Search good starting values
prefit(data, distr, method = c("mle", "mme", "qme", "mge"),
feasible.par, memp=NULL, order=NULL,
probs=NULL, qtype=7, gof=NULL, fix.arg=NULL, lower,
upper, weights=NULL, silent=TRUE, …)
A numeric vector.
A character string "name"
naming a distribution for which the corresponding
density function dname
, the corresponding distribution function pname
and the
corresponding quantile function qname
must be defined, or directly the density function.
A character string coding for the fitting method:
"mle"
for 'maximum likelihood estimation', "mme"
for 'moment matching estimation',
"qme"
for 'quantile matching estimation' and "mge"
for 'maximum goodness-of-fit estimation'.
A named list giving the initial values of parameters of the named distribution
or a function of data computing initial values and returning a named list.
This argument may be omitted (default) for some distributions for which reasonable
starting values are computed (see the 'details' section of mledist
).
It may not be into account for closed-form formulas.
A numeric vector for the moment order(s). The length of this vector must be equal to the number of parameters to estimate.
A function implementing empirical moments, raw or centered but has to be consistent with
distr
argument (and weights
argument).
A numeric vector of the probabilities for which the quantile matching is done. The length of this vector must be equal to the number of parameters to estimate.
The quantile type used by the R quantile
function to
compute the empirical quantiles, (default 7 corresponds to the default quantile method in R).
A character string coding for the name of the goodness-of-fit distance used : "CvM" for Cramer-von Mises distance,"KS" for Kolmogorov-Smirnov distance, "AD" for Anderson-Darling distance, "ADR", "ADL", "AD2R", "AD2L" and "AD2" for variants of Anderson-Darling distance described by Luceno (2006).
An optional named list giving the values of fixed parameters of the named distribution
or a function of data computing (fixed) parameter values and returning a named list.
Parameters with fixed value are thus NOT estimated by this maximum likelihood procedure.
The use of this argument is not possible if method="mme"
and a closed-form formula is used.
an optional vector of weights to be used in the fitting process.
Should be NULL
or a numeric vector. If non-NULL
,
weighted MLE is used, otherwise ordinary MLE.
A logical to remove or show warnings.
Lower bounds on the parameters.
Upper bounds on the parameters.
A named list.
Searching good starting values is achieved by transforming the parameters (from their constraint interval to the real line) of the probability distribution. Indeed,
positive parameters in \((0,Inf)\) are transformed using the logarithm
(typically the scale parameter sd
of a normal distribution, see Normal),
parameters in \((1,Inf)\) are transformed using the function \(log(x-1)\),
probability parameters in \((0,1)\) are transformed using the logit function \(log(x/(1-x))\)
(typically the parameter prob
of a geometric distribution, see Geometric),
negative probability parameters in \((-1,0)\) are transformed using the function \(log(-x/(1+x))\),
real parameters are of course not transformed at all,
typically the mean
of a normal distribution, see Normal.
Once parameters are transformed, an optimization is carried out by a quasi-Newton algorithm (typically BFGS) and then we transform them back to original parameter value.
Delignette-Muller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 1-34.
See mledist
, mmedist
, qmedist
,
mgedist
for details on parameter estimation.
See fitdist
for the main procedure.
# NOT RUN {
# (1) fit of a gamma distribution by maximum likelihood estimation
#
x <- rgamma(1e3, 5/2, 7/2)
prefit(x, "gamma", "mle", list(shape=3, scale=3), lower=-Inf, upper=Inf)
# }
Run the code above in your browser using DataLab