fitdistrplus-package: Overview of the fitdistrplus package

Description

The idea of this package emerged in 2008 from a collaboration between JB Denis, R Pouillot and ML Delignette who at this time worked in the area of quantitative risk assessment. The implementation of this package was a part of a more general project named "Risk assessment with R" gathering different packages and hosted in R-forge.

The fitdistrplus package was first written by ML Delignette-Muller and made available in CRAN on 2009 and presented at the 2009 useR conference in Rennes. A few months after, C Dutang joined the project by starting to participate to the implementation of the fitdistrplus package. The package has also been presented at the 2011 useR conference and at the 2eme rencontres R in 2013 (https://r2013-lyon.sciencesconf.org/).

Three vignettes are available within the package:

a general overview of the package published in the Journal of Statistical Software (tools:::Rd_expr_doi("10.18637/jss.v064.i04")),
a html document answering the most Frequently Asked Questions,
a html document presenting a benchmark of optimization algorithms when finding parameters.

The fitdistrplus package is a general package that aims at helping the fit of univariate parametric distributions to censored or non-censored data. The two main functions are fitdist for fit on non-censored data and fitdistcens for fit on censored data.

The choice of candidate distributions to fit may be helped using functions descdist and plotdist for non-censored data and plotdistcens for censored data).

Using functions fitdist and fitdistcens, different methods can be used to estimate the distribution parameters:

maximum likelihood estimation by default (mledist),
moment matching estimation (mmedist),
quantile matching estimation (qmedist),
maximum goodness-of-fit estimation (mgedist).

For classical distributions initial values are automatically calculated if not provided by the user. Graphical functions plotdist and plotdistcens can be used to help a manual calibration of initial values for parameters of non-classical distributions. Function prefit is proposed to help the definition of good starting values in the special case of constrained parameters. In the case where maximum likelihood is chosen as the estimation method, function llplot enables to visualize loglikelihood contours.

The goodness-of-fit of fitted distributions (a single fit or multiple fits) can be explored using different graphical functions (cdfcomp, denscomp, qqcomp and ppcomp for non-censored data and cdfcompcens for censored data). Goodness-of-fit statistics are also provided for non-censored data using function gofstat.

Bootstrap is proposed to quantify the uncertainty on parameter estimates (functions bootdist and bootdistcens) and also to quantify the uncertainty on CDF or quantiles estimated from the fitted distribution (quantile and CIcdfplot).

Arguments

Author

Marie-Laure Delignette-Muller and Christophe Dutang.