Data frame summarizing information about available probability distributions in R and the EnvStats package, and which distributions have associated functions for estimating distribution parameters.
Distribution.df
A data frame with 35 rows corresponding to 35 different available probability distributions, and 25 columns containing information associated with these probability distributions.
Name
a character vector containing the name of the probability distribution (see the column labeled Name in the table below).
Type
a character vector indicating the type of
distribution (see the column labeled Type in the table below).
Possible values are "Finite Discrete"
, "Discrete"
,
"Continuous"
, and "Mixed"
.
Support.Min
a character vector indicating the minimum value
the random variable can assume (see the column labeled Range in
the table below). The reason this is a character vector instead of a
numeric vector is because some distributions have a lower bound that
depends on the value of a distribution parameter. For example,
the minimum value for a Uniform distribution is given by the
value of the parameter min
.
Support.Max
a character vector indicating the maximum value
the random variable can assume (see the column labeled Range in
the table below). The reason this is a character vector instead of a
numeric vector is because some distributions have an upper bound that
depends on the value of a distribution parameter. For example,
the maximum value for a Uniform distribution is given by the value
of the parameter max
.
Estimation.Method(s)
a character vector indicating the
names of the methods available to estimate the distribution parameter(s)
(see the column labeled Estimation Method(s) in the table below).
Possible values include "mle"
(maximum likelihood), "mme"
(method of moments), "mmue"
(method of moments based on the
unbiased estimate of variance), "mvue"
(minimum variance unbiased),
"qmle"
(quasi-mle), etc., or some combination of these. In
cases where an estimator is more than one kind, a slash (/
) is
used to denote all methods covered by the single estimator. For example,
for the Binomial distribution, the sample proportion is the maximum
likelihood, method of moments, and minimum variance unbiased estimator,
so this method is denoted as "mle/mme/mvue"
. See the help files
for the specific function listed under
Estimating Distribution Parameters for an
explanation of each of these estimation methods.
Quantile.Estimation.Method(s)
a character vector indicating
the names of the methods available to estimate the distribution
quantiles. For many distributions, these are the same as
Estimation.Method(s)
. See the help files for the specific
function listed under
Estimating Distribution Quantiles for an
explanation of each of these estimation methods.
Prediction.Interval.Method(s)
a character vector indicating the names of the methods available to create prediction intervals. See the help files for the specific function listed under Prediction Intervals for an explanation of each of these estimation methods.
Singly.Censored.Estimation.Method(s)
a character vector indicating the names of the methods available to estimate the distribution parameter(s) for Type I singly-censored data. See the help files for the specific function listed under Estimating Distribution Parameters in the help file for Censored Data for an explanation of each of these estimation methods.
Multiply.Censored.Estimation.Method(s)
a character vector indicating the names of the methods available to estimate the distribution parameter(s) for Type I multiply-censored data. See the help files for the specific function listed under Estimating Distribution Parameters in the help file for Censored Data for an explanation of each of these estimation methods.
Number.parameters
a numeric vector indicating the number of parameters associated with the distribution (see the column labeled Parameters in the table below).
Parameter.1
the columns labeled
Parameter.1
, Parameter.2
, ..., Parameter.5
are
character vectors containing the names of the distribution parameters
(see the column labeled Parameters in the table below). If a
distribution has \(n\) parameters and \(n < 5\), then the columns
labeled Parameter.n+1
, ..., Parameter.5
are empty. For
example, the Normal distribution has only two parameters
associated with it (mean
and sd
), so the fields in
Parameter.3
, Parameter.4
, and Parameter.5
are
empty.
Parameter.2
see Parameter.1
Parameter.3
see Parameter.1
Parameter.4
see Parameter.1
Parameter.5
see Parameter.1
Parameter.1.Min
the columns labeled Parameter.1.Min
,
Parameter.2.Min
, ...,
Parameter.5.Min
are character
vectors containing the minimum values that can be assumed by the
distribution parameters (see the column labeled Parameter Range(s)
in the table below).
The reason these are character vectors instead of numeric vectors is
because some parameters have a lower bound of 0
but must be
strictly bigger than 0
(e.g., the parameter sd
for the
Normal distribution), in which case the lower bound is
.Machine$double.eps
, which may vary from machine to machine.
Also, some parameters have a lower bound that depends on the value of
another parameter. For example, the parameter max
for a
Uniform distribution is bounded below by the value of the
parameter min
.
If a distribution has \(n\) parameters and \(n < 5\), then the
columns labeled Parameter.n+1.Min
, ..., Parameter.5.Min
have the missing value code (NA
). For example, the Normal
distribution has only two parameters associated with it (mean
and sd
) so the fields in
Parameter.3.Min
, Parameter.4.Min
, and Parameter.5.Min
have NA
s in them.
Parameter.2.Min
see Parameter.1.Min
Parameter.3.Min
see Parameter.1.Min
Parameter.4.Min
see Parameter.1.Min
Parameter.5.Min
see Parameter.1.Min
Parameter.1.Max
the columns labeled Parameter.1.Max
,
Parameter.2.Max
, ...,
Parameter.5.Max
are character
vectors containing the maximum values that can be assumed by the
distribution parameters (see the column labeled Parameter Range(s)
in the table below).
The reason these are character vectors instead of numeric vectors is
because some parameters have an upper bound that depends on the value
of another parameter. For example, the parameter min
for a
Uniform distribution is bounded above by the value of the
parameter max
.
If a distribution has \(n\) parameters and \(n < 5\), then the
columns labeled Parameter.n+1.Max
, ..., Parameter.5.Max
have the missing value code (NA
). For example, the Normal
distribution has only two parameters associated with it (mean
and sd
) so the fields in
Parameter.3.Max
, Parameter.4.Max
, and Parameter.5.Max
have NA
s in them.
Parameter.2.Max
see Parameter.1.Max
Parameter.3.Max
see Parameter.1.Max
Parameter.4.Max
see Parameter.1.Max
Parameter.5.Max
see Parameter.1.Max
The table below summarizes the probability distributions available in
R and EnvStats. For each distribution, there are four
associated functions for computing density values, percentiles, quantiles,
and random numbers. The form of the names of these functions are
d
abb, p
abb, q
abb, and
r
abb, where abb is the abbreviated name of the
distribution (see table below). These functions are described in the
help file with the name of the distribution (see the first column of the
table below). For example, the help file for Beta describes the
behavior of dbeta
, pbeta
, qbeta
,
and rbeta
.
For most distributions, there is also an associated function for
estimating the distribution parameters, and the form of the names of
these functions is e
abb, where abb is the
abbreviated name of the distribution (see table below). All of these
functions are listed in the help file
Estimating Distribution Parameters. For example,
the function ebeta
estimates the shape parameters of a
Beta distribution based on a random sample of observations from
this distribution.
For some distributions, there are functions to estimate distribution
parameters based on Type I censored data. The form of the names of
these functions is e
abbSinglyCensored
for
singly censored data and e
abbMultiplyCensored
for
multiply censored data. All of these functions are listed under the heading
Estimating Distribution Parameters in the help file
Censored Data.
Table 1a. Available Distributions: Name, Abbreviation, Type, and Range
Name | Abbreviation | Type | Range |
Beta | beta | Continuous | \([0, 1]\) |
Binomial | binom | Finite | \([0, size]\) |
Discrete | (integer) | ||
Cauchy | cauchy | Continuous | \((-\infty, \infty)\) |
Chi | chi | Continuous | \([0, \infty)\) |
Chi-square | chisq | Continuous | \([0, \infty)\) |
Exponential | exp | Continuous | \([0, \infty)\) |
Extreme | evd | Continuous | \((-\infty, \infty)\) |
Value | |||
F | f | Continuous | \([0, \infty)\) |
Gamma | gamma | Continuous | \([0, \infty)\) |
Gamma | gammaAlt | Continuous | \([0, \infty)\) |
(Alternative) | |||
Generalized | gevd | Continuous | \((-\infty, \infty)\) |
Extreme | for \(shape = 0\) | ||
Value | |||
\((-\infty, location + \frac{scale}{shape}]\) | |||
for \(shape > 0\) | |||
\([location + \frac{scale}{shape}, \infty)\) | |||
for \(shape < 0\) | |||
Geometric | geom | Discrete | \([0, \infty)\) |
(integer) | |||
Hypergeometric | hyper | Finite | \([0, min(k,m)]\) |
Discrete | (integer) | ||
Logistic | logis | Continuous | \((-\infty, \infty)\) |
Lognormal | lnorm | Continuous | \([0, \infty)\) |
Lognormal | lnormAlt | Continuous | \([0, \infty)\) |
(Alternative) | |||
Lognormal | lnormMix | Continuous | \([0, \infty)\) |
Mixture | |||
Lognormal | lnormMixAlt | Continuous | \([0, \infty)\) |
Mixture | |||
(Alternative) | |||
Three- | lnorm3 | Continuous | \([threshold, \infty)\) |
Parameter | |||
Lognormal | |||
Truncated | lnormTrunc | Continuous | \([min, max]\) |
Lognormal | |||
Truncated | lnormTruncAlt | Continuous | \([min, max]\) |
Lognormal | |||
(Alternative) | |||
Negative | nbinom | Discrete | \([0, \infty)\) |
Binomial | (integer) | ||
Normal | norm | Continuous | \((-\infty, \infty)\) |
Normal | normMix | Continuous | \((-\infty, \infty)\) |
Mixture | |||
Truncated | normTrunc | Continuous | \([min, max]\) |
Normal | |||
Pareto | pareto | Continuous | \([location, \infty)\) |
Poisson | pois | Discrete | \([0, \infty)\) |
(integer) | |||
Student's t | t | Continuous | \((-\infty, \infty)\) |
Triangular | tri | Continuous | \([min, max]\) |
Uniform | unif | Continuous | \([min, max]\) |
Weibull | weibull | Continuous | \([0, \infty)\) |
Wilcoxon | wilcox | Finite | \([0, m n]\) |
Rank Sum | Discrete | (integer) | |
Zero-Modified | zmlnorm | Mixed | \([0, \infty)\) |
Lognormal | |||
(Delta) | |||
Zero-Modified | zmlnormAlt | Mixed | \([0, \infty)\) |
Lognormal | |||
(Delta) | |||
(Alternative) | |||
Zero-Modified | zmnorm | Mixed | \((-\infty, \infty)\) |
Normal |
Table 1b. Available Distributions: Name, Parameters, Parameter Default Values, Parameter Ranges, Estimation Method(s)
Default | Parameter | Estimation | ||
Name | Parameter(s) | Value(s) | Range(s) | Method(s) |
Beta | shape1 | \((0, \infty)\) | mle, mme, mmue | |
shape2 | \((0, \infty)\) | |||
ncp | 0 | \((0, \infty)\) | ||
Binomial | size | \([0, \infty)\) | mle/mme/mvue | |
prob | \([0, 1]\) | |||
Cauchy | location | 0 | \((-\infty, \infty)\) | |
scale | 1 | \((0, \infty)\) | ||
Chi | df | \((0, \infty)\) | ||
Chi-square | df | \((0, \infty)\) | ||
ncp | 0 | \((-\infty, \infty)\) | ||
Exponential | rate | 1 | \((0, \infty)\) | mle/mme |
Extreme | location | 0 | \( (-\infty, \infty)\) | mle, mme, mmue, pwme |
Value | scale | 1 | \((0, \infty)\) | |
F | df1 | \((0, \infty)\) | ||
df2 | \((0, \infty)\) | |||
ncp | 0 | \((0, \infty)\) | ||
Gamma | shape | \((0, \infty)\) | mle, bcmle, mme, mmue | |
scale | 1 | \((0, \infty)\) | ||
Gamma | mean | \((0, \infty)\) | mle, bcmle, mme, mmue | |
(Alternative) | cv | 1 | \((0, \infty)\) | |
Generalized | location | 0 | \((-\infty, \infty)\) | mle, pwme, tsoe |
Extreme | scale | 1 | \((0, \infty)\) | |
Value | shape | 0 | \((-\infty, \infty)\) | |
Geometric | prob | \((0, 1)\) | mle/mme, mvue | |
Hypergeometric | m | \([0, \infty)\) | mle, mvue | |
n | \([0, \infty)\) | |||
k | \([1, m+n]\) | |||
Logistic | location | 0 | \((-\infty, \infty)\) | mle, mme, mmue |
scale | 1 | \((0, \infty)\) | ||
Lognormal | meanlog | 0 | \((-\infty, \infty)\) | mle/mme, mvue |
sdlog | 1 | \((0, \infty)\) | ||
Lognormal | mean | exp(1/2) | \((0, \infty)\) | mle, mme, mmue, |
(Alternative) | cv | sqrt(exp(1)-1) | \((0, \infty)\) | mvue, qmle |
Lognormal | meanlog1 | 0 | \((-\infty, \infty)\) | |
Mixture | sdlog1 | 1 | \((0, \infty)\) | |
meanlog2 | 0 | \((-\infty, \infty)\) | ||
sdlog2 | 1 | \((0, \infty)\) | ||
p.mix | 0.5 | \([0, 1]\) | ||
Lognormal | mean1 | exp(1/2) | \((0, \infty)\) | |
Mixture | cv1 | sqrt(exp(1)-1) | \((0, \infty)\) | |
(Alternative) | mean2 | exp(1/2) | \((0, \infty)\) | |
cv2 | sqrt(exp(1)-1) | \((0, \infty)\) | ||
p.mix | 0.5 | \([0, 1]\) | ||
Three- | meanlog | 0 | \((-\infty, \infty)\) | lmle, mme, |
Parameter | sdlog | 1 | \((0, \infty)\) | mmue, mmme, |
Lognormal | threshold | 0 | \((-\infty, \infty)\) | royston.skew, |
zero.skew | ||||
Truncated | meanlog | 0 | \((-\infty, \infty)\) | |
Lognormal | sdlog | 1 | \((0, \infty)\) | |
min | 0 | \([0, max)\) | ||
max | Inf | \((min, \infty)\) | ||
Truncated | mean | exp(1/2) | \((0, \infty)\) | |
Lognormal | cv | sqrt(exp(1)-1) | \((0, \infty)\) | |
(Alternative) | min | 0 | \([0, max)\) | |
max | Inf | \((min, \infty)\) | ||
Negative | size | \([1, \infty)\) | mle/mme, mvue | |
Binomial | prob | \((0, 1]\) | ||
mu | \((0, \infty)\) | |||
Normal | mean | 0 | \((-\infty, \infty)\) | mle/mme, mvue |
sd | 1 | \((0, \infty)\) | ||
Normal | mean1 | 0 | \((-\infty, \infty)\) | |
Mixture | sd1 | 1 | \((0, \infty)\) | |
mean2 | 0 | \((-\infty, \infty)\) | ||
sd2 | 1 | \((0, \infty)\) | ||
p.mix | 0.5 | \([0, 1]\) | ||
Truncated | mean | 0 | \((-\infty, \infty)\) | |
Normal | sd | 1 | \((0, \infty)\) | |
min | -Inf | \((-\infty, max)\) | ||
max | Inf | \((min, \infty)\) | ||
Pareto | location | \((0, \infty)\) | lse, mle | |
shape | 1 | \((0, \infty)\) | ||
Poisson | lambda | \((0, \infty)\) | mle/mme/mvue | |
Student's t | df | \((0, \infty)\) | ||
ncp | 0 | \((-\infty, \infty)\) | ||
Triangular | min | 0 | \((-\infty, max)\) | |
max | 1 | \((min, \infty)\) | ||
mode | 0.5 | \((min, max)\) | ||
Uniform | min | 0 | \((-\infty, max)\) | mle, mme, mmue |
max | 1 | \((min, \infty)\) | ||
Weibull | shape | \((0, \infty)\) | mle, mme, mmue | |
scale | 1 | \((0, \infty)\) | ||
Wilcoxon | m | \([1, \infty)\) | ||
Rank Sum | n | \([1, \infty)\) | ||
Zero-Modified | meanlog | 0 | \((-\infty, \infty)\) | mvue |
Lognormal | sdlog | 1 | \((0, \infty)\) | |
(Delta) | p.zero | 0.5 | \([0, 1]\) | |
Zero-Modified | mean | exp(1/2) | \((0, \infty)\) | mvue |
Lognormal | cv | sqrt(exp(1)-1) | \((0, \infty)\) | |
(Delta) | p.zero | 0.5 | \([0, 1]\) | |
(Alternative) | ||||
Zero-Modified | ||||
mean | 0 | \((-\infty, \infty)\) | mvue | Normal |
sd | 1 | \((0, \infty)\) | ||
p.zero | 0.5 | \([0, 1]\) |
Millard, S.P. (2013). EnvStats: An R Package for Environmental Statistics. Springer, New York. https://link.springer.com/book/10.1007/978-1-4614-8456-1.