Goodness-of-fit statistics are computed. The Chi-squared statistic is computed using cells defined
by the argument
chisqbreaks
or cells automatically defined from data, in order
to reach roughly the same number of observations per cell, roughly equal to the argument
meancount
, or sligthly more if there are some ties.
The choice to define cells from the empirical distribution (data), and not from the
theoretical distribution, was done to enable the comparison of Chi-squared values obtained
with different distributions fitted on a same data set.
If chisqbreaks
and meancount
are both omitted, meancount
is fixed in order to obtain roughly \((4n)^{2/5}\) cells,
with \(n\) the length of the data set (Vose, 2000).
The Chi-squared statistic is not computed if the program fails
to define enough cells due to a too small dataset. When the Chi-squared statistic is computed,
and if the degree of freedom (nb of cells - nb of parameters - 1) of the corresponding distribution
is strictly positive, the p-value of the Chi-squared test is returned.
For continuous distributions, Kolmogorov-Smirnov, Cramer-von Mises and
Anderson-Darling and statistics are also computed, as defined by Stephens (1986).
An approximate Kolmogorov-Smirnov test is
performed by assuming the distribution parameters known. The critical value defined by Stephens (1986)
for a completely specified distribution is used to reject or not the
distribution at the significance level 0.05. Because of this approximation, the result of the test
(decision of rejection of the distribution or not) is returned only for data sets with more
than 30 observations. Note that this approximate test may be too conservative.
For data sets with more than 5 observations and for distributions for
which the test is described by Stephens (1986) for maximum likelihood estimations
("exp"
, "cauchy"
, "gamma"
and "weibull"
),
the Cramer-von Mises and Anderson-darling tests are performed as described by Stephens (1986).
Those tests take into
account the fact that the parameters are not known but estimated from the data by maximum likelihood.
The result is the
decision to reject or not the distribution at the significance level 0.05. Those tests are available
only for maximum likelihood estimations.
Only recommended statistics are automatically printed, i.e.
Cramer-von Mises, Anderson-Darling and Kolmogorov statistics for continuous distributions and
Chi-squared statistics for discrete ones ( "binom"
,
"nbinom"
, "geom"
, "hyper"
and "pois"
).
Results of the tests are not printed but stored in the output of the function.