goodfit: Goodness-of-fit Tests for Discrete Data

Description

Fits a discrete (count data) distribution for goodness-of-fit tests.

Usage

goodfit(x, type = c("poisson", "binomial", "nbinomial"),
  method = c("ML", "MinChisq"), par = NULL)
## S3 method for class 'goodfit':
predict(object, newcount = NULL, type = c("response", "prob"), ...)

Arguments

either a vector of counts, a 1-way table of frequencies of counts or a data frame or matrix with frequencies in the first column and the corresponding counts in the second column.

type

a character string indicating which distribution should be fit (for goodfit) or indicating the type of prediction (fitted response or probabilities in predict) respectively.

method

a character string indicating whether the distribution should be fit via ML (Maximum Likelihood) or Minimum Chi-squared.

par

a named list giving the distribution parameters (named as in the corresponding density function), if set to NULL, the default, the parameters are estimated. If the parameter size is not specified if type

object

an object of class "goodfit".

newcount

a vector of counts. By default the counts stored in object are used, i.e., the fitted values are computed. These can also be extracted by fitted(object).

...

currently not used.

Value

A list of class "goodfit" with elements:
observedobserved frequencies.
countcorresponding counts.
fittedexpected frequencies (fitted by ML).
typea character string indicating the distribution fitted.
methoda character string indicating the fitting method (can be either "ML", "MinChisq" or "fixed" if the parameters were specified).
dfdegrees of freedom.
para named list of the (estimated) distribution parameters.

Details

goodfit essentially computes the fitted values of a discrete distribution (either poisson, binomial or negative binomial) to the count data given in x. If the parameters are not specified they are estimated either by ML or Minimum Chi-squared.

par should be a named list specifying the parameters lambda for "poisson" and prob and size for "binomial" or "nbinomial", respectively. If for "binomial" size is not specified it is not estimated but taken as the maximum count.

The corresponding Pearson Chi-squared or likelihood ratio statistic respectively is computed and given with their $p$ values by the summary method. The plot method produces a rootogram of the observed and fitted values.

References

M. Friendly (2000), Visualizing Categorical Data. SAS Institute, Cary, NC.

Examples

Run this code

## Simulated data examples:
dummy <- rnbinom(200, size = 1.5, prob = 0.8)
gf <- goodfit(dummy, type = "nbinomial", method = "MinChisq")
summary(gf)
plot(gf)

dummy <- rbinom(100, size = 6, prob = 0.5)
gf1 <- goodfit(dummy, type = "binomial", par = list(size = 6))
gf2 <- goodfit(dummy, type = "binomial", par = list(prob = 0.6, size = 6))
summary(gf1)
plot(gf1)
summary(gf2)
plot(gf2)

## Real data examples:
data("HorseKicks")
HK.fit <- goodfit(HorseKicks)
summary(HK.fit)
plot(HK.fit)

data("Federalist")
## try geometric and full negative binomial distribution
F.fit <- goodfit(Federalist, type = "nbinomial", par = list(size = 1))
F.fit2 <- goodfit(Federalist, type = "nbinomial")
summary(F.fit)
summary(F.fit2)
plot(F.fit)
plot(F.fit2)

Run the code above in your browser using DataLab