fmgamma: MLE Fitting of Mixture of Gammas Using EM Algorithm

Description

Maximum likelihood estimation for fitting the mixture of gammas distribution using the EM algorithm.

Usage

fmgamma(x, M, pvector = NULL, std.err = TRUE, method = "BFGS",
  control = list(maxit = 10000), finitelik = TRUE, ...)
lmgamma(x, mgshape, mgscale, mgweight, log = TRUE)
nlmgamma(pvector, x, M, finitelik = FALSE)
nlEMmgamma(pvector, tau, mgweight, x, M, finitelik = FALSE)

Arguments

vector of sample data

number of gamma components in mixture

pvector

vector of initial values of GPD parameters (sigmau, xi) or NULL

std.err

logical, should standard errors be calculated

method

optimisation method (see optim)

control

optimisation control list (see optim)

finitelik

logical, should log-likelihood return finite value for invalid parameters

...

optional inputs passed to optim

mgshape

mgamma shape (positive) as vector of length M

mgscale

mgamma scale (positive) as vector of length M

mgweight

mgamma weights (positive) as vector of length M

log

logical, if TRUE then log-likelihood rather than likelihood is output

tau

matrix of posterior probability of being in each component (nxM where n is length(x))

Value

Log-likelihood is given by lmgamma and it's wrapper for negative log-likelihood from nlmgamma. The conditional negative log-likelihood using the posterior probabilities is given by nlEMmgamma. Fitting function fmgammagpd using EM algorithm returns a simple list with the following elements

`call`:	`optim` call
`x`:	data vector `x`
`init`:	`pvector`
`optim`:	complete `optim` output
`mle`:	vector of MLE of parameters
`cov`:	variance-covariance matrix of MLE of parameters
`se`:	vector of standard errors of MLE of parameters
`nllh`:	minimum negative log-likelihood
`n`:	total sample size
`M`:	number of gamma components
`mgshape`:	MLE of gamma shapes
`mgscale`:	MLE of gamma scales
`mgweight`:	MLE of gamma weights
`EMresults`:	EM results giving complete negative log-likelihood, estimated parameters and conditional "maximisation step" negative log-likelihood and convergence result
`posterior`:	posterior probabilites

Acknowledgments

Thanks to Daniela Laas, University of St Gallen, Switzerland for reporting various bugs in these functions.

Details

The weighted mixture of gammas distribution is fitted to the entire dataset by maximum likelihood estimation using the EM algorithm. The estimated parameters, variance-covariance matrix and their standard errors are automatically output.

The expectation step estimates the expected probability of being in each component conditional on gamma component parameters. The maximisation step optimizes the negative log-likelihood conditional on posterior probabilities of each observation being in each component.

The optimisation of the likelihood for these mixture models can be very sensitive to the initial parameter vector, as often there are numerous local modes. This is an inherent feature of such models and the EM algorithm. The EM algorithm is guaranteed to reach the maximum of the local mode. Multiple initial values should be considered to find the global maximum. If the pvector is input as NULL then random component probabilities are simulated as the initial values, so multiple such runs should be run to check the sensitivity to initial values. Alternatives to black-box likelihood optimisers (e.g. simulated annealing), or moving to computational Bayesian inference, are also worth considering.

The log-likelihood functions are provided for wider usage, e.g. constructing profile likelihood functions. The parameter vector pvector must be specified in the negative log-likelihood functions nlmgamma and nlEMmgamma.

Log-likelihood calculations are carried out in lmgamma, which takes parameters as inputs in the same form as the distribution functions. The negative log-likelihood function nlmgamma is a wrapper for lmgamma designed towards making it useable for optimisation, i.e. nlmgamma has complete parameter vector as first input. Similarly, for the maximisation step negative log-likelihood nlEMmgamma, which also has the second input as the component probability vector mgweight.

Missing values (NA and NaN) are assumed to be invalid data so are ignored.

The function lnormgpd carries out the calculations for the log-likelihood directly, which can be exponentiated to give actual likelihood using (log=FALSE).

The default optimisation algorithm in the "maximisation step" is "BFGS", which requires a finite negative log-likelihood function evaluation finitelik=TRUE. For invalid parameters, a zero likelihood is replaced with exp(-1e6). The "BFGS" optimisation algorithms require finite values for likelihood, so any user input for finitelik will be overridden and set to finitelik=TRUE if either of these optimisation methods is chosen.

It will display a warning for non-zero convergence result comes from optim function call or for common indicators of lack of convergence (e.g. any estimated parameters same as initial values).

If the hessian is of reduced rank then the variance covariance (from inverse hessian) and standard error of parameters cannot be calculated, then by default std.err=TRUE and the function will stop. If you want the parameter estimates even if the hessian is of reduced rank (e.g. in a simulation study) then set std.err=FALSE.

Suppose there are \(M\) gamma components with (scalar) shape and scale parameters and weight for each component. Only \(M-1\) are to be provided in the initial parameter vector, as the \(M\)th components weight is uniquely determined from the others.

For the fitting function fmgamma and negative log-likelihood functions the parameter vector pvector is a 3*M-1 length vector containing all \(M\) gamma component shape parameters first, followed by the corresponding \(M\) gamma scale parameters, then all the corresponding \(M-1\) probability weight parameters. The full parameter vector is then c(mgshape, mgscale, mgweight[1:(M-1)]).

For the maximisation step negative log-likelihood functions the parameter vector pvector is a 2*M length vector containing all \(M\) gamma component shape parameters first followed by the corresponding \(M\) gamma scale parameters. The partial parameter vector is then c(mgshape, mgscale).

For identifiability purposes the mean of each gamma component must be in ascending in order. If the initial parameter vector does not satisfy this constraint then an error is given.

Non-positive data are ignored as likelihood is infinite, except for gshape=1.

References

http://www.math.canterbury.ac.nz/~c.scarrott/evmix

http://en.wikipedia.org/wiki/Gamma_distribution

http://en.wikipedia.org/wiki/Mixture_model

McLachlan, G.J. and Peel, D. (2000). Finite Mixture Models. Wiley.

Examples

Run this code

# NOT RUN {
set.seed(1)
par(mfrow = c(1, 1))

x = c(rgamma(1000, shape = 1, scale = 1), rgamma(3000, shape = 6, scale = 2))
xx = seq(-1, 40, 0.01)
y = (dgamma(xx, shape = 1, scale = 1) + 3 * dgamma(xx, shape = 6, scale = 2))/4

# Fit by EM algorithm
fit = fmgamma(x, M = 2)
hist(x, breaks = 100, freq = FALSE, xlim = c(-1, 40))
lines(xx, y)
with(fit, lines(xx, dmgamma(xx, mgshape, mgscale, mgweight), col="red"))
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab