Learn R Programming

poisson.glm.mix (version 1.4)

pois.glm.mix: Main call function of the package.

Description

This function is the main function of the package. User has only to call it by specifying the data (\(x\) and \(y\)), the vector \(L\), the parameterization (\(m\in \{1,2,3\}\)), the desirable range for the number of components, the type of initialization and the number of EM runs and iterations for the small-EM strategy. When \(K=K_{min}\), EM algorithm is initialized according to Initialization scheme 1 (the functions init1.1.jk.j, init1.2.jk.j, init1.k). For consecutive run (\(K>K_{min}\)), EM algorithm is initialized using Initialization 2 (the functions init2.jk.j or init2.k).

Usage

pois.glm.mix(reference, response, L, m, max.iter, Kmin, Kmax, 
             m1, m2, t1, t2, msplit, tsplit,mnr)

Value

information.criteria

numeric array of dimension \((Kmax-Kmin + 1)\times 3\) containing the values of BIC, ICL and loglikelihood for each \(K\). The latter is computed according to the function mylogLikePoisMix.

runs

A list containing the output for the estimated mixture for each \(K\). The output is the same as in the functions bjkmodel, bjmodel and bkmodel.

sel.mod.icl

The selected number of mixture components according to the ICL criterion.

sel.mod.bic

The selected number of mixture components according to the BIC.

est.sel.mod.icl

The final estimates for the selected number of mixture components according to the ICL criterion. It is a list containing \(\widehat{\pi}_k\), \(\widehat{\alpha}_{jk}\), \(\widehat{\beta}_{jkv}\), \(\widehat{\gamma}_{j\ell}\), \(\widehat{c}_{i}\), \(\widehat{\tau}_{ik}\), \(i=1,\ldots,n\), \(j=1,\ldots,J\), \(\ell=1,\ldots,L_j\), \(k=1,\ldots,\widehat{K}_{icl}\), \(v=1,\ldots,V\), while \(\widehat{c}_i\) denotes the estimated cluster of observation \(i\), according to the MAP rule.

est.sel.mod.bic

The final estimates for the selected number of mixture components according to the BIC. It is a list containing \(\widehat{\pi}_k\), \(\widehat{\alpha}_{jk}\), \(\widehat{\beta}_{jkv}\), \(\widehat{\gamma}_{j\ell}\), \(\widehat{c}_{i}\), \(\widehat{\tau}_{ik}\), \(i=1,\ldots,n\), \(j=1,\ldots,J\), \(\ell=1,\ldots,L_j\), \(k=1,\ldots,\widehat{K}_{bic}\), \(v=1,\ldots,V\).

Arguments

reference

a numeric array of dimension \(n\times V\) containing the \(V\) covariates for each of the \(n\) observations.

response

a numeric array of count data with dimension \(n\times d\) containing the \(d\) response variables for each of the \(n\) observations.

L

numeric vector of positive integers containing the partition of the \(d\) response variables into \(J\leq d\) blocks, with \(\sum_{j=1}^{J}L_j=d\).

m

variable denoting the parameterization of the model: 1 for \(\beta_{jk}\) , 2 for \(\beta_{j}\) and 3 for \(\beta_{k}\) parameterization.

max.iter

positive integer denoting the maximum number of EM iterations.

Kmin

the minimum number of mixture components.

Kmax

the maximum number of mixture components.

m1

positive integer denoting the number of iterations for each call of the 1st small EM iterations used by Initialization 1 (init1.1.jk.j). Leave blank in case of parameterization \(m=3\).

m2

positive integer denoting the number of iterations for each call of the overall small EM iterations used by Initialization 1 (init1.2.jk.j or init1.k).

t1

positive integer denoting the number of different runs of the 1st small EM used by Initialization 1 (init1.1.jk.j). Leave blank in case of parameterization \(m=3\).

t2

positive integer denoting the number of different runs of the overall small EM used by Initialization 1 (init1.2.jk.j or init1.k).

msplit

positive integer denoting the number of different runs for each call of the splitting small EM used by Initialization 2 (init2.jk.j or init2.k).

tsplit

positive integer denoting the number of different runs for each call of the splitting small EM used by Initialization 2 (init2.jk.j or init2.k).

mnr

positive integer denoting the maximum number of Newton-Raphson iterations.

Author

Panagiotis Papastamoulis

Details

The output of the function is a list of lists. During the run of the function pois.glm.mix two R graphic devices are opened: The first one contains the graph of the information criteria (BIC and ICL). In the second graphe, the resulting fitted clusters per condition are plotted until the ICL criterion no longer selects a better model. Notice that in this graph the \(L_j\) replicates of condition \(j=1,\ldots,J\) are summed.

The EM algorithm is run until the increase to the loglikelihood of the mixture model is less than \(10^{-6}\). The Newton - Raphson iterations at the Maximization step of EM algorithm are repeated until the square Euclidean norm of the gradient vector of the component specific parameters is less than \(10^{-10}\).

See Also

bjkmodel, bjmodel, bkmodel, init1.1.jk.j, init1.2.jk.j, init1.k, init2.jk.j, init2.k, mylogLikePoisMix

Examples

Run this code
## load a small dataset of 500 observations
data("simulated_data_15_components_bjk")
## in this example there is V = 1 covariates (x)
##   and d = 6 response variables (y). The design is
##   L = (3,2,1).
V <- 1
x <- array(sim.data[,1],dim=c(dim(sim.data)[1],V))
y <- sim.data[,-1]

## We will run the algorithm using parameterization
##   m = 1 and the number of components in the set
##   {2,3,4}.

rr<-pois.glm.mix(reference=x, response=y, L=c(3,2,1), m=1, 
                  max.iter=1000, Kmin=2, Kmax= 4, 
                  m1=3, m2=3, t1=3, t2=3, msplit=3, tsplit=3,mnr = 5)

# note: useR should specify larger values for Kmax, m1, m2, 
#	t1, t2, msplit and tsplit for a complete analysis.


# retrieve the selected models according to BIC or ICL
rr$sel.mod.icl
rr$sel.mod.bic
# retrieve the estimates according to ICL
# alpha
rr$est.sel.mod.icl$alpha
# beta
rr$est.sel.mod.icl$beta
# gamma
rr$est.sel.mod.icl$gamma
# pi
rr$est.sel.mod.icl$pi
# frequency table with estimated clusters
table(rr$est.sel.mod.icl$clust)
# histogram of the maximum conditional probabilities
hist(apply(rr$est.sel.mod.icl$tau,1,max),30)

##(the full data of 5000 observations can be loaded using 
##     data("simulated_data_15_components_bjk_full")

Run the code above in your browser using DataLab