Learn R Programming

poisson.glm.mix (version 1.4)

init2.jk.j: Initialization 2 for the \(\beta_{jk}\) (\(m=1\)) or \(\beta_{j}\) (\(m=2\)) parameterization.

Description

This function applies a random splitting small EM initialization scheme (Initialization 2), for parameterizations \(m=1\) or 2. It can be implemented only in case where a previous run of the EM algorithm is available (with respect to the same parameterization). The initialization scheme proposes random splits of the existing clusters, increasing the number of mixture components by one. Then an EM is ran for (msplit) iterations and the procedure is repeated for tsplit times. The best values in terms of observed loglikelihood are chosen to initialize the main EM algorithm (bjkmodel or bjmodel).

Usage

init2.jk.j(reference, response, L, K, tsplit, model, msplit, 
           previousz, previousclust, previous.alpha, previous.beta,mnr)

Value

alpha

numeric array of dimension \(J \times K\) containing the selected values \(\alpha_{jk}^{0})\), \(j=1,\ldots,J\), \(k=1,\ldots,K\) that will be used to initialize main EM (bjkmodel or bjmodel).

beta

numeric array of dimension \(J \times K \times T\) (if model = 1) or \(J \times T\) (if model = 2) containing the selected values of \(\beta_{jk\tau}^{0})\) (or \(\beta_{j\tau}^{t})\)), \(j=1,\ldots,J\), \(k=1,\ldots,K\), \(\tau=1,\ldots,T\), that will be used to initialize the main EM.

psim

numeric vector of length \(K\) containing the weights that will initialize the main EM.

ll

numeric, the value of the loglikelihood, computed according to the mylogLikePoisMix function.

Arguments

reference

a numeric array of dimension \(n\times V\) containing the \(V\) covariates for each of the \(n\) observations.

response

a numeric array of count data with dimension \(n\times d\) containing the \(d\) response variables for each of the \(n\) observations.

L

numeric vector of positive integers containing the partition of the \(d\) response variables into \(J\leq d\) blocks, with \(\sum_{j=1}^{J}L_j=d\).

K

positive integer denoting the number of mixture components.

tsplit

positive integer denoting the number of different runs.

model

binary variable denoting the parameterization of the model: 1 for \(\beta_{jk}\) and 2 for \(\beta_{j}\) parameterization.

msplit

positive integer denoting the number of iterations for each run.

previousz

numeric array of dimension \(n\times(K-1)\) containing the estimates of the posterior probabilities according to the previous run of EM.

previousclust

numeric vector of length $n$ containing the estimated clusters according to the MAP rule obtained by the previous run of EM.

previous.alpha

numeric array of dimension \(J\times (K-1)\) containing the matrix of the ML estimates of the regression constants \(\alpha_{jk}\), \(j=1,\ldots,J\), \(k=1,\ldots,K-1\), based on the previous run of EM algorithm.

previous.beta

numeric array of dimension \(J\times (K-1)\times T\) (if model = 1) or \(J\times T\) (if model = 2) containing the matrix of the ML estimates of the regression coefficients \(\beta_{jk\tau}\) or \(\beta_{j\tau}\), \(j=1,\ldots,J\), \(k=1,\ldots,K-1\), \(\tau=1,\ldots,T\), based on the previous run of EM algorithm.

mnr

positive integer denoting the maximum number of Newton-Raphson iterations.

Author

Panagiotis Papastamoulis

See Also

init1.1.jk.j, init1.2.jk.j, bjkmodel, bjmodel

Examples

Run this code


data("simulated_data_15_components_bjk")
x <- sim.data[,1]
x <- array(x,dim=c(length(x),1))
y <- sim.data[,-1]

# At first a 2 component mixture is fitted using parameterization $m=1$.
run.previous<-bjkmodel(reference=x, response=y, L=c(3,2,1), m=100, K=2, 
                       nr=-10*log(10), maxnr=5, m1=2, m2=2, t1=1, t2=2, 
                       msplit, tsplit, prev.z, prev.clust, start.type=1, 
                       prev.alpha, prev.beta)
## Then the estimated clusters and parameters are used to initialize a 
##   3 component mixture using Initialization 2. The number of different
##   runs is set to $tsplit=3$ with each one of them using msplit = 2 
##   em iterations. 
q <- 3
tau <- 1
nc <- 3
z <- run.previous$z
ml <- length(run.previous$psim)/(nc - 1)
alpha <- array(run.previous$alpha[ml, , ], dim = c(q, nc - 1))
beta <- array(run.previous$beta[ml, , , ], dim = c(q, nc - 1, tau))
clust <- run.previous$clust
run<-init2.jk.j(reference=x, response=y, L=c(3,2,1), K=nc, tsplit=2, 
                model=1, msplit=2, previousz=z, previousclust=clust,
                previous.alpha=alpha, previous.beta=beta,mnr = 5)
# note: useR should specify larger values for msplit and tsplit for a complete analysis.

Run the code above in your browser using DataLab