poLCA(formula, data, nclass = 2, maxiter = 1000, graphs = FALSE,
tol = 1e-10, na.rm = TRUE, probs.start = NULL, nrep = 1,
verbose = TRUE, calc.se = TRUE)
response ~ predictors
. The details of model specification are given below.formula
. Manifest variables must contain only integer values, and must be coded with consecutive values from 1 to the maximum number of outcomes for each variable. All missing values should be entnclass=1
results in poLCA
estimating the loglinear independence model. The default is two.poLCA
should graphically display the parameter estimates at the completion of the estimation algorithm. The default is FALSE
.tol
, the estimation algorithm stops updating and considers the maximum log-likelihood to have been fopoLCA
handles cases with missing values on the manifest variables. If TRUE
, those cases are removed (listwise deleted) before estimating the model. If FALSE
, cases with missing values are retained. probs.start
. The default is one. Setting nrep
>1 automates the search for the global---rather than just a local---maximum of the log-likelihood function. poL
poLCA
should output to the screen the results of the model. If FALSE
, no output is produced. The default is TRUE
.poLCA
should calculate the standard errors of the estimated class-conditional response probabilities and mixing proportions. The default is TRUE
; can only be set to FALSE
if estimating a poLCA
uses the assumption of local independence to estimate a mixture model of latent multi-way tables, the number of which (nclass
) is specified by the user. Estimated parameters include the class-conditional response probabilities for each manifest variable, the "mixing" proportions denoting population share of observations corresponding to each latent multi-way table, and coefficients on any class-predictor covariates, if specified in the model.
Model specification: Latent class models have more than one manifest variable, so the response variables are cbind(dv1,dv2,dv3...)
where dv#} refer to variable names in the data frame. For models with no covariates, the formula is \code{cbind(dv1,dv2,dv3)~1}. For models with covariates, replace the \code{~1} with the desired function of predictors \code{iv1,iv2,iv3...} as, for example, \code{cbind(dv1,dv2,dv3)~iv1+iv2*iv3}.
poLCA
treats all manifest variables as qualitative/categorical/nominal -- NOT as ordinal.
poLCA
returns an object of class poLCA; a list containing the following elements:
N
).}
probs
.}
P
.}
link{poLCA.posterior}
.}
poLCA.table
and poLCA.predcell
.}
coeff
is a matrix with nclass-1
columns, and one row for each covariate. All logit coefficients are calculated for classes with respect to class 1.}
coeff
.}
nrep
attempts to fit the model.}
TRUE
if estimation algorithm needed to automatically restart with new initial parameters. A restart is caused in the event of computational/rounding errors that result in nonsensical parameter estimates.}
eflag
), then this contains the starting values used for the final, successful, run.}
FALSE
if probs.start
was incorrectly specified by the user, otherwise TRUE
.}poLCA
uses EM and Newton-Raphson algorithms to maximize the latent class model log-likelihood function. Depending on the starting parameters, this algorithm may only locate a local, rather than global, maximum. This becomes more and more of a problem as nclass
increases. It is therefore highly advisable to run poLCA
multiple times until you are relatively certain that you have located the global maximum log-likelihood. As long as probs.start=NULL
, each function call will use different (random) initial starting parameters. Alternatively, setting nrep
to a value greater than one enables the user to estimate the latent class model multiple times with a single call to poLCA
, thus conducting the search for the global maximizer automatically.
The term "Latent class regression" (LCR) can have two meanings. In this package, LCR models refer to latent class models in which the probability of class membership is predicted by one or more covariates. However, in other contexts, LCR is also used to refer to regression models in which the manifest variable is partitioned into some specified number of latent classes as part of estimating the regression model. It is a way to simultaneously fit more than one regression to the data when the latent data partition is unknown. The flexmix
function in package