lcmixed
is a method for the
flexmix
-function in package
flexmix
. It provides the necessary information to run an
EM-algorithm for maximum likelihood estimation for a latent class
mixture (clustering) model where some variables are continuous
and modelled within the mixture components by Gaussian distributions
and some variables are categorical and modelled within components by
independent multinomial distributions. lcmixed
can be called
within flexmix
. The function flexmixedruns
is a wrapper
function that can be run to apply lcmixed
.
Note that at least one categorical variable is needed, but it is possible to use data without continuous variable.
There are further format restrictions to the data (see below in the
documentation of continuous
and discrete
), which
can be ignored when running lcmixed
through
flexmixedruns
.
lcmixed( formula = .~. , continuous, discrete, ppdim,
diagonal = TRUE, pred.ordinal=FALSE, printlik=FALSE )
An object of class FLXMC
(not documented; only used
internally by flexmix
).
a formula to specify response and explanatory
variables. For lcmixed
this always has the form x~1
,
where x
is a matrix or data frome of all variables to be
involved, because regression and explanatory variables are not
implemented.
number of continuous variables. Note that the continuous variables always need to be the first variables in the matrix or data frame.
number of categorical variables. Always the last variables in the matrix or data frame. Note that categorical variables always must be coded as integers 1,2,3, etc. without interruption.
vector of integers specifying the number of (in the data) existing categories for each categorical variable.
logical. If TRUE
, Gaussian models are fitted
restricted to diagonal covariance matrices. Otherwise, covariance
matrices are unrestricted. TRUE
is consistent with the
"within class independence" assumption for the multinomial variables.
logical. If FALSE
, the within-component
predicted value for categorical variables is the probability mode,
otherwise it is the mean of the standard (1,2,3,...) scores, which
may be better for ordinal variables.
logical. If TRUE
, the loglikelihood is printed
out whenever computed.
Christian Hennig christian.hennig@unibo.it https://www.unibo.it/sitoweb/christian.hennig/en
The data need to be organised case-wise, i.e., if there are categorical variables only, and 15 cases with values c(1,1,2) on the 3 variables, the data matrix needs 15 rows with values 1 1 2.
General documentation on flexmix methods can be found in Chapter 4 of Friedrich Leisch's "FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in R", https://CRAN.R-project.org/package=flexmix
Hennig, C. and Liao, T. (2013) How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification, Journal of the Royal Statistical Society, Series C Applied Statistics, 62, 309-369.
flexmixedruns
, flexmix
,
flexmix-class
,
discrete.recode
, which recodes a dataset into the format
required by lcmixed
set.seed(112233)
options(digits=3)
require(MASS)
require(flexmix)
data(Cars93)
Cars934 <- Cars93[,c(3,5,8,10)]
cc <-
discrete.recode(Cars934,xvarsorted=FALSE,continuous=c(2,3),discrete=c(1,4))
fcc <- flexmix(cc$data~1,k=2,
model=lcmixed(continuous=2,discrete=2,ppdim=c(6,3),diagonal=TRUE))
summary(fcc)
Run the code above in your browser using DataLab