reglca: Regularized Latent Class Analysis

Description

Estimates the regularized latent class model for dichotomous responses based on regularization methods (Chen, Liu, Xu, & Ying, 2015; Chen, Li, Liu, & Ying, 2017). The SCAD and MCP penalty functions are available.

Usage

reglca(dat, nclasses, weights=NULL, group=NULL, regular_type="scad",
   regular_lam=0, sd_noise_init=1, item_probs_init=NULL, class_probs_init=NULL,
   random_starts=1, random_iter=20, conv=1e-05, h=1e-04, mstep_iter=10,
   maxit=1000, verbose=TRUE, prob_min=.0001)
# S3 method for reglca
summary(object, digits=4, file=NULL,  …)

Arguments

dat

Matrix with dichotomous item responses. NAs are allowed.

nclasses

Number of classes

weights

Optional vector of sampling weights

group

Optional vector for grouping variable

regular_type

Regularization type. Can be scad or mcp. See gdina for more information.

regular_lam

Regularization parameter \(\lambda\)

sd_noise_init

Standard deviation for amount of noise in generating random starting values

item_probs_init

Optional matrix of initial item response probabilities

class_probs_init

Optional vector of class probabilities

random_starts

Number of random starts

random_iter

Number of initial iterations for random starts

conv

Convergence criterion

Numerical differentiation parameter

mstep_iter

Number of iterations in the M-step

maxit

Maximum number of iterations

verbose

Logical indicating whether convergence progress should be displayed

prob_min

Lower bound for probabilities in estimation

object

A required object of class gdina, obtained from a call to the function gdina.

digits

Number of digits after decimal separator to display.

file

Optional file name for a file in which summary should be sinked.

…

Further arguments to be passed.

Value

A list containing following elements (selection):

item_probs

Item response probabilities

class_probs

Latent class probabilities

p.aj.xi

Individual posterior

p.xi.aj

Individual likelihood

loglike

Log-likelihood value

Npars

Number of estimated parameters

Nskillpar

Number of skill class parameters

Number of groups

n.ik

Expected counts

Nipar

Number of item parameters

n_reg

Number of regularized parameters

n_reg_item

Number of regularized parameters per item

item

Data frame with item parameters

pjk

Item response probabilities (in an array)

Number of persons

Number of items

Details

The regularized latent class model for dichotomous item responses assumes \(C\) latent classes. The item response probabilities \(P(X_i=1|c)=p_{ic}\) are estimated in such a way such that the number of different \(p_{ic}\) values per item is minimized. This approach eases interpretability and enables to recover the structure of a true (but unknown) cognitive diagnostic model.

References

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110, 850-866.

Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660-692.

Examples

Run this code

# NOT RUN {
#############################################################################
# EXAMPLE 1: Estimating a regularized LCA for DINA data
#############################################################################

#---- simulate data
I <- 12  # number of items
# define Q-matrix
q.matrix <- matrix(0,I,2)
q.matrix[ 1:(I/3), 1 ] <- 1
q.matrix[ I/3 + 1:(I/3), 2 ] <- 1
q.matrix[ 2*I/3 + 1:(I/3), c(1,2) ] <- 1
N <- 1000  # number of persons
guess <- rep(seq(.1,.3,length=I/3), 3)
slip <- .1
rho <- 0.3  # skill correlation
set.seed(987)
dat <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip,
           mean=0*c( .2, -.2 ), Sigma=matrix( c( 1, rho,rho,1), 2, 2 ) )
dat <- dat$dat

#--- Model 1: Four latent classes without regularization
mod1 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0, random_starts=3,
               random_iter=10, conv=1E-4)
summary(mod1)

#--- Model 2: Four latent classes with regularization and lambda=.08
mod2 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.08, regular_type="scad",
               random_starts=3, random_iter=10, conv=1E-4)
summary(mod2)

#--- Model 3: Four latent classes with regularization and lambda=.05 with warm start

# "warm start" -> use initial parameters from fitted model with higher lambda value
item_probs_init <- mod2$item_probs
class_probs_init <- mod2$class_probs
mod3 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.05, regular_type="scad",
               item_probs_init=item_probs_init, class_probs_init=class_probs_init,
               random_starts=3, random_iter=10, conv=1E-4)
# }

Run the code above in your browser using DataLab