lavTablesFitCp: Pairwise maximum likelihood fit statistics

Description

Three measures of fit for the pairwise maximum likelihood estimation method that are based on likelihood ratios (LR) are defined: $C_F$, $C_M$, and $C_P$. Subscript $F$ signifies a comparison of model-implied proportions of full response patterns with observed sample proportions, subscript $M$ signifies a comparison of model-implied proportions of full response patterns with the proportions implied by the assumption of multivariate normality, and subscript $P$ signifies a comparison of model-implied proportions of pairs of item responses with the observed proportions of pairs of item responses.

Usage

lavTablesFitCf(object)
lavTablesFitCp(object, alpha = 0.05)
lavTablesFitCm(object)

Arguments

object

An object of class lavaan.

alpha

The nominal level of signifiance of global fit.

Details

$C_F${ The $C_F$ statistic compares the log-likelihood of the model-implied proportions ($\pi_r$) with the observed proportions ($p_r$) of the full multivariate responses patterns: $$C_F = 2N\sum_{r}p_{r}\ln[p_{r}/\hat{\pi}_{r}],$$ which asymptotically has a chi-square distribution with $$df_F = m^k - n - 1,$$ where $k$ denotes the number of items with discrete response scales, $m$ denotes the number of response options, and $n$ denotes the number of parameters to be estimated. Notice that $C_F$ results may be biased because of large numbers of empty cells in the multivariate contingency table. } $C_M${ The $C_M$ statistic is based on the $C_F$ statistic, and compares the proportions implied by the model of interest (Model 1) with proportions implied by the assumption of an underlying multivariate normal distribution (Model 0): $$C_M = C_{F1} - C_{F0},$$ where $C_{F0}$ is $C_F$ for Model 0 and $C_{F1}$ is $C_F$ for Model 1. Statistic $C_M$ has a chi-square distribution with degrees of freedom $$df_M = k(k-1)/2 + k(m-1) - n_{1},$$ where $k$ denotes the number of items with discrete response scales, $m$ denotes the number of response options, and $k(k-1)/2$ denotes the number of polychoric correlations, $k(m-1)$ denotes the number of thresholds, and $n_1$ is the number of parameters of the model of interest. Notice that $C_M$ results may be biased because of large numbers of empty cells in the multivariate contingency table. However, bias may cancels out as both Model 1 and Model 0 contain the same pattern of empty responses. } $C_P${ With the $C_P$ statistic we only consider pairs of responses, and compare observed sample proportions ($p$) with model-implied proportions of pairs of responses($\pi$). For items $i$ and $j$ we obtain a pairwise likelihood ratio test statistic $C_{P_{ij}}$ $$C_{P_{ij}}=2N\sum_{c_i=1}^m \sum_{c_j=1}^m p_{c_i,c_j}\ln[p_{c_i,c_j}/\hat{\pi}_{c_i,c_j}],$$ where $m$ denotes the number of response options and $N$ denotes sample size. The $C_P$ statistic has an asymptotic chi-square distribution with degrees of freedom equal to the information $(m^2 -1)$ minus the number of parameters (2(m-1) thresholds and 1 correlation), $$df_P = m^{2} - 2(m - 1) - 2.$$ As $k$ denotes the number of items, there are $k(k-1)/2$ possible pairs of items. The $C_P$ statistic should therefore be applied with a Bonferroni adjusted level of significance $\alpha^*$, with $$\alpha^*= \alpha /(k(k-1)/2)),$$ to keep the family-wise error rate at $\alpha$. The hypothesis of overall goodness-of-fit is tested at $\alpha$ and rejected as soon as $C_P$ is significant at $\alpha^*$ for at least one pair of items. Notice that with dichotomous items, $m = 2$, and $df_P = 0$, so that hypothesis can not be tested. }

References

Barendse, M. T., Ligtvoet, R., Timmerman, M. E., & Oort, F. J. (under review). Structural Equation Modeling of Discrete data: Model Fit after Pairwise Maximum Likelihood. Joreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347-387.

Examples

Run this code

# Data
HS9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5",
                                 "x6","x7","x8","x9")]
HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )

# Single group example with one latent factor
HS.model <- ' trait =~ x1 + x2 + x3 + x4 '
fit <- cfa(HS.model, data=HSbinary[,1:4], ordered=names(HSbinary),
           estimator="PML")
lavTablesFitCm(fit)
lavTablesFitCp(fit)
lavTablesFitCf(fit)