Three measures of fit for the pairwise maximum likelihood estimation method that are based on likelihood ratios (LR) are defined: \(C_F\), \(C_M\), and \(C_P\). Subscript \(F\) signifies a comparison of model-implied proportions of full response patterns with observed sample proportions, subscript \(M\) signifies a comparison of model-implied proportions of full response patterns with the proportions implied by the assumption of multivariate normality, and subscript \(P\) signifies a comparison of model-implied proportions of pairs of item responses with the observed proportions of pairs of item responses.
lavTablesFitCf(object)
lavTablesFitCp(object, alpha = 0.05)
lavTablesFitCm(object)
An object of class lavaan
.
The nominal level of signifiance of global fit.
The \(C_F\) statistic compares the log-likelihood of the model-implied proportions (\(\pi_r\)) with the observed proportions (\(p_r\)) of the full multivariate responses patterns: $$ C_F = 2N\sum_{r}p_{r}\ln[p_{r}/\hat{\pi}_{r}], $$ which asymptotically has a chi-square distribution with $$ df_F = m^k - n - 1, $$ where \(k\) denotes the number of items with discrete response scales, \(m\) denotes the number of response options, and \(n\) denotes the number of parameters to be estimated. Notice that \(C_F\) results may be biased because of large numbers of empty cells in the multivariate contingency table.
The \(C_M\) statistic is based on the \(C_F\) statistic, and compares the proportions implied by the model of interest (Model 1) with proportions implied by the assumption of an underlying multivariate normal distribution (Model 0): $$ C_M = C_{F1} - C_{F0}, $$ where \(C_{F0}\) is \(C_F\) for Model 0 and \(C_{F1}\) is \(C_F\) for Model 1. Statistic \(C_M\) has a chi-square distribution with degrees of freedom $$ df_M = k(k-1)/2 + k(m-1) - n_{1}, $$ where \(k\) denotes the number of items with discrete response scales, \(m\) denotes the number of response options, and \(k(k-1)/2\) denotes the number of polychoric correlations, \(k(m-1)\) denotes the number of thresholds, and \(n_1\) is the number of parameters of the model of interest. Notice that \(C_M\) results may be biased because of large numbers of empty cells in the multivariate contingency table. However, bias may cancels out as both Model 1 and Model 0 contain the same pattern of empty responses.
With the \(C_P\) statistic we only consider pairs of responses, and compare observed sample proportions (\(p\)) with model-implied proportions of pairs of responses(\(\pi\)). For items \(i\) and \(j\) we obtain a pairwise likelihood ratio test statistic \(C_{P_{ij}}\) $$ C_{P_{ij}}=2N\sum_{c_i=1}^m \sum_{c_j=1}^m p_{c_i,c_j}\ln[p_{c_i,c_j}/\hat{\pi}_{c_i,c_j}], $$ where \(m\) denotes the number of response options and \(N\) denotes sample size. The \(C_P\) statistic has an asymptotic chi-square distribution with degrees of freedom equal to the information \((m^2 -1)\) minus the number of parameters (2(m-1) thresholds and 1 correlation), $$ df_P = m^{2} - 2(m - 1) - 2. $$ As \(k\) denotes the number of items, there are \(k(k-1)/2\) possible pairs of items. The \(C_P\) statistic should therefore be applied with a Bonferroni adjusted level of significance \(\alpha^*\), with $$ \alpha^*= \alpha /(k(k-1)/2)), $$ to keep the family-wise error rate at \(\alpha\). The hypothesis of overall goodness-of-fit is tested at \(\alpha\) and rejected as soon as \(C_P\) is significant at \(\alpha^*\) for at least one pair of items. Notice that with dichotomous items, \(m = 2\), and \(df_P = 0\), so that hypothesis can not be tested.
Barendse, M. T., Ligtvoet, R., Timmerman, M. E., & Oort, F. J. (2016). Structural Equation Modeling of Discrete data: Model Fit after Pairwise Maximum Likelihood. Frontiers in psychology, 7, 1-8.
Joreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347-387.
lavTables, lavaan
# Data
HS9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5",
"x6","x7","x8","x9")]
HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )
# Single group example with one latent factor
HS.model <- ' trait =~ x1 + x2 + x3 + x4 '
fit <- cfa(HS.model, data=HSbinary[,1:4], ordered=names(HSbinary[,1:4]),
estimator="PML")
lavTablesFitCm(fit)
lavTablesFitCp(fit)
lavTablesFitCf(fit)
Run the code above in your browser using DataLab