Computation of person fit statistics for ltm
, rasch
and tpm
models.
person.fit(object, alternative = c("less", "greater", "two.sided"),
resp.patterns = NULL, FUN = NULL, simulate.p.value = FALSE,
B = 1000)
a model object inheriting either from class ltm
, class rasch
or class tpm
.
the alternative hypothesis; see Details for more info.
a matrix or a data.frame of response patterns with columns denoting the items; if NULL
the person fit statistics are computed for the observed response patterns.
a function with three arguments calculating a user-defined person-fit statistic. The first argument must
be a numeric matrix of (0, 1) response patterns. The second argument must be a numeric vector of length equal to
the number of rows of the first argument, providing the ability estimates for each response pattern. The third
argument must be a numeric matrix with number of rows equal to the number of items, providing the IRT model
parameters. For ltm
and rasch
objects, this should be a two-column matrix, where the first
column contains the easiness and the second one the discrimination parameters (i.e., the additive
parameterization is assumed, which has the form \(\beta_{i0} + \beta_{i1}z\), where
\(\beta_{i0}\) is the easiness and \(\beta_{i1}\) the discrimination parameter for
the \(i\)th item). For tpm
objects the first column of the third argument of FUN
should contain
the logit (i.e., use qlogis()
) of the guessing parameters, the second column the easiness, and the third
column the discrimination parameters. The function should return a numeric vector of length equal to the number
of response patterns, containing the values of the user-defined person-fit statistics.
logical; if TRUE
, then the Monte Carlo procedure described in the Details
section is used to approximate the the distribution of the person-fit statistic(s) under the null hypothesis.
the number of replications in the Monte Carlo procedure.
An object of class persFit
is a list with components,
the response patterns for which the fit statistics have been computed.
a numeric matrix with person-fit statistics for each response pattern.
a numeric matrix with the corresponding \(p\)-values.
the value of the statistic
argument.
the value of the FUN
argument.
the value of the alternative
argument.
the value of the B
argument.
a copy of the matched call of object
.
The statistics calculated by default (i.e., if FUN = NULL
) by person.fit()
are the \(L_0\) statistic
of Levine and Rubin (1979) and its standardized version \(L_z\) proposed by Drasgow et al. (1985).
If simulate.p.value = FALSE
, the \(p\)-values are calculated for the \(L_z\) assuming a standard normal
distribution for the statistic under the null. If simulate.p.value = TRUE
, a Monte Carlo procedure is used to
approximate the distribution of the person-fit statistic(s) under the null hypothesis. In particular, the following
steps are replicated B
times for each response pattern:
Simulate a new ability estimate, say \(z^*\), from a normal distribution with mean the ability
estimate of the response pattern under the fitted model (i.e., object
), and standard
deviation the standard error of the ability estimate, as returned by the factor.scores
function.
Simulate a new response pattern of dichotomous items under the assumed IRT model, using \(z^*\) and
the maximum likelihood estimates under object
.
For the new response pattern and using \(z^*\) and the MLEs, compute the values of the person-fit statistic.
Denote by \(T_{obs}\) the value of the person-fit statistic for the original data-set. Then the \(p\)-value is
approximated according to the formula $$\left(1 + \sum_{b = 1}^B I(T_b \leq T_{obs})\right) / (1 + B),$$ if alternative = "less"
, $$\left(1 + \sum_{b = 1}^B I(T_b \geq
T_{obs})\right) / (1 + B),$$ if alternative = "greater"
, or
$$\left(1 + \sum_{b = 1}^B I(|T_b| \geq |T_{obs}|)\right) / (1 + B),$$ if alternative = "two.sided"
, where \(T_b\) denotes the value of the person-fit statistic in the
\(b\)th simulated data-set, \(I(.)\) denotes the indicator function, and \(|.|\) denotes the absolute value.
For the \(L_z\) statistic, negative values (i.e., alternative = "less"
) indicate response patterns that
are unlikely, given the measurement model and the ability estimate. Positive values (i.e., alternative =
"greater"
) indicate that the examinee's response pattern is more consistent than the probabilistic IRT model
expected. Finally, when alternative = "two.sided"
both the above settings are captured.
This simulation scheme explicitly accounts for the fact that ability values are estimated, by drawing from their large sample distribution. Strictly speaking, drawing \(z^*\) from a normal distribution is not theoretically appropriate, since the posterior distribution for the latent abilities is not normal. However, the normality assumption will work reasonably well, especially when a large number of items is considered.
Drasgow, F., Levine, M. and Williams, E. (1985) Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67--86.
Levine, M. and Rubin, D. (1979) Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4, 269--290.
Meijer, R. and Sijtsma, K. (2001) Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107--135.
Reise, S. (1990) A comparison of item- and person-fit methods of assessing model-data fit in IRT. Applied Psychological Measurement, 14, 127--137.
# NOT RUN {
# person-fit statistics for the Rasch model
# for the Abortion data-set
person.fit(rasch(Abortion))
# person-fit statistics for the two-parameter logistic model
# for the LSAT data-set
person.fit(ltm(LSAT ~ z1), simulate.p.value = TRUE, B = 100)
# }
Run the code above in your browser using DataLab