SitemFit1: Compute the S fit statistic for 1 item

Description

Implements the Kang & Chen (2007) polytomous extension to S statistic of Orlando & Thissen (2000). Rows with missing data are ignored, but see the omit option.

Usage

SitemFit1(
  grp,
  item,
  free = 0,
  ...,
  method = "pearson",
  log = TRUE,
  qwidth = 6,
  qpoints = 49L,
  alt = FALSE,
  omit = 0L,
  .twotier = TRUE
)

Arguments

grp: a list containing the model and data. See the details section.
item: the item of interest
free: the number of free parameters involved in estimating the item (to adjust the df)
...: Not used. Forces remaining arguments to be specified by name.
method: whether to use a pearson or rms test
log: whether to return p-values in log units
qwidth: lifecycle::badge("deprecated")
qpoints: lifecycle::badge("deprecated")
alt: whether to include the item of interest in the denominator
omit: number of items to omit or a character vector with the names of the items to omit when calculating the observed and expected sum-score tables
.twotier: whether to enable the two-tier optimization

Format of a group

A model, or group within a model, is represented as a named list.

spec: list of response model objects
param: numeric matrix of item parameters
free: logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
mean: numeric vector giving the mean of the latent distribution
cov: numeric matrix giving the covariance of the latent distribution
data: data.frame containing observed item responses, and optionally, weights and frequencies
score: factors scores with response patterns in rows
weightColumn: name of the data column containing the numeric row weights (optional)
freqColumn: name of the data column containing the integral row frequencies (optional)
qwidth: width of the quadrature expressed in Z units
qpoints: number of quadrature points
minItemsPerScore: minimum number of non-missing items when estimating factor scores

The param matrix stores items parameters by column. If a column has more rows than are required to fully specify a model then the extra rows are ignored. The order of the items in spec and order of columns in param are assumed to match. All items should have the same number of latent dimensions. Loadings on latent dimensions are given in the first few rows and can be named by setting rownames. Item names are assigned by param colnames.

Currently only a multivariate normal distribution is available, parameterized by the mean and cov. If mean and cov are not specified then a standard normal distribution is assumed. The quadrature consists of equally spaced points. For example, qwidth=2 and qpoints=5 would produce points -2, -1, 0, 1, and 2. The quadrature specification is part of the group and not passed as extra arguments for the sake of consistency. As currently implemented, OpenMx uses EAP scores to estimate latent distribution parameters. By default, the exact same EAP scores should be produced by EAPscores.

Details

This statistic is good at finding a small number of misfitting items among a large number of well fitting items. However, be aware that misfitting items can cause other items to misfit.

Observed tables cannot be computed when data is missing. Therefore, you can optionally omit items with the greatest number of responses missing relative to the item of interest.

Pearson is slightly more powerful than RMS in most cases I examined.

Setting alt to TRUE causes the tables to match published articles. However, the default setting of FALSE probably provides slightly more power when there are less than 10 items.

The name of the test, "S", probably stands for sum-score.

References

Kang, T. and Chen, T. T. (2007). An investigation of the performance of the generalized S-Chisq item-fit index for polytomous IRT models. ACT Research Report Series.