The genPolyMatrix
permits to quickly generate a polytomous item bank in suitable format for further use in e.g. computing item response probabilities with the Pi
.
The six polytomous IRT models that are supported are:
the Graded Response Model (GRM; Samejima, 1969);
the Modified Graded Response Model (MGRM; Muraki, 1990);
the Partial Credit Model (PCM; Masters, 1982);
the Generalized Partial Credit Model (GPCM; Muraki, 1992);
the Rating Scale Model (RSM; Andrich, 1978);
the Nominal Response Model (NRM; Bock, 1972).
Each model is specified through the model
argument, with its accronym surrounded by double quotes (i.e. "GRM"
for GRM, "PCM"
for PCM, etc.). The default value is "GRM"
.
For any item \(j\), set \((0, ..., g_j)\) as the \(g_j+1\) possible response categories. The maximum number of response categories can differ across items under the GRM, PCM, GPCM and NRM, but they are obviously equal across items under the MGRM and RSM. In the latter, set \(g\) as the (same) number of response categories for all items. It is possible however to require all items to have the same number of response categories, by fixing the same.nrCat
argument to TRUE
.
In case of GRM, PCM, GPCM or NRM with same.nrCat
being FALSE
, the number of response categories \(g_j+1\) per item is drawn from a Poisson distribution with parameter nrCat
, and this number is restricted to the interval [2; nrCat
]. This ensure at least two response categories and at most nrCat
categories. In all other cases, each \(g_j+1\) is trivially fixed to \(g+1 = \) nrCat
.
Denote further \(P_{jk}(\theta)\) as the probability of answering response category \(k \in \{0, ..., g_j\}\) of item \(j\). For GRM and MGRM, response probabilities \(P_{jk}(\theta)\) are defined through cumulative probabilities, while for PCM, GPCM, RSM and NRM they are directly computed.
For GRM and MGRM, set \(P_{jk}^*(\theta)\) as the (cumulative) probability of asnwering response category \(k\) or "above", that is \(P_{jk}^*(\theta) = Pr(X_j \geq k | \theta)\) where \(X_j\) is the item response. It follows obviously that for any \(\theta\), \(P_{j0}^*(\theta) = 1\) and \(P_{jk}^*(\theta) = 0\) when \(k>g_j\). Furthermore, response category probabilities are found back by the relationship \(P_{jk}(\theta)= P_{jk}^*(\theta)-P_{j,k+1}^*(\theta)\). Then, the GRM is defined by (Samejima, 1969)
$$P_{jk}^*(\theta)=\frac{\exp\,[\alpha_j\,(\theta-\beta_{jk})]}{1+\exp\,[\alpha_j\,(\theta-\beta_{jk})]}$$
and the MGRM by (Muraki, 1990)
$$P_{jk}^*(\theta)=\frac{\exp\,[\alpha_j\,(\theta-b_j+c_k)]}{1+\exp\,[\alpha_j\,(\theta-b_j+c_k)]}.$$
The PCM, GPCM, RSM and NRM are defined as "divide-by-total" models (Embretson and Reise, 2000). The PCM has following response category probability (Masters, 1982):
$$P_{jk}(\theta)=\frac{\exp\,\sum_{t=0}^k (\theta-\delta_{jt})}{\sum_{r=0}^{g_j}\,\exp\, \sum_{t=0}^r (\theta-\delta_{jt})}\quad \mbox{with} \quad \sum_{t=0}^0 (\theta-\delta_{jt})=0.$$
The GPCM has following response category probability (Muraki, 1992):
$$P_{jk}(\theta)=\frac{\exp\,\sum_{t=0}^k \alpha_j\,(\theta-\delta_{jt})}{\sum_{r=0}^{g_j}\,\exp\, \sum_{t=0}^r \alpha_j\,(\theta-\delta_{jt})}\quad \mbox{with} \quad \sum_{t=0}^0 \alpha_j\,(\theta-\delta_{jt})=0.$$
The RSM has following response category probability (Andrich, 1978):
$$P_{jk}(\theta)=\frac{\exp\,\sum_{t=0}^k [\theta-(\lambda_j+\delta_t)]}{\sum_{r=0}^{g_j}\,\exp\, \sum_{t=0}^r [\theta-(\lambda_j+\delta_t)]}\quad \mbox{with} \quad \sum_{t=0}^0 [\theta-(\lambda_j+\delta_t)]=0.$$
Finally, the NRM has following response category probability (Bock, 1972):
$$P_{jk}(\theta)=\frac{\exp (\alpha_{jk}\,\theta+c_{jk})}{\sum_{r=0}^{g_j} \exp (\alpha_{jr}\,\theta+c_{jr})}\quad \mbox{with} \quad \alpha_{j0}\,\theta+c_{j0}=0.$$
The following parent distributions are considered to generate the different item parameters. The \(\alpha_j\) parameters of GRM, MGRM and GPCM, as well as the \(\alpha_{jk}\) parameters of the NRM, are drawn from a log-normal distribution with mean 0 and standard deviation 0.1225. All other parameters are drawn from a standard normal distribution. Moreover, the \(\beta_{jk}\) parameters of the GRM and the \(c_k\) parameters of the MGRM are sorted respectively in increasing and decreasing order of \(k\), to ensure decreasing trend in the cumulative \(P_{jk}^*(\theta)\) probabilities.
The output is a matrix with one row per item and as many columns as required to hold all item parameters. In case of missing response categories, the corresponding parameters are replaced by NA
values. Column names refer to the corresponding model parameters. See Details for further explanations and Examples for illustrative examples.
Finally, the output matrix can contain an additional vector with the names of the subgroups to be used for content balancing purposes. To do so, the argument cbControl
(with default value is NULL
) must contain a list of two elements: (a) the names
element with the names of the subgroups, and (b) the props
elements with proportions of items per subgroup (of the same length of names
element, with only positive numbers but not necessarily summing to one). The cbControl
argument is similar to the one in nextItem
and randomCAT
functions to control for content balancing. The output matrix contains then an additional column, with the names of the subgroups randomly allocated to each item by using random multinomial draws with the probabilities given by cbControl$props
.