Fit log-multiplicative row-column association models with transitional layer effect, which are related to the RC(M)-L model, with one or several dimensions. Supported variants include (for square tables) symmetric (homogeneous) row and column scores, possibly combined with separate diagonal parameters.
rcL.trans(tab, nd = 1,
symmetric = FALSE,
diagonal = c("none", "heterogeneous", "homogeneous"),
weighting = c("marginal", "uniform", "none"),
se = c("none", "jackknife", "bootstrap"),
nreplicates = 100, ncpus = getOption("boot.ncpus"),
family = poisson, weights = NULL,
start = NULL, etastart = NULL, tolerance = 1e-8,
iterMax = 5000, trace = FALSE, verbose = TRUE, ...)
a three-way table, or an object (such as a matrix) that can be coerced into a table; if present, dimensions above three will be collapsed.
the number of dimensions to include in the model. Cannot exceed
min(nrow(tab) - 1, ncol(tab) - 1)
if symmetric
is FALSE
(saturated model),
and twice this threshold otherwise (quasi-symmetry model).
should row and column scores be constrained to be equal? Valid only for square tables.
what type of diagonal-specific parameters to include in the model, if any. This amounts to taking quasi-conditional independence, rather than conditional independence, as the baseline model. Valid only for square tables.
what weights should be used when normalizing the scores.
which method to use to compute standard errors for parameters.
the number of bootstrap replicates, if enabled.
the number of processes to use for jackknife or bootstrap parallel computing. Defaults to
the number of cores (see detectCores
), with a maximum of 5, but falls back to 1
(no parallelization) if package parallel
is not available.
a specification of the error distribution and link function
to be used in the model. This can be a character string naming
a family function; a family function, or the result of a call
to a family function. See family
details of family functions.
an optional vector of weights to be used in the fitting process.
either NA
to use optimal starting values, NULL
to use
random starting values, or a vector of starting values for the parameters in the model.
starting values for the linear predictor; set to NULL
to use either default
starting values (if start = NA
), or random starting values (in all other cases).
a positive numeric value specifying the tolerance level for convergence; higher values will speed up the fitting process, but beware of numerical instability of estimated scores!
a positive integer specifying the maximum number of main iterations to perform; consider raising this value if your model does not converge.
a logical value indicating whether the deviance should be printed after each iteration.
a logical value indicating whether progress indicators should be printed, including a diagnostic error message if the algorithm restarts.
more arguments to be passed to gnm
A rcL
object, with all the components of a gnm
object, plus an
assoc
component holding the most relevant association information:
The intrisic association parameters, one per dimension and per layer.
Row scores, normalized so that their (weighted) sum is 0, their (weighted) sum of squares is 1, and their (weighted) cross-dimensional correlation is null.
Column scores, normalized so that their (weighted) sum is 0, their (weighted) sum of squares is 1, and their (weighted) cross-dimensional correlation is null.
The name of the weighting method used, reflected by row.weights
and col.weights
.
The row weights used for the identification of scores, as specified by the
weighting
argument.
The column weights used for the identification of scores, as specified by the
weighting
argument.
The variance-covariance matrix for phi coefficients and normalized row and column
scores. Only present if se
was not “none”.
An array stacking on its third dimension one variance-covariance matrix for
the adjusted scores of each layer in the model (used for plotting). Only present if se
was not “none”.
The method used to compute the variance-covariance matrix (corresponding to the
se
argument.
This function fits log-multiplicative row-column association models with regression-type layer effect which are
experimental models combining the principles behind RC(M)-L (Wong, 2010; see rcL
) and regression-type
models (Goodman & Hout, 1998). More specifically, like RC(M)-L models, row and column scores are allowed to vary across
a layer variable, and the pattern of this variation follows the regression-type inspiration: for each dimension, a set of
scores describes the first layer, another set describes the total variation of these scores need to describe the
association observed for the last layer, and one parameter per layer describes the position of the layer between the
first and the last layer. Compared with the RC(M)-L model with homogeneous scores across layers, this models allows
for a finer description of changes since the ordering and distances of categories on a dimension are allowed to vary,
and not only the general strength of the association. It is designed to describe transitions from one state to another,
and is best suited for ordered layer variables like time (though the model is not sensitive to reordering of the layers).
The general equation of the model is:
$$ log F_{ijk} = \lambda + \lambda^I_i + \lambda^J_j + \lambda^K_k
+ \lambda^{IK}_{ik} + \lambda^{JK}_{jk}
+ \sum_{m=1}^M { \phi_{mk} (\mu^S_{im} + \psi_{mk} \mu^V_{im}) (\nu^S_{jm} + \psi_{mk} \nu^V_{jm}) }$$
where \(F_{ijk}\) is the expected frequency for the cell at the intersection of row i, column j and layer k of
tab
, and M the number of dimensions. The \(\psi_{mk}\) parameter is constrained to be positive, equal to 0
for the first layer (\(m = 1\)), and equal to 1 for the last layer.
This model should not be confused with another combination of RC(M) models with the regression-type approach, presented by Goodman & Hout (1998:180), in which two separate RC(M) associations are used to describe respectively the stable and the varying components. In the present model, row and column scores for both components are summed before entering the multiplicative interaction, which means only one RC(M) association exists.
The returned object is a generic rcL
association model describing the fitted scores for each layer. To analyze
more specifically the variation of each (normalized) score from the first to the last layer, use:
model$assoc$row[,,dim(model$assoc$row)[3]] - model$assoc$row[,,1]
(and similarly for column scores).
Actual model fitting is performed using gnm
, which implements the Newton-Raphson algorithm.
This function simply ensures correct start values are used, in addition to allowing for identification
of scores even with several dimensions, computation of their jackknife or bootstrap standard errors, and plotting.
The default starting values are taken from a model with a stable RC(M) association (“base model”). In some
complex cases, using start = NULL
to get random starting values can be more efficient, but it is also
less stable and can converge to non-optimal solutions.
Goodman, L.A., and Hout, M. (1998). Statistical Methods and Graphical Displays for Analyzing How the Association Between Two Qualitative Variables Differs Among Countries, Among Groups, Or Over Time: A Modified Regression-Type Approach. Sociological Methodology 28(1), 175-230. Wong, R.S-K. (2010). Association models. SAGE: Quantitative Applications in the Social Sciences.