Fits an extension of the skew-symmetric association model proposed in van der Heijden & Mooijaart (1995) to describe asymmetry of square tables. This model introduces a layer effect by which the strength of skew-symmetric association, and optionnally scores, can vary over the levels of the third dimension of the table. Skew-symmetric association can be combined with quasi-symmetry (the default), quasi-independence, or symmetric (homogeneous) RC(M) associations, with or without layer effect.
hmskewL(tab, nd.symm = NA,
layer.effect.skew = c("homogeneous.scores", "heterogeneous",
"none"),
layer.effect.symm = c("heterogeneous", "uniform",
"regression.type",
"homogeneous.scores", "none"),
diagonal = c("none", "heterogeneous", "homogeneous"),
weighting = c("marginal", "uniform", "none"),
se = c("none", "jackknife", "bootstrap"),
nreplicates = 100, ncpus = getOption("boot.ncpus"),
family = poisson, weights = NULL,
start = NULL, etastart = NULL, tolerance = 1e-8,
iterMax = 5000, trace = FALSE, verbose = TRUE, ...)
a three-way table, or an object (such as a matrix) that can be coerced into a table; if present, dimensions above three will be collapsed. First two dimensions must be symmetric (i.e. of the same length).
the number of dimensions to include in the symmetric RC(M) association. Cannot exceed
2 * min(nrow(tab) - 1, ncol(tab) - 1)
(quasi-symmetry model).
If NA
(the default), a full quasi-symmetric association is used instead of a RC(M) model; if 0
,
quasi-independence is used.
determines the form of the interaction between skew-symmetric association and layers. See “Details” below.
determines the form of the interaction between symmetric row-column association,
or quasi-symmetric association (if nd.symm = NA
) and layers. See “Details” below.
what type of diagonal-specific parameters to include in the model, if any. Only makes sense
when nd.symm
is not NA
(else, diagonal parameters are already included).
what weights should be used when normalizing the scores.
which method to use to compute standard errors for parameters.
the number of bootstrap replicates, if enabled.
the number of processes to use for jackknife or bootstrap parallel computing. Defaults to
the number of cores (see detectCores
), with a maximum of 5, but falls back to 1
(no parallelization) if package parallel
is not available.
a specification of the error distribution and link function
to be used in the model. This can be a character string naming
a family function; a family function, or the result of a call
to a family function. See family
details of family functions.
an optional vector of weights to be used in the fitting process.
either NA
to use optimal starting values, NULL
to use
random starting values, or a vector of starting values for the parameters in the model.
starting values for the linear predictor; set to NULL
to use either default
starting values (if start = NA
), or random starting values (in all other cases).
a positive numeric value specifying the tolerance level for convergence; higher values will speed up the fitting process, but beware of numerical instability of estimated scores!
a positive integer specifying the maximum number of main iterations to perform; consider raising this value if your model does not converge.
a logical value indicating whether the deviance should be printed after each iteration.
a logical value indicating whether progress indicators should be printed, including a diagnostic error message if the algorithm restarts.
more arguments to be passed to gnm
A hmskewL
object, which is a subclass of an rcL.symm
object (see rcL
) if
nd.symm
is strictly positive. In addition to this class, it contains a assoc.hmskew
component
holding information about the skew-symmetric association:
The intrisic association parameters, one per dimension and per layer.
Row scores, normalized so that their (weighted) sum is 0, their (weighted) sum of squares is 1, and their (weighted) cross-dimensional correlation is null.
Column scores, normalized so that their (weighted) sum is 0, their (weighted) sum of squares is 1, and their (weighted) cross-dimensional correlation is null.
The name of the weighting method used, reflected by row.weights
and col.weights
.
The row weights used for the identification of scores, as specified by the
weighting
argument.
The column weights used for the identification of scores, as specified by the
weighting
argument.
The variance-covariance matrix for phi coefficients and normalized row and column
scores. Only present if se
was not “none”.
An array stacking on its third dimension one variance-covariance matrix for
the adjusted scores of each layer in the model (used for plotting). Only present if se
was not “none”.
The method used to compute the variance-covariance matrix (corresponding to the
se
argument.
This model follows an equation inspired from that presented by van der Heijden & Mooijaart (1995) for two-way tables
(see hmskew
):
$$ log F_{ijk} = q_{ijk} + \phi_k (\nu_{ik} \mu_{jk} - \mu_{ik} \nu_{jk}) $$
where \(F_{ijk}\) is the expected frequency for the cell at the intersection of row i, column j and layer k of
tab
, and \(q_{ij}\) a quasi-symmetric specification, with either full interaction parameters, or
a RC(M) association. See reference for detailed information about the degrees of freedom and the identification
constraints applied to the scores.
If layer.effect.skew
is set to ‘heterogeneous’, different scores will be computed for each level,
which is equivalent to fitting separate models using hmskew
on the k two-way tables.
If it is set to ‘homogeneous.scores’, then \(\mu_{ik} = \mu_i\) and \(\nu_{ik} = \nu_i\) for all
layers k: only the \(\phi_k\) are allowed to vary across layers. If it is set to ‘none’, then in addition
to the previous conditions all \(\phi_{mk}\) are forced to be equal for all layers k, which amounts to a stability
of the association across layers.
When nd.symm
is different from NA
, the symmetric association works exactly like a call to rcL
,
with parameters nd.symm
and layer.effect.symm
translated respectively to nd
and
layer.effect
. When nd.symm == NA
, symmetric association parameters are either stable across layers,
are multiplied by a layer coefficient (UNIDIFF model, see unidiff
), follow a regression-type
(Goodman-Hout) specification, or are different for each layer, when layer.effect.symm
is respectively
none
, uniform
, regression.type
and heterogeneous
.
Actual model fitting is performed using gnm
, which implements the Newton-Raphson algorithm.
This function simply ensures correct start values are used, in addition to allowing for identification
of scores even with several dimensions, computation of their jackknife or bootstrap standard errors, and plotting.
The default starting values for skew association parameters are computed using an eigen value decomposition from the
results of the model without skew association component (“base model”); if nd.symm
is not NA
and
strictly positive, random starting values are used. In some complex cases, using start = NULL
to start with
random values can be more efficient, but it is also less stable and can converge to non-optimal solutions.
van der Heijden, P.G.M., and Mooijaart, A. (1995). Some new log bilinear models for the analysis of asymmetry in a square contingency table. Sociol. Methods and Research 24, 7-29.