Depending on the distributions specified for the outcome variables in the arguments dist_e
and
dist_c
and the type of missingness mechanism specified in the argument type
, different pattern mixture models
are built and run in the background by the function pattern
. The model for the outcomes is fitted in each missingness pattern
and the parameters indexing the missing data distributions are identified using: the corresponding parameters identified from the observed data
in other patterns (under 'MAR'); or a combination of the parameters identified by the observed data and some sensitivity parameters (under 'MNAR').
A simple example can be used to show how pattern mixture models are specified.
Consider a data set comprising a response variable \(y\) and a set of centered covariate \(X_j\). We denote with \(d_i\) the patterns' indicator variable for each
subject in the trial \(i = 1, ..., n\) such that: \(d_i = 1\) indicates the completers (both e and c observed), \(d_i = 2\) and \(d_i = 3\) indicate that
only the costs or effects are observed, respectively, while \(d_i = 4\) indicates that neither of the two outcomes is observed. In general, a different number of patterns
can be observed between the treatment groups and missingHE
accounts for this possibility by modelling a different patterns' indicator variables for each arm.
For simplicity, in this example, we assume that the same number of patterns is observed in both groups. \(d_i\) is assigned a multinomial distribution,
which probabilities are modelled using a Dirichlet prior (by default giving to each pattern the same weight). Next, the model specified in dist_e
and dist_c
is fitted in each pattern. The parameters that cannot be identified by the observed data in each pattern (d = 2, 3, 4), e.g. the means.
\(mu_e[d]\) and mu_c[d]
, can be identified using the parameters estimated from other patterns. Two choices are currently available: the complete cases ('CC') or available cases ('AC').
For example, using the 'CC' restriction, the parameters indexing the distributions of the missing data are identified as:
$$mu_e[2] = \mu_e[4] = \mu_e[1] + \Delta_e$$
$$mu_c[3] = \mu_c[4] = \mu_c[1] + \Delta_c$$
where
\(\mu_e[1]\) is the effects mean for the completers.
\(\mu_c[1]\) is the costs mean for the completers.
\(\Delta_e\) is the sensitivity parameters associated with the marginal effects mean.
\(\Delta_c\) is the sensitivity parameters associated with the marginal costs mean.
If the 'AC' restriction is chosen, only the parameters estimated from the observed data in pattern 2 (costs) and pattern 3 (effects) are used to identify those in the other patterns.
When \(\Delta_e = 0\) and \(\Delta_c = 0\) the model assumes a 'MAR' mechanism. When \(\Delta_e != 0\) and/or \(\Delta_c != 0\) 'MNAR' departues for the
effects and/or costs are explored assuming a Uniform prior distributions for the sensitivity parameters. The range of values for these priors is defined based on the
boundaries specified in Delta_e
and Delta_c
(see Arguments), which must be provided by the user.
When user-defined hyperprior values are supplied via the argument prior
in the function pattern
, the elements of this list (see Arguments)
must be vectors of length two containing the user-provided hyperprior values and must take specific names according to the parameters they are associated with.
Specifically, the names for the parameters indexing the model which are accepted by missingHE are the following:
location parameters \(\alpha_0\) and \(\beta_0\): "mean.prior.e"(effects) and/or "mean.prior.c"(costs)
auxiliary parameters \(\sigma\): "sigma.prior.e"(effects) and/or "sigma.prior.c"(costs)
covariate parameters \(\alpha_j\) and \(\beta_j\): "alpha.prior"(effects) and/or "beta.prior"(costs)
The only exception is the missingness patterns' probability \(\pi\), denoted with "patterns.prior", whose hyperprior values must be provided as a list
formed by two elements. These must be vectors of the same length equal to the number of patterns in the control (first element) and intervention (second element) group.
For each model, random effects can also be specified for each parameter by adding the term + (x | z) to each model formula,
where x is the fixed regression coefficient for which also the random effects are desired and z is the clustering variable across which
the random effects are specified (must be the name of a factor variable in the dataset). Multiple random effects can be specified using the
notation + (x1 + x2 | site) for each covariate that was included in the fixed effects formula. Random intercepts are included by default in the models
if a random effects are specified but they can be removed by adding the term 0 within the random effects formula, e.g. + (0 + x | z).