Learn R Programming

medflex (version 0.6-10)

neModel: Fit a natural effect model

Description

neModel is used to fit a natural effect model on the expanded dataset.

Usage

neModel(
  formula,
  family = gaussian,
  expData,
  xFit,
  se = c("bootstrap", "robust"),
  nBoot = 1000,
  parallel = c("no", "multicore", "snow"),
  ncpus = getOption("boot.ncpus", 1L),
  progress = TRUE,
  ...
)

Value

An object of class "neModel" (which additionally inherits from class "neModelBoot" if the bootstrap is used) consisting of a list of 3 objects:

neModelFit

the fitted natural model object (of class "glm") with downwardly biased standard errors

bootRes, vcov

the bootstrap results (of class "boot"; if se = "bootstrap") or the robust variance-covariance matrix (if se = "robust")

terms

the neTerms (internal class) object used. This object is equivalent to the terms object returned by the glm function, but has an additional "vartype" attribute, a list including pointers to the names of the outcome variable (Y), exposure (X), mediator (M), covariates (C) and auxiliary hypothetical variables x and x* (Xexp).

See neModel-methods for methods for neModel objects.

Arguments

formula

a formula object providing a symbolic description of the natural effect model.

family

a description of the error distribution and link function to be used in the model. For glm this can be a character string naming a family function, a family function or the result of a call to a family function. For glm.fit only the third option is supported. (See family for details of family functions.)

expData

the expanded dataset (of class "expData").

xFit

fitted model object representing a model for the exposure (used for inverse treatment (exposure) probability weighting).

se

character string indicating the type of standard errors to be calculated. The default type is based on the bootstrap (see details).

nBoot

number of bootstrap replicates (see R argument of boot).

parallel

(only for bootstrap) The type of parallel operation to be used (if any). If missing, the default is taken from the option "boot.parallel" (and if that is not set, "no").

ncpus

(only for bootstrap) integer: number of processes to be used in parallel operation: typically one would chose this to the number of available CPUs (see details).

progress

(only for bootstrap) logical value indicating whether or not a progress bar should be displayed. Progress bars are automatically disabled for multicore processing.

...

additional arguments (passed to glm).

Bootstrap standard errors

The bootstrap procedure entails refitting all working models on each bootstrap sample, reconstructing the expanded dataset and subsequently refitting the specified natural effect model on this dataset. In order to obtain stable standard errors, the number of bootstrap samples (specified via the nBoot argument) should be chosen relatively high (default is 1000).

To speed up the bootstrap procedure, parallel processing can be used by specifying the desired type of parallel operation via the parallel argument (for more details, see boot). The number of parallel processes (ncpus) is suggested to be specified explicitly (its default is 1, unless the global option options("boot.cpus") is specified). The function detectCores from the parallel package can be helpful at determining the number of available cores (although this may not always correspond to the number of allowed cores).

Robust standard errors

Robust variance-covariance matrices for the model parameters, based on the sandwich estimator, are calculated using core functions from the sandwich package. Additional details and derivations for the sandwich estimator for natural effect models can be found in the corresponding vignette that can be obtained by the command vignette("sandwich", package = "medflex").

Details

This function is a wrapper for glm, providing unbiased bootstrap (se = "bootstrap", the default) or robust (se = "robust") standard errors for the parameter estimates (see below for more details).

The formula argument requires to be specified in function of the variables from the expanded dataset (specified in expData) whose corresponding parameters index the direct and indirect effect. Stratum-specific natural effects can be estimated by additionally modeling the relation between the outcome and baseline covariates. If the set of baseline covariates adjusted for in the formula argument is not sufficient to control for confounding (e.g. when fitting a population-average natural effect model), an adequate model for the exposure (conditioning on a sufficient set of baseline covariates) should be specified in the xFit argument. In this case, such a model for the exposure distribution is needed to weight by the reciprocal of the probability (density) of the exposure (i.e. inverse probability weighting) in order to adjust for confounding. Just as for ratio-of-mediator probability weighting (see paragraph below), this kind of weighting is done internally.

Quadratic or higher-order polynomial terms can be included in the formula by making use of the I function or by using the poly function. However, we do not recommend the use of orthogonal polynomials (i.e. using the default argument specification raw = FALSE in poly), as these are not compatible with the neEffdecomp function.

In contrast to glm, the expData argument (rather than data argument) requires specification of a data frame that inherits from class "expData", which contains additional information about e.g. the fitted working model, the variable types or terms of this working model and possibly ratio-of-mediator probability weights. The latter are automatically extracted from the expData object and weighting is done internally.

As the default glm standard errors fail to reflect the uncertainty inherent to the working model(s) (i.e. either a model for the mediator or an imputation model for the outcome and possibly a model for the exposure), bootstrap standard errors (using the boot function from the boot package) or robust standard errors are calculated. The default type of standard errors is bootstrap standard errors. Robust standard errors (based on the sandwich estimator) can be requested (to be calculated) instead by specifying se = "robust".

References

Lange, T., Vansteelandt, S., & Bekaert, M. (2012). A Simple Unified Approach for Estimating Natural Direct and Indirect Effects. American Journal of Epidemiology, 176(3), 190-195.

Vansteelandt, S., Bekaert, M., & Lange, T. (2012). Imputation Strategies for the Estimation of Natural Direct and Indirect Effects. Epidemiologic Methods, 1(1), Article 7.

Loeys, T., Moerkerke, B., De Smet, O., Buysse, A., Steen, J., & Vansteelandt, S. (2013). Flexible Mediation Analysis in the Presence of Nonlinear Relations: Beyond the Mediation Formula. Multivariate Behavioral Research, 48(6), 871-894.

See Also

neModel-methods, plot.neModel, neImpute, neWeight, neLht, neEffdecomp

Examples

Run this code
data(UPBdata)

##############################
## weighting-based approach ##
##############################
weightData <- neWeight(negaff ~ att + gender + educ + age, 
                       data = UPBdata)

## stratum-specific natural effects
# bootstrap SE
if (FALSE) {
weightFit1b <- neModel(UPB ~ att0 * att1 + gender + educ + age, 
                       family = binomial, expData = weightData)
summary(weightFit1b)
}
# robust SE
weightFit1r <- neModel(UPB ~ att0 * att1 + gender + educ + age, 
                       family = binomial, expData = weightData, se = "robust")
summary(weightFit1r)

## population-average natural effects
expFit <- glm(att ~ gender + educ + age, data = UPBdata)
# bootstrap SE
if (FALSE) {
weightFit2b <- neModel(UPB ~ att0 * att1, family = binomial, 
                       expData = weightData, xFit = expFit)
summary(weightFit2b)
}
# robust SE
weightFit2r <- neModel(UPB ~ att0 * att1, family = binomial, 
                       expData = weightData, xFit = expFit, se = "robust")
summary(weightFit2r)

###############################
## imputation-based approach ##
###############################
impData <- neImpute(UPB ~ att * negaff + gender + educ + age, 
                    family = binomial, data = UPBdata)

## stratum-specific natural effects
# bootstrap SE
if (FALSE) {
impFit1b <- neModel(UPB ~ att0 * att1 + gender + educ + age, 
                    family = binomial, expData = impData)
summary(impFit1b)
}
# robust SE
impFit1r <- neModel(UPB ~ att0 * att1 + gender + educ + age, 
                    family = binomial, expData = impData, se = "robust")
summary(impFit1r)

## population-average natural effects
# bootstrap SE
if (FALSE) {
impFit2b <- neModel(UPB ~ att0 * att1, family = binomial, 
                    expData = impData, xFit = expFit)
summary(impFit2b)
}
# robust SE
impFit2r <- neModel(UPB ~ att0 * att1, family = binomial, 
                    expData = impData, xFit = expFit, se = "robust")
summary(impFit2r)

# check with vgam (VGAM package)
# library(VGAM)
# weightData <- neWeight(negaff ~ att + gender + educ + age, family = uninormal, data = UPBdata, FUN = vgam)
# impData <- neImpute(UPB ~ att + negaff + gender + educ + age, family = binomialff, data = UPBdata, FUN = vgam)
# debug(neModel)
# weightFit <- neModel(UPB ~ att0 + att1 + gender + educ + age, family = binomial, expData = weightData, nBoot = 2)
# impFit <- neModel(UPB ~ att0 + att1 + gender + educ + age, family = binomial, expData = impData, nBoot = 2)
# summary(weightFit)
# summary(impFit)

# warning!
impFit <- neModel(UPB ~ att0 * att1 + gender + educ + age, family = binomial, expData = impData, nBoot = 2)

# inverse propensity score weighting
expFit <- glm(att ~ gender + educ + age, data = UPBdata)
impFit <- neModel(UPB ~ att0 + att1, family = binomial, expData = impData, xFit = expFit, nBoot = 2)

Run the code above in your browser using DataLab