ipwm: Weighting for Confounding and Joint Misclassification of Exposure and Outcome

Description

ipwm implements a method for estimating the marginal causal odds ratio by constructing weights (modified inverse probability weights) that address both confounding and joint misclassification of exposure and outcome.

Usage

ipwm(
  formulas,
  data,
  outcome_true,
  outcome_mis = NULL,
  exposure_true,
  exposure_mis = NULL,
  nboot = 1000,
  conf_level = 0.95,
  fix_nNAs = FALSE,
  semiparametric = FALSE,
  optim_args = list(method = "BFGS"),
  force_optim = FALSE,
  sp = Inf,
  print = TRUE
)

Arguments

formulas

a list of objects of class formula specifying the probability models for the stats::terms of some factorisation of the joint conditional probability function of exposure_true, exposure_mis, outcome_true and outcome_mis, given covariates

data

data.frame containing exposure.true, exposure.mis, outcome.true, outcome.mis and covariates. Missings (NAs) are allowed on variables exposure_true and outcome_true.

outcome_true

a character string specifying the name of the true outcome variable that is free of misclassification but possibly unknown (NA) for some (but not all) subjects

outcome_mis

a character string specifying the name of the counterpart of outcome_true that is available on all subjects but potentially misclassifies subjects' outcomes. The default (outcome_mis = NULL) indicates absence of outcome misclassification

exposure_true

a character string specifying the name of the true exposure variable that is free of misclassification but possibly unknown (NA) for some (but not all) subjects

exposure_mis

a character string specifying the name of the counterpart of exposure_true that is available on all subjects but potentially misclassifies subjects as exposed or as non-exposed. The default (exposure_mis = NULL) indicates absence of exposure misclassification

nboot

number of bootstrap samples. Setting nboot == 0 results in point estimation only.

conf_level

the desired confidence level of the confidence interval

fix_nNAs

logical indicator specifying whether or not to fix the joint distribution of is.na(exposure_true) and is.na(outcome_true). If TRUE, stratified bootstrap sampling is done according to the missing data pattern.

semiparametric

logical indicator specifying whether or not to parametrically sample exposure_true, exposure_mis, outcome_true and outcome_mis. If semiparametric == TRUE, it is assumed that the missing data pattern is conditionally independent of these variables given covariates. Provided nboot > 0, the missing data pattern and covariates are sampled nonparametrically. semiparametric is ignored if nboot == 0.

optim_args

arguments passed onto optim if called. See Details below for more information.

force_optim

logical indicator specifying whether or not to force the optim function to be called

scalar shrinkage parameter in the interval (0, Inf). Values closer to zero result in greater shrinkage of the estimated odds ratio to unity; sp == Inf results in no shrinkage.

logical indicator specifying whether or not to print the output.

Value

ipwm returns an object of class ipwm. The returned object is a list containing the following elements:

logOR

the estimated log odds ratio;

call

the matched function call.

If nboot != 0, the list also contains

a bootstrap estimate of the standard error for the estimator of the log odds ratio;

a bootstrap percentile confidence interval for the log odds ratio.

Details

This function is an implementation of the weighting method described by Penning de Vries et al. (2018). The method defaults to the estimator proposed by Gravel and Platt (2018) in the absence of exposure misclassification.

The function assumes that the exposure or the outcome has a misclassified version. An error is issued when both outcome_mis and exposure_mis are set to NULL.

Provided force_optim = FALSE, ipwm is considerably more efficient when the optim function is not invoked; i.e., when (1) exposure_mis = NULL and the formula for outcome_true does not contain stats::terms involving outcome_mis or exposure_true, (2) outcome_mis = NULL and the formula for exposure_true does not contain stats::terms involving exposure_mis or outcome_true, or (3) all(is.na(data[, exposure_true]) == is.na(data[, outcome_true])) and the formulas for exposure_true and outcome_true do not contain stats::terms involving exposure_mis or outcome_mis. In these cases, ipwm uses iteratively reweighted least squares via the glm function for maximum likelihood estimation. In all other cases, optim_args is passed on to optim for optimisation of the joint likelihood of outcome_true, outcome_mis, exposure_true and exposure_mis.

References

Gravel, C. A., & Platt, R. W. (2018). Weighted estimation for confounded binary outcomes subject to misclassification. Statistics in medicine, 37(3), 425-436. https://doi.org/10.1002/sim.7522

Penning de Vries, B. B. L., van Smeden, M., & Groenwold, R. H. H. (2020). A weighting method for simultaneous adjustment for confounding and joint exposure-outcome misclassifications. Statistical Methods in Medical Research, 0(0), 1-15. https://doi.org/10.1177/0962280220960172

Examples

Run this code

# NOT RUN {
data(sim) # simulated data on 10 covariates, exposure A and outcome Y.
formulas <- list(
  Y ~ A + L1 + L2 + L3 + L4 + L5 + L6 + L7 + L8 + L9 + L10 + B + Z,
  A ~ L1 + L2 + L3 + L4 + L5 + L6 + L7 + L8 + L9 + L10 + B + Z,
  Z ~ L1 + L2 + L3 + L4 + L5 + L6 + L7 + L8 + L9 + L10 + B,
  B ~ L1 + L2 + L3 + L4 + L5 + L6 + L7 + L8 + L9 + L10
)
# }
# NOT RUN {
ipwm_out <- ipwm(
  formulas = formulas,
  data = sim,
  outcome_true = "Y",
  outcome_mis = "Z",
  exposure_true = "A",
  exposure_mis = "B",
  nboot = 200,
  sp = 1e6
)
ipwm_out
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab