This page explains the details of estimating weights from SuperLearner-based propensity scores by setting method = "super"
in the call to weightit
or weightitMSM
. This method can be used with binary, multinomial, and continuous treatments.
In general, this method relies on estimating propensity scores using the SuperLearner algorithm for stacking predictions and then converting those propensity scores into weights using a formula that depends on the desired estimand. For binary and multinomial treatments, one or more binary classification algorithms are used to estimate the propensity scores as the predicted probability of being in each treatment given the covariates. For continuous treatments, a regression algorithm is used to estimate generalized propensity scores as the conditional density of treatment given the covariates.
Binary Treatments
For binary treatments, this method estimates the propensity scores using SuperLearner
in the SuperLearner package. The following estimands are allowed: ATE, ATT, ATC, ATO, and ATM. The weights for the ATE, ATT, and ATC are computed from the estimated propensity scores using the standard formulas, the weights for the ATO are computed as in Li & Li (2018), and the weights for the ATM (i.e., average treatment effect in the equivalent sample "pair-matched" with calipers) are computed as in Yoshida et al (2017). Weights can also be computed using marginal mean weighting through stratification for the ATE, ATT, and ATC. See get_w_from_ps
for details.
Multinomial Treatments
For multinomial treatments, the propensity scores are estimated using several calls to SuperLearner
, one for each treatment group, and the treatment probabilities are normalized to sum to 1. The following estimands are allowed: ATE, ATT, ATO, and ATM. The weights for each estimand are computed using the standard formulas or those mentioned above. Weights can also be computed using marginal mean weighting through stratification for the ATE, ATT, and ATC. See get_w_from_ps
for details.
Continuous Treatments
For continuous treatments, the generalized propensity score is estimated using SuperLearner
. In addition, kernel density estimation can be used instead of assuming a normal density for the numerator and denominator of the generalized propensity score by setting use.kernel = TRUE
. Other arguments to density
can be specified to refine the density estimation parameters. plot = TRUE
can be specified to plot the density for the numerator and denominator, which can be helpful in diagnosing extreme weights.
Longitudinal Treatments
For longitudinal treatments, the weights are the product of the weights estimated at each time point.
Sampling Weights
Sampling weights are supported through s.weights
in all scenarios.
Missing Data
In the presence of missing data, the following value(s) for missing
are allowed:
"ind"
(default)First, for each variable with missingness, a new missingness indicator variable is created which takes the value 1 if the original covariate is NA
and 0 otherwise. The missingness indicators are added to the model formula as main effects. The missing values in the covariates are then replaced with 0s. The weight estimation then proceeds with this new formula and set of covariates. The covariates output in the resulting weightit
object will be the original covariates with the NA
s.