This page explains the details of estimating weights from generalized linear model-based propensity scores by setting method = "ps"
in the call to weightit
or weightitMSM
. This method can be used with binary, multinomial, and continuous treatments.
In general, this method relies on estimating propensity scores with a parametric generalized linear model and then converting those propensity scores into weights using a formula that depends on the desired estimand. For binary and multinomial treatments, a binomial or multinomial regression model is used to estimate the propensity scores as the predicted probability of being in each treatment given the covariates. For continuous treatments, a generalized linear model is used to estimate generalized propensity scores as the conditional density of treatment given the covariates.
For binary treatments, this method estimates the propensity scores using glm
. An additional argument is link
, which uses the same options as link
in family
. The default link is "logit", but others, including "probit", are allowed. The following estimands are allowed: ATE, ATT, ATC, ATO, and ATM. The weights for the ATE, ATT, and ATC are computed from the estimated propensity scores using the standard formulas, the weights for the ATO are computed as in Li & Li (2018), and the weights for the ATM (i.e., average treatment effect in the equivalent sample "pair-matched" with calipers) are computed as in Yoshida et al (2017). When include.obj = TRUE
, the returned object is the glm
fit.
For multinomial treatments, the propensity scores are estimated using multinomial regression from one of a few functions depending on the requested link: for logit ("logit"
) and probit ("probit"
) links, mlogit
from the mlogit package is used; for the Bayesian probit ("bayes.probit"
) link, mnp
from the MNP package is used; and for the biased-reduced multinomial logistic regression ("br.logit"
), brmultinom
from the brglm2 package is used. If the treatment variable is an ordered factor, polr
from the MASS package is used to fit ordinal regression. Any of the methods allowed in the method
argument of polr
can be supplied to link
. The following estimands are allowed: ATE, ATT, ATC, ATO, and ATM. The weights for each estimand are computed using the standard formulas or those mentioned above. When include.obj = TRUE
, the returned object is the fit object from the fitting function used.
For continuous treatments, the generalized propensity score is estimated using linear regression. In addition, kernel density estimation can be used instead of assuming a normal density for the numerator and denominator of the generalized propensity score by setting use.kernel = TRUE
. Other arguments to density
can be specified to refine the density estimation parameters. plot = TRUE
can be specified to plot the density for the numerator and denominator, which can be helpful in diagnosing extreme weights. When include.obj = TRUE
, the returned object is the glm
fit from denominator model.
For longitudinal treatments, the weights are the product of the weights estimated at each time point.
Sampling weights are supported through s.weights
in all scenarios except for multinomial treatments with link = "bayes.probit"
. Warning messages may appear otherwise about non-integer successes, and these can be ignored.
Missing data is not directly compatible with estimating propensity scores, so a few extra things happen when NA
s are present in the covariates. First, for each variable with missingness, a new missingness indicator variable is created which takes the value 1 if the original covariate is NA
and 0 otherwise. The missingness indicators are added to the model formula as main effects. The missing values in the covariates are then replaced with 0s (this value is arbitrary and does not affect estimation). The weight estimation then proceeds with this new formula and set of covariates. The covariates output in the resulting weightit
object will be the original covariates with the NA
s.
The following additional arguments can be specified:
link
The link used in the generalized linear model for the propensity scores. For binary treatments, link
can be any of those allowed by binomial
. A br.
prefix can be added (e.g., "br.logit"
); this changes the fitting method to the bias-corrected generalized linear models implemented in the brglm2 package. For multinomial treatments, link
can be "logit", "probit", "bayes.probit", or "br.logit". For ordered treatments, link
can be any of those allowed by the method
argument of polr
. For continuous treatments, link
can be any of those allowed by gaussian
.
use.kernel
If TRUE
, uses kernel density estimation through density
to estimate the numerator and denominator densities for the weights with continuous treatments. If FALSE
, assumes a normal distribution.
bw
, adjust
, kernel
, n
If use.kernel = TRUE
with continuous treatments, the arguments to density
. The defaults are the same as those in density
except that n
is 10 times the number of units in the sample.
plot
If use.kernel = TRUE
with continuous treatments, whether to plot the estimated density.
For binary treatments, additional arguments to glm can be specified as well. The method argument in glm is renamed to glm.method. This can be used to supply alternative fitting functions, such as those implemented in the glm2 package. Other arguments to weightit are massed to ... in glm.
Binary treatments
- estimand = "ATO"
Li, F., Morgan, K. L., & Zaslavsky, A. M. (2018). Balancing covariates via propensity score weighting. Journal of the American Statistical Association, 113(521), 390<U+2013>400. 10.1080/01621459.2016.1260466
- estimand = "ATM"
Li, L., & Greene, T. (2013). A Weighting Analogue to Pair Matching in Propensity Score Analysis. The International Journal of Biostatistics, 9(2). 10.1515/ijb-2012-0030
- Other estimands
Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399<U+2013>424. 10.1080/00273171.2011.568786
Multinomial Treatments
- estimand = "ATO"
Li, F., & Li, F. (2018). Propensity Score Weighting for Causal Inference with Multi-valued Treatments. ArXiv:1808.05339 [Stat]. Retrieved from http://arxiv.org/abs/1808.05339
- estimand = "ATM"
Yoshida, K., Hern<U+00E1>ndez-D<U+00ED>az, S., Solomon, D. H., Jackson, J. W., Gagne, J. J., Glynn, R. J., & Franklin, J. M. (2017). Matching weights to simultaneously compare three treatment groups: Comparison to three-way matching. Epidemiology (Cambridge, Mass.), 28(3), 387<U+2013>395. 10.1097/EDE.0000000000000627
- Other estimands
McCaffrey, D. F., Griffin, B. A., Almirall, D., Slaughter, M. E., Ramchand, R., & Burgette, L. F. (2013). A Tutorial on Propensity Score Estimation for Multiple Treatments Using Generalized Boosted Models. Statistics in Medicine, 32(19), 3388<U+2013>3414. 10.1002/sim.5753
Continuous treatments
method = "ps"
Robins, J. M., Hern<U+00E1>n, M. <U+00C1>., & Brumback, B. (2000). Marginal Structural Models and Causal Inference in Epidemiology. Epidemiology, 11(5), 550<U+2013>560.
# NOT RUN {
library("cobalt")
data("lalonde", package = "cobalt")
#Balancing covariates between treatment groups (binary)
(W1 <- weightit(treat ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "ps", estimand = "ATT",
link = "probit"))
summary(W1)
bal.tab(W1)
#Balancing covariates with respect to race (multinomial)
(W2 <- weightit(race ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "ps", estimand = "ATE"))
summary(W2)
bal.tab(W2)
#Balancing covariates with respect to re75 (continuous)
(W3 <- weightit(re75 ~ age + educ + married +
nodegree + re74, data = lalonde,
method = "ps", use.kernel = TRUE))
summary(W3)
bal.tab(W3)
# }
Run the code above in your browser using DataLab