Learn R Programming

CBPS (version 0.23)

CBMSM: Covariate Balancing Propensity Score (CBPS) for Marginal Structural Models

Description

CBMSM estimates propensity scores such that both covariate balance and prediction of treatment assignment are maximized. With longitudinal data, the method returns marginal structural model weights that can be entered directly into a linear model. The method also handles multiple binary treatments administered concurrently.

Usage

CBMSM(
  formula,
  id,
  time,
  data,
  type = "MSM",
  twostep = TRUE,
  msm.variance = "approx",
  time.vary = FALSE,
  init = "opt",
  ...
)

Arguments

formula

A formula of the form treat ~ X. The same covariates are used in each time period. At default values, a single set of coefficients is estimated across all time periods. To allow a different set of coefficients for each time period, set time.vary = TRUE. Data should be sorted by time.

id

A vector which identifies the unit associated with each row of treat and X.

time

A vector which identifies the time period associated with each row of treat and X. All data should be sorted by time.

data

An optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which CBMSM is called. Data should be sorted by time.

type

"MSM" for a marginal structural model, with multiple time periods or "MultiBin" for multiple binary treatments at the same time period.

twostep

Set to TRUE to use a two-step estimator, which will run substantially faster than continuous-updating. Default is FALSE, which uses the continuous-updating estimator described by Imai and Ratkovic (2014).

msm.variance

Default is FALSE, which uses the low-rank approximation of the variance described in Imai and Ratkovic (2014). Set to TRUE to use the full variance matrix.

time.vary

Default is FALSE, which uses the same coefficients across time period. Set to TRUE to fit one set per time period.

init

Default is "opt", which uses CBPS and logistic regression starting values, and chooses the one that achieves the best balance. Other options are "glm" and "CBPS"

...

Other parameters to be passed through to optim()

Value

weights

The optimal weights.

fitted.values

The fitted propensity score for each observation.

y

The treatment vector used.

x

The covariate matrix.

id

The vector id used in CBMSM.fit.

time

The vector time used in CBMSM.fit.

model

The model frame.

call

The matched call.

formula

The formula supplied.

data

The data argument.

treat.hist

A matrix of the treatment history, with each observation in rows and time in columns.

treat.cum

A vector of the cumulative treatment history, by individual.

Details

Fits covariate balancing propensity scores for marginal structural models.

### @aliases CBMSM CBMSM.fit

References

Imai, Kosuke and Marc Ratkovic. 2014. ``Covariate Balancing Propensity Score.'' Journal of the Royal Statistical Society, Series B (Statistical Methodology). http://imai.princeton.edu/research/CBPS.html

Imai, Kosuke and Marc Ratkovic. 2015. ``Robust Estimation of Inverse Probability Weights for Marginal Structural Models.'' Journal of the American Statistical Association. http://imai.princeton.edu/research/MSM.html

See Also

plot.CBMSM

Examples

Run this code
# NOT RUN {

##Load Blackwell data

data(Blackwell)

## Quickly fit a short model to test
form0 <- "d.gone.neg ~ d.gone.neg.l1 + camp.length"
fit0<-CBMSM(formula = form0, time=Blackwell$time,id=Blackwell$demName,
			data=Blackwell, type="MSM",  iterations = NULL, twostep = TRUE, 
			msm.variance = "approx", time.vary = FALSE)

# }
# NOT RUN {
##Fitting the models in Imai and Ratkovic  (2014)		
##Warning: may take a few mintues; setting time.vary to FALSE
##Results in a quicker fit but with poorer balance
##Usually, it is best to use time.vary TRUE
form1<-"d.gone.neg ~ d.gone.neg.l1 + d.gone.neg.l2 + d.neg.frac.l3 + 
		camp.length + camp.length + deminc + base.poll + year.2002 + 
		year.2004 + year.2006 + base.und + office"

##Note that 	init="glm" gives the published results but the default is now init="opt"
fit1<-CBMSM(formula = form1, time=Blackwell$time,id=Blackwell$demName,
			data=Blackwell, type="MSM",  iterations = NULL, twostep = TRUE, 
			msm.variance = "full", time.vary = TRUE, init="glm")

fit2<-CBMSM(formula = form1, time=Blackwell$time,id=Blackwell$demName,
			data=Blackwell, type="MSM",  iterations = NULL, twostep = TRUE, 
			msm.variance = "approx", time.vary = TRUE, init="glm")


##Assessing balance

bal1<-balance.CBMSM(fit1)
bal2<-balance.CBMSM(fit2)

##Effect estimation: Replicating Effect Estimates in 
##Table 3 of Imai and Ratkovic (2014)

lm1<-lm(demprcnt[time==1]~fit1$treat.hist,data=Blackwell,
weights=fit1$glm.weights)
lm2<-lm(demprcnt[time==1]~fit1$treat.hist,data=Blackwell,
weights=fit1$weights)
lm3<-lm(demprcnt[time==1]~fit1$treat.hist,data=Blackwell,
weights=fit2$weights)

lm4<-lm(demprcnt[time==1]~fit1$treat.cum,data=Blackwell,
weights=fit1$glm.weights)
lm5<-lm(demprcnt[time==1]~fit1$treat.cum,data=Blackwell,
weights=fit1$weights)
lm6<-lm(demprcnt[time==1]~fit1$treat.cum,data=Blackwell,
weights=fit2$weights)



### Example: Multiple Binary Treatments Administered at the Same Time
n<-200
k<-4
set.seed(1040)
X1<-cbind(1,matrix(rnorm(n*k),ncol=k))

betas.1<-betas.2<-betas.3<-c(2,4,4,-4,3)/5
probs.1<-probs.2<-probs.3<-(1+exp(-X1 %*% betas.1))^-1

treat.1<-rbinom(n=length(probs.1),size=1,probs.1)
treat.2<-rbinom(n=length(probs.2),size=1,probs.2)
treat.3<-rbinom(n=length(probs.3),size=1,probs.3)
treat<-c(treat.1,treat.2,treat.3)
X<-rbind(X1,X1,X1)
time<-c(rep(1,nrow(X1)),rep(2,nrow(X1)),rep(3,nrow(X1)))
id<-c(rep(1:nrow(X1),3))
y<-cbind(treat.1,treat.2,treat.3) %*% c(2,2,2) + 
X1 %*% c(-2,8,7,6,2) + rnorm(n,sd=5)

multibin1<-CBMSM(treat~X,id=id,time=time,type="MultiBin",twostep=TRUE)
summary(lm(y~-1+treat.1+treat.2+treat.3+X1, weights=multibin1$w))
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab