Match
implements a variety of algorithms for multivariate
matching including propensity score, Mahalanobis and inverse variance
matching. The function is intended to be used in conjunction with the
MatchBalance
function which determines the extent to which
Match
has been able to achieve covariate balance. In order to
do propensity score matching, one should estimate the propensity model
before calling Match
, and then send Match
the propensity
score to use. Match
enables a wide variety of matching
options including matching with or without replacement, bias
adjustment, different methods for handling ties, exact and caliper
matching, and a method for the user to fine tune the matches via a
general restriction matrix. Variance estimators include the usual
Neyman standard errors, Abadie-Imbens standard errors, and robust
variances which do not assume a homogeneous causal effect. The
GenMatch
function can be used to automatically
find balance via a genetic search algorithm which determines the
optimal weight to give each covariate.Match(Y=NULL, Tr, X, Z = X, V = rep(1, length(Y)), estimand = "ATT", M = 1,
BiasAdjust = FALSE, exact = NULL, caliper = NULL, replace=TRUE, ties=TRUE,
CommonSupport=FALSE,Weight = 1, Weight.matrix = NULL, weights = NULL,
Var.calc = 0, sample = FALSE, restrict=NULL, match.out = NULL,
distance.tolerance = 1e-05, tolerance=sqrt(.Machine$double.eps),
version="standard")
Match
will rVar.calc
option,
which takes precedence.ties
option.Z
matrix.X
. If a logical vector is provided, a logical value should
be provideFALSE
, the order of matches
generally matters. Matches will be found in the same order as the
data are sorted. Thus, the match(es) for the first ties==TRUE
. If, for example, one treated observation
matches more than one control observation, the matched dataset will
include the multiple matchedcaliper
option is to
be X
. The default value of 1 denotes that weights are equal to
the inverse of the variances. 2 denotes the MahalanoX
---see
the Weight
option. This square matrix should have as many
columns as the number of columns of the X
Y
which
provides observation specific weights.Var.calc=0
which means that
homoscedasticity is assumed. For values of Var.calc > 0
,
robust variances are calculated using Var.calc
madistance.tolerance
are deemed to be equal to zero.
This option can be used to perform a type of optimal mMatch
. If this object is provided, then Match
will
use the matches found by the previous invocation of the function.
Hence, Match
will run faster. This is
uties=FALSE
or
replace=FALSE
if the dataset is larX
consists of either covariates or a known
propensity score because it takes into account the uncertainty of the
matching procedure. If an estimated propensity score is used, the
uncertainty involved in its estimation is not accounted for although
the uncertainty of the matching procedure itself still is.BiasAdjust
. If BiasAdjust
is not requested, this is the
same as est
.weights
. Note that the
standard error provided by se
takes into account the uncertainty
of the matching procedure while se.standard
does not. Neither
se
nor se.standard
take into account the uncertainty of
estimating a propensity score. se.standard
does
not take into account any BiasAdjust
. Summary of both types
of standard error results can be requested by setting the
full=TRUE
flag when using the summary.Match
function on the object returned by Match
.Match
. Three datasets are included in this list: Y
,
Tr
and X
.index.control
can be used to recover the matched dataset produced by
Match
. For example, the X
matrix used by Match
can be recovered by
rbind(X[index.treated,],X[index.control,])
. The user should
generally just examine the output of mdata
.index.treated
can be used to recover the matched dataset produced by
Match
. For example, the X
matrix used by Match
can be recovered by
rbind(X[index.treated,],X[index.control,])
. The user should
generally just examine the output of mdata
.caliper
and
exact
. If no observations were dropped, this
index will be NULL
.caliper
which was used.X
variables. This object has the same length as the number of
covariates in X
.exact
function argument.ndrops.matches
, takes into account observation specific
weights which the user may have provided via the weights
argument.MatchBalance
function which checks if the results of this
function have actually achieved balance. The results of this function
can be summarized by a call to the summary.Match
function. If one wants to do propensity score matching, one should estimate the
propensity model before calling Match
, and then place the
fitted values in the X
matrix---see the provided example.
The GenMatch
function can be used to automatically
find balance by the use of a genetic search algorithm which determines
the optimal weight to give each covariate. The object returned by
GenMatch
can be supplied to the Weight.matrix
option of Match
to obtain estimates.
Match
is often much faster with large datasets if
ties=FALSE
or replace=FALSE
---i.e., if matching is done
by randomly breaking ties or without replacement. Also see the
Matchby
function. It provides a wrapper for
Match
which is much faster for large datasets when it can be
used.
Three demos are included: GerberGreenImai
, DehejiaWahba
,
and AbadieImbens
. These can be run by calling the
demo
function such as by demo(DehejiaWahba)
. Sekhon, Jasjeet S. 2006. ``Alternative Balance Metrics for Bias
Reduction in Matching Methods for Causal Inference.'' Working Paper.
Abadie, Alberto and Guido Imbens. 2006.
``Large Sample Properties of Matching Estimators for Average
Treatment Effects.'' Econometrica 74(1): 235-267.
Diamond, Alexis and Jasjeet S. Sekhon. 2005. ``Genetic Matching for
Estimating Causal Effects: A General Multivariate Matching Method for
Achieving Balance in Observational Studies.'' Working Paper.
Imbens, Guido. 2004. Matching Software for Matlab and
Stata.
summary.Match
,
GenMatch
,
MatchBalance
,
Matchby
,
balanceMV
, balanceUV
,
qqstats
, ks.boot
,
GerberGreenImai
, lalonde
#
# Replication of Dehejia and Wahba psid3 model
#
# Dehejia, Rajeev and Sadek Wahba. 1999.``Causal Effects in Non-Experimental Studies: Re-Evaluating the
# Evaluation of Training Programs.''Journal of the American Statistical Association 94 (448): 1053-1062.
#
data(lalonde)
#
# Estimate the propensity model
#
glm1 <- glm(treat~age + I(age^2) + educ + I(educ^2) + black +
hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
u74 + u75, family=binomial, data=lalonde)
#
#save data objects
#
X <- glm1$fitted
Y <- lalonde$re78
Tr <- lalonde$treat
#
# one-to-one matching with replacement (the "M=1" option).
# Estimating the treatment effect on the treated (the "estimand" option defaults to ATT).
#
rr <- Match(Y=Y, Tr=Tr, X=X, M=1);
summary(rr)
# Let's check the covariate balance
# 'nboots' is set to small values in the interest of speed.
# Please increase to at least 500 each for publication quality p-values.
mb <- MatchBalance(treat~age + I(age^2) + educ + I(educ^2) + black +
hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
u74 + u75, data=lalonde, match.out=rr, nboots=10)
Run the code above in your browser using DataLab