Match
function which
separates the matching problem into subgroups defined by a factor.
This is equivalent to conducting exact matching on each level of a factor.
Matches within each level are found as determined by the
usual matching options. This function is much faster for large
datasets than the Match
function itself. For additional
speed, consider doing matching without replacement---see the
replace
option. This function is more limited than the
Match
function. For example, Matchby
cannot be
used if the user wishes to provide observation specific weights.Matchby(Y, Tr, X, by, estimand = "ATT", M = 1, ties=FALSE, replace=TRUE,
exact = NULL, caliper = NULL, AI=FALSE, Var.calc=0,
Weight = 1, Weight.matrix = NULL, distance.tolerance = 1e-05,
tolerance = sqrt(.Machine$double.eps), print.level=1, version="Matchby", ...)
as.factor(by)
defines the
grouping, or a list of such factors in which case their
interaction is used for the grouping.ties
option.ties==TRUE
. If, for example, one
treated observation matches more than one control observation, the
matched dataset will include the multiple matchedFALSE
, the order of matches generally matters. Matches
will be found in the same order as the data is sorted. Thus, the
match(es) for the first observation will be X
. If a logical vector is provided, a logical value should
be providedMatchby
can only calculate AI SEs for ATT.
To calculate AI errors with other estimands,Var.calc=0
which means that
homoscedasticity is assumed. For values of Var.calc > 0
,
robust variances are calculated using Var.calc
maX
. The default value of
1 denotes that weights are equal to the inverse of the variances. 2
denotes the MahaX
---see
the Weight
option. This square matrix should have as many
columns as the number of columns of the X
distance.tolerance
are deemed to be equal to zero. This
option can be used to perform a type of optimal Match
.AI
option is TRUE
. This standard error has
correct coverage if X
consists of either covariates or a
known propensity score because it takes into account the uncertainty
of the matching
procedure. If an estimated propensity score is used, the
uncertainty involved in its estimation is not accounted for although the
uncertainty of the matching procedure itself still is.index.control
can be used to recover the matched dataset produced by
Matchby
. For example, the X
matrix used by Matchby
can be recovered by
rbind(X[index.treated,],X[index.control,])
.index.treated
can be used to recover the matched dataset produced by
Matchby
. For example, the Y
matrix for the matched dataset
can be recovered by
c(Y[index.treated],Y[index.control])
.Match
which was used.Matchby
is much faster for large datasets than
Match
. But Matchby
only implements a subset of
the functionality of Match
. For example, the
restrict
option cannot be used, Abadie-Imbens standard errors
are not provided and bias adjustment cannot be requested.
Matchby
is a wrapper for the Match
function which
separates the matching problem into subgroups defined by a factor. This
is the equivalent to doing exact matching on each factor, and the
way in which matches are found within each factor is determined by the
usual matching options.
Note that by default ties=FALSE
although the default for
the Match
in GenMatch
functions is TRUE
. This is
done because randomly breaking ties in large datasets often results in
a great speedup. For additional speed, consider doing matching
without replacement which is often much faster when the dataset is
large---see the replace
option.
There will be slight differences in the matches produced by
Matchby
and Match
because of how the covariates
are weighted. When the data is broken up into separate groups (via
the by
option), Mahalanobis distance and inverse variance
will imply different weights than when the data is taken as whole. Sekhon, Jasjeet S. 2006. ``Alternative Balance Metrics for Bias
Reduction in Matching Methods for Causal Inference.'' Working Paper.
Abadie, Alberto and Guido Imbens. 2006.
``Large Sample Properties of Matching Estimators for Average
Treatment Effects.'' Econometrica 74(1): 235-267.
Diamond, Alexis and Jasjeet S. Sekhon. 2005. ``Genetic Matching for
Estimating Causal Effects: A General Multivariate Matching Method for
Achieving Balance in Observational Studies.'' Working Paper.
Imbens, Guido. 2004. Matching Software for Matlab and
Stata.
Match
,
summary.Matchby
,
GenMatch
,
MatchBalance
,
balanceMV
, balanceUV
,
qqstats
, ks.boot
,
GerberGreenImai
, lalonde
#
# Match exactly by racial groups and then match using the propensity score within racial groups
#
data(lalonde)
#
# Estimate the Propensity Score
#
glm1 <- glm(treat~age + I(age^2) + educ + I(educ^2) +
hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
u74 + u75, family=binomial, data=lalonde)
#save data objects
#
X <- glm1$fitted
Y <- lalonde$re78
Tr <- lalonde$treat
# one-to-one matching with replacement (the "M=1" option) after exactly
# matching on race using the 'by' option. Estimating the treatment
# effect on the treated (the "estimand" option defaults to ATT).
rr <- Matchby(Y=Y, Tr=Tr, X=X, by=lalonde$black, M=1);
summary(rr)
# Let's check the covariate balance
# 'nboots' is set to small values in the interest of speed.
# Please increase to at least 500 each for publication quality p-values.
mb <- MatchBalance(treat~age + I(age^2) + educ + I(educ^2) + black +
hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
u74 + u75, data=lalonde, match.out=rr, nboots=10)
Run the code above in your browser using DataLab