In matchit()
, setting method = "full"
performs optimal full
matching, which is a form of subclassification wherein all units, both
treatment and control (i.e., the "full" sample), are assigned to a subclass
and receive at least one match. The matching is optimal in the sense that
that sum of the absolute distances between the treated and control units in
each subclass is as small as possible. The method relies on and is a wrapper
for optmatch::fullmatch()
.
Advantages of optimal full matching include that the matching order is not
required to be specified, units do not need to be discarded, and it is less
likely that extreme within-subclass distances will be large, unlike with
standard subclassification. The primary output of full matching is a set of
matching weights that can be applied to the matched sample; in this way,
full matching can be seen as a robust alternative to propensity score
weighting, robust in the sense that the propensity score model does not need
to be correct to estimate the treatment effect without bias. Note: with large samples, the optimization may fail or run very slowly; one can try using method = "quick"
instead, which also performs full matching but can be much faster.
This page details the allowable arguments with method = "full"
.
See matchit()
for an explanation of what each argument means in a general
context and how it can be specified.
Below is how matchit()
is used for optimal full matching:
matchit(formula,
data = NULL,
method = "full",
distance = "glm",
link = "logit",
distance.options = list(),
estimand = "ATT",
exact = NULL,
mahvars = NULL,
anitexact = NULL,
discard = "none",
reestimate = FALSE,
s.weights = NULL,
caliper = NULL,
std.caliper = TRUE,
verbose = FALSE,
...)
a two-sided formula object containing the treatment and covariates to be used in creating the distance measure used in the matching. This formula will be supplied to the functions that estimate the distance measure.
a data frame containing the variables named in formula
.
If not found in data
, the variables will be sought in the
environment.
set here to "full"
.
the distance measure to be used. See distance
for allowable options. Can be supplied as a distance matrix.
when distance
is specified as a method of estimating
propensity scores, an additional argument controlling the link function used
in estimating the distance measure. See distance
for allowable
options with each option.
a named list containing additional arguments
supplied to the function that estimates the distance measure as determined
by the argument to distance
.
a string containing the desired estimand. Allowable options
include "ATT"
, "ATC"
, and "ATE"
. The estimand controls
how the weights are computed; see the Computing Weights section at
matchit()
for details.
for which variables exact matching should take place.
for which variables Mahalanobis distance matching should take
place when distance
corresponds to a propensity score (e.g., for
caliper matching or to discard units for common support). If specified, the
distance measure will not be used in matching.
for which variables ant-exact matching should take place.
Anti-exact matching is processed using optmatch::antiExactMatch()
.
a string containing a method for discarding units outside a
region of common support. Only allowed when distance
corresponds to a
propensity score.
if discard
is not "none"
, whether to
re-estimate the propensity score in the remaining sample prior to matching.
the variable containing sampling weights to be incorporated into propensity score models and balance statistics.
the width(s) of the caliper(s) used for caliper matching.
Calipers are processed by optmatch::caliper()
. See Notes and Examples.
logical
; when calipers are specified, whether they
are in standard deviation units (TRUE
) or raw units (FALSE
).
logical
; whether information about the matching
process should be printed to the console.
additional arguments passed to optmatch::fullmatch()
.
Allowed arguments include min.controls
, max.controls
,
omit.fraction
, mean.controls
, tol
, and solver
.
See the optmatch::fullmatch()
documentation for details. In general,
tol
should be set to a low number (e.g., 1e-7
) to get a more
precise solution.
The arguments replace
, m.order
, and ratio
are ignored with a warning.
All outputs described in matchit()
are returned with
method = "full"
except for match.matrix
. This is because
matching strata are not indexed by treated units as they are in some other
forms of matching. When include.obj = TRUE
in the call to
matchit()
, the output of the call to optmatch::fullmatch()
will be
included in the output. When exact
is specified, this will be a list
of such objects, one for each stratum of the exact
variables.
Mahalanobis distance matching can be done one of two ways:
If no propensity score needs to be estimated, distance
should be
set to "mahalanobis"
, and Mahalanobis distance matching will occur
using all the variables in formula
. Arguments to discard
and
mahvars
will be ignored, and a caliper can only be placed on named
variables. For example, to perform simple Mahalanobis distance matching, the
following could be run:
matchit(treat ~ X1 + X2, method = "nearest",
distance = "mahalanobis")
With this code, the Mahalanobis distance is computed using X1
and
X2
, and matching occurs on this distance. The distance
component of the matchit()
output will be empty.If a propensity score needs to be estimated for any reason, e.g., for
common support with discard
or for creating a caliper,
distance
should be whatever method is used to estimate the propensity
score or a vector of distance measures, i.e., it should not be
"mahalanobis"
. Use mahvars
to specify the variables used to
create the Mahalanobis distance. For example, to perform Mahalanobis within
a propensity score caliper, the following could be run:
matchit(treat ~ X1 + X2 + X3, method = "nearest",
distance = "glm", caliper = .25,
mahvars = ~ X1 + X2)
With this code, X1
, X2
, and X3
are used to estimate the
propensity score (using the "glm"
method, which by default is
logistic regression), which is used to create a matching caliper. The actual
matching occurs on the Mahalanobis distance computed only using X1
and X2
, which are supplied to mahvars
. Units whose propensity
score difference is larger than the caliper will not be paired, and some
treated units may therefore not receive a match. The estimated propensity
scores will be included in the distance
component of the
matchit()
output. See Examples.In a manuscript, be sure to cite the following paper if using
matchit()
with method = "full"
:
Hansen, B. B., & Klopfer, S. O. (2006). Optimal Full Matching and Related Designs via Network Flows. Journal of Computational and Graphical Statistics, 15(3), 609–627. tools:::Rd_expr_doi("10.1198/106186006X137047")
For example, a sentence might read:
Optimal full matching was performed using the MatchIt package (Ho, Imai, King, & Stuart, 2011) in R, which calls functions from the optmatch package (Hansen & Klopfer, 2006).
Theory is also developed in the following article:
Hansen, B. B. (2004). Full Matching in an Observational Study of Coaching for the SAT. Journal of the American Statistical Association, 99(467), 609–618. tools:::Rd_expr_doi("10.1198/016214504000000647")
matchit()
for a detailed explanation of the inputs and outputs of
a call to matchit()
.
optmatch::fullmatch()
, which is the workhorse.
method_optimal
for optimal pair matching, which is a special
case of optimal full matching, and which relies on similar machinery.
Results from method = "optimal"
can be replicated with method = "full"
by setting min.controls
, max.controls
, and
mean.controls
to the desired ratio
.
method_quick
for fast generalized quick matching, which is very similar to optimal full matching but can be dramatically faster at the expense of optimality and is less customizable.
if (FALSE) { # requireNamespace("optmatch", quietly = TRUE)
data("lalonde")
# Optimal full PS matching
m.out1 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
method = "full")
m.out1
summary(m.out1)
# Optimal full Mahalanobis distance matching within a PS caliper
m.out2 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
method = "full", caliper = .01,
mahvars = ~ age + educ + re74 + re75)
m.out2
summary(m.out2, un = FALSE)
# Optimal full Mahalanobis distance matching within calipers
# of 500 on re74 and re75
m.out3 <- matchit(treat ~ age + educ + re74 + re75,
data = lalonde, distance = "mahalanobis",
method = "full",
caliper = c(re74 = 500, re75 = 500),
std.caliper = FALSE)
m.out3
summary(m.out3, addlvariables = ~race + nodegree + married,
data = lalonde, un = FALSE)
}
Run the code above in your browser using DataLab