Mahalanobis Distance Matching
Mahalanobis distance matching can be done one of two ways:
1) If no propensity score needs to be estimated, distance
should be set to "mahalanobis"
, and Mahalanobis distance matching will occur on all the variables in formula
. Arguments to discard
and mahvars
will be ignored, and a caliper can only be placed on named variables. For example, to perform simple Mahalanobis distance matching, the following could be run:
matchit(treat ~ X1 + X2, method = "nearest",
distance = "mahalanobis")
With this code, the Mahalanobis distance is computed using X1
and X2
, and matching occurs on this distance. The distance
component of the matchit
output will be empty.
2) If a propensity score needs to be estimated for any reason, e.g., for common support with discard
or for creating a caliper, distance
should be whatever method is used to estimate the propensity score or a vector of distance measures, i.e., it should not be "mahalanobis"
. Use mahvars
to specify the variables used to create the Mahalanobis distance. For example, to perform Mahalanobis within a propensity score caliper, the following could be run:
matchit(treat ~ X1 + X2 + X3, method = "nearest",
distance = "glm", caliper = .25,
mahvars = ~ X1 + X2)
With this code, X1
, X2
, and X3
are used to estimate the propensity score (using the "glm"
method, which by default is logistic regression), which is used to create a matching caliper. The actual matching occurs on the Mahalanobis distance computed only using X1
and X2
, which are supplied to mahvars
. Units whose propensity score difference is larger than the caliper will not be paired, and some treated units may therefore not receive a match. The estimated propensity scores will be included in the distance
component of the matchit
output. See Examples.
When sampling weights are supplied through the s.weights
argument, the covariance matrix of the covariates used in the Mahalanobis distance is weighted by the sampling weights.
Estimand
The estimand
argument controls whether control units are selected to be matched with treated units (estimand = "ATT"
) or treated units are selected to be matched with control units (estimand = "ATC"
). The "focal" group (e.g., the treated units for the ATT) is typically made to be the smaller treatment group, and a warning will be thrown if it is not set that way unless replace = TRUE
. Setting estimand = "ATC"
is equivalent to swapping all treated and control labels for the treatment variable. When estimand = "ATC"
, the default m.order
is "smallest"
, and the match.matrix
component of the output will have the names of the control units as the rownames and be filled with the names of the matched treated units (opposite to when estimand = "ATT"
). Note that the argument supplied to estimand
doesn't necessarily correspond to the estimand actually targeted; it is merely a switch to trigger which treatment group is considered "focal".
Variable Ratio Matching
matchit
can perform variable ratio "extremal" matching as described by Ming and Rosenbaum (2000). This method tends to result in better balance than fixed ratio matching at the expense of some precision. When ratio > 1
, rather than requiring all treated units to receive ratio
matches, each treated unit is assigned a value that corresponds to the number of control units they will be matched to. These values are controlled by the arguments min.controls
and max.controls
, which correspond to and , respectively, in Ming and Rosenbaum (2000), and trigger variable ratio matching to occur. Some treated units will receive min.controls
matches and others will receive max.controls
matches (and one unit may have an intermediate number of matches); how many units are assigned each number of matches is determined by the algorithm described in Ming and Rosenbaum (2000, p119). ratio
controls how many total control units will be matched: n1 * ratio
control units will be matched, where n1
is the number of treated units, yielding the same total number of matched controls as fixed ratio matching does.
Variable ratio matching cannot be used with Mahalanobis distance matching. The calculations of the numbers of control units each treated unit will be matched to occurs without consideration of caliper
or discard
. ratio
does not have to be an integer but must be greater than 1 and less than n0/n1
, where n0
and n1
are the number of control and treated units, respectively. Setting ratio = n0/n1
performs a crude form of full matching where all control units are matched. If min.controls
is not specified, it is set to 1 by default. min.controls
must be less than ratio
and max.controls
must be greater than ratio
. See Examples below for an example their use.