mdist: Creates matrices of mdists (distances) between observations for matching

Description

A generic function, with several supplied methods, for creating distance matrices between observations to be used in the match process. Using these matrices, pairmatch() or fullmatch() can determine optimal matches.

Usage

mdist(x, structure.fmla = NULL, ...)

Arguments

The object to use as the basis for forming the mdist matrix. Methods exist for formulas, functions, and generalized linear models.

structure.fmla

A formula denoting the treatment variable on the left hand side and an optional grouping expression on the right hand side. For example, z ~ 1 indicates no grouping. z ~ s subsets the data only computing distances within t

...

Additional method arguments. Most methods require a 'data' argument.

Value

Object of class optmatch.dlist, which is suitable to be given as distance argument to fullmatch or pairmatch. For more information, see pscore.dist

Details

The mdist method provides three ways to construct a mdist matrix (or list of mdist matrices): function, glm, and formula.

The mdist.function method takes a function of two arguments. When called, this function will recieve the treatment observations as the first argument and the control observations as the second argument. As an example, the following computes the raw differences between values of t1 for treatment units (here, nuclear plants with pr==1) and controls (here, plants with pr==0), returning the result as a distance matrix: sdiffs <- function(treatments, controls) { abs(outer(treatments$t1, controls$t1, `-`)) }

The mdist.function method does similar things as the earlier optmatch function makedist, although the interface is a bit different. The mdist.formula computes the squared Mahalanobis distance between observations using the supplied formula. In addition to the distance formula (the first argument), this method can also take a structure formula to denote strata in the observations, e.g. ~ s would group the observations by the factor s.

The mdist.glm takes an argument of class glm as the first argument. It computes the deviations between observations using the mad function. See pscore.dist for more information.

References

P.~R. Rosenbaum and D.~B. Rubin (1985), Constructing a control group using multivariate matched sampling methods that incorporate the propensity score, The American Statistician, 39 33--38.

Examples

Run this code

data(nuclearplants)

### A propensity score distance:
aGlm <- glm(pr~.-(pr+cost), family=binomial(), data=nuclearplants)
mdist(aGlm)

### A Mahalanobis distance:
mdist(pr ~ t1 + t2, data = nuclearplants)

### Absolute difference on a scalar-distance:

sdiffs <- function(treatments, controls) {
  abs(outer(treatments$t1, controls$t1, `-`))
}

(absdist <- mdist(sdiffs, structure.fmla = pr ~ 1, data = nuclearplants))

### Using pairmatch on the scalar example:
pairmatch(absdist)

Run the code above in your browser using DataLab