pscore.dist: Assemble propensity distances and prepare for matching

Description

Extracts scores from a fitted propensity scoring model, assembling them into a discrepancy matrix (or matrices) from which pairmatch() or fullmatch() can determine optimal matches.

Usage

pscore.dist(glmobject, structure.fmla = NULL, standardization.scale=sd)

Arguments

glmobject

A fitted propensity modeling object, produced by a call to glm() or, say, bayesglm (from package arm) or brglm() (from the brglm package).

structure.fmla

Optional formula argument specifying subclasses within which matches are to be performed. If omitted, no subclassification is done. If it is given, the RHS of this formula gives variables on which to stratify the sample prior to matching.

standardization.scale

Scalar-valued function or NULL, defaulting to sd.

Value

Object of class optmatch.dlist, which is suitable to be given as distance argument to fullmatch or pairmatch.
Specifically, a list of matrices, one for each subclass defined by the interaction of variables appearing on the right hand side of structure.fmla. Each of these is a number of treatments by number of controls matrix of propensity distances. The distances are differences of the linear predictor from the propensity model, rather than differences of estimated probabilities, avoiding compression of estimated propensities near 0 and 1 (Rosenbaum and Rubin, 1985). They will have been scaled by the pooled SD of propensity scores in the treatment and control groups, so that a caliper of .25 pooled SDs on the propensity score can be coded as value/(value<=.25)< code="">; see the examples. The list also carries some metadata as attributes, data that are not of direct interest to the user but are useful to fullmatch() and pairmatch().

Details

glmobject need not necessarily be the result of a call to glm. It should be a list with elements: y, a vector that is positive for treatment subjects and nonpositive for controls; linear.predictors, containing the propensity scores; and data, the data frame from which propensities were made. The purpose of giving a structure.fmla argument is to speed up large problems. Variables appearing on its right-hand side will be interacted to create these subclasses; the same variables should also have appeared on the RHS of the formula used to specify the propensity model.

If non-null, argument standardization.scale should be a scalar-valued function of a vector argument. It is applied separately to treatment and control propensity scores to determine their scale; propensity distances will be scaled by a pooling of these two values. If NULL, no scaling is supplied, and distances are in terms of the linear propensity score, that is the logits of the conditional probabilities. (Another good choice is mad, which is robust to outliers but gives results similar to sd in the absence of outliers.) Underneath, the function makes use of makedist. So it keeps track of metadata useful for matching, as described on the help page for that function.

References

P.~R. Rosenbaum and D.~B. Rubin (1985), Constructing a control group using multivariate matched sampling methods that incorporate the propensity score, The American Statistician, 39 33--38.

Examples

Run this code

data(nuclearplants)
psm <- glm(pr~.-(pr+cost), family=binomial(), data=nuclearplants)
psd <- pscore.dist(psm)
fullmatch(psd)
fullmatch(caliper(.25, psd)) # Propensity matching with calipers

Run the code above in your browser using DataLab