Calculates several imbalance measures for the original and matched data sets
imbalance(group, data, drop=NULL, breaks = NULL, weights, grouping = NULL)
An object of class imbalance
which is a list with the following
two elements
Table of imbalance measures
The global L1 measure of imbalance
the group variable
the data
a vector of variable names in the data frame to ignore
a list of vectors of cutpoints used to calculate the L1 measure. See Details.
weights
named list, each element of which is a list of groupings for a single categorical variable. See Details.
Stefano Iacus, Gary King, and Giuseppe Porro
This function calculates several imbalance measures.
For numeric variables, the difference in means (under the column
statistic
), the difference in quantiles and the L1 measure is
calculated. For categorical variables the L1 measure and the
Chi-squared distance (under column statistic
) is calculated.
Column type
reports either (diff)
or (Chi2)
to
indicate the type of statistic being calculated.
If breaks
is not specified, the Scott automated bin calculation
is used (which coarsens less than Sturges, which used in
cem
). Please refer to cem
help page. In
this case, breaks are used to calculate the L1 measure.
This function also calculate the global L1 imbalance measure.
If breaks
is missing, the default rule to calculate cutpoints
is the Scott's rule.
The grouping
option is a list where each element is itself a
list. For example, suppose for variable quest1
you have the
following possible levels "no answer", NA, "negative", "neutral",
"positive"
and you want to collect ("no answer", NA, "neutral")
into a single group, then the grouping
argument should contain
list(quest1=list(c("no answer", NA, "neutral")))
. Or if you have
a discrete variable elements
with values 1:10
and you want
to collect it into groups ``1:3,NA
'', ``4
'',
``5:9
'', ``10
'' you specify in grouping
the
following list list(elements=list(c(1:3,NA), 5:9))
. Values not
defined in the grouping
are left as they are. If cutpoints
and groupings
are defined for the same variable, the
groupings
take precedence and the corresponding cutpoints are set
to NULL
.
See L1.meas
help page for details.
Iacus, King, Porro (2011) tools:::Rd_expr_doi("10.1198/jasa.2011.tm09599")
Iacus, King, Porro (2012) tools:::Rd_expr_doi("10.1093/pan/mpr013")
Iacus, King, Porro (2019) tools:::Rd_expr_doi("10.1017/pan.2018.29")
# \donttest{
data(LL)
todrop <- c("treated","re78")
imbalance(LL$treated, LL, drop=todrop)
# cem match: automatic bin choice
mat <- cem(treatment="treated", data=LL, drop="re78")
# }
Run the code above in your browser using DataLab