method_energy: Energy Balancing

Description

This page explains the details of estimating weights using energy balancing by setting method = "energy" in the call to weightit() or weightitMSM(). This method can be used with binary and multinomial treatments.

In general, this method relies on estimating weights by minimizing an energy statistic related to covariate balance. For binary and multinomial treatments, this is the energy distance, which is a multivariate distance between distributions, between treatment groups. This method relies on code written for WeightIt using osqp() from the osqp package to perform the optimization. This method may be slow or memory-intensive for large datasets.

Binary Treatments

For binary treatments, this method estimates the weights using osqp() using formulas described by Huling and Mak (2020). The following estimands are allowed: ATE, ATT, and ATC.

Multinomial Treatments

For multinomial treatments, this method estimates the weights using osqp() using formulas described by Huling and Mak (2020). The following estimands are allowed: ATE and ATT.

Continuous Treatments

Continuous treatments are not currently supported.

Longitudinal Treatments

For longitudinal treatments, the weights are the product of the weights estimated at each time point. This method is not guaranteed to yield optimal balance at each time point. NOTE: the use of energy balancing with longitudinal treatments has not been validated!

Sampling Weights

Sampling weights are supported through s.weights in all scenarios. In some cases, sampling weights will cause the optimization to fail due to lack of convexity or infeasible constraints.

Missing Data

In the presence of missing data, the following value(s) for missing are allowed:

"ind" (default): First, for each variable with missingness, a new missingness indicator variable is created which takes the value 1 if the original covariate is NA and 0 otherwise. The missingness indicators are added to the model formula as main effects. The missing values in the covariates are then replaced with 0s (this value is arbitrary and does not affect estimation). The weight estimation then proceeds with this new formula and set of covariates. The covariates output in the resulting weightit object will be the original covariates with the NAs.

Additional Arguments

For binary and multinomial treatments, the following additional arguments can be specified:

improved: logical; whether to use the improved energy balancing weights as described by Huling and Mak (2020) when estimand = "ATE". This involves optimizing balance not only between each treatment group and the overall sample, but also between each pair of treatment groups. Huling and Mak (2020) found that the improved energy balancing weights generally outperformed standard energy balancing. Default is TRUE; set to FALSE to use the standard energy balancing weights instead (not recommended).
dist.mat: a numeric distance matrix to be used instead of the default distance matrix computed by weightit(), which uses dist() with default arguments. Note that some distance matrices can cause the R session to abort due to a bug within osqp, so this argument should be used with caution. A distance matrix must be a square, symmetric, numeric matrix with zeros along the diagonal and a row and column for each unit. Can also be supplied as the output of a call to dist().
lambda: a positive numeric scalar used to penalize the square of the weights. This value divided by the total sample size is added to the diagonal of the quadratic part of the loss function. Higher values favor weights with less variability. Note this is distinct from the lambda value described in Huling and Mak (2020), which penalizes the complexity of individual treatment rules rather than the weights.

The moments argument functions differently for method = "energy" from how it does with other methods. When unspecified or set to zero, energy balancing weights are estimated as described by Huling and Mak (2020). When moments is set to an integer larger than 0, additional balance constraints on the requested moments of the covariates are also included, guaranteeing exact moment balance on these covariates while minimizing the energy distance of the weighted sample. For binary and multinomial treatments, this involves exact balance on the means of the entered covariates.

Additional Outputs

obj: When include.obj = TRUE, the output of the call to solve_osqp(), which contains the dual variables and convergence information.

References

Huling, J. D., & Mak, S. (2020). Energy Balancing of Covariate Distributions. ArXiv:2004.13962 [Stat]. https://arxiv.org/abs/2004.13962

Examples

Run this code

# NOT RUN {
library("cobalt")
data("lalonde", package = "cobalt")

#Examples may not converge, but may after several runs
# }
# NOT RUN {
#Balancing covariates between treatment groups (binary)
(W1 <- weightit(treat ~ age + educ + married +
                  nodegree + re74, data = lalonde,
                method = "energy", estimand = "ATE"))
summary(W1)
bal.tab(W1)

#Balancing covariates with respect to race (multinomial)
(W2 <- weightit(race ~ age + educ + married +
                  nodegree + re74, data = lalonde,
                method = "energy", estimand = "ATT",
                focal = "black"))
summary(W2)
bal.tab(W2)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab