tian_transf(formula, data, subset, na.action = na.pass, method = c("undersample", "oversample", "none"), standardize = TRUE, cts = FALSE)
trt()
must be used in the model equation to identify the binary treatment variable. For example, if the treatment is represented by a variable named treat
, then the right hand side of the formula must include the term +trt(treat)
.na.action = na.pass
.The covariates $x$ supplied in the RHS of the model formula are transformed as $w = z * T/2$, where $T=[-1,1]$ is the treatment indicator and $z$ is the matrix of standardize $x$ variables.
If cts = TRUE
, factors included in the formula are converted to dummy variables in a special way that is more appropriate when the returned model frame is used to fit a penalized regression. In this case, contrasts used for factors are given by penalized regression contrasts from the penalized
package. Unordered factors are turned into as many dummy variables as the factor has levels, except when the number of levels is 2, in which case it returns a single contrast. This ensures a symmetric treatment of all levels and guarantees that the fit does not depend on the ordering of the levels. See help(contr.none)
in penalized
package. Ordered factors are turned into dummy variables that code for the difference between successive levels (one dummy less than the number of levels). See help(contr.diff)
in penalized package
.
If the data has an equal number of control and treated observations, then method = "none"
should be used. Otherwise, any of the other methods should be used.
If method = "undersample"
, a random sample without replacement is drawn from the treated class (i.e., treated/control) with the majority of observations, such that the returned data frame will have balanced treated/control proportions.
If method = "oversample"
, a random sample with replacement is drawn from the treated class with the minority of observations, such that the returned data frame will have balanced treated/control proportions.
Guelman, L., Guillen, M., and Perez-Marin A.M. (2013). Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study. Submitted.
library(uplift)
set.seed(1)
dd <- sim_pte(n = 1000, p = 20, rho = 0, sigma = sqrt(2), beta.den = 4)
dd$treat <- ifelse(dd$treat == 1, 1, 0)
dd2 <- tian_transf(y ~ X1 + X2 + X3 + trt(treat), data =dd, method = "none")
head(dd2)
Run the code above in your browser using DataLab