Compute nonparametric estimates of the chosen variable importance parameter, with a correction for using data-adaptive techniques to estimate the conditional means only if necessary.
cv_vimp_point_est(
full,
reduced,
y,
folds,
weights = rep(1, length(y)),
type = "r_squared",
na.rm = FALSE
)
fitted values from a regression of the outcome on the full set of covariates; a list of length V, where each object is a set of predictions on the validation data.
fitted values from a regression of the fitted values from the full regression on the reduced set of covariates; a list of length V, where each object is a set of predictions on the validation data.
the outcome.
a list of outer and inner folds (outer for hypothesis testing, inner for cross-validation)
weights for the computed influence curve (e.g., inverse probability weights for coarsened-at-random settings)
which parameter are you estimating (defaults to anova
, for ANOVA-based variable importance)?
logical; should NA's be removed in computation? (defaults to FALSE
)
The estimated variable importance for the given group of left-out covariates.
See the paper by Williamson, Gilbert, Simon, and Carone for more details on the mathematics behind this function and the definition of the parameter of interest.