Learn R Programming

knockoff (version 0.3.6)

stat.stability_selection: Importance statistics based on stability selection

Description

Computes the difference statistic $$W_j = |Z_j| - |\tilde{Z}_j|$$ where \(Z_j\) and \(\tilde{Z}_j\) are measure the importance of the jth variable and its knockoff, respectively, based on the stability of their selection upon subsampling of the data.

Usage

stat.stability_selection(X, X_k, y, fitfun = stabs::lars.lasso, ...)

Value

A vector of statistics \(W\) of length p.

Arguments

X

n-by-p matrix of original variables.

X_k

n-by-p matrix of knockoff variables.

y

response vector (length n)

fitfun

fitfun a function that takes the arguments x, y as above, and additionally the number of variables to include in each model q. The function then needs to fit the model and to return a logical vector that indicates which variable was selected (among the q selected variables). The name of the function should be prefixed by 'stabs::'.

...

additional arguments specific to 'stabs' (see Details).

Details

This function uses the stabs package to compute variable selection stability. The selection stability of the j-th variable is defined as its probability of being selected upon random subsampling of the data. The default method for selecting variables in each subsampled dataset is lars.lasso.

For a complete list of the available additional arguments, see stabsel.

See Also

Other statistics: stat.forward_selection(), stat.glmnet_coefdiff(), stat.glmnet_lambdadiff(), stat.lasso_coefdiff_bin(), stat.lasso_coefdiff(), stat.lasso_lambdadiff_bin(), stat.lasso_lambdadiff(), stat.random_forest(), stat.sqrt_lasso()

Examples

Run this code
set.seed(2022)
p=50; n=50; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p %in% nonzero)
y = X %*% beta + rnorm(n)
knockoffs = function(X) create.gaussian(X, mu, Sigma)

# Basic usage with default arguments
result = knockoff.filter(X, y, knockoffs=knockoffs,
                         statistic=stat.stability_selection)
print(result$selected)


Run the code above in your browser using DataLab