Learn R Programming

MFSIS (version 0.3.0)

CAS: Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data

Description

A category-adaptive screening procedure with high-dimensional heterogeneous data, which is to detect category-specific important covariates. This proposal is a model-free approach without any specification of a regression model and an adaptive procedure in the sense that the set of active variables is allowed to vary across different categories, thus making it more flexible to accommodate heterogeneity.

Usage

CAS(X, Y, nsis)

Value

the labels of first nsis largest active set of all predictors

Arguments

X

The design matrix of dimensions n * p. Each row is an observation vector.

Y

The response vector of dimension n * 1.

nsis

Number of predictors recruited by CAS. The default is n/log(n).

Author

Xuewei Cheng xwcheng@hunnu.edu.cn

References

Pan, R., Wang, H., and Li, R. (2016). Ultrahigh-dimensional multiclass linear discriminant analysis by pairwise sure independence screening. Journal of the American Statistical Association, 111(513):169–179.

Examples

Run this code

n <- 100
p <- 200
rho <- 0.5
data <- GendataLGM(n, p, rho)
data <- cbind(data[[1]], data[[2]])
colnames(data)[1:ncol(data)] <- c(paste0("X", 1:(ncol(data) - 1)), "Y")
data <- as.matrix(data)
X <- data[, 1:(ncol(data) - 1)]
Y <- data[, ncol(data)]
A <- CAS(X, Y, n / log(n))
A

Run the code above in your browser using DataLab