Learn R Programming

MFSIS (version 0.3.0)

MDCSIS: Martingale Difference Correlation and Its Use in High-Dimensional Variable Screening

Description

A new metric, the so-called martingale difference correlation, measure the departure of conditional mean independence between a scalar response variable V and a vector predictor variable U. This metric is a natural extension of distance correlation proposed by Szekely, Rizzo, and Bahirov(2007), which is used to measure the dependence between V and U. The martingale difference correlation and its empirical counterpart inherit a number of desirable features of distance correlation and sample distance correlation, such as algebraic simplicity and elegant theoretical properties.

Usage

MDCSIS(X, Y, nsis = (dim(X)[1])/log(dim(X)[1]))

Value

the labels of first nsis largest active set of all predictors

Arguments

X

The design matrix of dimensions n * p. Each row is an observation vector.

Y

The response vector of dimension n * 1.

nsis

Number of predictors recruited by MDCSIS. The default is n/log(n).

Author

Xuewei Cheng xwcheng@hunnu.edu.cn

References

Szekely, G. J., M. L. Rizzo, and N. K. Bakirov (2007). Measuring and testing dependence by correlation of distances. The annals of statistics 35(6), 2769–2794.

Shao, X. and J. Zhang (2014). Martingale difference correlation and its use in high-dimensional variable screening. Journal of the American Statistical Association 109(507),1302–1318.

Examples

Run this code

n <- 100
p <- 200
rho <- 0.5
data <- GendataLM(n, p, rho, error = "gaussian")
data <- cbind(data[[1]], data[[2]])
colnames(data)[1:ncol(data)] <- c(paste0("X", 1:(ncol(data) - 1)), "Y")
data <- as.matrix(data)
X <- data[, 1:(ncol(data) - 1)]
Y <- data[, ncol(data)]
A <- MDCSIS(X, Y, n / log(n))
A

Run the code above in your browser using DataLab