Learn R Programming

kernDeepStackNet (version 2.0.2)

rdcSubset: Randomized dependence coefficients score on given subset

Description

Variable pre selection scoring for KDSN. Estimates the RDC score for a subset of variables.

Usage

rdcSubset(binCode, x, y, k=20, s=1/6, f=sin, seedX=NULL, seedY=NULL, 
  rdcRep=1, trans0to1=TRUE)

Arguments

binCode

Specifies which set of variables of the covariates is used to explain the responses (binary vector). One to assiged inclusion and zero excludes variables.

x

Covariates data (numeric matrix).

y

Responses (numeric matrix).

k

Number of random features (integer scalar).

s

Variance of the random weights. Default is 1/6.

f

Non-linear transformation function. Default is sin.

seedX

Random number seed of normal distributed weights for covariates (integer scalar). Default is to randomly draw weights.

seedY

Random number seed of normal distributed weights for responses (integer scalar). Default is to randomly draw weights.

rdcRep

Gives the number of rdc repetitions. All repetitions are averaged per variable, to give more robust estimates. Default is to use one repetition.

trans0to1

Should the design matrix and response be transformed to the interval [0, 1]? (Logical). If the data is available in this for form, it can be evaluated much faster.

Value

RDC score (numeric scalar).

Details

Covariates are ranked according to their dependence with the response variable.

References

David Lopez-Paz and Philipp Hennig and Bernhard Schoelkopf, (2013), The Randomized dependence coefficient, Proceedings of Neural Information Processing Systems (NIPS) 26, Stateline Nevada USA, C.J.C. Burges and L. Bottou and M. Welling and Z. Ghahramani and K.Q. Weinberger (eds.)

See Also

rdcPart, cancorRed, rdcVarOrder, rdcVarSelSubset

Examples

Run this code
#############################
# Cubic noisy association

# Generate 10 covariates
library(mvtnorm)
set.seed(3489)
X <- rmvnorm(n=200, mean=rep(0, 10))

# Generate responses based on some covariates
set.seed(-239247)
y <- 0.5*X[, 1]^3 - 2*X[, 2]^2 + X[, 3] - 1 + rnorm(200)

# Score of true subset
scoreTrue <- rdcSubset(binCode=c(rep(1, 3), rep(0, 7)), 
x=X, y=y, seedX=1:10, seedY=-(1:10), rdcRep=10)
scoreTrue

# Only unneccessary variables
scoreFalse <- rdcSubset(binCode=c(rep(0, 3), rep(1, 7)), 
x=X, y=y, seedX=1:10, seedY=-(1:10), rdcRep=10)
scoreFalse

# One important two important variables and some non causal variables
scoreMix <- rdcSubset(binCode=c(1, 0, 1, rep(0, 3), rep(1, 4)), 
x=X, y=y, seedX=1:10, seedY=-(1:10), rdcRep=10)
scoreMix

Run the code above in your browser using DataLab