gsim: GSIM for binary data

Description

The function gsim performs prediction using Lambert-Lacroix and Peyre's GSIM algorithm.

Usage

gsim(Xtrain, Ytrain, Xtest=NULL, Lambda, hA, hB=NULL, NbIterMax=50)

Value

A list with the following components:

Ytest: the ntest vector containing the predicted labels for the observations from Xtest.
beta: the p vector giving the projection direction estimated.
hB: the value of hB used in step B of GSIM (value given by the user or estimated by plug-in if the argument value was equal to NULL)
DeletedCol: the vector containing the column number of Xtrain when the variance of the corresponding predictor variable is null. Otherwise DeletedCol=NULL
Cvg: the 0-1 value indicating convergence of the algorithm (1 for convergence, 0 otherwise).

Arguments

Xtrain: a (ntrain x p) data matrix of predictors. Xtrain must be a matrix. Each row corresponds to an observation and each column to a predictor variable.
Ytrain: a ntrain vector of responses. Ytrain must be a vector. Ytrain is a {1,2}-valued vector and contains the response variable for each observation.
Xtest: a (ntest x p) matrix containing the predictors for the test data set. Xtest may also be a vector of length p (corresponding to only one test observation). If Xtest is not equal to NULL, then the prediction step is made for these new predictor variables.
Lambda: a positive real value. Lambda is the ridge regularization parameter.
hA: a strictly positive real value. hA is the bandwidth for GSIM step A.
hB: a strictly positive real value. hB is the bandwidth for GSIM step B. if hB is equal to NULL, then hB value is chosen using a plug-in method.
NbIterMax: a positive integer. NbIterMax is the maximal number of iterations in the Newton-Rapson parts.

Author

Sophie Lambert-Lacroix (http://membres-timc.imag.fr/Sophie.Lambert/) and Julie Peyre (https://membres-ljk.imag.fr/Julie.Peyre/).

Details

The columns of the data matrices Xtrain and Xtest may not be standardized, since standardizing is performed by the function gsim as a preliminary step before the algorithm is run.

The procedure described in Lambert-Lacroix and Peyre (2005) is used to estimate the projection direction beta. When Xtest is not equal to NULL, the procedure predicts the labels for these new predictor variables.

References

S. Lambert-Lacroix, J. Peyre . (2006) Local likelyhood regression in generalized linear single-index models with applications to microarrays data. Computational Statistics and Data Analysis, vol 51, n 3, 2091-2113.

Examples

Run this code

# load plsgenomics library
library(plsgenomics)

# load Colon data
data(Colon)
IndexLearn <- c(sample(which(Colon$Y==2),12),sample(which(Colon$Y==1),8))

Xtrain <- Colon$X[IndexLearn,]
Ytrain <- Colon$Y[IndexLearn]
Xtest <- Colon$X[-IndexLearn,]

# preprocess data
resP <- preprocess(Xtrain= Xtrain, Xtest=Xtest,Threshold = c(100,16000),Filtering=c(5,500),
		log10.scale=TRUE,row.stand=TRUE)

# perform prediction by GSIM
res <- gsim(Xtrain=resP$pXtrain,Ytrain= Ytrain,Xtest=resP$pXtest,Lambda=10,hA=50,hB=NULL)
   
res$Cvg
sum(res$Ytest!=Colon$Y[-IndexLearn])

Run the code above in your browser using DataLab