Learn R Programming

RPEnsemble (version 0.5)

RPChooseSS: A sample splitting version of RPChoose

Description

Chooses the best projection based on an estimate of the test error of the classifier with training data (XTrain, YTrain), the estimation method counts the number of errors made on the validation set (XVal, YVal).

Usage

RPChooseSS(XTrain, YTrain, XVal, YVal, XTest, d, B2 = 100, base = "LDA",  
k = c(3, 5), projmethod = "Haar", ...)

Arguments

XTrain

An n by p matrix containing the training data feature vectors

YTrain

A vector of length n of the classes (either 1 or 2) of the training data

XVal

An n.val by p matrix containing the validation data feature vectors

YVal

A vector of length n.val of the classes (either 1 or 2) of the validation data

XTest

An n.test by p matrix of the test data feature vectors

d

The lower dimension of the image space of the projections

B2

The block size

base

The base classifier one of "knn","LDA","QDA" or "other"

k

The options for k if base = "knn"

projmethod

Either "Haar", "Gaussian" or "axis"

Optional further arguments if base = "other"

Value

Returns a vector of length n.val + n.test: the first n.val entries are the estimated classes of the validation set, the last n.test are the estimated classes of the test set.

Details

Maps the the data using B2 random projections. For each projection the validation set is classified using the the training set and the projection yielding the smallest number of errors over the validation set is retained. The validation set and test set are then classified using the chosen projection.

References

Cannings, T. I. and Samworth, R. J. (2017) Random-projection ensemble classification, J. Roy. Statist. Soc., Ser. B. (with discussion), 79, 959--1035

See Also

RPParallel, RPChoose, lda, qda, knn

Examples

Run this code
# NOT RUN {
set.seed(100)
Train <- RPModel(1, 50, 100, 0.5)
Validate <- RPModel(1, 50, 100, 0.5)
Test <- RPModel(1, 100, 100, 0.5)
Choose.out5 <- RPChooseSS(XTrain = Train$x, YTrain = Train$y, XVal = Validate$x, 
YVal = Validate$y, XTest = Test$x, d = 2, B2 = 5, base = "QDA", projmethod = "Haar")
Choose.out10 <- RPChooseSS(XTrain = Train$x, YTrain = Train$y, XVal = Validate$x, 
YVal = Validate$y, XTest = Test$x, d = 2, B2 = 10, base = "QDA", projmethod = "Haar")
sum(Choose.out5[1:50] != Validate$y)
sum(Choose.out10[1:50] != Validate$y)
sum(Choose.out5[51:150] != Test$y)
sum(Choose.out10[51:150] != Test$y)
# }

Run the code above in your browser using DataLab