Learn R Programming

bootsPLS (version 1.1.2)

CI.prediction: Compute Confidence Intervals (CI) for test samples

Description

Compute Confidence Intervals (CI) for test samples based on random subsamplings

Usage

CI.prediction(object,X,Y,signature,ncomp,many,
            subsampling.matrix,ratio,X.test,level.CI,save.file)

Arguments

object

a `spls.constraint' object, as one resulting from fit.model. If object is missing: X, Y, signature are needed.

X

Only used if object is missing. Input matrix of dimension n * p; each row is an observation vector.

Y

Only used if object is missing. Factor with at least q>2 levels.

signature

Only used if object is missing. A list containing which variables are to be kept on each component.

ncomp

Only used if object is missing. How many component do you want to include in the sPLS-DA analysis?

many

How many subsamplings do you want to do? Default is 100

subsampling.matrix

Optional matrix of many columns. Gives the samples to subsample as an internal learning set.

ratio

Number between 0 and 1. It is the proportion of the n samples that are put aside and considered as an internal testing set. The (1-ratio)*n samples are used as a training set and the kCV fold cross validation is performed on them. Default is 0.3

X.test

Test matrix.

level.CI

A 1- level.CI% confidence interval is calculated.

save.file

Save the outputs of the functions in save.file.Rdata.

Value

CI

A (1- level.CI)% confidence interval is returned for each samples in X.test

Y.hat.test

A four dimensional array. The two first dimensions are an estimation of the dummy matrix obtained from Y (size n * number of sample types). The third dimension is relative to the number of components ncomp. The fourth dimension concerns the number of subsamplings.

ClassifResult

A 5-dimensional array. The two first dimensions consists in the confusion matrix. The third dimension is relative to the number of components ncomp. The fourth dimension concerns the number of subsamplings. The fifth and last dimension is relative to the different distances "max.dist", "centroids.dist" and "mahalanobis.dist".

loadings.X

A 3-dimensional array. Loadings vector of X, for each component and each subsampling.

prediction.X

A 4-dimensional array of size n*many*ncomp*3. Gives the prediction for the chosen method of all the samples, either in the internal learning set or the internal testing set. The last dimension is relative to the different distances "max.dist", "centroids.dist" and "mahalanobis.dist".

prediction.X.test

A 4-dimensional array of size nrow(X.test)*many*ncomp*3. Gives the prediction for the chosen method of all the test samples in X.test. The last dimension is relative to the different distances "max.dist", "centroids.dist" and "mahalanobis.dist".

learning.sample

Matrix of size n*many. Gives the samples that have been used in the internal training set over the many replications. These samples have the value 1, the others 0.

coeff

A list of means.X, sigma.X, means.Y and sigma.Y. Means and variances for the variables of X and the columns of the dummy matrix obtained from Y, each row is a subsampling.

data

A list of the input data X, Y and of signature, which is a list containing the variables kept on each component.

Details

This function can work with a `spls.constraint' object or with the input data (X, Y, signature). See examples below to see the difference in use.

See Also

fit.model, prediction

Examples

Run this code
# NOT RUN {
data(MSC)
X=MSC$X
Y=MSC$Y


# with a bootsPLS object
boot=bootsPLS(X=X,Y=Y,ncomp=3,many=5,kCV=5)
fit=fit.model(boot,ncomp=3)

CI=CI.prediction(fit)
CI=CI.prediction(fit,X.test=X)
plot(CI)

lapply(CI$CI$'comp.1',head)
lapply(CI$CI$'comp.2',head)
lapply(CI$CI$'comp.3',head)


# without a spls.constraint object. X,Y and signature are needed
# the results should be similar
#(not the same because of the random subsamplings,
# exactly the same if subsampling.matrix is an input)
signature=fit$data$signature
CI=CI.prediction(X=X,Y=Y,signature=signature)
CI=CI.prediction(X=X,Y=Y,signature=signature,X.test=X)

lapply(CI$CI$'comp.1',head)
lapply(CI$CI$'comp.2',head)
lapply(CI$CI$'comp.3',head)
# }

Run the code above in your browser using DataLab