oos.linproj: OOS : Linear Projection

Description

The simplest way of out-of-sample extension might be linear regression even though the original embedding is not the linear type by solving $$\textrm{min}_{\beta} \|X_{old} \beta - Y_{old}\|_2^2$$ and use the estimate $\hat{beta}$ to acquire $$Y_{new} = X_{new} \hat{\beta}$$. Due to the choice of original preprocessing, trfinfo must be brought from the original model you trained.

Usage

oos.linproj(Xold, Yold, trfinfo, Xnew)

Arguments

Xold

an $(n\times p)$ matrix of data in original high-dimensional space.

Yold

an $(n\times ndim)$ matrix of data in reduced-dimensional space.

trfinfo

a list containing transformation information generated from manifold learning algorithms. See also aux.preprocess for more details.

Xnew

an $(m\times p)$ matrix for out-of-sample extension.

Value

a named list containing

Ynew: an $(m\times ndim)$ matrix whose rows are embedded observations.

Examples

Run this code

# NOT RUN {
## generate sample data and separate them
X = aux.gensamples(n=500)
set.seed(46556)
idxselect  = sample(1:500,20)

Xold = X[setdiff(1:500,idxselect),]  # 80% of data for training
Xnew = X[idxselect,]                 # 20% of data for testing

## run PCA for train data
training = do.pca(Xold,ndim=2,preprocess="whiten")
Yold     = training$Y       # embedded data points
oldinfo  = training$trfinfo # preprocessing information

## perform out-of-sample extension
output  = oos.linproj(Xold, Yold, oldinfo, Xnew)
Ynew    = output$Ynew

## let's compare via visualization
xx = c(-2,2) # range of axis 1 for compact visualization
yy = c(-2,2) # range of axis 2 for compact visualization
mm = "black=train / red=test data" # figure title

## visualize
opar <- par(no.readonly=TRUE)
plot(Yold, type="p", xlim=xx, ylim=yy, main=mm, xlab="axis 1", ylab="axis 2")
points(Ynew[,1], Ynew[,2], lwd=3, col="red")
par(opar)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab