Learn R Programming

Rdimtools (version 1.0.4)

oos.linproj: OOS : Linear Projection

Description

The simplest way of out-of-sample extension might be linear regression even though the original embedding is not the linear type by solving $$\textrm{min}_{\beta} \|X_{old} \beta - Y_{old}\|_2^2$$ and use the estimate \(\hat{beta}\) to acquire $$Y_{new} = X_{new} \hat{\beta}$$. Due to the choice of original preprocessing, trfinfo must be brought from the original model you trained.

Usage

oos.linproj(Xold, Yold, trfinfo, Xnew)

Arguments

Xold

an \((n\times p)\) matrix of data in original high-dimensional space.

Yold

an \((n\times ndim)\) matrix of data in reduced-dimensional space.

trfinfo

a list containing transformation information generated from manifold learning algorithms. See also aux.preprocess for more details.

Xnew

an \((m\times p)\) matrix for out-of-sample extension.

Value

a named list containing

Ynew

an \((m\times ndim)\) matrix whose rows are embedded observations.

Examples

Run this code
# NOT RUN {
## generate sample data and separate them
X = aux.gensamples(n=500)
set.seed(46556)
idxselect  = sample(1:500,20)

Xold = X[setdiff(1:500,idxselect),]  # 80% of data for training
Xnew = X[idxselect,]                 # 20% of data for testing

## run PCA for train data
training = do.pca(Xold,ndim=2,preprocess="whiten")
Yold     = training$Y       # embedded data points
oldinfo  = training$trfinfo # preprocessing information

## perform out-of-sample extension
output  = oos.linproj(Xold, Yold, oldinfo, Xnew)
Ynew    = output$Ynew

## let's compare via visualization
xx = c(-2,2) # range of axis 1 for compact visualization
yy = c(-2,2) # range of axis 2 for compact visualization
mm = "black=train / red=test data" # figure title

## visualize
opar <- par(no.readonly=TRUE)
plot(Yold, type="p", xlim=xx, ylim=yy, main=mm, xlab="axis 1", ylab="axis 2")
points(Ynew[,1], Ynew[,2], lwd=3, col="red")
par(opar)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab