do.olpp: Orthogonal Locality Preserving Projection

Description

Orthogonal Locality Preserving Projection (OLPP) is a variant of do.lpp, which extracts orthogonal basis functions to reconstruct the data in a more intuitive fashion. It adopts PCA as preprocessing step and uses only one eigenvector at each iteration in that it might incur warning messages for solving near-singular system of linear equations.

Usage

do.olpp(
  X,
  ndim = 2,
  type = c("proportion", 0.1),
  symmetric = c("union", "intersect", "asymmetric"),
  weight = TRUE,
  preprocess = c("center", "scale", "cscale", "decorrelate", "whiten"),
  t = 1
)

Arguments

an \((n\times p)\) matrix or data frame whose rows are observations

ndim

an integer-valued target dimension.

type

a vector of neighborhood graph construction. Following types are supported; c("knn",k), c("enn",radius), and c("proportion",ratio). Default is c("proportion",0.1), connecting about 1/10 of nearest data points among all data points. See also aux.graphnbd for more details.

symmetric

one of "intersect", "union" or "asymmetric" is supported. Default is "union". See also aux.graphnbd for more details.

weight

TRUE to perform LPP on weighted graph, or FALSE otherwise.

preprocess

an additional option for preprocessing the data. See aux.preprocess for details.

bandwidth for heat kernel in \((0,\infty)\)

Value

a named list containing

Y: an \((n\times ndim)\) matrix whose rows are embedded observations.
projection: a \((p\times ndim)\) whose columns are basis for projection.
trfinfo: a list containing information for out-of-sample prediction.

References

cai_orthogonal_2006Rdimtools

Examples

Run this code

# NOT RUN {
## use iris data
data(iris)
set.seed(100)
subid = sample(1:150, 50)
X     = as.matrix(iris[subid,1:4])
label = as.factor(iris[subid,5])

##  connecting 10% and 25% of data for graph construction each.
output1 <- do.olpp(X,ndim=2,type=c("proportion",0.10))
output2 <- do.olpp(X,ndim=2,type=c("proportion",0.25))

## Visualize
#  In theory, it should show two separated groups of data
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,2))
plot(output1$Y, col=label, pch=19, main="OLPP::10% connected")
plot(output2$Y, col=label, pch=19, main="OLPP::25% connected")
par(opar)
# }
# NOT RUN {
# }