qkIsomap: qKernel Isometric Feature Mapping

Description

Computes the Isomap embedding as introduced in 2000 by Tenenbaum, de Silva and Langford.

Usage

# S4 method for matrix
qkIsomap(x, kernel = "rbfbase", qpar = list(sigma = 0.1, q = 0.9),
dims = 2, k, mod = FALSE, plotResiduals = FALSE, verbose = TRUE, na.action = na.omit, ...)
# S4 method for cndkernmatrix
qkIsomap(x, dims = 2, k, mod = FALSE, plotResiduals = FALSE,
verbose = TRUE, na.action = na.omit, ...)
# S4 method for qkernmatrix
qkIsomap(x, dims = 2, k, mod = FALSE, plotResiduals = FALSE,
verbose = TRUE, na.action = na.omit, ...)

Arguments

N x D matrix (N samples, D features) or a kernel matrix of cndkernmatrix or qkernmatrix.

kernel

the kernel function used in training and predicting. This parameter can be set to any function, of class kernel, which computes a kernel function value between two vector arguments. qkerntool provides the most popular kernel functions which can be used by setting the kernel parameter to the following strings:

rbfbase Radial Basis qkernel function "Gaussian"
nonlbase Non Linear qkernel function
laplbase Laplbase qkernel function
ratibase Rational Quadratic qkernel function
multbase Multiquadric qkernel function
invbase Inverse Multiquadric qkernel function
wavbase Wave qkernel function
powbase Power qkernel function
logbase Log qkernel function
caubase Cauchy qkernel function
chibase Chi-Square qkernel function
studbase Generalized T-Student qkernel function
nonlcnd Non Linear cndkernel function
polycnd Polynomial cndkernel function
rbfcnd Radial Basis cndkernel function "Gaussian"
laplcnd Laplacian cndkernel function
anocnd ANOVA cndkernel function
raticnd Rational Quadratic cndkernel function
multcnd Multiquadric cndkernel function
invcnd Inverse Multiquadric cndkernel function
wavcnd Wave cndkernel function
powcnd Power cndkernel function
logcnd Log cndkernel function
caucnd Cauchy cndkernel function
chicnd Chi-Square cndkernel function
studcnd Generalized T-Student cndkernel function

The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument.

qpar

the list of hyper-parameters (kernel parameters). This is a list which contains the parameters to be used with the kernel function. Valid parameters for existing kernels are :

sigma, q for the Radial Basis qkernel function "rbfbase" , the Laplacian qkernel function "laplbase" and the Cauchy qkernel function "caubase".
alpha, q for the Non Linear qkernel function "nonlbase".
c, q for the Rational Quadratic qkernel function "ratibase" , the Multiquadric qkernel function "multbase" and the Inverse Multiquadric qkernel function "invbase".
theta, q for the Wave qkernel function "wavbase".
d, q for the Power qkernel function "powbase" , the Log qkernel function "logbase" and the Generalized T-Student qkernel function "studbase".
alpha for the Non Linear cndkernel function "nonlcnd".
d, alpha, c for the Polynomial cndkernel function "polycnd".
gamma for the Radial Basis cndkernel function "rbfcnd" and the Laplacian cndkernel function "laplcnd" and the Cauchy cndkernel function "caucnd".
d, sigma for the ANOVA cndkernel function "anocnd".
c for the Rational Quadratic cndkernel function "raticnd" , the Multiquadric cndkernel function "multcnd" and the Inverse Multiquadric cndkernel function "invcnd".
theta for the Wave cndkernel function "wavcnd".
d for the Power cndkernel function "powcnd" , the Log cndkernel function "logcnd" and the Generalized T-Student cndkernel function "studcnd".

Hyper-parameters for user defined kernels can be passed through the qpar parameter as well.

dims

vector containing the target space dimension(s)

number of neighbours

mod

use modified Isomap algorithm

plotResiduals

show a plot with the residuals between the high and the low dimensional data

verbose

show a summary of the embedding procedure at the end

na.action

A function to specify the action to be taken if NAs are found. The default action is na.omit, which leads to rejection of cases with missing values on any required variable. An alternative is na.fail, which causes an error if NA cases are found. (NOTE: If given, this argument must be named.)

…

additional parameters

Value

qkIsomap gives out an S4 object which is a LIST with components

prj

a N x dim matrix (N samples, dim features) with the reduced input data (list of several matrices if more than one dimension was specified).

dims

the dimension of the target space.

Residuals

the residual variances for all dimensions.

eVal

the corresponding eigenvalues.

eVec

the corresponding eigenvectors.

cndkernf

the kernel function used.

kcall

The formula of the function called

all the slots of the object can be accessed by accessor functions.

Details

The qkIsomap is a nonlinear dimension reduction technique, that preserves global properties of the data. That means, that geodesic distances between all samples are captured best in the low dimensional embedding. This R version is based on the Matlab implementation by Tenenbaum and uses Floyd's Algorithm to compute the neighbourhood graph of shortest distances, when calculating the geodesic distances. A modified version of the original Isomap algorithm is included. It respects nearest and farthest neighbours. To estimate the intrinsic dimension of the data, the function can plot the residuals between the high and the low dimensional data for a given range of dimensions.

References

Tenenbaum, J. B. and de Silva, V. and Langford, J. C., "A global geometric framework for nonlinear dimensionality reduction.", 2000; Matlab code is available at http://waldron.stanford.edu/~isomap/

Examples

Run this code

# NOT RUN {
 # another example using the iris
  data(iris)
  testset <- sample(1:150,20)
  train <- as.matrix(iris[-testset,-5])
  labeltrain<- as.integer(iris[-testset,5])
  test <- as.matrix(iris[testset,-5])
  # ratibase(c=1,q=0.8)
  d_low = qkIsomap(train, kernel = "ratibase", qpar = list(c=1,q=0.8),
                    dims=2,  k=5, plotResiduals = TRUE)
  #plot the data projection on the components
  plot(prj(d_low),col=labeltrain, xlab="1st Principal Component",ylab="2nd  Principal Component")

  prj(d_low)
	dims(d_low)
	Residuals(d_low)
	eVal(d_low)
	eVec(d_low)
	kcall(d_low)
	cndkernf(d_low)
# }

Run the code above in your browser using DataLab