Learn R Programming

⚠️There's a newer version (1.1.2) of this package.Take me there.

Rdimtools

Rdimtools is an R package for dimension reduction (DR) - including feature selection and manifold learning - and intrnsic dimension estimation (IDE) methods. We aim at building one of the most comprehensive toolbox available online, where current version delivers 141 DR algorithms and 17 IDE methods.

The philosophy is simple, the more we have at hands, the better we can play.

Elephant

Our logo characterizes the foundational nature of multivariate data analysis; we may be blind people wrangling the data to see an elephant to grasp an idea of what the data looks like with partial information from each algorithm.

Installation

You can install a release version from CRAN:

install.packages("Rdimtools")

or the development version from github:

## install.packages("devtools")
devtools::install_github("kyoustat/Rdimtools")

Minimal Example : Dimension Reduction

Here is an example of dimension reduction on the famous iris dataset. Principal Component Analysis (do.pca), Laplacian Score (do.lscore), and Diffusion Maps (do.dm) are compared, each from a family of algorithms for linear reduction, feature extraction, and nonlinear reduction.

# load the library
library(Rdimtools)

# load the data
X   = as.matrix(iris[,1:4])
lab = as.factor(iris[,5])

# run 3 algorithms mentioned above
mypca = do.pca(X, ndim=2)
mylap = do.lscore(X, ndim=2)
mydfm = do.dm(X, ndim=2, bandwidth=10)

# visualize
par(mfrow=c(1,3))
plot(mypca$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="PCA")
plot(mylap$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="Laplacian Score")
plot(mydfm$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="Diffusion Maps")

Minimal Example : Dimension Estimation

Swill Roll is a classic example of 2-dimensional manifold embedded in ℝ3 and one of 11 famous model-based samples from aux.gensamples() function. Given the ground truth that d = 2, let’s apply several methods for intrinsic dimension estimation.

# generate sample data
set.seed(100)
roll = aux.gensamples(dname="swiss")

# we will compare 6 methods (out of 17 methods from version 1.0.0)
vecd = rep(0,5)
vecd[1] = est.Ustat(roll)$estdim       # convergence rate of U-statistic on manifold
vecd[2] = est.correlation(roll)$estdim # correlation dimension
vecd[3] = est.made(roll)$estdim        # manifold-adaptive dimension estimation
vecd[4] = est.mle1(roll)$estdim        # MLE with Poisson process
vecd[5] = est.twonn(roll)$estdim       # minimal neighborhood information

# let's visualize
plot(1:5, vecd, type="b", ylim=c(1.5,2.5), 
     main="true dimension is d=2",
     xaxt="n",xlab="",ylab="estimated dimension")
xtick = seq(1,5,by=1)
axis(side=1, at=xtick, labels = FALSE)
text(x=xtick,  par("usr")[3], 
     labels = c("Ustat","correlation","made","mle1","twonn"), pos=1, xpd = TRUE)

We can observe that all 5 methods we tested estimated the intrinsic dimension around d = 2. It should be noted that the estimated dimension may not be integer-valued due to characteristics of each method.

Acknowledgements

The logo icon is made by Freepik from www.flaticon.com.The rotating Swiss Roll image is taken from Dinoj Surendran’s website.

Copy Link

Version

Install

install.packages('Rdimtools')

Monthly Downloads

794

Version

1.0.6

License

MIT + file LICENSE

Maintainer

Last Published

April 18th, 2021

Functions in Rdimtools (1.0.6)

est.boxcount

Box-counting Dimension
aux.shortestpath

Find shortest path using Floyd-Warshall algorithm
est.Ustat

ID Estimation with Convergence Rate of U-statistic on Manifold
est.correlation

Correlation Dimension
aux.graphnbd

Construct Nearest-Neighborhood Graph
est.clustering

Intrinsic Dimension Estimation via Clustering
aux.preprocess

Preprocessing the data
aux.kernelcov

Build a centered kernel matrix K
aux.gensamples

Generate model-based samples
aux.pkgstat

Show the number of functions for Rdimtools.
est.nearneighbor1

Intrinsic Dimension Estimation with Near-Neighbor Information
do.cscore

Constraint Score
do.mifs

Mutual Information for Selecting Features
est.twonn

Intrinsic Dimension Estimation by a Minimal Neighborhood Information
do.nrsr

Non-convex Regularized Self-Representation
do.disr

Diversity-Induced Self-Representation
est.mindml

MINDml
do.cscoreg

Constraint Score using Spectral Graph
do.lspe

Locality and Similarity Preserving Embedding
est.mindkl

MiNDkl
est.incisingball

Intrinsic Dimension Estimation with Incising Ball
est.danco

Intrinsic Dimensionality Estimation with DANCo
est.mle1

Maximum Likelihood Esimation with Poisson Process
iris

'Iris' data
do.wdfs

Worst-Case Discriminative Feature Selection
est.nearneighbor2

Near-Neighbor Information with Bias Correction
do.lasso

Least Absolute Shrinkage and Selection Operator
do.procrustes

Feature Selection using PCA and Procrustes Analysis
do.spufs

Structure Preserving Unsupervised Feature Selection
do.lscore

Laplacian Score
do.rsr

Regularized Self-Representation
do.elpp2

Enhanced Locality Preserving Projection (2013)
do.mcfs

Multi-Cluster Feature Selection
do.eslpp

Extended Supervised Locality Preserving Projection
do.extlpp

Extended Locality Preserving Projection
est.gdistnn

Intrinsic Dimension Estimation based on Manifold Assumption and Graph Distance
est.packing

Intrinsic Dimension Estimation using Packing Numbers
est.mle2

Maximum Likelihood Esimation with Poisson Process and Bias Correction
est.pcathr

PCA Thresholding with Accumulated Variance
do.asi

Adaptive Subspace Iteration
do.anmm

Average Neighborhood Margin Maximization
do.fa

Exploratory Factor Analysis
do.bpca

Bayesian Principal Component Analysis
do.ugfs

Unsupervised Graph-based Feature Selection
do.uwdfs

Uncorrelated Worst-Case Discriminative Feature Selection
do.specs

Supervised Spectral Feature Selection
do.specu

Unsupervised Spectral Feature Selection
do.udfs

Unsupervised Discriminative Features Selection
do.cnpe

Complete Neighborhood Preserving Embedding
do.crp

Collaborative Representation-based Projection
do.ldakm

Combination of LDA and K-means
do.dne

Discriminant Neighborhood Embedding
do.dagdne

Double-Adjacency Graphs-based Discriminant Neighborhood Embedding
do.lde

Local Discriminant Embedding
do.lpfda

Locality Preserving Fisher Discriminant Analysis
do.lpmip

Locality-Preserved Maximum Information Projection
est.made

Manifold-Adaptive Dimension Estimation
do.lsda

Locality Sensitive Discriminant Analysis
do.isoproj

Isometric Projection
do.lpca2006

Locally Principal Component Analysis by Yang et al. (2006)
do.lpe

Locality Pursuit Embedding
do.lsir

Localized Sliced Inverse Regression
do.lspp

Local Similarity Preserving Projection
do.enet

Elastic Net Regularization
do.kudp

Kernel-Weighted Unsupervised Discriminant Projection
do.lda

Linear Discriminant Analysis
do.lltsa

Linear Local Tangent Space Alignment
do.odp

Orthogonal Discriminant Projection
do.kmvp

Kernel-Weighted Maximum Variance Projection
do.mmsd

Multiple Maximum Scatter Difference
do.lmds

Landmark Multidimensional Scaling
do.fscore

Fisher Score
do.olda

Orthogonal Linear Discriminant Analysis
do.cca

Canonical Correlation Analysis
do.ldp

Locally Discriminating Projection
do.lea

Locally Linear Embedded Eigenspace Analysis
do.pls

Partial Least Squares
do.pflpp

Parameter-Free Locality Preserving Projection
do.spc

Supervised Principal Component Analysis
do.spca

Sparse Principal Component Analysis
do.ssldp

Semi-Supervised Locally Discriminant Projection
do.mmc

Maximum Margin Criterion
do.spp

Sparsity Preserving Projection
do.mds

(Classical) Multidimensional Scaling
do.npca

Nonnegative Principal Component Analysis
do.npe

Neighborhood Preserving Embedding
do.rpcag

Robust Principal Component Analysis via Geometric Median
do.rndproj

Random Projection
do.modp

Modified Orthogonal Discriminant Projection
do.mfa

Marginal Fisher Analysis
do.slpp

Supervised Locality Preserving Projection
do.udp

Unsupervised Discriminant Projection
do.slpe

Supervised Locality Pursuit Embedding
do.mlie

Maximal Local Interclass Embedding
do.kmfa

Kernel Marginal Fisher Analysis
do.keca

Kernel Entropy Component Analysis
do.mve

Minimum Volume Embedding
do.klde

Kernel Local Discriminant Embedding
do.kmmc

Kernel Maximum Margin Criterion
do.mvu

Maximum Variance Unfolding / Semidefinite Embedding
do.mmp

Maximum Margin Projection
do.ulda

Uncorrelated Linear Discriminant Analysis
do.dve

Distinguishing Variance Embedding
do.lsdf

Locality Sensitive Discriminant Feature
oos.linproj

OOS : Linear Projection
do.cge

Constrained Graph Embedding
do.sammc

Semi-Supervised Adaptive Maximum Margin Criterion
do.idmap

Interactive Document Map
do.bmds

Bayesian Multidimensional Scaling
do.rsir

Regularized Sliced Inverse Regression
do.mvp

Maximum Variance Projection
do.msd

Maximum Scatter Difference
do.opls

Orthogonal Partial Least Squares
do.nolpp

Nonnegative Orthogonal Locality Preserving Projection
do.pca

Principal Component Analysis
do.sdlpp

Sample-Dependent Locality Preserving Projection
do.fastmap

FastMap
do.lsls

Locality Sensitive Laplacian Score
do.ammc

Adaptive Maximum Margin Criterion
do.adr

Adaptive Dimension Reduction
do.isomap

Isometric Feature Mapping
do.olpp

Orthogonal Locality Preserving Projection
do.nonpp

Nonnegative Orthogonal Neighborhood Preserving Projections
do.sir

Sliced Inverse Regression
do.ispe

Isometric Stochastic Proximity Embedding
do.elde

Exponential Local Discriminant Embedding
do.dspp

Discriminative Sparsity Preserving Projection
do.lle

Locally Linear Embedding
do.iltsa

Improved Local Tangent Space Alignment
do.lapeig

Laplacian Eigenmaps
do.fssem

Feature Subset Selection using Expectation-Maximization
do.lfda

Local Fisher Discriminant Analysis
do.llp

Local Learning Projections
do.ica

Independent Component Analysis
do.lpp

Locality Preserving Projection
do.lqmi

Linear Quadratic Mutual Information
do.kpca

Kernel Principal Component Analysis
do.lisomap

Landmark Isometric Feature Mapping
do.plp

Piecewise Laplacian-based Projection (PLP)
do.llle

Local Linear Laplacian Eigenmaps
do.rpca

Robust Principal Component Analysis
do.splapeig

Supervised Laplacian Eigenmaps
do.sammon

Sammon Mapping
do.spmds

Spectral Multidimensional Scaling
do.phate

Potential of Heat Diffusion for Affinity-based Transition Embedding
do.kqmi

Kernel Quadratic Mutual Information
do.nnp

Nearest Neighbor Projection
do.ppca

Probabilistic Principal Component Analysis
do.tsne

t-distributed Stochastic Neighbor Embedding
do.sda

Semi-Supervised Discriminant Analysis
do.rlda

Regularized Linear Discriminant Analysis
do.save

Sliced Average Variance Estimation
oos.linear

Out-Of-Sample Prediction for Linear Methods
do.cisomap

Conformal Isometric Feature Mapping
do.ree

Robust Euclidean Embedding
do.onpp

Orthogonal Neighborhood Preserving Projections
do.crda

Curvilinear Distance Analysis
do.dm

Diffusion Maps
do.crca

Curvilinear Component Analysis
do.klfda

Kernel Local Fisher Discriminant Analysis
do.ksda

Kernel Semi-Supervised Discriminant Analysis
do.klsda

Kernel Locality Sensitive Discriminant Analysis
do.lamp

Local Affine Multidimensional Projection
do.ltsa

Local Tangent Space Alignment
do.mmds

Metric Multidimensional Scaling
do.sne

Stochastic Neighbor Embedding
do.spe

Stochastic Proximity Embedding