Learn R Programming

rrcovHD: Robust Multivariate Methods for High Dimensional Data

The package rrcovHD provides robust multivariate methods for high dimensional data including outlier detection (Filzmoser and Todorov (2013) doi:10.1016/j.ins.2012.10.017), robust sparse PCA (Croux et al. (2013) doi:10.1080/00401706.2012.727746, Todorov and Filzmoser (2013) doi:10.1007/978-3-642-33042-1_31), robust PLS (Todorov and Filzmoser (2014) doi:10.17713/ajs.v43i4.44), and robust sparse classification (Ortner et al. (2020) doi:10.1007/s10618-019-00666-8).

Installation

The rrcovHD package is on CRAN (The Comprehensive R Archive Network) and the latest release can be easily installed using the command

install.packages("rrcovHD")
library(rrcovNA)

Building from source

To install the latest stable development version from GitHub, you can pull this repository and install it using

## install.packages("remotes")
remotes::install_github("valentint/rrcovHD")

Of course, if you have already installed remotes, you can skip the first line (I have commented it out).

Example

This is a basic example which shows you if the package is properly installed:


library(rrcovHD)
#> Loading required package: rrcov
#> Loading required package: robustbase
#> Scalable Robust Estimators with High Breakdown Point (version 1.7-5)
#> Robust Multivariate Methods for High Dimensional Data (version 0.2-7)

data(pottery)
dim(pottery)        # 27 observations in 2 classes, 6 variables
#> [1] 27  7
head(pottery)
#>     SI   AL   FE  MG   CA   TI origin
#> 1 55.8 14.0 10.2 4.9  5.0 0.88  Attic
#> 2 51.2 12.5 10.1 4.4  4.8 0.86  Attic
#> 3 57.1 14.0  8.3 6.4 11.2 0.75  Attic
#> 4 53.8 13.1  9.3 4.9  6.6 0.81  Attic
#> 5 59.4 14.8  9.8 5.5  5.4 0.89  Attic
#> 6 56.2 14.0  9.9 4.9  5.4 0.89  Attic

## Build the SIMCA model. Use RSimca for a robust version
rs <- RSimca(origin~., data=pottery)
rs
#> Call:
#> RSimca(origin ~ ., data = pottery)
#> 
#> Prior Probabilities of Groups:
#>     Attic  Eritrean 
#> 0.4814815 0.5185185 
#> 
#> Pca objects for Groups:
#> 
#> Call:
#> PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
#> Importance of components:
#>                           PC1    PC2
#> Standard deviation     5.2672 0.8564
#> Proportion of Variance 0.7186 0.1804
#> Cumulative Proportion  0.7186 0.8990
#> 
#> Call:
#> PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
#> Importance of components:
#>                           PC1
#> Standard deviation     3.2934
#> Proportion of Variance 0.8102
#> Cumulative Proportion  0.8102
summary(rs)
#> 
#> Call:
#> RSimca(formula = origin ~ ., data = pottery)
#> 
#> Prior Probabilities of Groups:
#>     Attic  Eritrean 
#> 0.4814815 0.5185185 
#> 
#> Pca objects for Groups:
#> 
#> Call:
#> PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
#> Importance of components:
#>                           PC1    PC2
#> Standard deviation     5.2672 0.8564
#> Proportion of Variance 0.7186 0.1804
#> Cumulative Proportion  0.7186 0.8990
#> 
#> Call:
#> PcaHubert(x = class, k = k[i], kmax = kmax[i], trace = trace)
#> Importance of components:
#>                           PC1
#> Standard deviation     3.2934
#> Proportion of Variance 0.8102
#> Cumulative Proportion  0.8102

Community guidelines

Report issues and request features

If you experience any bugs or issues or if you have any suggestions for additional features, please submit an issue via the Issues tab of this repository. Please have a look at existing issues first to see if your problem or feature request has already been discussed.

Contribute to the package

If you want to contribute to the package, you can fork this repository and create a pull request after implementing the desired functionality.

Ask for help

If you need help using the package, or if you are interested in collaborations related to this project, please get in touch with the package maintainer.

Copy Link

Version

Install

install.packages('rrcovHD')

Monthly Downloads

1,234

Version

0.3-1

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Last Published

August 17th, 2024

Functions in rrcovHD (0.3-1)

Outlier-class

Class "Outlier" -- a base class for outlier identification
OutlierPCOut-class

Class "OutlierPCOut" - Outlier identification in high dimensions using using the PCOUT algorithm
OutlierPCDist-class

Class "OutlierPCDist" - Outlier identification in high dimensions using using the PCDIST algorithm
OutlierMahdist-class

Class OutlierMahdist - Outlier identification using robust (mahalanobis) distances based on robust multivariate location and covariance matrix
RSimca

Robust classification in high dimensions based on the SIMCA method
PredictSosDisc-class

Class "PredictSosDisc" - prediction of "SosDisc" objects
OutlierSign1

Outlier identification in high dimensions using the SIGN1 algorithm
OutlierPCOut

Outlier identification in high dimensions using the PCOUT algorithm
SPcaGrid

Sparse Robust Principal Components based on Projection Pursuit (PP): GRID search Algorithm
OutlierSign2-class

Class "OutlierSign2" - Outlier identification in high dimensions using the SIGN2 algorithm
OutlierSign1-class

Class "OutlierSign1" - Outlier identification in high dimensions using the SIGN1 algorithm
SosDisc-class

Class "SosDisc" - virtual base class for all classic and robust SosDisc classes representing the results of the robust and sparse multigroup classification by the optimal scoring approach
SosDiscClassic-class

Class SosDiscClassic - sparse multigroup classification by the optimal scoring approach
RSimca-class

Class "RSimca" - robust classification in high dimensions based on the SIMCA method
SummarySimca-class

Class "SummarySimca" - summary of "Simca" objects
Simca-class

Class "Simca" - virtual base class for all classic and robust SIMCA classes representing classification in high dimensions based on the SIMCA method
OutlierSign2

Outlier identification in high dimensions using the SIGN2 algorithm
SPcaGrid-class

Class SPcaGrid - Sparse Robust PCA using PP - GRID search Algorithm
SummarySosDisc-class

Class "SummarySosDisc" - summary of "SosDisc" objects
PredictSimca-class

Class "PredictSimca" - prediction of "Simca" objects
SosDiscRobust-class

Class SosDiscRobust - robust and sparse multigroup classification by the optimal scoring approach
SosDiscRobust

Robust and sparse multigroup classification by the optimal scoring approach
getWeight-methods

Accessor methods to the essential slots of Outlier and its subclasses
kibler

1985 Auto Imports Database
CSimca-class

Class "CSimca" - classification in high dimensions based on the (classical) SIMCA method
CSimca

Classification in high dimensions based on the (classical) SIMCA method
OutlierMahdist

Outlier identification using robust (mahalanobis) distances based on robust multivariate location and covariance matrix
OutlierPCDist

Outlier identification in high dimensions using the PCDIST algorithm