Learn R Programming

rchemo - Dimension reduction, Regression and Discrimination for Chemometrics

rchemo is a package for data exploration and prediction with focus on high dimensional data and chemometrics.

The package was initially designed about partial least squares regression and discrimination models and variants, in particular locally weighted PLS models (LWPLS) (e.g. https://doi.org/10.1002/cem.3209). Then, it has been expanded to many other methods for analyzing high dimensional data.

The name rchemo comes from the fact that the package is orientated to chemometrics, but most of the provided methods are fully generic to other domains.

Functions such as transform, predict, coef and summary are available. Tuning the predictive models is facilitated by generic functions gridscore (validation dataset) and gridcv (cross-validation). Faster versions are also available for models based on latent variables (LVs) (gridscorelv and gridcvlv) and ridge regularization (gridscorelb and gridcvlb).

All the functions have a help page with a documented example.

NOTE: This repository replaces the previous rchemo repository that now is archived.

News

Click HERE to see what changed in the previous versions.

or write in the R console

news(package = "rchemo")

Installation

Using Rstudio is recommended for installation and usage.

rchemo can be installed from the official R repo CRAN.

It can also be installed from the Chemouse Github repo using the following steps:

1. Install package 'remotes' from CRAN

Use the Rstudio menu

or write in the R console

install.packages("remotes")

2. Install package 'rchemo'

a) Most recent version

Write in the R console

remotes::install_github("ChemHouse-group/rchemo", dependencies = TRUE)

In case of the following question during installation process:

These packages have more recent versions available.
Which would you like to update?"

it is recommended to skip updates (usually choice 3 = None)

b) Any given tagged version

e.g. with tag "v0.1-1", write in the R console

remotes::install_github("ChemHouse-group/rchemo@v0.1-1", dependencies = TRUE)

Usage

Write in the R console

library(rchemo)

How to cite

Brandolini-Bunlon M., Jallais B., Roger J.M. Lesnoff M., 2023 R package rchemo: Dimension Reduction, Regression and Discrimination for Chemometrics. https://github.com/ChemHouse-group/rchemo.

Copy Link

Version

Install

install.packages('rchemo')

Monthly Downloads

250

Version

0.1-3

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Marion Brandolini-Bunlon

Last Published

September 11th, 2024

Functions in rchemo (0.1-3)

eposvd

External parameter orthogonalization (EPO)
forages

forages
kpca

KPCA
knnr

KNN-R
getknn

KNN selection
gridscore

Tuning of predictive models on a validation dataset
headm

Display of the first part of a data set
krbf

Kernel functions
interpl

Resampling of spectra by interpolation methods
knnda

KNN-DA
gridcv

Cross-validation
lwplsr

KNN-LWPLSR
lwplsrda_agg

Aggregation of KNN-LWPLSDA models with different numbers of LVs
kplsrda

KPLSR-DA models
lmr

Linear regression models
krrda

KRR-DA models
krr

KRR (LS-SVMR)
lmrda

LMR-DA models
kplsr

KPLSR Models
lda

LDA and QDA
locw

Locally weighted models
matW

Between and within covariance matrices
mavg

Smoothing by moving average
odis

Orthogonal distances from a PCA or PLS score space
octane

octane
mbplsr_mbplsda_allsteps

MBPLSR or MBPLSDA analysis steps
mbplsrda

multi-block PLSDA models
mbplsr

multi-block PLSR algorithms
orthog

Orthogonalization of a matrix to another matrix
plotjit

Jittered plot
plotscore

Plotting errors rates
pinv

Moore-Penrose pseudo-inverse of a matrix
plsrda

PLSDA models
lwplsr_agg

Aggregation of KNN-LWPLSR models with different numbers of LVs
lwplsrda

KNN-LWPLS-DA Models
ozone

ozone
plotxna

Plotting Missing Data in a Matrix
plsrda_agg

PLSDA with aggregation of latent variables
pcasvd

PCA algorithms
plotxy

2-d scatter plot
plotsp

Plotting spectra
rrda

RR-DA models
plsr_agg

PLSR with aggregation of latent variables
rr

Linear Ridge Regression
plskern

PLSR algorithms
rmgap

Removing vertical gaps in spectra
savgol

Savitzky-Golay smoothing
sampks

Kennard-Stone sampling
plsr_plsda_allsteps

PLSR or PLSDA analysis steps
sampdp

Duplex sampling
sampcla

Within-class sampling
segmkf

Segments for cross-validation
mse

Residuals and prediction error rates
sopls

Block dimension reduction by SO-PLS
scordis

Score distances (SD) in a PCA or PLS score space
snv

Standard normal variate transformation (SNV)
soplsr_soplsda_allsteps

SOPLSR or SOPLSDA analysis steps
sourcedir

Source R functions in a directory
vip

Variable Importance in Projection (VIP)
soplsrda

Block dimension reduction by SO-PLS-DA
summ

Description of the quantitative variables of a data set
wdist

Distance-based weights
svmr

SVM Regression and Discrimination
xfit

Matrix fitting from a PCA or PLS model
transform

Generic transform function
selwold

Heuristic selection of the dimension of a latent variable model with the Wold's criterion
aicplsr

AIC and Cp for Univariate PLSR Models
aggmean

Centers of classes
blockscal

Block autoscaling
checkdupl

Duplicated rows in datasets
cglsr

CG Least Squares Models
dderiv

Derivation by finite difference
covsel

CovSel
cassav

cassav
asdgap

asdgap
dmnorm

Multivariate normal probability density
dkrr

Direct KRR Models
dummy

Table of dummy variables
dfplsr_cg

Degrees of freedom of Univariate PLSR Models
checkna

Find and count NA values in a dataset
dkplsr

Direct KPLSR Models
detrend

Polynomial de-trend transformation
dtagg

Summary statistics of data subsets
fda

Factorial discriminant analysis
euclsq

Matrix of distances