Learn R Programming

mldr.resampling

Collection of the state of the art multilabel resampling algorithms. The objective of these algorithms is to achieve balance in multilabel datasets.

Installation

Use install.packages to install mldr.resampling and its dependencies:

install.packages("mldr.resampling")

Alternatively, you can install it via install_github from the devtools package.

devtools::install_github("madr0008/mldr.resampling")

Building from source

Use devtools::build from devtools to build the package:

devtools::build(args = "--compact-vignettes=gs+qpdf")

Usage and examples

This package has an interface function that can be called in order to execute the desired algorithms, on the desired mldr datasets. This function can be called as follows:

library(mldr.resampling)

resample(birds, c("MLSOL", "MLeNN"), P=30, k=5, TH=0.4)

For more examples and detailed explanation on available functions, please refer to the documentation.

Copy Link

Version

Install

install.packages('mldr.resampling')

Monthly Downloads

244

Version

0.2.3

License

MIT + file LICENSE

Maintainer

Miguel Ángel Dávila

Last Published

August 22nd, 2023

Functions in mldr.resampling (0.2.3)

initTypes

Auxiliary function used by MLSOL. Categorizes each pair instance-label of the dataset with a type
MLTL

Multilabel approach for the Tomek Link undersampling algorithm (MLTL)
generateInstanceMLSOL

Auxiliary function used by MLSOL. Creates a synthetic sample based on two other samples, taking into account their types
MLSMOTE

Synthetic oversampling of multilabel instances (MLSMOTE)
executeAlgorithm

Auxiliary function used by resample. It executes an algorithm, given as a string, and stores the resulting MLD in a arff file
getAllNeighbors2

Auxiliary function used by MLeNN and MLTL. Gets the kNN of every instance in a dataset, when compared to some of the rest
getAllNeighbors

Auxiliary function used by MLSOL and MLUL. Computes the kNN of every instance in a dataset
getNN

Auxiliary function used to compute the neighbors of an instance
LPRUS

Randomly deletes instances with majoritary labelsets
newSample

Auxiliary function used by MLSMOTE. Creates a synthetic sample based on values of attributes and labels of its neighbors
getNumCores

Get the number of cores available for parallel computing
MLRUS

Randomly deletes instances with majoritary labels
getAllReverseNeighbors

Auxiliary function used by MLUL. For each instance in the dataset, given the neighbors structure, we compute its reverse nearest neighbors
getS

Auxiliary function used by MLSOL and MLUL. For non outlier instances, it aggregates the values of C, taking into account the global class imbalance
getU

Auxiliary function used by MLUL. It computes the influence of each instance with respect to its reverse neighbors
getC

Auxiliary function used by MLSOL and MLUL. For each instance in the dataset, we compute, for each label, the proportion of neighbors having an opposite class with respect to the proper instance
getV

Auxiliary function used by MLUL. It calculates, for each instance, how important it is in the dataset
calculateTableVDM

Auxiliary function used to calculate an auxiliary table to make VDM calculation faster
calculateDistances

Auxiliary function used to calculate the distances between an instance and the ones with a specific active label. Euclidean distance is calculated for numeric attributes, and VDM for non numeric ones.
resample

Interface function of the package. It executes one or several algorithms, given as strings, and stores the resulting MLDs in arff files
getW

Auxiliary function used by MLSOL and MLUL. For non outlier instances, it aggregates the values of S for each label
setNumCores

Set the number of cores available for parallel computing
setParallel

Enable/Disable parallel computing
vdm

Auxiliary function used to calculate the Value Difference Metric (VDM) between two instances considering their non numeric attributes
LPROS

Randomly clones instances with minoritary labelsets
MLSOL

Multi-label oversampling based on local label imbalance (MLSOL)
MLUL

Multi-label undersampling based on local label imbalance (MLUL)
REMEDIAL

Decouples highly imbalanced labels
adjustedHammingDist

Auxiliary function used by MLeNN. Computes the Hamming Distance between two instances
MLeNN

Multilabel edited Nearest Neighbor (MLeNN)
MLROS

Randomly clones instances with minoritary labels
MLRkNNOS

Reverse-nearest neighborhood based oversampling for imbalanced, multi-label datasets