Learn R Programming

FRESA.CAD

Feature Selection Algorithms for Computer Aided Diagnosis.

Set of functions for: Conditioning, Feature Selection, Machine Learning, Cross-Validation, and Visual Evaluation

Table of Contents

Overview

The design of diagnostic or prognostic multivariate models via the selection of significantly discriminant features is complex.

FRESA.CAD provides a series of functions for: Data conditioning, Feature Selection, Machine Learning, Benchmarking, Visualization and Reporting.

CategoryFunction(s)Purpose
Conditioning/PreprocessingnearestNeighborImpute()Impute missing values
Conditioning/PreprocessingFRESA.Scale()Data Scale/Normalization
Conditioning/PreprocessingfeatureAdjustment()Adjust variables removing collinearity
Conditioning/PreprocessingIDeA()/ILAA()Multicollinearity Mitigation
Feature SelectionuniRankVar()Univariate Analysis
Feature SelectionBSWiMS.model()Linear Model Subset Selection
Feature Selectionunivariate_BinEnsemble()Ensemble Select Top Features
Feature Selectionunivariate...Filter Select Top Features ...
Machine LearningBSWiMS.model()Bootstrap Modeling
Machine LearningfilteredFit()Pipeline ML: Scale/Filter/Transform/Learn
Machine LearningHLCM()/HLCM_EM()Latent-Class Based Modeling
Machine LearningGMVECluster()Unsupervised Clustering via GMVE
Benchmarking / EvaluationRandomCV()Random Holdout Validation
Benchmarking / EvaluationBinaryBenchmark()Binary Model Evaluation
Benchmarking / EvaluationOrdinalBenchmark()Ordinal Model Evaluation
Benchmarking / EvaluationCoxBenchmark()Cox-based Model Evaluation
Visualization / ReportingRRPlot()Survival Model Evaluation
Visualization / ReportingpredictionStats_binary()Report Cross Validation Results Binary
Visualization / ReportingpredictionStats_Ordinal()Report Cross Validation Results Ordinal
Visualization / ReportingpredictionStats_survival()Report Cross Validation Results Survival

Besides the above listed functions the library provides predictors and wrappers of common machine learning methods, and many other auxiliary functions.

Installation

You can install the official release of the package from CRAN using:

install.packages("FRESA.CAD")

To install the development version from GitHub, use:

# Install 'devtools' package if you haven't already
install.packages("devtools")

# Install the package from GitHub
devtools::install_github("https://github.com/joseTamezPena/FRESA.CAD")

Usage

#Load the package
library(FRESA.CAD)

#For comprehensive evaluaiton of confusion tables
library("epiR")

# Example usage

data(stagec,package = "rpart")
options(na.action = 'na.pass')
dataCancer <- cbind(pgstat = stagec$pgstat,
                        pgtime = stagec$pgtime,
                        as.data.frame(
                          model.matrix(Surv(pgtime,pgstat) ~ .,stagec))[-1])

#Impute missing values
dataCancerImputed <- nearestNeighborImpute(dataCancer)
data(cancerVarNames)

UniRankFeaturesRaw <- univariateRankVariables(variableList = cancerVarNames,
                                                  formula = "pgstat ~ 1+pgtime",
                                                  Outcome = "pgstat",
                                                  data = dataCancer, 
                                                  categorizationType = "Raw", 
                                                  type = "LOGIT", 
                                                  rankingTest = "zIDI",
                                                  description = "Description",
                                                  uniType="Binary")
print(UniRankFeaturesRaw)

    # A simple BSIWMS Model

BSWiMSModel <- BSWiMS.model(formula = Surv(pgtime, pgstat) ~ 1, dataCancerImputed)
#The list of all models of the bootstrap forward selection 
print(BSWiMSModel$forward.selection.list)

#With FRESA.CAD we can do a leave-one-out using the list of models
pm <- ensemblePredict(BSWiMSModel$forward.selection.list,
                          dataCancer,predictType = "linear",type="LOGIT",Outcome="pgstat")

#Ploting the ROC with 95
pm <- plotModels.ROC(cbind(dataCancer$pgstat,
                               pm$ensemblePredict),
                     main=("LOO Forward Selection Median Predict"))

#The plotModels.ROC provides the diagnosis confusion matrix.
summary(epi.tests(pm$predictionTable))
    

More examples of FRESA.CAD usage can be found at: https://rpubs.com/J_Tamez

Contributing

Contributions are welcome! If you'd like to contribute to this project, please follow these guidelines:

- Fork the repository.

- Create a new branch: git checkout -b feature/new-feature.

- Make your changes and commit them: git commit -m 'Add new feature'.

- Push to the branch: git push origin feature/new-feature.

- Submit a pull request.

License

This project is licensed under the LGPL>=2.0.

Contact

For any questions or feedback, feel free to contact us at:

Email: jose.tamezpena@tec.mx

Twitter: @tamezpena

Copy Link

Version

Install

install.packages('FRESA.CAD')

Monthly Downloads

414

Version

3.4.8

License

LGPL (>= 2)

Maintainer

Last Published

June 25th, 2024

Functions in FRESA.CAD (3.4.8)

EmpiricalSurvDiff

Estimate the LR value and its associated p-values
FRESA.Model

Automated model selection
FRESA.CAD-package

FeatuRE Selection Algorithms for Computer-Aided Diagnosis (FRESA.CAD)
BSWiMS.model

BSWiMS model selection
CalibrationProbPoissonRisk

Baseline hazard and interval time Estimations
BESS

CV BeSS fit
cancerVarNames

Data frame used in several examples of this package
ClustClass

Hybrid Hierarchical Modeling
ForwardSelection.Model.Res

NeRI-based feature selection procedure for linear, logistic, or Cox proportional hazards regression models
baggedModel

Get the bagged model from a list of models
ForwardSelection.Model.Bin

IDI/NRI-based feature selection procedure for linear, logistic, and Cox proportional hazards regression models
multivariate_BinEnsemble

Multivariate Filters
FRESAScale

Data frame normalization
TUNED_SVM

Tuned SVM
GMVECluster

Set Clustering using the Generalized Minimum Volume Ellipsoid (GMVE)
GMVEBSWiMS

Hybrid Hierarchical Modeling with GMVE and BSWiMS
RRPlot

Plot and Analysis of Indices of Risk
HLCM

Latent class based modeling of binary outcomes
IDeA

Decorrelation of data frames
barPlotCiError

Bar plot with error bars
metric95ci

Estimators and 95CI
backVarElimination_Bin

IDI/NRI-based backwards variable elimination
backVarElimination_Res

NeRI-based backwards variable elimination
getVar.Bin

Analysis of the effect of each term of a binary classification model by analysing its reclassification performance
predict.CLUSTER_CLASS

Predicts ClustClass outcome
filteredFit

A generic pipeline of Feature Selection, Transformation, Scale and fit
plotModels.ROC

Plot test ROC curves of each cross-validation model
getLatentCoefficients

Derived Features of the UPLTM transform
getVar.Res

Analysis of the effect of each term of a linear regression model by analysing its residuals
crossValidationFeatureSelection_Bin

IDI/NRI-based selection of a linear, logistic, or Cox proportional hazards regression model from a set of candidate variables
plot.FRESA_benchmark

Plot the results of the model selection benchmark
getSignature

Returns a CV signature template
getKNNpredictionFromFormula

Predict classification using KNN
ppoisGzero

Probability of more than zero events
predict.BAGGS

Predicts baggedModel bagged models
clusterISODATA

Cluster Clustering using the Isodata Approach
nearestNeighborImpute

nearest neighbor NA imputation
NAIVE_BAYES

Naive Bayes Modeling
LM_RIDGE_MIN

Ridge Linear Models
crossValidationFeatureSelection_Res

NeRI-based selection of a linear, logistic, or Cox proportional hazards regression model from a set of candidate variables
bootstrapValidation_Res

Bootstrap validation of regression models
bootstrapVarElimination_Res

NeRI-based backwards variable elimination with bootstrapping
predict.FRESA_FILTERFIT

Predicts filteredFit models
calBinProb

Calibrates Predicted Binary Probabilities
bootstrapVarElimination_Bin

IDI/NRI-based backwards variable elimination with bootstrapping
featureAdjustment

Adjust each listed variable to the provided set of covariates
FilterUnivariate

Univariate Filters
listTopCorrelatedVariables

List the variables that are highly correlated with each other
jaccardMatrix

Jaccard Index of two labeled sets
ensemblePredict

The median prediction from a list of models
predict.FRESA_SVM

Predicts TUNED_SVM models
predict.FRESA_NAIVEBAYES

Predicts NAIVE_BAYES models
predictionStats

Prediction Evaluation
predict.GMVE

Predicts GMVECluster clusters
predict.FRESAsignature

Predicts CVsignature models
predict.FRESA_RIDGE

Predicts LM_RIDGE_MIN models
plot.bootstrapValidation_Bin

Plot ROC curves of bootstrap results
predict.fitFRESA

Linear or probabilistic prediction
randomCV

Cross Validation of Prediction Models
mRMR.classic_FRESA

FRESA.CAD wrapper of mRMRe::mRMR.classic
predict.FRESA_GLMNET

Predicts GLMNET fitted objects
plot.bootstrapValidation_Res

Plot ROC curves of bootstrap results
reportEquivalentVariables

Report the set of variables that will perform an equivalent IDI discriminant function
uniRankVar

Univariate analysis of features (additional values returned)
summary.fitFRESA

Returns the summary of the fit
residualForFRESA

Return residuals from prediction
summaryReport

Report the univariate analysis, the cross-validation analysis and the correlation analysis
update.uniRankVar

Update the univariate analysis using new data
univariateRankVariables

Univariate analysis of features
predict.GMVE_BSWiMS

Predicts GMVEBSWiMS outcome
updateModel.Bin

Update the IDI/NRI-based model using new data or new threshold values
KNN_method

KNN Setup for KNN prediction
bootstrapValidation_Bin

Bootstrap validation of binary classification models
improvedResiduals

Estimate the significance of the reduction of predicted residuals
trajectoriesPolyFeatures

Extract the per patient polynomial Coefficients of a feature trayectory
rankInverseNormalDataFrame

rank-based inverse normal transformation of the data
heatMaps

Plot a heat map of selected variables
timeSerieAnalysis

Fit the listed time series variables to a given model
GLMNET

GLMNET fit with feature selection"
updateModel.Res

Update the NeRI-based model using new data or new threshold values
nearestCentroid

Class Label Based on the Minimum Mahalanobis Distance
modelFitting

Fit a model to the data
benchmarking

Compare performance of different model fitting/filtering algorithms
predict.LogitCalPred

Predicts calibrated probabilities
signatureDistance

Distance to the signature template
summary.bootstrapValidation_Bin

Generate a report of the results obtained using the bootstrapValidation_Bin function
predict.FRESAKNN

Predicts class::knn models
predict.FRESA_HLCM

Predicts BOOST_BSWiMS models
predict.FRESA_BESS

Predicts BESS models
getMedianSurvCalibratedPrediction

Binary Predictions Calibration of Random CV
CVsignature

Cross-validated Signature