Learn R Programming

SuperML

The goal of SuperML is to provide sckit-learn's fit,predict,transform standard way of building machine learning models in R. It is build on top of latest r-packages which provides optimized way of training machine learning models.

Installation

You can install latest stable cran version using (recommended):

install.packages("superml")
install.packages("superml", dependencies=TRUE) # to install all dependencies at once

You can install superml from github with:

# install.packages("devtools")
devtools::install_github("saraswatmks/superml")

Description

In superml, every machine learning algorithm is called as a trainer. Following is the list of trainers available as of today:

  • LMTrainer: used to train linear, logistic, ridge, lasso models
  • KNNTrainer: K-Nearest Neighbour Models
  • KMeansTrainer: KMeans Model
  • NBTrainer: Naive Baiyes Model
  • SVMTrainer: SVM Model
  • RFTrainer: Random Forest Model
  • XGBTrainer: XGBoost Model

In addition, there are other useful functions to support modeling tasks such as:

  • CountVectorizer: Create Bag of Words model
  • TfidfVectorizer: Create TF-IDF feature model
  • LabelEncoder: Convert categorical features to numeric
  • GridSearchCV: For hyperparameter optimization
  • RandomSearchCV: For hyperparameter optimization
  • kFoldMean: Target encoding
  • smoothMean: Target encoding

To compute text similarity, following functions are available:

  • bm_25: Computes bm25 distance
  • dot: Computes dot product between two vectors
  • dotmat: Computes dot product between vector & matrix

Usage

Any machine learning model can be trained using the following steps:

data(iris)
library(superml)

# random forest
rf <- RFTrainer$new(n_estimators = 100)
rf$fit(iris, "Species")
pred <- rf$predict(iris)

Documentation

The documentation can be found here: SuperML Documentation

Contributions & Support

SuperML is my ambitious effort to help people train machine learning models in R as easily as they do in python. I encourage you to use this library, post bugs and feature suggestions in the issues above.

Copy Link

Version

Install

install.packages('superml')

Monthly Downloads

463

Version

0.5.7

License

GPL-3 | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

February 18th, 2024

Functions in superml (0.5.7)

GridSearchCV

Grid Search CV
RandomSearchCV

Random Search CV
RFTrainer

Random Forest Trainer
LabelEncoder

Label Encoder
KNNTrainer

K Nearest Neighbours Trainer
CountVectorizer

Count Vectorizer
KMeansTrainer

K-Means Trainer
Counter

Calculate count of values in a list or vector
NBTrainer

Naive Bayes Trainer
LMTrainer

Linear Models Trainer
bm25

Best Matching(BM25) - Deprecated
dotmat

Dot product similarity between a vector and matrix
check_package

Internal function
createFolds

Internal function
XGBTrainer

Extreme Gradient Boosting Trainer
smoothMean

smoothMean Calculator
dot

Dot product similarity in vectors
cla_train

cla_train
TfIdfVectorizer

TfIDF(Term Frequency Inverse Document Frequency) Vectorizer
bm_25

BM25 Matching
sort_index

sort_index
kFoldMean

kFoldMean Calculator
normalise1d

normalise1d
normalise2d

normalise2d
testdata

Internal function
reg_train

reg_train