Learn R Programming

superml (version 0.4.0)

TfIdfVectorizer: TfIDF(Term Frequency Inverse Document Frequency) Vectorizer

Description

Provides an easy way to create tf-idf matrix of features in R. It consists of fit, transform methods (similar to sklearn) to generate tf-idf features.

Usage

TfIdfVectorizer

Arguments

Format

R6Class object.

Usage

For usage details see Methods, Arguments and Examples sections.

tf_object = TfIdfVectorizer$new(max_df=1, min_df=1, max_features=1, smooth_idf=TRUE)
tf_object$fit(sentences)
tf_matrix = tf_object$transform(sentences)
tf_matrix = tf_object$fit_transform(sentences) ## alternate

Methods

$new()

Initialise the instance of the vectorizer

$fit()

creates a memory of count vectorizers but doesn't return anything

$transform()

based on encodings learned in fit method, returns the tf-idf matrix

$fit_transform()

returns tf-idf matrix

Examples

Run this code
# NOT RUN {
df <- data.frame(sents = c('i am alone in dark.',
                           'mother_mary a lot',
                           'alone in the dark?',
                           'many mothers in the lot....'))
tf <- TfIdfVectorizer$new(smooth_idf = TRUE, min_df = 0.3)
tf_features <- tf$fit_transform(df$sents)
# }

Run the code above in your browser using DataLab