Learn R Programming

superml (version 0.4.0)

CountVectorizer: Count Vectorizer

Description

Creates CountVectorizer Model. Given a list of text, it generates a bag of words model and returns a data frame consisting of BOW features.

Usage

CountVectorizer

Arguments

Format

R6Class object.

Usage

For usage details see Methods, Arguments and Examples sections.

bst = CountVectorizer$new(min_df=1, max_df=1, max_features=1)
bst$fit(sentences)
bst$fit_transform(sentences)
bst$transform(sentences)

Methods

$new()

Initialise the instance of the vectorizer

$fit()

creates a memory of bag of words

$transform()

based on encodings learned in fit method, return a bag of words matrix

$fit_transform()

simultaneouly fits and transform words and returns bag of words of matrix

Examples

Run this code
# NOT RUN {
df <- data.frame(sents = c('i am alone in dark.','mother_mary a lot',
                           'alone in the dark?',
                           'many mothers in the lot....'))

# fits and transforms on the entire data in one go
bw <- CountVectorizer$new(min_df = 0.3)
tf_features <- bw$fit_transform(df$sents)

# fit on entire data and do transformation in train and test
bw <- CountVectorizer$new()
bw$fit(df$sents)
tf_features <- bw$transform(df$sents)
# }

Run the code above in your browser using DataLab