Learn R Programming

text2vec (version 0.5.0)

LatentSemanticAnalysis: Latent Semantic Analysis model

Description

Creates LSA(Latent semantic analysis) model. See https://en.wikipedia.org/wiki/Latent_semantic_analysis for details.

Usage

LatentSemanticAnalysis

LSA

Format

R6Class object.

Usage

For usage details see Methods, Arguments and Examples sections.

lsa = LatentSemanticAnalysis$new(n_topics, method = c("randomized", "irlba"))
lsa$fit_transform(x, ...)
lsa$transform(x, ...)
lsa$components

Methods

$new(n_topics)

create LSA model with n_topics latent topics

$fit_transform(x, ...)

fit model to an input sparse matrix (preferably in dgCMatrix format) and then transform x to latent space

$transform(x, ...)

transform new data x to latent space

Arguments

lsa

A LSA object.

x

An input document-term matrix. Preferably in dgCMatrix format

n_topics

integer desired number of latent topics.

method

character, one of c("randomized", "irlba"). Defines underlying SVD algorithm. For very large data "randomized" usually works faster and more accurate.

...

Arguments to internal functions. Notably useful for fit_transform() - these arguments will be passed to irlba or svdr functions which are used as backend for SVD.

Examples

Run this code
# NOT RUN {
data("movie_review")
N = 100
tokens = movie_review$review[1:N] %>% tolower %>% word_tokenizer
dtm = create_dtm(itoken(tokens), hash_vectorizer())
n_topics = 10
lsa_1 = LatentSemanticAnalysis$new(n_topics)
d1 = lsa_1$fit_transform(dtm)
# the same, but wrapped with S3 methods
d2 = fit_transform(dtm, lsa_1)

# }

Run the code above in your browser using DataLab