Learn R Programming

textmineR

Functions for Text Mining and Topic Modeling

Copyright 2021 by Thomas W. Jones

An aid for text mining in R, with a syntax that is more familiar to experienced R users. Also, implements various functions related to topic modeling, making it a good topic modeling work bench.

textmineR was created with three principles in mind:

  1. Maximize interoperability within R's ecosystem
  2. Scaleable in terms of object storeage and computation time
  3. Syntax that is idiomatic to R

Please see the vignettes for more information on how to get started.

Note: there's a lot going on with textmineR at the moment, including adding functionality based on original research.

Copy Link

Version

Install

install.packages('textmineR')

Monthly Downloads

1,138

Version

3.0.5

License

MIT + file LICENSE

Maintainer

Last Published

June 28th, 2021

Functions in textmineR (3.0.5)

CalcTopicModelR2

Calculate the R-squared of a topic model.
GetProbableTerms

Get cluster labels using a "more probable" method of terms
CalcProbCoherence

Probabilistic coherence of topics
FitLsaModel

Fit a topic model using Latent Semantic Analysis
FitCtmModel

Fit a Correlated Topic Model
textmineR-deprecated

Deprecated functions in package textmineR.
FitLdaModel

Fit a Latent Dirichlet Allocation topic model
posterior.lda_topic_model

Draw from the posterior of an LDA topic model
predict.ctm_topic_model

Predict method for Correlated topic models (CTM)
CalcJSDivergence

Calculate Jensen-Shannon Divergence
CreateTcm

Convert a character vector to a term co-occurrence matrix.
nih

Abstracts and metadata from NIH research grants awarded in 2014
Dtm2Docs

Convert a DTM to a Character Vector of documents
textmineR

textmineR
Dtm2Tcm

Turn a document term matrix into a term co-occurrence matrix
Dtm2Lexicon

Turn a document term matrix into a list for LDA Gibbs sampling
posterior

Posterior methods for topic models
LabelTopics

Get some topic labels using a "more probable" method of terms
CalcLikelihood

Calculate the log likelihood of a document term matrix given a topic model
SummarizeTopics

Summarize topics in a topic model
TmParallelApply

TermDocFreq

Get term frequencies and document frequencies from a document term matrix.
CreateDtm

Convert a character vector to a document term matrix.
Internals

Internal helper functions for textmineR
Cluster2TopicModel

Represent a document clustering as a topic model
predict.lsa_topic_model

Predict method for LSA topic models
GetTopTerms

Get Top Terms for each topic from a topic model
predict.lda_topic_model

Get predictions from a Latent Dirichlet Allocation model
update

Update methods for topic models
update.lda_topic_model

Update a Latent Dirichlet Allocation topic model with new data
CalcGamma

Calculate a matrix whose rows represent P(topic_i|tokens)
CalcHellingerDist

Calculate Hellinger Distance