Learn R Programming

⚠️There's a newer version (3.0.5) of this package.Take me there.

textmineR

Functions for Text Mining and Topic Modeling

Copyright 2019 by Thomas W. Jones

An aid for text mining in R, with a syntax that is more familiar to experienced R users. Also, implements various functions related to topic modeling, making it a good topic modeling work bench.

textmineR was created with three principles in mind:

  1. Maximize interoperability within R's ecosystem
  2. Scaleable in terms of object storeage and computation time
  3. Syntax that is idiomatic to R

Please see the vignettes for more information on how to get started.

Note: there's a lot going on with textmineR at the moment, including adding functionality based on original research.

Copy Link

Version

Install

install.packages('textmineR')

Monthly Downloads

1,142

Version

3.0.4

License

MIT + file LICENSE

Maintainer

Last Published

April 18th, 2019

Functions in textmineR (3.0.4)

FitLsaModel

Fit a topic model using Latent Semantic Analysis
textmineR

textmineR
textmineR-deprecated

Deprecated functions in package textmineR.
CalcLikelihood

Calculate the log likelihood of a document term matrix given a topic model
SummarizeTopics

Summarize topics in a topic model
predict.ctm_topic_model

Predict method for Correlated topic models (CTM)
LabelTopics

Get some topic labels using a "more probable" method of terms
posterior.lda_topic_model

Draw from the posterior of an LDA topic model
GetTopTerms

Get Top Terms for each topic from a topic model
Internals

Internal helper functions for textmineR
CalcTopicModelR2

Calculate the R-squared of a topic model.
FitLdaModel

Fit a Latent Dirichlet Allocation topic model
nih

Abstracts and metadata from NIH research grants awarded in 2014
predict.lsa_topic_model

Predict method for LSA topic models
predict.lda_topic_model

Get predictions from a Latent Dirichlet Allocation model
GetProbableTerms

Get cluster labels using a "more probable" method of terms
posterior

Posterior methods for topic models
Dtm2Lexicon

Turn a document term matrix into a list for LDA Gibbs sampling
Dtm2Tcm

Turn a document term matrix into a term co-occurrence matrix
TermDocFreq

Get term frequencies and document frequencies from a document term matrix.
TmParallelApply

update

Update methods for topic models
update.lda_topic_model

Update a Latent Dirichlet Allocation topic model with new data
Cluster2TopicModel

Represent a document clustering as a topic model
Dtm2Docs

Convert a DTM to a Character Vector of documents
CreateDtm

Convert a character vector to a document term matrix.
CalcHellingerDist

Calculate Hellinger Distance
CalcProbCoherence

Probabilistic coherence of topics
CalcGamma

Calculate a matrix whose rows represent P(topic_i|tokens)
CreateTcm

Convert a character vector to a term co-occurrence matrix.
CalcJSDivergence

Calculate Jensen-Shannon Divergence
FitCtmModel

Fit a Correlated Topic Model