Learn R Programming

cleanNLP (version 3.1.0)

cleanNLP-package: cleanNLP: A Tidy Data Model for Natural Language Processing

Description

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Multiple NLP backends can be used, with the output standardized into a normalized format. Options include stringi (very fast, but only provides tokenization), udpipe (fast, many languages, includes part of speech tags and dependencies), and spacy (python backend; includes named entity recognition).

Arguments

Details

Once the package is set up, run one of cnlp_init_stringi, cnlp_init_spacy, or cnlp_init_udpipe to load the desired NLP backend. After this function is done running, use cnlp_annotate to run the annotation engine over a corpus of text. The package vignettes provide more detailed set-up information.

See Also

Examples

Run this code

if (FALSE) {
library(cleanNLP)

# load the annotation engine
cnlp_init_stringi()

# annotate your text
input <- data.frame(
 text=c(
   "This is a sentence.",
   "Here is something else to parse!"
 ),
 stringsAsFactors=FALSE
)
}

Run the code above in your browser using DataLab