Learn R Programming

tm.plugin.lexisnexis (version 1.4.1)

tm.plugin.lexisnexis-package: A plug-in for the tm text mining framework to import articles from LexisNexis

Description

This package provides a tm Source to create corpora from articles exported from the LexisNexis content provider as HTML files.

Arguments

Details

Typical usage is to create a corpus from HTML files exported from LexisNexis (here called myLexisNexisArticles.html). Setting language=NA allows the language to be set automatically from the information provided by Factiva:

    # Import corpus
    source <- LexisNexisSource("myLexisNexisArticles.html")
    corpus <- Corpus(source, readerControl = list(language = NA))

# See how many articles were imported corpus

# See the contents of the first article and its meta-data inspect(corpus[1]) meta(corpus[[1]])

Currently, only HTML files saved in English and French are supported. Please send the maintainer examples of LexisNexis files in your language if you want it to be supported.

See link{LexisNexisSource} for more details and real examples.

References

http://www.lexisnexis.com/