Learn R Programming

tm.plugin.factiva (version 1.8.1)

tm.plugin.factiva-package: A plug-in for the tm text mining framework to import articles from Factiva

Description

This package provides a tm Source to create corpora from articles exported from Dow Jones's Factiva content provider as XML or HTML files.

Arguments

Author

Milan Bouchet-Valat <nalimilan@club.fr>

Details

Typical usage is to create a corpus from a XML or HTML files exported from Factiva (here called myFactivaArticles.xml). Setting language=NA allows the language to be set automatically from the information provided by Factiva:


    # Import corpus
    source <- FactivaSource("myFactivaArticles.xml")
    corpus <- Corpus(source, list(language=NA))

# See how many articles were imported corpus

# See the contents of the first article and its meta-data inspect(corpus[1]) meta(corpus[[1]])

Currently, only HTML files saved in French are supported. Please send the maintainer examples of Factiva files in your language if you want it to be supported.

See FactivaSource for more details and real examples.

References

https://global.factiva.com/