Learn R Programming

tm.plugin.alceste (version 1.1.1)

AlcesteSource: Alceste Source

Description

Construct a source for an input containing a set of texts saved in the Alceste format in a single text file.

Usage

AlcesteSource(x, encoding = "auto")

Value

An object of class AlcesteSource which extends the class

Source representing set of articles from Alceste.

Arguments

x

Either a character identifying the file or a connection.

encoding

A character string: if non-empty declares the encoding used when reading the file, so the character data can be re-encoded. See the ‘Encoding’ section of the help for file. The default, “auto”, uses stri_enc_detect to try to guess the encoding; this may fail, in which case the native encoding is used.

Author

Milan Bouchet-Valat

Details

Several texts are saved in a single Alceste-formatted file, separated by lines starting with “***” or digits, followed by starred variables (see links below). These variables are set as document meta-data that can be accessed via the meta function.

Currently, “theme” lines starting with “-*” are ignored.

See Also

https://image-zafar.com/sites/default/files/telechargements/formatage_alceste.pdf (in French) about the Alceste format

readAlceste for the function actually parsing individual articles.

getSources to list available sources.

Examples

Run this code
    library(tm)
    file <- system.file("texts", "alceste_test.txt", 
                        package = "tm.plugin.alceste")
    corpus <- Corpus(AlcesteSource(file))

    # See the contents of the documents
    inspect(corpus)

    # See meta-data associated with first article
    meta(corpus[[1]])

Run the code above in your browser using DataLab