Read in an article exported from Europresse in the HTML format.
readEuropresseHTML1(elem, language, id)
readEuropresseHTML2(elem, language, id)
A PlainTextDocument
with the contents of the article and the available meta-data set.
A list
with the named element content
which
must hold the document to be read in.
A character
vector giving the text's language.
If set to NA
, the language will automatically be set to the value
reported in the document (which is usually correct).
A character
vector representing a unique identification
string for the returned text document.
Milan Bouchet-Valat
readEuropresseHTML1
reads documents in the old format, while readEuropresseHTML2
reads documents in the new one. EuropresseSource
automatically chooses the correct
reader based on the structure of the file.
getReaders
to list available reader functions.