Learn R Programming

tm (version 0.5-9.1)

readDOC: Read In a MS Word Document

Description

Return a function which reads in a Microsoft Word document extracting its text.

Usage

readDOC(AntiwordOptions = "", ...)

Arguments

AntiwordOptions
Options passed over to antiword.
...
Arguments for the generator function.

Value

  • A function with the signature elem, language, id:
  • elemA list with the named element uri of type character which must hold a valid file name.
  • languageA character vector giving the text's language.
  • idA character vector representing a unique identification string for the returned text document.
  • The function returns a PlainTextDocument representing the text in content.

Details

Formally this function is a function generator, i.e., it returns a function (which reads in a text document) with a well-defined signature, but can access passed over arguments (e.g., options to antiword) via lexical scoping.

Note that this MS Word reader needs the tool antiword installed and accessible on your system. This can convert documents from Microsoft Word version 2, 6, 7, 97, 2000, 2002 and 2003 to plain text, and is available from http://www.winfield.demon.nl/.

See Also

getReaders to list available reader functions.