parseLatex: Experimental Functions to Work with LaTeX Code

Description

The parseLatex function parses LaTeX source, producing a structured object; deparseLatex reverses the process. The latexToUtf8 function takes a LaTeX object, and processes a number of different macros to convert them into the corresponding UTF-8 characters.

Usage

parseLatex(text, filename = deparse(substitute(text)),
           verbose = FALSE,
           verbatim = c("verbatim", "verbatim*",
                        "Sinput", "Soutput"))
deparseLatex(x, dropBraces = FALSE)
latexToUtf8(x)

Arguments

text

A character vector containing LaTeX source code.

filename

A filename to use in syntax error messages.

verbose

If TRUE, print debug error messages.

verbatim

A character vector containing the names of LaTeX environments holding verbatim text.

A "LaTeX" object.

dropBraces

Drop unnecessary braces when displaying a "LaTeX" object.

Value

The parseLatex() function returns a recursive object of class "LaTeX". Each of the entries in this object will have a "latex_tag" attribute identifying its syntactic role.

The deparseLatex() function returns a single element character vector, possibly containing embedded newlines.

The latexToUtf8() function returns a modified version of the "LaTeX" object that was passed to it.

Details

The parser does not recognize all legal LaTeX code, only relatively simple examples. It does not associate arguments with macros, that needs to be done after parsing, with knowledge of the definitions of each macro. The main intention for this function is to process simple LaTeX code used in bibliographic references, not fully general LaTeX documents.

Verbose text is allowed in two forms: the \verb macro (with single character delimiters), and environments whose names are listed in the verbatim argument.

Examples

Run this code

# NOT RUN {
latex <- parseLatex("fa\\c{c}ile")
deparseLatex(latexToUtf8(latex))
# }

Run the code above in your browser using DataLab