Learn R Programming

docxtractr (version 0.6.5)

read_docx: Read in a Word document for table extraction

Description

Local file path or URL pointing to a .docx file. Can also take .doc file as input if LibreOffice is installed (see https://www.libreoffice.org/ for more info and to download).

Usage

read_docx(path, track_changes = NULL)

Arguments

path

path to the Word document

track_changes

if not NULL (the default) then must be one of "accept" or "reject" which will, respectively, accept all or reject all changes. NOTE: this functionality relies on the pandoc utility being available on the system PATH. Both system PATH and the RSTUDIO_PANDOC (RStudio ships with a copy of pandoc) environment variables will be checked. If no pandoc binary is found then a warning will be issued and the document will be read without integrating or ignoring any tracked changes. The original Word document will not be modified and this feature only works with docx files.

Examples

Run this code
# NOT RUN {
doc <- read_docx(system.file("examples/data.docx", package="docxtractr"))
class(doc)

doc <- read_docx(
  system.file("examples/trackchanges.docx", package="docxtractr"),
  track_changes = "accept"
)

# }
# NOT RUN {
# from a URL
budget <- read_docx(
"http://rud.is/dl/1.DOCX")
# }

Run the code above in your browser using DataLab