Learn R Programming

docxtractr (version 0.6.5)

Extract Data Tables and Comments from 'Microsoft' 'Word' Documents

Description

'Microsoft Word' 'docx' files provide an 'XML' structure that is fairly straightforward to navigate, especially when it applies to 'Word' tables and comments. Tools are provided to determine table count/structure, comment count and also to extract/clean tables and comments from 'Microsoft Word' 'docx' documents. There is also nascent support for '.doc' and '.pptx' files.

Copy Link

Version

Install

install.packages('docxtractr')

Monthly Downloads

804

Version

0.6.5

License

MIT + file LICENSE

Maintainer

Last Published

July 5th, 2020

Functions in docxtractr (0.6.5)

%>%

Pipe operator
docxtractr

Extract Data Tables and Comments from 'Microsoft' 'Word' Documents
mcga

Make Column Names Great Again
read_docx

Read in a Word document for table extraction
set_libreoffice_path

Point to Local soffice.exe File
convert_to_pdf

Convert a Document (usually PowerPoint) to a PDF
print.docx

Display information about the document
assign_colnames

Make a specific row the column names for the specified data.frame
docx_extract_all_cmnts

Extract all comments from a Word document
docx_extract_all_tbls

Extract all tables from a Word document
docx_cmnt_count

Get number of comments in a Word document
docx_extract_tbl

Extract a table from a Word document
docx_describe_cmnts

Returns information about the comments in the Word document
docx_tbl_count

Get number of tables in a Word document
docx_describe_tbls

Returns a description of all the tables in the Word document
docx_extract_all

Extract all tables from a Word document