This function takes: a character vector or factor containing the names of one or
more R data frames; A character vector or factor containing the name
of a single list which contains one or more R data frames; a single
list object containing one or more data frames; a
single data frame object and exports them to an Excel 2003 or 2007
spreadsheet file. Each data frame will be written to a separate
worksheet in the same Excel file.
The order of the worksheets created in the Excel file will match the
order of the entries in x
.
The actual creation of the Excel file is performed by Perl scripts
called WriteXLS.pl (for XLS files) and WriteXLSX.pl (for XLSX files),
which are included with this package.
Note that the named Excel file, if it already exists, will be
overwritten and no warning is given. In addition, if the file exists
and is open by another application (eg. Excel, OO.org, LibreOffice, etc.) you will
likely get an error message regarding the inability to open the file
and/or that the file is already in use by another application or
user. Errors can also occur if the file has been marked as
read-only or if your access rights do not allow you to overwrite the
file or write to the folder you have indicated in the path to the file.
There is an intermediate step, where the R data frames are first written
to CSV files using writeLines
with argument useBytes = TRUE
before being written
to the Excel file by the relevant Perl script. tempdir
is used to determine the
current R session temporary directory and a new sub-directory called “WriteXLS”
will be created there. The CSV files will be written to that directory and
both the files and the directory will be deleted prior to the function
terminating normally using on.exit
. It is possible that these
will remain in place if this function terminates abnormally or is aborted
prior to completion.
Since as.character
is used to coerce data frame column content to character
vectors prior to export, data types supported by as.character
will be exported to their
character representation correctly. For other data types, it is recommended
that you first coerce them to character columns, formatted as you require,
and then use WriteXLS
to create the Excel file.
All of the CSV files will be created prior to the creation of the Excel file
as the Perl script will loop over them as part of the process. Thus,
sufficient free disk space must be available for these files and the Excel
file at the same time. Note, importantly, that in the course of creating XLSX files,
which are ZIP compressed XML files, Perl will require free disk space that is some
multiple of the size of the resultant XLSX file, in the course of creating the
XML files before compression into the XLSX file. Thus, additional free disk space,
in addition to the CSV files noted above, will also be required temporarily.
A text file called "SheetNames.txt" will be created in the same
temporary directory as the CSV files. This file will contain the sheet
names, one per line and will be used by the Perl script to name the
worksheets in the Excel file.
Each worksheet will be named using either the names in
SheetNames
, the names of the data frames in x
or the
names of the list elements if x
is a list (up to the first 31
characters, which is an Excel limitation). If any the data frame names
specified in x
are longer than 31 characters, they will be
truncated to 31 characters. SheetNames
if specified, will be
checked to make sure that all of the entries are less than or equal to
31 characters. If not, an error message will be displayed.
Note that the order of the names in SheetNames
MUST match the
order of the data frames named in x
.
Note that the worksheets must have unique names. Thus, if
SheetNames
is NULL
, the data frame names will be checked
to be sure that they are unique up through the first 31 characters. If
SheetNames
is specified, the entries will be checked to be sure
that they are unique. If not, an error message will be displayed.
Note that the following characters are not allowed for Excel worksheet
names: []:*?/\
The data frame column names will be exported "as is" and will be the first
row in the corresponding worksheet, if col.names = TRUE
.
UTF-8 encoded content in the data frame should be properly exported by
default. If you are operating in a 'latin1' based locale (also known
as iso-8859-1) or a Windows CP-1252 locale, set Encoding
to 'latin1' or 'cp1252', respectively.
As of version 5.0.0, Unicode character based content is better supported if
running on a Windows based operating system. This is achieved by generating
the intermediate CSV files using writeLines(..., useBytes = TRUE)
,
which should avoid the re-encoding of the content of the CSV files into the
operating locale of the relevant computer. Set Encoding
to 'UTF-8' in
this case to preserve the Unicode content in the Excel file. Since
Encoding
applies to all data frames and columns therein being
exported to the same Excel file, if mixed character encodings are present,
this may cause errors in the character representations in the Excel file.
If you note errors from Perl regarding characters not mapping to Unicode,
or illegal characters in UTF-8, try setting Encoding
to 'latin1'.
If mixed encodings are present, you may want to explore the use of iconv
to standardize the data to a single encoding.
If one or more of the data frame columns have been assigned a
comment attribute using the comment
function, these will
be used to create a worksheet cell comment in this first row of
the worksheet for each column with this attribute. These can serve to
provide descriptive information for each column. This will work for
both the Excel 2003 and 2007 file formats.
Values containing leading and/or trailing zeroes in data frame content will
be treated, by default, as text, such that these zeroes will be preserved in the Excel
worksheet content, rather than being coerced to numeric values and being stripped, which is
the default Excel behavior. These values will be left-justified in the cells, rather than
being right-justified, and there may be a mix of justification in a single column, based
upon the data present.
If AllText = TRUE
, all data frame contents, including numeric values, will be
written to the Excel file as text, and will be left-justified. There may be content in
the source data frame(s), that contain leading and/or trailing zeros or other pre-formatted
content which would otherwise be modified by Excel, and which is the default Excel behavior.
Thus, this option is provided to fully override the default behavior in Excel. Note that if
you open the Excel file after creation, you may note little colored triangles in the upper
left hand corner of the relevant cells, indicating that a number is stored as text. This is
expected and if you manually edit the cell content, it may revert to a number, losing any desired
formatting.
If verbose = TRUE
, messages will be output during key steps in the process of creating
the Excel file and intermediate steps. In the case of the Perl scripts parsing the intermediate CSV files,
CSV file line numbers will be output to hopefully assist in the debugging of possible content problems that may
compromise the integrity of the resultant Excel file. Note that the line numbers are 0 based, rather than 1 based,
and will include an additional line for the column names if col.names = TRUE
. An additional line is
also included to be able to handle any comment attributes, and you will observe a repeat of either line 0,
if col.names = FALSE
or line 1, if col.names = TRUE
, as a result. The actual source data frame
content will begin on the subsequent line.
If ReadOnly = TRUE
, each worksheet will be set to Protected (read only) mode to prevent the inadvertent
editing of the contents when the file is opened in Excel or a compatible application. This is not the same as
the file being password protected and encrypted, which is not supported. The contents of the worksheets will be
readable and copyable, but not editable. Each worksheet will show a lock icon on the worksheet tab. Protected mode
can be disabled by the user by right-clicking on each worksheet tab and selecting 'Unprotect Sheet' in the menu.
Note that as of version 6.0.0, more robust handling of embedded newline ('\n') and carriage return ('\r')
characters, double quotes, and two character sequences of a backslash character ('\') followed by another character,
within a data frame value has been implemented.
If you have installed the RTools utilities under Windows from CRAN, version 4.x or greater, RTools includes a Cygwin
based Perl distribution from the MSYS2 project, which emulates Linux/Unix behavior and will be used by default.
This causes a conflict in reading the temporary text files created here, as Linux/Unix uses LF character for line endings,
rather than a CRLF character sequence, which is the default under Windows, and will result in an error. Starting with
WriteXLS version 6.7.0, this has been resolved, if you should elect to use the RTools Perl distribution.
If there any issues parsing the lines in the CSV file, such that the number of fields in the current line is different
than the prior line, perhaps due to atypical character content, an error will be issued. If this occurs, you may wish
to set verbose = TRUE
in order to obtain additional debugging information to help identify where in the CSV
file the error may be located. You may need to pre-process the data in the source data frame (e.g. using gsub
)
in order to resolve this issue. While this handling is now more robust, the user is strongly encouraged to verify the
content of the resultant Excel file to ensure that the integrity of the exported data frame(s) is maintained and that
all source data have been written correctly to the worksheet(s). The user is strongly encouraged to be aware of any
data frame content that includes escaped character sequences or other atypical content, and to consider pre-processing
these as may be apropos, prior to using this function.
Note that arguments AdjWidth
, AutoFilter
, BoldHeaderRow
, ReadOnly
, FreezeRow
,
FreezeCol
, Encoding
, row.names
, col.names
and AllText
will apply to ALL
worksheets exported.