readTable: Reading Excel tables from a workbook

Description

Reads Excel tables (Office 2007+) from a '>workbook.

Usage

# S4 method for workbook,numeric
readTable(object, sheet, table, header, rownames, colTypes, forceConversion, 
dateTimeFormat, check.names, useCachedValues, keep, drop, simplify, readStrategy)
# S4 method for workbook,character
readTable(object, sheet, table, header, rownames, colTypes, forceConversion, 
dateTimeFormat, check.names, useCachedValues, keep, drop, simplify, readStrategy)

Arguments

object

The '>workbook to use

sheet

The index or name of the worksheet on which to look for the specified table

table

The name of the table to read

header

The argument header specifies if the first row should be interpreted as column names. The default value is TRUE.

rownames

Index (numeric) or name (character) of column that should be used as row names. The corresponding column will be removed from the data set. Defaults to NULL which means that no row names are applied.

colTypes

Column types to use when reading in the data. Specified as a character vector of the corresponding type names (see XLC; XLC$DATA_TYPE.<?>). You may also use R class names such as numeric, character, logical and POSIXt. The types are applied in the given order to the columns - elements are recycled if necessary. Defaults to character(0) meaning that column types are determined automatically (see the Note section for more information). By default, type conversions are only applied if the specified column type is a more generic type (e.g. from Numeric to String) - otherwise NA is returned. The forceConversion flag can be set to force conversion into less generic types where possible.

forceConversion

logical specifying if conversions to less generic types should be forced. Defaults to FALSE meaning that if a column is specified to be of a certain type via the colTypes argument and a more generic type is detected in the column, then NA will be returned (example: column is specified to be DateTime but a more generic String is found). Specifying forceConversion = TRUE will try to enforce a conversion - if it succeeds the corresponding (converted) value will be returned, otherwise NA. See the Note section for some additional information.

dateTimeFormat

Date/time format used when doing date/time conversions. Defaults to getOption("XLConnect.dateTimeFormat"). This should be a POSIX format specifier according to strptime although not all specifications have been implemented yet - the most important ones however are available.

check.names

logical specifying if column names of the resulting data.frame should be checked to ensure that they are syntactically valid valid variable names and are not duplicated. See the check.names argument of data.frame. Defaults to TRUE.

useCachedValues

logical specifying whether to read cached formula results from the workbook instead of re-evaluating them. This is particularly helpful in cases for reading data produced by Excel features not supported in XLConnect like references to external workbooks. Defaults to FALSE, which means that formulas will be evaluated by XLConnect.

keep

List of column names or indices to be kept in the output data frame. It is possible to specify either keep or drop, but not both at the same time. Defaults to NULL. If a vector is passed as argument, it will be wrapped into a list. This list gets replicated to match the length of the other arguments.

drop

List of column names or indices to be dropped in the output data frame. It is possible to specify either keep or drop, but not both at the same time. Defaults to NULL. If a vector is passed as argument, it will be wrapped into a list. This list gets replicated to match the length of the other arguments.

simplify

logical specifying if the result should be simplified, e.g. in case the data.frame would only have one row or one column (and data types match). Simplifying here is identical to calling unlist on the otherwise resulting data.frame (using use.names = FALSE). The default is FALSE.

readStrategy

character specifying the reading strategy to use. Currently supported strategies are:

"default" (default): Can handle all supported data types incl. date/time values and can deal directly with missing value identifiers (see setMissingValue)
"fast": Increased read performance. Date/time values are read as numeric (number of days since 1900-01-01; fractional days represent hours, minutes, and seconds) and only blank cells are recognized as missing (missing value identifiers as set in setMissingValue are ignored)

References

Overview of Excel tables http://office.microsoft.com/en-001/excel-help/overview-of-excel-tables-HA010048546.aspx

Examples

Run this code

# NOT RUN {
# mtcars xlsx file from demoFiles subfolder of package XLConnect
demoExcelFile <- system.file("demoFiles/mtcars.xlsx", package = "XLConnect")

# Load workbook
wb <- loadWorkbook(demoExcelFile)

# Read table 'MtcarsTable' from sheet 'mtcars_table'
data <- readTable(wb, sheet = "mtcars_table", table = "MtcarsTable")
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

References

See Also

Examples