readDataFrame.TabularTextFileSet: Reads the tabular data from all files as data frames

Description

Reads the tabular data from all files as data frames and combines them into one data frame (by default).

Usage

# S3 method for TabularTextFileSet
readDataFrame(this, ..., combineBy=function(x) Reduce(rbind, x), verbose=FALSE)

Value

Returns what combineBy returns, which defaults to a data.frame. If combineBy=NULL, then a named list of data.frame:s is returned.

Arguments

...: Arguments passed to readDataFrame() as called for each TabularTextFile of the file set.
combineBy: A function that takes a list of data.frame:s and combines them. The default is to stack them into a single data.frame. If NULL, the list is not combined.

Author

Henrik Bengtsson

Examples

Run this code

# Setup a file set consisting of all *.dat tab-delimited files
# in a particular directory
path <- system.file("exData/dataSetA,original", package="R.filesets")
ds <- TabularTextFileSet$byPath(path, pattern="[.]dat$")
print(ds)


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Read data frames from each of the files
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
dataList <- lapply(ds, readDataFrame)
print(dataList)

rows <- c(3:5, 8, 2)
dataList <- lapply(ds, readDataFrame, rows=rows)
print(dataList)



# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Read common columns and stack into one data frame
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
colNames <- Reduce(intersect, lapply(ds, getColumnNames))
cat("Common column names:\n")
print(colNames)

# Read the *common* columns "as is" (hence 'NA')
colClasses <- rep(NA, times=length(colNames))
names(colClasses) <- colNames
cat("Column class patterns:\n")
print(colClasses)

data <- readDataFrame(ds, colClasses=colClasses)
print(data)


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Translate column names on the fly
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
lapply(ds, FUN=setColumnNamesTranslator, function(names, ...) toupper(names))
data <- readDataFrame(ds, colClasses=c("(X|Y)"="integer", "CHAR"="character"))
print(data)