Reading the content of files where the number of separators (eg tabulation) is variable poses problems with traditional methods for reding files, like read.table
.
This function reads each line independently and then parses all separators therein. The first line is assumed to be column-headers.
Finally, all data will be returned in a matrix adopted to the line with most separators and if the number of column-headers is insufficient, new (unique) column-headers will be generated.
Thus, the lines may contain different number of elements, empty elements (ie tabular fields) will always get added to right of data read
and their content will be as defined by argument emptyFields
(default NA
).
readVarColumns(
fiName,
path = NULL,
sep = "\t",
header = TRUE,
emptyFields = NA,
refCo = NULL,
supNa = NULL,
silent = FALSE,
callFrom = NULL
)
This function returns a matrix (character or numeric)
(character) file-name
(character) optional path
(character) separator (between columns)
(logical) indicating whether the file contains the names of the variables as its first line.
(NA
or character) missing headers will be replaced by the content of 'emptyFields', if NA
the last column-name will be re-used and a counter added
(integer) for custom choice of column to be used as row-names (default will use 1st text-column)
(character) base for constructing name for columns wo names (+counter starting at 2), default column-name to left of 1st col wo colname
(logical) suppress messages
(character) allow easier tracking of messages produced
Note, this functions assumes one line of header and at least one line of data ! Note, for numeric data the comma is assumed to be US-Style (as '.'). Note, that it is assumed, that any missing fields for the complete tabular view are missing on the right (ie at the end of line) !
for regular 'complete' data read.table
and its argument flush
path1 <- system.file("extdata",package="wrMisc")
fiNa <- "Names1.tsv"
datAll <- readVarColumns(fiName=file.path(path1,fiNa))
str(datAll)
Run the code above in your browser using DataLab