These functions create, merge and expand BlockwiseData objects for holding in-memory or disk-backed blockwise data. Blockwise here means that the data is too large to be loaded or processed in one piece and is therefore split into blocks that can be handled one by one in a divide-and-conquer manner.
newBlockwiseData(
data,
external = FALSE,
fileNames = NULL,
doSave = external,
recordAttributes = TRUE,
metaData = list())mergeBlockwiseData(...)
addBlockToBlockwiseData(
bwData,
blockData,
external = bwData$external,
blockFile = NULL,
doSave = external,
recordAttributes = !is.null(bwData$attributes),
metaData = NULL)
A list in which each component carries the data of a single block.
Logical: should the data be disk-backed (TRUE
) or in-memory (FALSE
)?
When external
is TRUE
, this argument must be a
character vector of the same length as data
, giving the file names for the data to be saved to, or
where the data is already located.
Logical: should data be saved? If this is FALSE
, it is the user's responsibility to ensure the files
supplied in fileNames
already exist and contain the expected data.
Logical: should attributes
of the given data be recorded within the object?
A list giving any additional meta-data for data
that should be attached to the object.
An existing BlockwiseData
object.
A vector, matrix or array carrying the data of a single block.
File name where data contained in blockData
should be saved.
One or more objects of class BlockwiseData
.
All three functions return a list with the class set to "BlockwiseData"
, containing the following components:
Copy of the input argument external
If external
is TRUE
, an empty list, otherwise a copy of the input data
.
Copy of the input argument fileNames
.
A vector of lengths (results of length
) of elements of data
.
If input recordAttributes
is TRUE
, a list with one component per block
(component of data
); each component is in turn a list of attributes of that component of data
.
A copy of the input metaData
.
The definition of BlockwiseData
should be considered experimental and may change in
the future.
Several functions in this package use the concept of blockwise, or "divide-and-conquer", analysis. The BlockwiseData class is meant to hold the blockwise data, or all necessary information about blockwise data that is saved in disk files.
The data can be stored in disk files (one file per block) or in-memory. In memory storage is provided so that same code can be used for both smaller (single-block) data where disk storage could slow down operations as well as larger data sets where disk storage and block by block analysis are necessary.
Other functions on BlockwiseData
:
BD.getData
for retrieving data
BD.actualFileNames
for retrieving file names of files containing data;
BD.nBlocks
for retrieving the number of blocks;
BD.blockLengths
for retrieving block lengths;
BD.getMetaData
for retrieving metadata;
BD.checkAndDeleteFiles
for deleting files of an unneeded object.