These functions create, merge and expand BlockwiseData objects for holding in-memory or disk-backed blockwise data. Blockwise here means that the data is too large to be loaded or processed in one piece and is therefore split into blocks that can be handled one by one in a divide-and-conquer manner.
newBlockwiseData(
data,
external = FALSE,
fileNames = NULL,
doSave = external,
recordAttributes = TRUE,
metaData = list())mergeBlockwiseData(...)
addBlockToBlockwiseData(
bwData,
blockData,
external = bwData$external,
blockFile = NULL,
doSave = external,
recordAttributes = !is.null(bwData$attributes),
metaData = NULL)
A list in which each component carries the data of a single block.
Logical: should the data be disk-backed (TRUE) or in-memory (FALSE)?
When external is TRUE, this argument must be a
character vector of the same length as data, giving the file names for the data to be saved to, or
where the data is already located.
Logical: should data be saved? If this is FALSE, it is the user's responsibility to ensure the files
supplied in fileNames already exist and contain the expected data.
Logical: should attributes of the given data be recorded within the object?
A list giving any additional meta-data for data that should be attached to the object.
An existing BlockwiseData object.
A vector, matrix or array carrying the data of a single block.
File name where data contained in blockData should be saved.
One or more objects of class BlockwiseData.
All three functions return a list with the class set to "BlockwiseData", containing the following components:
Copy of the input argument external
If external is TRUE, an empty list, otherwise a copy of the input data.
Copy of the input argument fileNames.
A vector of lengths (results of length) of elements of data.
If input recordAttributes is TRUE, a list with one component per block
(component of data); each component is in turn a list of attributes of that component of data.
A copy of the input metaData.
The definition of BlockwiseData should be considered experimental and may change in
the future.
Several functions in this package use the concept of blockwise, or "divide-and-conquer", analysis. The BlockwiseData class is meant to hold the blockwise data, or all necessary information about blockwise data that is saved in disk files.
The data can be stored in disk files (one file per block) or in-memory. In memory storage is provided so that same code can be used for both smaller (single-block) data where disk storage could slow down operations as well as larger data sets where disk storage and block by block analysis are necessary.
Other functions on BlockwiseData:
BD.getData for retrieving data
BD.actualFileNames for retrieving file names of files containing data;
BD.nBlocks for retrieving the number of blocks;
BD.blockLengths for retrieving block lengths;
BD.getMetaData for retrieving metadata;
BD.checkAndDeleteFiles for deleting files of an unneeded object.