Set up the design matrix X as a big.matrix
object based on external
massive data file stored on disk that cannot be fullly loaded into memory.
The data file must be a well-formated ASCII-file, and contains only one
single type. Current version only supports double
type. Other
restrictions about the data file are described in
biglasso-package
. This function reads the massive data, and
creates a big.matrix
object. By default, the resulting
big.matrix
is file-backed, and can be shared across processors or
nodes of a cluster.
setupX(
filename,
dir = getwd(),
sep = ",",
backingfile = paste0(unlist(strsplit(filename, split = "\\."))[1], ".bin"),
descriptorfile = paste0(unlist(strsplit(filename, split = "\\."))[1], ".desc"),
type = "double",
...
)
A big.matrix
object corresponding to a file-backed
big.matrix
. It's ready to be used as the design matrix X
in
biglasso
and cv.biglasso
.
The name of the data file. For example, "dat.txt".
The directory used to store the binary and descriptor files
associated with the big.matrix
. The default is current working
directory.
The field separator character. For example, "," for comma-delimited files (the default); "\t" for tab-delimited files.
The binary file associated with the file-backed
big.matrix
. By default, its name is the same as filename
with
the extension replaced by ".bin".
The descriptor file used for the description of the
file-backed big.matrix
. By default, its name is the same as
filename
with the extension replaced by ".desc".
The data type. Only "double" is supported for now.
Additional arguments that can be passed into function
read.big.matrix
.
Yaohui Zeng and Patrick Breheny
Maintainer: Yaohui Zeng <yaohui.zeng@gmail.com>
For a data set, this function needs to be called only one time to set up the
big.matrix
object with two backing files (.bin, .desc) created in
current working directory. Once set up, the data can be "loaded" into any
(new) R session by calling attach.big.matrix(discriptorfile)
.
This function is a simple wrapper of
read.big.matrix
. See
read.big.matrix
and the package
bigmemory for more
details.
biglasso
, cv.ncvreg
## see the example in "biglasso-package"
Run the code above in your browser using DataLab