For convenience, the default formatType = "10x"
directly fits the
structure of cellranger output. formatType = "anndata"
works for
current AnnData H5AD file specification (see Details). If a customized H5
file structure is presented, any of the rawData
,
indicesName
, indptrName
, genesName
, barcodesName
should be specified accordingly to override the formatType
preset.
DO make a copy of the H5AD files because rliger functions write to the files and they will not be able to be read back to Python. This will be fixed in the future.
createH5LigerDataset(
h5file,
formatType = "10x",
rawData = NULL,
normData = NULL,
scaleData = NULL,
barcodesName = NULL,
genesName = NULL,
indicesName = NULL,
indptrName = NULL,
anndataX = "X",
modal = c("default", "rna", "atac", "spatial", "meth"),
featureMeta = NULL,
...
)
H5-based ligerDataset object
Filename of an H5 file
Select preset of H5 file structure. Default "10X"
.
Alternatively, we also support "anndata"
for H5AD files.
The path in a H5 file for the raw
sparse matrix data. These three types of data stands for the x
,
i
, and p
slots of a dgCMatrix-class
object. Default NULL
uses formatType
preset.
The path in a H5 file for the "x" vector of the normalized
sparse matrix. Default NULL
.
The path in a H5 file for the Group that contains the sparse
matrix constructing information for the scaled data. Default NULL
.
The path in a H5 file for the gene names and
cell barcodes. Default NULL
uses formatType
preset.
The HDF5 path to the raw count data in an H5AD file. See
Details. Default "X"
.
Name of modality for this dataset. Currently options of
"default"
, "rna"
, "atac"
, "spatial"
and
"meth"
are supported. Default "default"
.
Data frame for feature metadata. Default NULL
.
Additional slot data. See ligerDataset for detail. Given values will be directly placed at corresponding slots.
For H5AD file written from an AnnData object, we allow using
formatType = "anndata"
for the function to infer the proper structure.
However, while a typical AnnData-based analysis tends to in-place update the
adata.X
attribute and there is no standard/forced convention for where
the raw count data, as needed from LIGER, is stored. Therefore, we expose
argument anndataX
for specifying this information. The default value
"X"
looks for adata.X
. If the raw data is stored in a layer,
e.g. adata.layers['count']
, then anndataX = "layers/count"
.
If it is stored to adata.raw.X
, then anndataX = "raw/X"
. If
your AnnData object does not have the raw count retained, you will have to
go back to the Python work flow to have it inserted at desired object space
and re-write the H5AD file, or just go from upstream source files with which
the AnnData was originally created.
h5Path <- system.file("extdata/ctrl.h5", package = "rliger")
tempPath <- tempfile(fileext = ".h5")
file.copy(from = h5Path, to = tempPath)
ld <- createH5LigerDataset(tempPath)
Run the code above in your browser using DataLab