The SummarizedExperiment class is a matrix-like container where rows represent features of interest (e.g. genes, transcripts, exons, etc...) and columns represent samples (with sample data summarized as a DataFrame). A SummarizedExperiment object contains one or more assays, each represented by a matrix-like object of numeric or other mode.
Note that SummarizedExperiment is the parent of the RangedSummarizedExperiment class which means that all the methods documented below also work on a RangedSummarizedExperiment object.
## Constructor
# See ?RangedSummarizedExperiment for the constructor function.
## Accessors
assayNames(x, ...)
assayNames(x, ...) <- value
assays(x, ..., withDimnames=TRUE)
assays(x, ..., withDimnames=TRUE) <- value
assay(x, i, ...)
assay(x, i, ...) <- value
rowData(x, ...)
rowData(x, ...) <- value
colData(x, ...)
colData(x, ...) <- value
#dim(x)
#dimnames(x)
#dimnames(x) <- value
## Quick colData access
"$"(x, name)
"$"(x, name) <- value
"[["(x, i, j, ...)
"[["(x, i, j, ...) <- value
## Subsetting
"["(x, i, j, ..., drop=TRUE)
"["(x, i, j) <- value
## Combining
"cbind"(..., deparse.level=1)
"rbind"(..., deparse.level=1)
assay
, ...
may contain withDimnames
, which is
forwarded to assays
. For rowData
, arguments passed thru ...
are forwarded to
mcols
.
For cbind
, rbind
, ...
contains SummarizedExperiment
objects to be combined.
For other accessors, ignored.
assay
, assay<-
, i
is an integer or
numeric scalar; see Details for additional constraints. For [,SummarizedExperiment
,
[,SummarizedExperiment<-
, i
, j
are subscripts
that can act to subset the rows and columns of x
, that is the
matrix
elements of assays
.
For [[,SummarizedExperiment
,
[[<-,SummarizedExperiment
, i
is a scalar index (e.g.,
character(1)
or integer(1)
) into a column of
colData
.
colData
.logical(1)
, indicating whether dimnames
should be applied to extracted assay elements. Setting
withDimnames=FALSE
increases the speed and memory efficiency
with which assays are extracted. withDimnames=TRUE
in the
getter assays<-
allows efficient complex assignments (e.g.,
updating names of assays, names(assays(x, withDimnames=FALSE))
= ...
is more efficient than names(assays(x)) = ...
); it
does not influence actual assignment of dimnames to assays.logical(1)
, ignored by these methods.?base::cbind
for a description of
this argument.SummarizedExperiment
function documented in
?RangedSummarizedExperiment
. x
is a SummarizedExperiment
object. assays(x)
, assays(x) <- value
:value
is a list
or SimpleList
, each
element of which is a matrix with the same dimensions as
x
.assay(x, i)
, assay(x, i) <- value
:assays(x)[[i]]
, assays(x)[[i]] <-
value
) to get or set the i
th (default first) assay
element. value
must be a matrix of the same dimension as
x
, and with dimension names NULL
or consistent with
those of x
.assayNames(x)
, assayNames(x) <- value
:assay()
elements.rowData(x)
, rowData(x) <- value
:value
is a DataFrame object. Row
names of value
must be NULL or consistent with the existing
row names of x
.colData(x)
, colData(x) <- value
:value
is a DataFrame object. Row
names of value
must be NULL or consistent with the existing
column names of x
.metadata(x)
, metadata(x) <- value
:value
is a list
with arbitrary
content.dim(x)
:dimnames(x)
, dimnames(x) <- value
:value
is usually a list of length 2,
containing elements that are either NULL
or vectors of
appropriate length for the corresponding dimension. value
can be NULL
, which removes dimension names. This method
implies that rownames
, rownames<-
, colnames
,
and colnames<-
are all available.x
is a SummarizedExperiment object. x[i,j]
, x[i,j] <- value
:x
. i
, j
can be numeric
,
logical
, character
, or missing
. value
must be a SummarizedExperiment object with dimensions,
dimension names, and assay elements consistent with the subset
x[i,j]
being replaced.colData
columns x$name
, x$name <- value
name
in x
.x[[i, ...]]
, x[[i, ...]] <- value
i
in x
....
are SummarizedExperiment objects
to be combined. cbind(...)
:cbind
combines objects with the same features of interest
but different samples (columns in assays
).
The colnames in colData(SummarizedExperiment)
must match or
an error is thrown.
Duplicate columns of rowData(SummarizedExperiment)
must
contain the same data. Data in assays
are combined by name matching; if all assay
names are NULL matching is by position. A mixture of names and NULL
throws an error. metadata
from all objects are combined into a list
with no name checking.
rbind(...)
:rbind
combines objects with the same samples
but different features of interest (rows in assays
).
The colnames in rowData(SummarizedExperiment)
must match or
an error is thrown.
Duplicate columns of colData(SummarizedExperiment)
must
contain the same data. Data in assays
are combined by name matching; if all assay
names are NULL matching is by position. A mixture of names and NULL
throws an error. metadata
from all objects are combined into a list
with no name checking.
contains="SummarizedExperiment"
in the new
class definition. In addition, the representation of the assays
slot of
SummarizedExperiment is as a virtual class Assays. This
allows derived classes (contains="Assays"
) to easily implement
alternative requirements for the assays, e.g., backed by file-based
storage like NetCDF or the ff
package, while re-using the existing
SummarizedExperiment class without modification.
See Assays for more information. The current assays
slot is implemented as a reference class
that has copy-on-change semantics. This means that modifying non-assay
slots does not copy the (large) assay data, and at the same time the
user is not surprised by reference-based semantics. Updates to
non-assay slots are very fast; updating the assays slot itself can be
5x or more faster than with an S4 instance in the slot. One useful
technique when working with assay
or assays
function is
use of the withDimnames=FALSE
argument, which benefits speed
and memory use by not copying dimnames from the row- and colData
elements to each assay. The SummarizedExperiment class is meant for numeric and other
data types derived from a sequencing experiment. The structure is
rectangular like a matrix
, but with additional annotations on
the rows and columns, and with the possibility to manage several
assays simultaneously.
The rows of a SummarizedExperiment object represent features
of interest. Information about these features is stored in a
DataFrame object, accessible using the function
rowData
. The DataFrame must have as many rows
as there are rows in the SummarizedExperiment object, with each row
of the DataFrame providing information on the feature in the
corresponding row of the SummarizedExperiment object. Columns of the
DataFrame represent different attributes of the features
of interest, e.g., gene or transcript IDs, etc.
Each column of a SummarizedExperiment object represents a sample.
Information about the samples are stored in a DataFrame,
accessible using the function colData
, described below.
The DataFrame must have as many rows as there are
columns in the SummarizedExperiment object, with each row of the
DataFrame providing information on the sample in the
corresponding column of the SummarizedExperiment object.
Columns of the DataFrame represent different sample
attributes, e.g., tissue of origin, etc. Columns of the
DataFrame can themselves be annotated (via the
mcols
function). Column names typically
provide a short identifier unique to each sample.
A SummarizedExperiment object can also contain information about
the overall experiment, for instance the lab in which it was conducted,
the publications with which it is associated, etc. This information is
stored as a list
object, accessible using the metadata
function. The form of the data associated with the experiment is left to
the discretion of the user.
The SummarizedExperiment container is appropriate for matrix-like
data. The data are accessed using the assays
function,
described below. This returns a SimpleList object. Each
element of the list must itself be a matrix (of any mode) and must
have dimensions that are the same as the dimensions of the
SummarizedExperiment in which they are stored. Row and column
names of each matrix must either be NULL
or match those of the
SummarizedExperiment during construction. It is convenient for
the elements of SimpleList of assays to be named.
metadata
and
mcols
accessors in the S4Vectors
package.
nrows <- 200; ncols <- 6
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
row.names=LETTERS[1:6])
se0 <- SummarizedExperiment(assays=SimpleList(counts=counts),
colData=colData)
se0
dim(se0)
dimnames(se0)
assayNames(se0)
head(assay(se0))
assays(se0) <- endoapply(assays(se0), asinh)
head(assay(se0))
rowData(se0)
colData(se0)
se0[, se0$Treatment == "ChIP"]
## cbind() combines objects with the same features of interest
## but different samples:
se1 <- se0
se2 <- se1[,1:3]
colnames(se2) <- letters[1:ncol(se2)]
cmb1 <- cbind(se1, se2)
dim(cmb1)
dimnames(cmb1)
## rbind() combines objects with the same samples but different
## features of interest:
se1 <- se0
se2 <- se1[1:50,]
rownames(se2) <- letters[1:nrow(se2)]
cmb2 <- rbind(se1, se2)
dim(cmb2)
dimnames(cmb2)
Run the code above in your browser using DataLab