| SummarizedExperiment-class {GenomicRanges} | R Documentation |
WARNING: The SummarizedExperiment class described here is deprecated and being
replaced with the RangedSummarizedExperiment
class defined in the new SummarizedExperiment package.
Please make sure to install the SummarizedExperiment package before
you attempt to use the SummarizedExperiment() constructor function.
Note that this will return a
RangedSummarizedExperiment instance instead
of a SummarizedExperiment instance.
The SummarizedExperiment class is a matrix-like container where rows
represent ranges of interest (as a GRanges or
GRangesList-class) and columns represent samples (with
sample data summarized as a DataFrame-class). A
SummarizedExperiment contains one or more assays, each
represented by a matrix-like object of numeric or other mode.
## Constructors SummarizedExperiment(assays, ...) ## Accessors assayNames(x, ...) assayNames(x, ...) <- value assays(x, ..., withDimnames=TRUE) assays(x, ..., withDimnames=TRUE) <- value assay(x, i, ...) assay(x, i, ...) <- value rowRanges(x, ...) rowRanges(x, ...) <- value colData(x, ...) colData(x, ...) <- value exptData(x, ...) exptData(x, ...) <- value ## S4 method for signature 'SummarizedExperiment' dim(x) ## S4 method for signature 'SummarizedExperiment' dimnames(x) ## S4 replacement method for signature 'SummarizedExperiment,NULL' dimnames(x) <- value ## S4 replacement method for signature 'SummarizedExperiment,list' dimnames(x) <- value ## colData access ## S4 method for signature 'SummarizedExperiment' x$name ## S4 replacement method for signature 'SummarizedExperiment,ANY' x$name <- value ## S4 method for signature 'SummarizedExperiment,ANY,missing' x[[i, j, ...]] ## S4 replacement method for signature 'SummarizedExperiment,ANY,missing,ANY' x[[i, j, ...]] <- value ## rowRanges access ## see 'GRanges compatibility', below ## Subsetting ## S4 method for signature 'SummarizedExperiment' x[i, j, ..., drop=TRUE] ## S4 replacement method for signature 'SummarizedExperiment,ANY,ANY,SummarizedExperiment' x[i, j] <- value ## S4 method for signature 'SummarizedExperiment' subset(x, subset, select, ...) ## Combining ## S4 method for signature 'SummarizedExperiment' cbind(..., deparse.level=1) ## S4 method for signature 'SummarizedExperiment' rbind(..., deparse.level=1) ## Coercion ## S4 method for signature 'SummarizedExperiment' updateObject(object, ..., verbose=FALSE) ## S4 method for signature 'ExpressionSet,SummarizedExperiment' coerce(from, to = "SummarizedExperiment", strict = TRUE) ## S4 method for signature 'SummarizedExperiment,ExpressionSet' coerce(from, to = "ExpressionSet", strict = TRUE)
assays |
See |
... |
For For For For other accessors, ignored. |
verbose |
A |
x, object |
An instance of |
i, j |
For For For |
subset |
An expression which, when evaluated in the
context of |
select |
An expression which, when evaluated in the
context of |
name |
A symbol representing the name of a column of
|
withDimnames |
A |
drop |
A |
value |
An instance of a class specified in the S4 method signature or as outlined in ‘Details’. |
deparse.level |
See |
from |
the object to be coerced |
to |
the class to coerce to |
strict |
logical flag. If 'TRUE', the returned object must be strictly from the target class. |
The SummarizedExperiment class is meant for numeric and other
data types derived from a sequencing experiment. The structure is
rectangular like a matrix, but with additional annotations on
the rows and columns, and with the possibility to manage several
assays simultaneously.
The rows of a SummarizedExperiment instance represent ranges
(in genomic coordinates) of interest. The ranges of interest are
described by a GRanges-class or a
GRangesList-class instance, accessible using the
rowRanges function, described below. The GRanges and
GRangesList classes contains sequence (e.g., chromosome) name,
genomic coordinates, and strand information. Each range can be
annotated with additional data; this data might be used to describe
the range or to summarize results (e.g., statistics of differential
abundance) relevant to the range. Rows may or may not have row names;
they often will not.
Each column of a SummarizedExperiment instance represents a
sample. Information about the samples are stored in a
DataFrame-class, accessible using the function
colData, described below. The DataFrame must have as
many rows as there are columns in the SummarizedExperiment,
with each row of the DataFrame providing information on the
sample in the corresponding column of the
SummarizedExperiment. Columns of the DataFrame represent
different sample attributes, e.g., tissue of origin, etc. Columns of
the DataFrame can themselves be annotated (via the
mcols function). Column names typically provide a short
identifier unique to each sample.
A SummarizedExperiment can also contain information about the
overall experiment, for instance the lab in which it was conducted,
the publications with which it is associated, etc. This information is
stored as a SimpleList-class, accessible using
the exptData function. The form of the data associated with the
experiment is left to the discretion of the user.
The SummarizedExperiment is appropriate for matrix-like
data. The data are accessed using the assays function,
described below. This returns a SimpleList-class instance. Each
element of the list must itself be a matrix (of any mode) and must
have dimensions that are the same as the dimensions of the
SummarizedExperiment in which they are stored. Row and column
names of each matrix must either be NULL or match those of the
SummarizedExperiment during construction. It is convenient for
the elements of SimpleList of assays to be named.
The SummarizedExperiment class has the following slots; this
detail of class structure is not relevant to the user.
exptDataA SimpleList-class instance containing information about the overall experiment.
rowDataA GRanges-class instance defining the
ranges of interest and associated metadata. WARNING: The accessor
for this slot is rowRanges, not rowData!
colDataA DataFrame-class instance describing the samples and associated metadata.
assaysA SimpleList-class instance, each element of which is a matrix summarizing data associated with the corresponding range and sample.
Instances are constructed using the SummarizedExperiment
function with arguments outlined above.
Package version 1.9.59 introduced a new way of representing
‘assays’. If you have a serialized instance x of a
SummarizedExperiment (e.g., from using the save function
with a version of GenomicRanges prior to 1.9.59), it should be updated
by invoking x <- updateObject(x).
as(from, "SummarizedExperiment"):Creates a
SummarizedExperiment object from a ExpressionSet object.
as(from, "ExpressionSet"):Creates a
ExpressionSet object from a SummarizedExperiment object.
The following data mappings are used for coercion between
ExpressionSet and SummarizedExperiment.
assayDataassays
featureDatarowData
phenoDatacolData
experimentData, annotation,
protocolDatacolData
If the SummarizedExperiment being coerced uses GRanges to store
it's range data that data will be included in the featureData of the
ExpressionSet.
Because ExpressionSet objects require an assay named ‘exprs’ if
the SummarizedExperiment object being coerced does not have an assay
named ‘exprs’ the first assay will be renamed and a warning will be
issued.
In the following code snippets, x is a
SummarizedExperiment instance.
assays(x), assays(x) <- value:Get or set the
assays. value is a list or SimpleList, each
element of which is a matrix with the same dimensions as
x.
assay(x, i), assay(x, i) <- value:A convenient
alternative (to assays(x)[[i]], assays(x)[[i]] <-
value) to get or set the ith (default first) assay
element. value must be a matrix of the same dimension as
x, and with dimension names NULL or consistent with
those of x.
assayNames(x), assayNames(x) <- value:Get or
set the names of assay() elements.
rowRanges(x), rowRanges(x) <- value:Get or set the
row data. value is a GenomicRanges instance. Row
names of value must be NULL or consistent with the existing
row names of x.
colData(x), colData(x) <- value:Get or set the
column data. value is a DataFrame instance. Row
names of value must be NULL or consistent with the existing
column names of x.
exptData(x), exptData(x) <- value:Get or set
the experiment data. value is a list or
SimpleList instance, with arbitrary content.
dim(x):Get the dimensions (ranges x samples) of the
SummarizedExperiment.
dimnames(x), dimnames(x) <- value:Get or set
the dimension names. value is usually a list of length 2,
containing elements that are either NULL or vectors of
appropriate length for the corresponding dimension. value
can be NULL, which removes dimension names. This method
implies that rownames, rownames<-, colnames,
and colnames<- are all available.
Many GRanges-class and
GRangesList-class operations are supported on
‘SummarizedExperiment’ and derived instances, using
rowRanges.
Supported operations include: compare,
countOverlaps, coverage,
disjointBins, distance,
distanceToNearest, duplicated,
end, end<-, findOverlaps,
flank, follow, granges,
isDisjoint, match, mcols,
mcols<-, narrow, nearest,
order, overlapsAny, precede,
ranges,
ranges<-, rank, resize,
restrict, seqinfo,
seqinfo<-, seqnames,
shift,
sort, split, relistToClass,
start, start<-,
strand, strand<-,
subsetByOverlaps, width,
width<-.
Not all GRanges-class operations are supported, because
they do not make sense for ‘SummarizedExperiment’ objects
(e.g., length, name, as.data.frame, c, splitAsList), involve
non-trivial combination or splitting of rows (e.g., disjoin, gaps,
reduce, unique), or have not yet been implemented (Ops, map, window,
window<-).
In the code snippets below, x is a SummarizedExperiment
instance.
x[i,j], x[i,j] <- value:Create or replace a
subset of x. i, j can be numeric,
logical, character, or missing. value
must be a SummarizedExperiment instance with dimensions,
dimension names, and assay elements consistent with the subset
x[i,j] being replaced.
subset(x, subset, select):Create a subset of x
using an expression subset referring to columns of
rowRanges(x) (including ‘seqnames’, ‘start’,
‘end’, ‘width’, ‘strand’, and
names(mcols(x))) and / or select referring to
column names of colData(x).
Additional subsetting accessors provide convenient access to
colData columns
x$name, x$name <- valueAccess or replace
column name in x.
x[[i, ...]], x[[i, ...]] <- valueAccess or
replace column i in x.
In the code snippets below, ... are SummarizedExperiment
instances to be combined.
cbind(...), rbind(...):cbind combines objects with identical ranges (rowRanges)
but different samples (columns in assays). The colnames in
colData must match or an error is thrown. Duplicate columns
of mcols(rowRanges(SummarizedExperiment)) must contain the same
data. Data in assays are combined by name matching; if all names
are NULL matching is by position. A mixture of names and NULL throws an
error.
rbind combines objects with different ranges (rowRanges)
and the same subjects (columns in assays). Duplicate columns
of colData must contain the same data.
exptData from all objects are combined into a
SimpleList with no name checking.
This section contains advanced material meant for package developers.
SummarizedExperiment is implemented as an S4 class, and can be
extended in the usual way, using
contains="SummarizedExperiment" in the new class definition.
In addition, the representation of the assays slot of
SummarizedExperiment is as a virtual class Assays. This
allows derived classes (contains="Assays") to easily implement
alternative requirements for the assays, e.g., backed by file-based
storage like NetCDF or the ff package, while re-using the
existing SummarizedExperiment class without modification. The
requirements on Assays are list-like semantics (e.g.,
sapply, [[ subsetting, names) with elements
having matrix- or array-like semantics (e.g., dim,
dimnames). These requirements can be made more precise if
developers express interest.
The current assays slot is implemented as a reference class
that has copy-on-change semantics. This means that modifying non-assay
slots does not copy the (large) assay data, and at the same time the
user is not surprised by reference-based semantics. Updates to
non-assay slots are very fast; updating the assays slot itself can be
5x or more faster than with an S4 instance in the slot. One useful
technique when working with assay or assays function is
use of the withDimnames=FALSE argument, which benefits speed
and memory use by not copying dimnames from the row- and colData
elements to each assay.
In a little more detail, a small reference class hierarchy (not
exported from the GenomicRanges name space) defines a reference class
ShallowData with a single field data of type ANY,
and a derived class ShallowSimpleListAssays that specializes
the type of data as SimpleList, and
contains=c("ShallowData", "Assays"). The assays slot contains
an instance of ShallowSimpleListAssays. Invoking
assays() on a SummarizedExperiment re-dispatches from
the assays slot to retrieve the SimpleList from the
field of the reference class. This was achieved by implementing a
generic (not exported) value(x, name, ...), with a method
implemented on SummarizedExperiment that retrieves a slot when
name is a slot containing an S4 object in x, and a field
when name is a slot containing a ShallowData instance in
x. Copy-on-change semantics is maintained by implementing the
clone method (clone methods are supposed to do a deep
copy, update methods a shallow copy; the clone generic
is introduced, and not exported, in the GenomicRanges package). The
‘getter’ and ‘setter’ code for methods implemented on
SummarizedExperiment use value for slot access, and
clone for replacement. This makes it easy to implement
ShallowData instances for other slots if the need arises.
Martin Morgan, mtmorgan@fhcrc.org
RangedSummarizedExperiment in the new SummarizedExperiment package for the replacement of the SummarizedExperiment class.
## WARNING: The SummarizedExperiment class is deprecated and being ## replaced with the RangedSummarizedExperiment class defined in the ## new SummarizedExperiment package. See ?RangedSummarizedExperiment ## in the SummarizedExperiment package for examples of how to create ## and manipulate RangedSummarizedExperiment objects.