RaggedExperiment objects
Source:R/RaggedExperiment-class.R
, R/RaggedExperiment-subset-methods.R
RaggedExperiment-class.Rd
The RaggedExperiment
class is a container for
storing range-based data, including but not limited to copy
number data, and mutation data. It can store a collection of
GRanges
objects, as it is derived from the
GenomicRangesList
.
Usage
RaggedExperiment(..., colData = DataFrame(), metadata = list())
# S4 method for class 'RaggedExperiment'
seqinfo(x)
# S4 method for class 'RaggedExperiment'
seqinfo(x, new2old = NULL, pruning.mode = c("error", "coarse", "fine", "tidy")) <- value
# S4 method for class 'RaggedExperiment'
rowRanges(x, ...)
# S4 method for class 'RaggedExperiment,GRanges'
rowRanges(x, ...) <- value
# S4 method for class 'RaggedExperiment'
mcols(x, use.names = FALSE, ...)
# S4 method for class 'RaggedExperiment'
mcols(x, ...) <- value
# S4 method for class 'RaggedExperiment'
rowData(x, use.names = TRUE, ...)
# S4 method for class 'RaggedExperiment'
rowData(x, ...) <- value
# S4 method for class 'RaggedExperiment'
dim(x)
# S4 method for class 'RaggedExperiment'
dimnames(x)
# S4 method for class 'RaggedExperiment,list'
dimnames(x) <- value
# S4 method for class 'RaggedExperiment,ANY'
dimnames(x) <- value
# S4 method for class 'RaggedExperiment'
length(x)
# S4 method for class 'RaggedExperiment'
colData(x, ...)
# S4 method for class 'RaggedExperiment,DataFrame'
colData(x) <- value
# S4 method for class 'RaggedExperiment,missing'
assay(x, i, withDimnames = TRUE, ...)
# S4 method for class 'RaggedExperiment,ANY'
assay(x, i, withDimnames = TRUE, ...)
# S4 method for class 'RaggedExperiment'
assays(x, withDimnames = TRUE, ...)
# S4 method for class 'RaggedExperiment'
assayNames(x, ...)
# S4 method for class 'RaggedExperiment'
show(object)
# S4 method for class 'RaggedExperiment'
as.list(x, ...)
# S4 method for class 'RaggedExperiment'
as.data.frame(x, row.names = NULL, optional = FALSE, ...)
# S4 method for class 'RaggedExperiment'
x$name
# S4 method for class 'RaggedExperiment,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]
# S4 method for class 'RaggedExperiment,Vector'
overlapsAny(
query,
subject,
maxgap = 0L,
minoverlap = 1L,
type = c("any", "start", "end", "within", "equal"),
...
)
# S4 method for class 'RaggedExperiment,Vector'
subsetByOverlaps(
x,
ranges,
maxgap = -1L,
minoverlap = 0L,
type = c("any", "start", "end", "within", "equal"),
invert = FALSE,
...
)
# S4 method for class 'RaggedExperiment'
subset(x, subset, select, ...)
Arguments
- ...
Constructor: GRanges, list of GRanges, or GRangesList OR assay: Additional arguments for assay. See details for more information.
- colData
A
DataFrame
describing samples. Length of rowRanges must equal the number of rows in colData- metadata
A
list
to include in the metadata. Any metadata included in the input objects are lost.- x
A RaggedExperiment object.
- new2old
The
new2old
argument allows the user to rename, drop, add and/or reorder the "sequence levels" inx
.new2old
can beNULL
or an integer vector with one element per entry in Seqinfo objectvalue
(i.e.new2old
andvalue
must have the same length) describing how the "new" sequence levels should be mapped to the "old" sequence levels, that is, how the entries invalue
should be mapped to the entries inseqinfo(x)
. The values innew2old
must be >= 1 and <=length(seqinfo(x))
.NA
s are allowed and indicate sequence levels that are being added. Old sequence levels that are not represented innew2old
will be dropped, but this will fail if those levels are in use (e.g. ifx
is a GRanges object with ranges defined on those sequence levels) unless a pruning mode is specified via thepruning.mode
argument (see below).If
new2old=NULL
, then sequence levels can only be added to the existing ones, that is,value
must have at least as many entries asseqinfo(x)
(i.e.length(values) >= length(seqinfo(x))
) and alsoseqlevels(values)[seq_len(length(seqlevels(x)))]
must be identical toseqlevels(x)
.Note that most of the times it's easier to proceed in 2 steps:
First align the seqlevels on the left (
seqlevels(x)
) with the seqlevels on the right.Then call
seqinfo(x) <- value
. Becauseseqlevels(x)
andseqlevels(value)
now are identical, there's no need to specifynew2old
.
This 2-step approach will typically look like this:
seqlevels(x) <- seqlevels(value) # align seqlevels seqinfo(x) <- seqinfo(value) # guaranteed to work
Or, if
x
has seqlevels not invalue
, it will look like this:seqlevels(x, pruning.mode="coarse") <- seqlevels(value) seqinfo(x) <- seqinfo(value) # guaranteed to work
The
pruning.mode
argument will control what happens tox
when some of its seqlevels get droppped. See below for more information.- pruning.mode
When some of the seqlevels to drop from
x
are in use (i.e. have ranges on them), the ranges on these sequences need to be removed before the seqlevels can be dropped. We call this pruning. Thepruning.mode
argument controls how to prunex
. Four pruning modes are currently defined:"error"
,"coarse"
,"fine"
, and"tidy"
."error"
is the default. In this mode, no pruning is done and an error is raised. The other pruning modes do the following:"coarse"
: Remove the elements inx
where the seqlevels to drop are in use. Typically reduces the length ofx
. Note that ifx
is a list-like object (e.g. GRangesList, GAlignmentPairs, or GAlignmentsList), then any list element inx
where at least one of the sequence levels to drop is in use is fully removed. In other words, whenpruning.mode="coarse"
, theseqlevels
setter will keep or remove full list elements and not try to change their content. This guarantees that the exact ranges (and their order) inside the individual list elements are preserved. This can be a desirable property when the list elements represent compound features like exons grouped by transcript (stored in a GRangesList object as returned byexonsBy( , by="tx")
), or paired-end or fusion reads, etc..."fine"
: Supported on list-like objects only. Removes the ranges that are on the sequences to drop. This removal is done within each list element of the original objectx
and doesn't affect its length or the order of its list elements. In other words, the pruned object is guaranteed to be parallel to the original object."tidy"
: Like the"fine"
pruning above but also removes the list elements that become empty as the result of the pruning. Note that this pruning mode is particularly well suited on a GRangesList object that contains transcripts grouped by gene, as returned bytranscriptsBy( , by="gene")
. Finally note that, as a convenience, this pruning mode is supported on non list-like objects (e.g. GRanges or GAlignments objects) and, in this case, is equivalent to the"coarse"
mode.
See the "B. DROP SEQLEVELS FROM A LIST-LIKE OBJECT" section in the examples below for an extensive illustration of these pruning modes.
- value
dimnames: A
list
of dimension namesmcols: A
DataFrame
representing the assays
- use.names
(logical default FALSE) whether to propagate rownames from the object to rownames of metadata
DataFrame
- i
logical(1), integer(1), or character(1) indicating the assay to be reported. For
[
,i
can be any supportedVector
object, e.g.,GRanges
.- withDimnames
logical (default TRUE) whether to use dimension names in the resulting object
- object
A RaggedExperiment object.
- row.names
NULL
or a character vector giving the row names for the data frame. Missing values are not allowed.- optional
logical. If
TRUE
, setting row names and converting column names (to syntactic names: seemake.names
) is optional. Note that all of R's base packageas.data.frame()
methods useoptional
only for column names treatment, basically with the meaning ofdata.frame(*, check.names = !optional)
. See also themake.names
argument of thematrix
method.- name
a literal character string or a name (possibly backtick quoted). For extraction, this is normally (see under ‘Environments’) partially matched to the
names
of the object.- j
integer(), character(), or logical() index selecting columns from RaggedExperiment
- drop
logical (default TRUE) whether to drop empty samples
- query
A RaggedExperiment instance.
- subject, ranges
Each of them can be an IntegerRanges (e.g. IRanges, Views) or IntegerRangesList (e.g. IRangesList, ViewsList) derivative. In addition, if
subject
orranges
is an IntegerRanges object,query
orx
can be an integer vector to be converted to length-one ranges.If
query
(orx
) is an IntegerRangesList object, thensubject
(orranges
) must also be an IntegerRangesList object.If both arguments are list-like objects with names, each list element from the 2nd argument is paired with the list element from the 1st argument with the matching name, if any. Otherwise, list elements are paired by position. The overlap is then computed between the pairs as described below.
If
subject
is omitted,query
is queried against itself. In this case, and only this case, thedrop.self
anddrop.redundant
arguments are allowed. By default, the result will contain hits for each range against itself, and if there is a hit from A to B, there is also a hit for B to A. Ifdrop.self
isTRUE
, all self matches are dropped. Ifdrop.redundant
isTRUE
, only one of A->B and B->A is returned.- maxgap
A single integer >= -1.
If
type
is set to"any"
,maxgap
is interpreted as the maximum gap that is allowed between 2 ranges for the ranges to be considered as overlapping. The gap between 2 ranges is the number of positions that separate them. The gap between 2 adjacent ranges is 0. By convention when one range has its start or end strictly inside the other (i.e. non-disjoint ranges), the gap is considered to be -1.If
type
is set to anything else,maxgap
has a special meaning that depends on the particulartype
. Seetype
below for more information.- minoverlap
A single non-negative integer.
Only ranges with a minimum of
minoverlap
overlapping positions are considered to be overlapping.When
type
is"any"
, at least one ofmaxgap
andminoverlap
must be set to its default value.- type
By default, any overlap is accepted. By specifying the
type
parameter, one can select for specific types of overlap. The types correspond to operations in Allen's Interval Algebra (see references). Iftype
isstart
orend
, the intervals are required to have matching starts or ends, respectively. Specifyingequal
as the type returns the intersection of thestart
andend
matches. Iftype
iswithin
, the query interval must be wholly contained within the subject interval. Note that all matches must additionally satisfy theminoverlap
constraint described above.The
maxgap
parameter has special meaning with the special overlap types. Forstart
,end
, andequal
, it specifies the maximum difference in the starts, ends or both, respectively. Forwithin
, it is the maximum amount by which the subject may be wider than the query. Ifmaxgap
is set to -1 (the default), it's replaced internally by 0.- invert
If
TRUE
, keep only the ranges inx
that do not overlapranges
.- subset
logical expression indicating elements or rows to keep: missing values are taken as false.
- select
If
query
is an IntegerRanges derivative: Whenselect
is"all"
(the default), the results are returned as a Hits object. Otherwise the returned value is an integer vector parallel toquery
(i.e. same length) containing the first, last, or arbitrary overlapping interval insubject
, withNA
indicating intervals that did not overlap any intervals insubject
.If
query
is an IntegerRangesList derivative: Whenselect
is"all"
(the default), the results are returned as a HitsList object. Otherwise the returned value depends on thedrop
argument. Whenselect != "all" && !drop
, an IntegerList is returned, where each element of the result corresponds to a space inquery
. Whenselect != "all" && drop
, an integer vector is returned containing indices that are offset to align with the unlistedquery
.
Value
constructor returns a RaggedExperiment
object
'rowRanges' returns a GRanges
object
summarizing ranges corresponding to assay()
rows.
'rowRanges<-' returns a RaggedExperiment
object
with replaced ranges
'mcols' returns a DataFrame
object
of the metadata columns
'assays' returns a SimpleList
'overlapsAny' returns a logical vector of length equal
to the number of rows in the query
; TRUE
when the
copy number region overlaps the subject
.
'subsetByOverlaps' returns a RaggedExperiment containing
only copy number regions overlapping subject
.
Methods (by generic)
seqinfo(RaggedExperiment)
: seqinfo accessorseqinfo(RaggedExperiment) <- value
: Replace seqinfo metadata of the rangesrowRanges(RaggedExperiment)
: rowRanges accessorrowRanges(x = RaggedExperiment) <- value
: rowRanges replacementmcols(RaggedExperiment)
: get the metadata columns of the ranges, rectangular representation of the 'assays'mcols(RaggedExperiment) <- value
: set the metadata columns of the ranges corresponding to the assaysrowData(RaggedExperiment)
: get the rowData or metadata for the rangesrowData(RaggedExperiment) <- value
: set the rowData or metadata for the rangesdim(RaggedExperiment)
: get dimensions (number of sample-specific row ranges by number of samples)dimnames(RaggedExperiment)
: get row (sample-specific) range names and sample namesdimnames(x = RaggedExperiment) <- value
: set row (sample-specific) range names and sample namesdimnames(x = RaggedExperiment) <- value
: set row range names and sample names to NULLlength(RaggedExperiment)
: get the length of row vectors in the object, similar to SummarizedExperimentcolData(RaggedExperiment)
: get column datacolData(x = RaggedExperiment) <- value
: change the colDataassay(x = RaggedExperiment, i = missing)
: assay missing method uses first metadata columnassay(x = RaggedExperiment, i = ANY)
: assay numeric method.assays(RaggedExperiment)
: assaysassayNames(RaggedExperiment)
: names in each assayshow(RaggedExperiment)
: show methodas.list(RaggedExperiment)
: Allow extraction of metadata columns as a plainlist
as.data.frame(RaggedExperiment)
: Allow conversion to plaindata.frame
$
: Easily access thecolData
columns with the dollar sign operatorx[i
: Subset a RaggedExperiment objectoverlapsAny(query = RaggedExperiment, subject = Vector)
: Determine whether copy number ranges defined byquery
overlap ranges ofsubject
.subsetByOverlaps(x = RaggedExperiment, ranges = Vector)
: Subset the RaggedExperiment to contain only copy number ranges overlapping ranges ofsubject
.subset(RaggedExperiment)
: subset helper function for dividing by rowData and / or colData values
Constructors
RaggedExperiment(..., colData=DataFrame())
: Creates a
RaggedExperiment object using multiple GRanges
objects or a list
of GRanges
objects. Additional column data may be provided
as a DataFrame
object.
Accessors
In the following, 'x' represents a RaggedExperiment
object:
rowRanges(x)
:
Get the ranged data. Value is a GenomicRanges
object.
assays(x)
:
Get the assays. Value is a SimpleList
.
assay(x, i)
:
An alternative to assays(x)[[i]]
to get the ith
(default first) assay element.
mcols(x), mcols(x) <- value
:
Get or set the metadata columns. For RaggedExperiment
, the
columns correspond to the assay ith elements.
rowData(x), rowData(x) <- value
:
Get or set the row data. Value is a DataFrame
object. Also corresponds to the mcols
data.
Note for advanced users and developers. Both
mcols
and rowData
setters may reduce the size of the
internal RaggedExperiment
data representation. Particularly after
subsetting, the internal row index is modified and such setter
operations will use the index to subset the data and reduce the
"rows" of the internal data representation.
Subsetting
x[i, j]
:
Get ranges or elements (i
and j
, respectively) with
optional metadata columns where i
or j
can be missing,
an NA-free logical, numeric, or character vector.
Coercion
In the following, 'object' represents a RaggedExperiment
object:
as(object, "GRangesList")
:
Creates a GRangesList object from a
RaggedExperiment
.
as(from, "RaggedExperiment")
:
Creates a RaggedExperiment
object from a
GRangesList, or GRanges object.
Examples
## Create an empty RaggedExperiment instance
re0 <- RaggedExperiment()
re0
#> class: RaggedExperiment
#> dim: 0 0
#> assays(0):
#> rownames: NULL
#> colnames: NULL
#> colData names(0):
## Create a couple of GRanges objects with row ranges names
sample1 <- GRanges(
c(a = "chr1:1-10:-", b = "chr1:11-18:+"),
score = 1:2)
sample2 <- GRanges(
c(c = "chr2:1-10:-", d = "chr2:11-18:+"),
score = 3:4)
## Include column data
colDat <- DataFrame(id = 1:2)
## Create a RaggedExperiment object from a couple of GRanges
re1 <- RaggedExperiment(sample1=sample1, sample2=sample2, colData = colDat)
re1
#> class: RaggedExperiment
#> dim: 4 2
#> assays(1): score
#> rownames(4): a b c d
#> colnames(2): sample1 sample2
#> colData names(1): id
## With list of GRanges
lgr <- list(sample1 = sample1, sample2 = sample2)
## Create a RaggedExperiment from a list of GRanges
re2 <- RaggedExperiment(lgr, colData = colDat)
grl <- GRangesList(sample1 = sample1, sample2 = sample2)
## Create a RaggedExperiment from a GRangesList
re3 <- RaggedExperiment(grl, colData = colDat)
## Subset a RaggedExperiment
assay(re3[c(1, 3),])
#> sample1 sample2
#> a 1 NA
#> c NA 3
subsetByOverlaps(re3, GRanges("chr1:1-5")) # by ranges
#> class: RaggedExperiment
#> dim: 1 2
#> assays(1): score
#> rownames(1): a
#> colnames(2): sample1 sample2
#> colData names(1): id