Create simplified representation of ragged assay data.
Source:R/assay-functions.R
assay-functions.RdThese methods transform assay() from the
default (i.e., sparseAssay()) representation to various
forms of more dense representation. compactAssay()
collapses identical ranges across samples into a single
row. disjoinAssay() creates disjoint (non-overlapping)
regions, simplifies values within each sample in a
user-specified manner, and returns a matrix of disjoint regions
x samples.
This method transforms assay() from the default
(i.e., sparseAssay()) representation to a reduced
representation summarizing each original range overlapping
ranges in query. Reduction in each cell can be tailored
to indivdual needs using the simplifyReduce functional argument.
Usage
sparseAssay(
x,
i = 1,
withDimnames = TRUE,
background = NA_integer_,
sparse = FALSE
)
compactAssay(
x,
i = 1,
withDimnames = TRUE,
background = NA_integer_,
sparse = FALSE
)
disjoinAssay(
x,
simplifyDisjoin,
i = 1,
withDimnames = TRUE,
background = NA_integer_
)
qreduceAssay(
x,
query,
simplifyReduce,
i = 1,
withDimnames = TRUE,
background = NA_integer_
)Arguments
- x
A
RaggedExperimentobject- i
integer(1) or character(1) name of assay to be transformed.
- withDimnames
logical(1) include dimnames on the returned matrix. When there are no explict rownames, these are manufactured with
as.character(rowRanges(x)); rownames are always manufactured forcompactAssay()anddisjoinAssay().- background
A value (default NA) for the returned matrix after
*Assayoperations- sparse
logical(1) whether to return a
sparseMatrixrepresentation- simplifyDisjoin
A
function/ functional operating on a*List, where the elements of the list are all within-sample assay values from ranges overlapping each disjoint range. For instance, to use thesimplifyDisjoin=meanof overlapping ranges, where ranges are characterized by integer-valued scores, the entries are calculated asa original: |-----------| b |----------| a a, b b disjoint: |----|------|---| values <- IntegerList(a, c(a, b), b) simplifyDisjoin(values)- query
GRangesproviding regions over which reduction is to occur.- simplifyReduce
A
function/ functional accepting argumentsscore,range, andqrange:scoreA*List, where each list element corresponds to a cell in the matrix to be returned byqreduceAssay. Vector elements correspond to ranges overlapping query. The*Listobjects support many vectorized mathematical operations, sosimplifyReducecan be implemented efficiently.rangeAGRangesListinstance, 'parallel' toscore. Each element of the list corresponds to a cell in the matrix to be returned byqreduceAssay. Each range in the element corresponds to the range for which thescoreelement applies.qrangeAGRangesinstance with the same length asunlist(score), providing the query range window to which the corresponding scores apply.
Value
sparseAssay(): A matrix() with dimensions
dim(x). Elements contain the assay value for the ith
range and jth sample. Use 'sparse=TRUE' to obtain
a sparseMatrix assay representation.
compactAssay(): Samples with identical range are placed
in the same row. Non-disjoint ranges are NOT collapsed. Use
'sparse=TRUE' to obtain a sparseMatrix assay
representation.
disjoinAssay(): A matrix with number of rows equal
to number of disjoint ranges across all samples. Elements of
the matrix are summarized by applying simplifyDisjoin() to
assay values of overlapping ranges
qreduceAssay(): A matrix() with dimensions
length(query) x ncol(x). Elements contain assay
values for the ith query range and jth sample, summarized
according to the function simplifyReduce.
Examples
re4 <- RaggedExperiment(GRangesList(
GRanges(c(A = "chr1:1-10:-", B = "chr1:8-14:-", C = "chr2:15-18:+"),
score = 3:5),
GRanges(c(D = "chr1:1-10:-", E = "chr2:11-18:+"), score = 1:2)
), colData = DataFrame(id = 1:2))
query <- GRanges(c("chr1:1-14:-", "chr2:11-18:+"))
weightedmean <- function(scores, ranges, qranges)
{
## weighted average score per query range
## the weight corresponds to the size of the overlap of each
## overlapping subject range with the corresponding query range
isects <- pintersect(ranges, qranges)
sum(scores * width(isects)) / sum(width(isects))
}
qreduceAssay(re4, query, weightedmean)
#> [,1] [,2]
#> chr1:1-14:- 3.411765 1
#> chr2:11-18:+ 5.000000 2
if (FALSE) { # \dontrun{
## Extended example: non-silent mutations, summarized by genic
## region
suppressPackageStartupMessages({
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)
library(GenomeInfoDb)
library(MultiAssayExperiment)
library(curatedTCGAData)
library(TCGAutils)
})
## TCGA MultiAssayExperiment with RaggedExperiment data
mae <- curatedTCGAData("ACC", c("RNASeq2GeneNorm", "CNASNP", "Mutation"),
version = "1.1.38", dry.run = FALSE)
## genomic coordinates
gn <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene)
gn <- keepStandardChromosomes(granges(gn), pruning.mode="coarse")
seqlevelsStyle(gn) <- "NCBI"
genome(gn)
gn <- unstrand(gn)
## reduce mutations, marking any genomic range with non-silent
## mutation as FALSE
nonsilent <- function(scores, ranges, qranges)
any(scores != "Silent")
mre <- mae[["ACC_Mutation-20160128"]]
seqlevelsStyle(rowRanges(mre)) <- "NCBI"
## hack to make genomes match
genome(mre) <- paste0(correctBuild(unique(genome(mre)), "NCBI"), ".p13")
mutations <- qreduceAssay(mre, gn, nonsilent, "Variant_Classification")
genome(mre) <- correctBuild(unique(genome(mre)), "NCBI")
## reduce copy number
re <- mae[["ACC_CNASNP-20160128"]]
class(re)
## [1] "RaggedExperiment"
seqlevelsStyle(re) <- "NCBI"
genome(re) <- "GRCh37.p13"
cn <- qreduceAssay(re, gn, weightedmean, "Segment_Mean")
genome(re) <- "GRCh37"
## ALTERNATIVE
##
## TCGAutils helper function to convert RaggedExperiment objects to
## RangedSummarizedExperiment based on annotated gene ranges
mae2 <- mae
mae2[[1L]] <- re
mae2[[2L]] <- mre
qreduceTCGA(mae2)
} # }