GenomicFiles objects
GenomicFiles-class.Rd
The GenomicFiles
class is a matrix-like container where rows
represent ranges of interest and columns represent files. The class
is designed for byFile or byRange queries.
Details
GenomicFiles
inherits from the RangedSummarizedExperiment
class in the SummarizedExperiment
package. Currently, no use
is made of the elementMetadat
and assays
slots. This may
change in the future.
Accessors
In the code below, x
is a GenomicFiles object.
- rowRanges, rowRanges(x) <- value
Get or set the rowRanges on
x
.value
can be aGRanges
orGRangesList
representing ranges or indices defined on the spaces (position) of the files.- files(x), files(x) <- value
Get or set the files on
x
.value
can be a character() of file paths or a List of file objects such as BamFile, BigWigFile, FaFile, etc.- colData, colData(x) <- value
Get or set the colData on
x
.value
must be aDataFrame
instance describing the files. The number of rows must match the number of files. Row names, if present, become the column names of theGenomicFiles
.- metadata, metadata(x) <- value
Get or set the metadata on
x
.value
must be a SimpleList of arbitrary content describing the overall experiment.- dimnames, dimnames(x) <- value
Get or set the row and column names on
x
.
Methods
In the code below, x
is a GenomicFiles object.
- [
Subset the object by
fileRange
orfileSample
.- show
Compactly display the object.
- reduceByFile
Extract, manipulate and combine data defined in
rowRanges
within the files specified infiles
. See ?reduceByFile
for details.- reduceByRange
Extract, manipulate and combine data defined in
rowRanges
across the files specified infiles
. See ?reduceByRange
for details.
See also
reduceByFile and reduceByRange methods.
SummarizedExperiment objects in the SummarizedExperiment package.
Examples
## -----------------------------------------------------------------------
## Basic Use
## -----------------------------------------------------------------------
if (require(RNAseqData.HNRNPC.bam.chr14)) {
fl <- RNAseqData.HNRNPC.bam.chr14_BAMFILES
rd <- GRanges("chr14",
IRanges(c(62262735, 63121531, 63980327), width=214700))
cd <- DataFrame(method=rep("RNASeq", length(fl)),
format=rep("bam", length(fl)))
## Construct an instance of the class:
gf <- GenomicFiles(files = fl, rowRanges = rd, colData = cd)
gf
## Subset on ranges or files for different experimental runs.
dim(gf)
gf_sub <- gf[2, 3:4]
dim(gf_sub)
## When summarize = TRUE and no REDUCE is provided the reduceBy*
## functions output a SummarizedExperiment object.
MAP <- function(range, file, ...) {
requireNamespace("GenomicFiles", quietly=TRUE) ## for coverage()
requireNamespace("Rsamtools", quietly=TRUE) ## for ScanBamParam()
param = Rsamtools::ScanBamParam(which=range)
GenomicFiles::coverage(file, param=param)[range]
}
se <- reduceByRange(gf, MAP=MAP, summarize=TRUE)
se
## Data from the rowRanges, colData and metadata slots in the
## GenomicFiles are transferred to the SummarizedExperiment.
colData(se)
## Results are in the assays slot.
assays(se)
}
#> Loading required package: RNAseqData.HNRNPC.bam.chr14
#> List of length 1
#> names(1): data
## -----------------------------------------------------------------------
## Managing cached or remote files with GenomicFiles
## -----------------------------------------------------------------------
## The GenomicFiles class can manage cached or remote files and their
## associated ranges.
if (FALSE) { # \dontrun{
## Files from AnnotationHub can be downloaded and cached locally.
library(AnnotationHub)
hub = AnnotationHub()
hublet = query(hub, c("files I'm", "interested in"))
# cache (if need) and return local path to files
fls = cache(hublet)
## An alternative to the local file paths is to use urls to a remote file.
## This approach could be used with something like rtracklayer::bigWig which
## supports remote file queries.
urls = hublet$sourceurls
## Define ranges of interest and use GenomicFiles to manage.
rngs = GRanges("chr10", IRanges(c(100000, 200000), width=1))
gf = GenomicFiles(rngs, fls)
## As an example, one could create a matrix from data extracted
## across multiple BED files.
MAP = function(rng, fl) {
requireNamespace("rtracklayer", quietly=TRUE) ## import, BEDFile
rtracklayer::import(rtracklayer::BEDFile(fl), which=rng)$name
}
REDUCE = unlist
xx = reduceFiles(gf, MAP=MAP, REDUCE=REDUCE)
mcols(rngs) = simplify2array(xx)
## Data and ranges can be stored in a SummarizedExperiment.
SummarizedExperiment(list(my=simplify2array(xx)), rowRanges=rngs)
} # }