R / Bioconductor in the AnVIL Cloud
July 17, July 18 2020
About
Venue: Bioinformatics Community Conference
Date (US Eastern Time)
- Western hemisphere: Friday July 17, 2020 12:15 - 14:45
- Eastern hemisphere: Saturday July 18, 2020 21:00 - 23:30
Abstract
Bioconductor provides more than 1900 R packages for the analysis and comprehension of high-throughput genomic data. Most users install and run Bioconductor on a personal computer or perhaps use an academic cluster. Cloud-based solutions are increasing appealing, removing the headaches of local installation while providing access to (a) better, scalable computing resources; and (b) large-scale ‘consortium’ and other reference data sets. This session introduces the AnVIL cloud computing environment. We cover use of the cloud as a replacement to desktop-style computing; integrating workflows for ‘upstream’ processing of large data resources with interactive ‘downstream’ analysis and comprehension, using Human Cell Atlas single-cell datasets as an example; and querying cloud-based consortium data for integration with a users’ own data sets.
Incoming expectations
Participants should be comfortable working with R. Some familiarity with Bioconductor is helpful but not required. No prior cloud-based experience is necessary. We will use the AnVIL cloud, which requires a modern web browser.
Learning objectives
The AnVIL cloud
AnVIL workspaces
Jupyter Runtimes
The AnVIL package for cloud access
Cloud-based analysis
Single cell exploratory analysis
(Formal workflows for large-scale analysis)