About

Venue: Bioinformatics Community Conference

Date (US Eastern Time)

  • Western hemisphere: Friday July 17, 2020 12:15 - 14:45
  • Eastern hemisphere: Saturday July 18, 2020 21:00 - 23:30

Abstract

Bioconductor provides more than 1900 R packages for the analysis and comprehension of high-throughput genomic data. Most users install and run Bioconductor on a personal computer or perhaps use an academic cluster. Cloud-based solutions are increasing appealing, removing the headaches of local installation while providing access to (a) better, scalable computing resources; and (b) large-scale ‘consortium’ and other reference data sets. This session introduces the AnVIL cloud computing environment. We cover use of the cloud as a replacement to desktop-style computing; integrating workflows for ‘upstream’ processing of large data resources with interactive ‘downstream’ analysis and comprehension, using Human Cell Atlas single-cell datasets as an example; and querying cloud-based consortium data for integration with a users’ own data sets.

Incoming expectations

Participants should be comfortable working with R. Some familiarity with Bioconductor is helpful but not required. No prior cloud-based experience is necessary. We will use the AnVIL cloud, which requires a modern web browser.

Learning objectives

The AnVIL cloud

  1. AnVIL workspaces

  2. Jupyter Runtimes

  3. The AnVIL package for cloud access

Cloud-based analysis

  1. Single cell exploratory analysis

  2. (Formal workflows for large-scale analysis)