Skip to contents

Download one or more files from GDC. Files are downloaded using the UUID and renamed to the file name on the remote system. By default, neither the uuid nor the file name on the remote system can exist.

Usage

gdcdata(
  uuids,
  use_cached = TRUE,
  progress = interactive(),
  token = NULL,
  access_method = "api",
  transfer_args = character(),
  ...
)

Arguments

uuids

character() of GDC file UUIDs.

use_cached

logical(1) default TRUE indicating that, if found in the cache, the file will not be downloaded again. If FALSE, all supplied uuids will be re-downloaded.

progress

logical(1) default TRUE in interactive sessions, FALSE otherwise indicating whether a progress par should be produced for each file download.

token

(optional) character(1) security token allowing access to restricted data. See https://gdc-docs.nci.nih.gov/API/Users_Guide/Authentication_and_Authorization/.

access_method

character(1), either 'api' or 'client'. See details.

transfer_args

character(1), additional arguments to pass to the gdc-client command line. See gdc_client and transfer_help for details.

...

further arguments passed to files, particulary useful when requesting legacy=TRUE

Value

a named vector with file uuids as the names and paths as the value

Details

This function is appropriate for one or several files; for large downloads use manifest to create a manifest for and the GDC Data Transfer Tool.

When access_method is "api", the GDC "data" endpoint is the transfer mechanism used. The alternative access_method, "client", will utilize the gdc-client transfer tool, which must be downloaded separately and available. See gdc_client for details on specifying the location of the gdc-client executable.

See also

manifest for downloading large data.

Examples

# get some example file uuids
uuids <- files() %>%
    filter(~ access == 'open' & file_size < 100000) %>%
    results(size = 3) %>%
    ids()

# and get the data, placing it into the gdc_cache() directory
gdcdata(uuids, use_cached=TRUE)
#>                                                                                                            b2b1f936-c929-4750-a72c-67199a17ff00 
#>      "~/.cache/GenomicDataCommons/b2b1f936-c929-4750-a72c-67199a17ff00/b0e23808-a9ae-4c80-95cc-7bd75733bfa4.wxs.aliquot_ensemble_masked.maf.gz" 
#>                                                                                                            944ff9e5-be73-475d-9990-209c4a81e79b 
#>      "~/.cache/GenomicDataCommons/944ff9e5-be73-475d-9990-209c4a81e79b/24186ca6-dd8b-4632-850d-f4e7e9f73399.wxs.aliquot_ensemble_masked.maf.gz" 
#>                                                                                                            223e5356-ea2b-428f-ab78-e8abd27529fa 
#> "~/.cache/GenomicDataCommons/223e5356-ea2b-428f-ab78-e8abd27529fa/bce47c3d-a414-49c9-bd27-9e941144860c.wgs.ASCAT.copy_number_variation.seg.txt" 

# legacy data
exon <- files(legacy = TRUE) %>%
    filter( ~ cases.project.project_id == "TCGA-COAD" &
        data_category == "Gene expression" &
        data_type == "Exon quantification") %>%
    results(size = 1) %>% ids()

gdcdata(exon, legacy = TRUE)
#>                                                                                                               13f62ad4-b55d-4091-9cea-a4dce58f9943 
#> "~/.cache/GenomicDataCommons/13f62ad4-b55d-4091-9cea-a4dce58f9943/unc.edu.85eb9996-655c-44f8-abbf-fbc332cf75cf.1756113.bt.exon_quantification.txt"