Read a single htseq-counts result file. — readHTSeqFile • GenomicDataCommons

The htseq package is used extensively to count reads relative to regions (see http://www-huber.embl.de/HTSeq/doc/counting.html). The output of htseq-count is a simple two-column table that includes features in column 1 and counts in column 2. This function simply reads in the data from one such file and assigns column names.

Usage

readHTSeqFile(fname, samplename = "sample", ...)

Arguments

fname: character(1), the path of the htseq-count file.
samplename: character(1), the name of the sample. This will become the name of the second column on the resulting data.frame, making for easier merging if necessary.
...: passed to read_tsv)

Value

a two-column data frame

Examples

fname = system.file(package='GenomicDataCommons',
                    'extdata/example.htseq.counts.gz')
dat = readHTSeqFile(fname)
#> Rows: 50 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): X1
#> dbl (1): X2
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(dat)
#> # A tibble: 6 × 2
#>   feature            sample
#>   <chr>               <dbl>
#> 1 ENSG00000000003.13   3039
#> 2 ENSG00000000005.5       0
#> 3 ENSG00000000419.11   1625
#> 4 ENSG00000000457.12    960
#> 5 ENSG00000000460.15    154
#> 6 ENSG00000000938.11    610