Skip to contents

The htseq package is used extensively to count reads relative to regions (see http://www-huber.embl.de/HTSeq/doc/counting.html). The output of htseq-count is a simple two-column table that includes features in column 1 and counts in column 2. This function simply reads in the data from one such file and assigns column names.

Usage

readHTSeqFile(fname, samplename = "sample", ...)

Arguments

fname

character(1), the path of the htseq-count file.

samplename

character(1), the name of the sample. This will become the name of the second column on the resulting data.frame, making for easier merging if necessary.

...

passed to read_tsv)

Value

a two-column data frame

Examples

fname = system.file(package='GenomicDataCommons',
                    'extdata/example.htseq.counts.gz')
dat = readHTSeqFile(fname)
#> Rows: 50 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): X1
#> dbl (1): X2
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(dat)
#> # A tibble: 6 × 2
#>   feature            sample
#>   <chr>               <dbl>
#> 1 ENSG00000000003.13   3039
#> 2 ENSG00000000005.5       0
#> 3 ENSG00000000419.11   1625
#> 4 ENSG00000000457.12    960
#> 5 ENSG00000000460.15    154
#> 6 ENSG00000000938.11    610