library(MsDataHub)
fls <- c(X20171016_POOL_POS_1_105.134.mzML(),
X20171016_POOL_POS_3_105.134.mzML())
#> see ?MsDataHub and browseVignettes('MsDataHub') for documentation
#> loading from cache
#> see ?MsDataHub and browseVignettes('MsDataHub') for documentation
#> loading from cache
fls
#> EH7809
#> "/Users/runner/Library/Caches/org.R-project.R/R/ExperimentHub/1a93bd4833e_7859"
#> EH7810
#> "/Users/runner/Library/Caches/org.R-project.R/R/ExperimentHub/1a932cf46183_7860"
How to read mass spectrometry data
2025-02-17
Bioconductor packages used in this document
How to read mass spectrometry data
Let’s first get some example raw mass spectrometry data from the MsDataHub package. Below, we download two sciex mzML files represent profile-mode LC-MS data of pooled human serum samples and create a vector of files names fls
:
We can now use the Spectra()
constructor function to create an object of class Spectra
that contains the raw data. Note that the mzR
package is needed to read the content of the mzML files.
sp <- Spectra(fls)
sp
#> MSn data (Spectra) with 1862 spectra in a MsBackendMzR backend:
#> msLevel rtime scanIndex
#> <integer> <numeric> <integer>
#> 1 1 0.280 1
#> 2 1 0.559 2
#> 3 1 0.838 3
#> 4 1 1.117 4
#> 5 1 1.396 5
#> ... ... ... ...
#> 1858 1 258.636 927
#> 1859 1 258.915 928
#> 1860 1 259.194 929
#> 1861 1 259.473 930
#> 1862 1 259.752 931
#> ... 34 more variables/columns.
#>
#> file(s):
#> 1a93bd4833e_7859
#> 1a932cf46183_7860
The object contains 1862 spectra from both files.
Further reading
- The R for Mass Spectrometry book.
- The Large-scale data handling and processing with Spectra vignette from the Spectra package.
Session info
Click to display session info
sessionInfo()
#> R Under development (unstable) (2025-02-15 r87725)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Sonoma 14.7.2
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: UTC
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] mzR_2.41.4 Rcpp_1.0.14 Spectra_1.17.5
#> [4] BiocParallel_1.41.0 S4Vectors_0.45.4 BiocGenerics_0.53.6
#> [7] generics_0.1.3 MsDataHub_1.7.0 BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] KEGGREST_1.47.0 xfun_0.50 Biobase_2.67.0
#> [4] vctrs_0.6.5 tools_4.5.0 curl_6.2.0
#> [7] parallel_4.5.0 tibble_3.2.1 AnnotationDbi_1.69.0
#> [10] RSQLite_2.3.9 cluster_2.1.8 blob_1.2.4
#> [13] pkgconfig_2.0.3 dbplyr_2.5.0 lifecycle_1.0.4
#> [16] GenomeInfoDbData_1.2.13 compiler_4.5.0 Biostrings_2.75.3
#> [19] codetools_0.2-20 ncdf4_1.23 clue_0.3-66
#> [22] GenomeInfoDb_1.43.4 htmltools_0.5.8.1 yaml_2.3.10
#> [25] pillar_1.10.1 crayon_1.5.3 MASS_7.3-64
#> [28] cachem_1.1.0 MetaboCoreUtils_1.15.0 mime_0.12
#> [31] ExperimentHub_2.15.0 AnnotationHub_3.15.0 tidyselect_1.2.1
#> [34] digest_0.6.37 dplyr_1.1.4 purrr_1.0.4
#> [37] BiocVersion_3.21.1 fastmap_1.2.0 cli_3.6.4
#> [40] magrittr_2.0.3 withr_3.0.2 filelock_1.0.3
#> [43] UCSC.utils_1.3.1 rappdirs_0.3.3 bit64_4.6.0-1
#> [46] rmarkdown_2.29 XVector_0.47.2 httr_1.4.7
#> [49] bit_4.5.0.1 png_0.1-8 memoise_2.0.1
#> [52] evaluate_1.0.3 knitr_1.49 IRanges_2.41.3
#> [55] BiocFileCache_2.15.1 rlang_1.1.5 glue_1.8.0
#> [58] DBI_1.2.3 BiocManager_1.30.25 jsonlite_1.8.9
#> [61] R6_2.6.1 fs_1.6.5 ProtGenerics_1.39.2
#> [64] MsCoreUtils_1.19.0