High-Throughput Mass Spectral Library Searching of Small Molecules in R with NIST MSPepSearch

High-level programming languages such as Python and R are widely used in mass spectrometry data processing, where library searching is a standard step. Despite the availability of numerous library search algorithms, those developed by NIST and implemented in MS Search remain predominant, partly beca...

Szczegółowa specyfikacja

Zapisane w:
Opis bibliograficzny
1. autor: Andrey Samokhin (20282728) (author)
Kolejni autorzy: Mikhail Khrisanfov (22683809) (author)
Wydane: 2025
Hasła przedmiotowe:
Etykiety: Dodaj etykietę
Nie ma etykietki, Dołącz pierwszą etykiete!
Opis
Streszczenie:High-level programming languages such as Python and R are widely used in mass spectrometry data processing, where library searching is a standard step. Despite the availability of numerous library search algorithms, those developed by NIST and implemented in MS Search remain predominant, partly because commercial databases (e.g., NIST, Wiley) are distributed in proprietary formats inaccessible to custom code. MSPepSearch, another NIST tool, provides access to the same algorithms with greater flexibility for automation. However, its use requires calling a command-line interface with multiple flags and parsing output text files to retrieve results, which can be cumbersome. To address this, we developed mspepsearchr, an R package that streamlines the integration of library searches against NIST-format mass spectral databases into complex, multistep workflows. MSPepSearch is a single-threaded tool; therefore, parallelization was achieved externally by running multiple instances from within R. We describe the package, evaluate its performance, and illustrate its utility through the recognition of steroid-like compounds in untargeted gas chromatography-mass spectrometry analysis of biological samples.