MSFragger is an ultrafast database search tool for peptide identification in mass spectrometry-based proteomics. It has demonstrated excellent performance across a wide range of datasets and applications. MSFragger is suitable for standard shotgun proteomics analyses as well as large datasets (including timsTOF PASEF data), enzyme unconstrained searches (e.g., peptidome), open database searches (e.g., precursor mass tolerance set to hundreds of Daltons) for identification of modified peptides, and glycopeptide identification (N-linked and O-linked).
MSFragger is implemented in the cross-platform Java programming language and can be used three different ways:
- With FragPipe user interface
- As a standalone Java executable
- Through ProteomeDiscoverer
MSFragger writes peptide-spectrum matches in either tabular or pepXML formats, making it fully compatible with downstream data analysis pipelines such as Trans-Proteomic Pipeline, Percolator, and Philosopher. See the complete documentation, including a list of Frequently Asked Questions. Example parameter files can be found here.
Supported file formats
The following spectral file formats can be searched directly with MSFragger, see the FragPipe homepage for compatibility with workflow components downstream from MSFragger.
mzML/mzXML - data from any instrument in mzML/mzXML format can be used
Thermo RAW - Thermo raw files (.raw) can be read directly, conversion to mzML is not required. In Linux, Mono need to be installed.
Bruker timsTOF PASEF - MSFragger can read Bruker timsTOF PASEF (DDA) raw files (.d) directly, as well as MGF files converted by the Bruker DataAnalysis program. Please note: timsTOF data requires Visual C++ Redistributable for Visual Studio 2017 in Windows. If you see an error saying cannot find Bruker native library, please try to install the Visual C++ redistibutable.
The entire MSFragger suite of tools (MSFragger-Core, MSFragger-LOS, MSFragger-Glyco, MSFragger-DIA, MSFragger-Labile), collectively known as “MSFragger”, is distributed as a single JAR file. It is available freely for academic research, non-commercial or educational purposes under academic license.
Other uses require a commercial license after the initial 60-day evaluation period that can be obtained by contacting Drew Bennett (email@example.com) at the University of Michigan Office of Tech Transfer. For questions, please contact Prof. Alexey Nesvizhskii (firstname.lastname@example.org).
Whether you run use FragPipe, Proteome Discoverer (PD, Thermo Scientific), or the command line, you will need to download the latest MSFragger JAR file. See instructions for downloading or upgrading MSFragger.
Check here for the full list of MSFragger versions and changes.
On Windows or Linux, the easiest way to run MSFragger is through FragPipe, which has a variety of built-in workflows for complete data analysis.
MSFragger and Philosopher (PeptideProphet) are also available as processing nodes in Proteome Discoverer (PD, Thermo Scientific). Currently, the MSFragger-PD node can be used in PD versions 2.2, 2.3 and 2.4.
See Launching MSFragger on the Wiki page.
For technical documentation on MSFragger (hardware requirements, search parameters, etc.), see the MSFragger wiki page.
Questions and Technical Support
See our Frequently Asked Questions (FAQ) page. Please post all questions/bug reports regarding MSFragger itself on the MSFragger GitHub issue page, or if more appropriate on FragPipe page or Philosopher page.
Requests for Collaboration
If you would like to propose a new collaboration that can take advantage of MSFragger and related tools, please contact us directly.
How to Cite
- Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D., & Nesvizhskii, A. I. (2017). MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics. Nature Methods, 14(5), 513-520.
- Yu, F., Teo, G. C., Kong, A. T., Haynes, S. E., Avtonomov, D. M., Geiszler, D. J., & Nesvizhskii, A. I. (2020). Identification of modified peptides using localization-aware open search. Nature Communications, 11(1), 1-9.
- Polasky, D. A., Yu, F., Teo, G. C., & Nesvizhskii, A. I. (2020). Fast and Comprehensive N-and O-glycoproteomics analysis with MSFragger-Glyco. Nature Methods, 17(11), 1125-1132.
- Yu, F., Haynes, S. E., Teo, G. C., Avtonomov, D. M., Polasky, D. A., & Nesvizhskii, A. I. (2020). Fast Quantitative Analysis of timsTOF PASEF Data with MSFragger and IonQuant. Molecular & Cellular Proteomics, 19(9), 1575-1585.
For other tools developed by the Nesvizhskii lab, see our website www.nesvilab.org