Performing a database search with FragPipe
FragPipe can be downloaded here.
Follow the instructions on the same Releases page to launch the program.
When FragPipe launches, the first tab in the window (‘Config’) will be used to configure the program.
- Connect FragPipe to a MSFragger .jar program file. If you already have such a file downloaded, use the ‘Browse’ button to select it or ‘Update’ to upgrade to the latest version. If you have not downloaded MSFragger before, use the ‘Download’ button.
- Connect FragPipe to a Philosopher program file. If you already have it downloaded, select ‘Browse’, otherwise select ‘Download’.
- Python is needed to perform database splitting (necessary in large database situation) and spectral library generation. We recommend you install version 3.7 or later here. Also install the following Python packages: numpy, pandas, Cython, msproteomicstools (only needed for spectral library generation).
Specify input files:
In the next tab, ‘Select LC/MS Files’,
- Drag & drop mzML files into the window or select ‘Add files’ or ‘Add Folder Recursively’ (to add all files in a folder, including those in subfolders).
Specify a protein sequence database:
In the ‘Sequence DB’ tab,
- ‘Browse’ for the FASTA sequence database file that you want to use in the search, or select ‘Download’ to fetch one from Uniprot. The FASTA file must contain decoy sequences. For help adding decoys and database formatting, see this page.
- Make sure the decoy prefix tag in the sequence database file is correct (necessary for target/decoy validation of identifications). If you select ‘Try Auto-Detect’, 50% of the entries should contain the decoy tag.
Set MSFragger search parameters:
Select the type of database search you want to perform.
Closed search: To perform a closed search (normal precursor mass tolerance), select ‘Closed Search’. This will prompt you to also update the downstream parameters for closed searching, select ‘Yes’.
Open search: To perform an open search (large precursor mass tolerance, used for finding unspecified post translational modifications), select ‘Open Search’. This will prompt you to also update the downstream parameters for open searching, select ‘Yes’.
Non-specific search: To perform a closed search where peptides are not required to have any enzymatic terminus, select ‘Non-specific Search’. This will prompt you to also update the downstream parameters for non-specific search, select ‘Yes’.
- Fill in the amount of memory (in GB) that FragPipe can use. We recommend at least 8-16 GB, but complex closed searches and open searches will require more.
- Specify the search parameters you want to use. For more information on these parameters, see the MSFragger wiki page.
Set downstream processing parameters:
- Select ‘Run PeptideProphet’ to validate your search results. (More information about PeptideProphet can be found here.
- If you previously updated the downstream parameters when setting MSFragger search parameters, you can skip to the next section. You can also re-load default downstream processing parameters by selecting the appropriate ‘Load defaults’ button.
- Select ‘Run ProteinProphet’ to validate your protein identifications. (More information about ProteinProphet here).
- If you are performing an open search, select Crystal-C to further improve filtering and interpretability of your results.
Specify filtering criteria and reports:
- Select ‘Create report’ to output tab-delimited tables of the search results.
- Select ‘Label-free Quant’ to perform label-free quantification if you need it.
- Select ‘Generate Spectral Library from search results’ to generate spectral library if you need it.
- Browse for the folder where you would like the search results to be written.
- Press ‘RUN’ to begin the analysis!