The GPM open source proteomics project

X! Series API Documentation Project

X! Tandem, P3 and Hunter are open source proteomics software that attempt to find the best sequence model for a given MS/MS spectrum of a peptide. These X! Series search engines use a common set of input parameters, which include a larger set of input parameters than the older styles of search software. The goal of this project is to create a good set of documentation for all of the parameters currently supported by the X! Series. Currently, more than 40 of the API parameters have documentation.

In addition to the text documentation, there are some Universal Modelling Language (UML) diagrams that illustrate some of the more technical features of how the X! Series works here.

Click on the entries below for a description of the appropriate API parameter. X! Series API parameters are represented as ASCII entries in an input file, that uses a simplified XML syntax. A general entry is as follows:

	<note type="input" label="GROUP, NAME">VALUE</note>

Each parameter value is a BIOML <note> tag, with the type attribute "input". The label attribute contains the name of the parameter, broken into two parts by a comma. The first part is a general description of a general group of parameters (GROUP) and the second part is the specific description of the parameter (NAME). The VALUE(contents) of the note tag is the value that the parameter will take. The names and values for parameters are case sensitive.

To see an example of a fully filled out X! Series input API file, click here.

Use of all documentation for X! Tandem, X! P3 and X! Hunter is governed by the Artistic License.

GROUP: list path,

  1. default parameters - path to default parameter file.
  2. taxonomy information - path to sequence taxonomy file.

GROUP: output,

  1. histogram column width - width of columns in output file.
  2. histograms - display histograms in output file.
  3. log path - sets logging file location.
  4. maximum valid expectation value - highest value for recorded peptides.
  5. message - sets console output processing message.
  6. one sequence copy - sets the mode for writing protein sequences.
  7. parameters - controls output of input parameters
  8. path - output file path.
  9. path hashing - hash file name with date and time of record.
  10. performance - controls output of performance parameters.
  11. proteins - controls output of protein sequences.
  12. results - controls the types of results recorded.
  13. sequence path - output the refinement protein sequence list.
  14. sort results by - controls how spectrum results are sorted.
  15. sequences - controls output of sequence information.
  16. spectra - controls output of spectrum information.
  17. xsl path - sets path for the XSLT style sheet used to view the output XML.

GROUP: protein,

  1. cleavage C-terminal mass change - moiety added to peptide C-terminus by cleavage.
  2. cleavage N-terminal mass change - moiety added to peptide N-terminus by cleavage.
  3. cleavage semi - use semi-enzymatic cleavage rules
  4. cleavage site - specification of specific protein cleavage sites
  5. C-terminal residue modification mass - moiety added to the C-terminus of protein.
  6. N-terminal residue modification mass - moiety added to the N-terminus of protein.
  7. modified residue mass file - modify the default residue masses for any or all amino acids.
  8. quick acetyl - protein N-terminal modification detection.
  9. quick pyrolidone - peptide N-terminus cyclization detection.
  10. stP bias - interpretation of peptide phosphorylation models.
  11. saps - check for known snAPs.
  12. taxon - specification of the taxonomy keyword.
  13. use annotations - use the annotation file specified in the taxonomy file.

GROUP: refine,

  1. cleavage semi - use semi-enzymatic cleavage rules.
  2. maximum valid expectation value - highest value allowed as a refinement result.
  3. modification mass - alter the list of complete modifications for refinement.
  4. point mutations - test for sAPs.
  5. potential modification mass - potential modifications to test.
  6. potential modification motif -potential modification motifs to test.
  7. potential N-terminus modifications - potential modifications to the N-terminus of a peptide.
  8. refine - controls the use of the refinement modules.
  9. saps - test for known annotated snAPs.
  10. sequence path - input protein sequence list prior to refinement.
  11. spectrum synthesis - controls the use of spectrum synthesis scoring.
  12. tic percent - alter the frequency of output tics during refinement.
  13. unanticipated cleavage - controls the use of cleavage at every residue.
  14. use annotations - use the annotation file specified in the taxonomy file.
  15. use potential modifications for full refinement - controls the use of refinement modifications in all refinement modules.

GROUP: residue,

  1. modification mass - specification of modifications of residues.
  2. potential modification mass - specificiation of potential modifications of residues.
  3. potential modification motif - specification of potential modification motifs.

GROUP: scoring,

  1. a ions - allows the use of a-ions in scoring.
  2. b ions - allows the use of b-ions in scoring.
  3. c ions - allows the use of c-ions in scoring.
  4. cyclic permutation - compensate for very small sequence list files.
  5. include reverse - automatically perform "reversed database" search.
  6. maximum missed cleavage sites - sets the number of missed cleavage sites.
  7. minimum ion count - sets the minimum number of ions required for a peptide to be scored.
  8. x ions - allows the use of x-ions in scoring.
  9. y ions - allows the use of y-ions in scoring.
  10. z ions - allows the use of z-ions in scoring.

GROUP: spectrum,

  1. contrast angle - sets contrast angle for removing duplicate spectra.
  2. dynamic range - sets the dynamic range for scoring spectra.
  3. fragment mass error - fragment ion mass tolerance (chemical average mass).
  4. fragment mass error units - units for fragment ion mass tolerance (chemical average mass).
  5. fragment mass type - use chemical average or monoisotopic mass for fragment ions.
  6. fragment monoisotopic mass error - fragment ion mass tolerance (monoisotopic mass).
  7. fragment monoisotopic mass error units - units for fragment ion mass tolerance (monoisotopic mass).
  8. minimum fragment mz - sets minimum fragment m/z to be considered.
  9. minimum peaks - sets the minimum number of peaks required for a spectrum to be considered.
  10. minimum parent m+h -sets the minimum parent M+H required for a spectrum to be considered.
  11. neutral loss mass - sets the centre of the window for ignoring neutral molecule losses.
  12. neutral loss window - sets the width of the window for ignoring neutral molecule losses.
  13. parent monoisotopic mass error minus - parent ion M+H mass tolerance lower window.
  14. parent monoisotopic mass error plus - parent ion M+H mass tolerance upper window.
  15. parent monoisotopic mass error units - parent ion M+H mass tolerance window units.
  16. parent monoisotopic mass isotope error - anticipate carbon isotope parent ion assignment errors.
  17. path - path for input spectrum file.
  18. path type - type of input spectrum file.
  19. sequence batch size - alter how protein sequences are retrieved from a FASTA file.
  20. skyline path - insert a path name into the output for use by Skyline.
  21. threads - worker threads to be used for calculation.
  22. total peaks - maximum number of peaks to be used from a spectrum.
  23. use neutral loss window - controls the use of the neutral loss window.
  24. use noise suppression - controls the use of noise suppression routines.
  25. use contrast angle - controls the use of contrast angle duplicate spectrum deletion.

Is there something you would like to see included, or documentation that you would like to contribute? Please let use know - contact@thegpm.org.



X! Series API description project