X! Series search engines   The Global Proteome Machine Organization
  www.thegpm.org

  TANDEM project

  | Home | FAQ | Mutations | Release Notes | Instructions | API |

X! TANDEM Spectrum Modeler

X! Tandem open source is software that can match tandem mass spectra with peptide sequences, in a process that has come to be known as protein identification.

This software has a very simple, sophisticated application programming interface (API): it simply takes an XML file of instructions on its command line, and outputs the results into an XML file, which has been specified in the input XML file. The output format is described here (PDF). This format is used for all of the X! series search engines, as well as the GPM and GPMDB.

Unlike some earlier generation search engines, all of the X! Series search engines calculate statistical confidence (expectation values) for all of the individual spectrum-to-sequence assignments. They also reassemble all of the peptide assignments in a data set onto the known protein sequences and assign the statistical confidence that this assembly and alignment is non-random. The formula for which can be found here. Therefore, separate assembly and statistical analysis software, e.g. PeptideProphet and ProteinProphet, do not need to be used.

Latest release: 2008.12.01
This is a maintenance release of X! Tandem TORNADO. It includes an improvement in the threading mechanism that should improve overall performance for analyzing large datasets on multiprocessor/multicore computers and preliminary support for mzML data files.
System level changes
  1. A preliminary implimentation of the mzML file type has been made, compatible with files generated by ReAdW and ProteoWizard has been added. No changes to existing data file format implimentations have been made.
  2. The way that spectra are divided up between executing threads has been altered to better balance processor use for large LC/MS/MS datasets. In previous versions, spectra were divided up into equally-sized contiguous blocks and distributed to the available threads. For example, if there were 6 spectra and 2 threads:
    1. thread 1 = spectra #1, #2, #3; and
    2. thread 2 = spectra #4, #5, #6.
    This method works, but it can run into load balancing problems for large datasets where there is a bias in the type of spectra in the first part of a data set compared to the last part of a data set. This problem often occurs in large LC/MS/MS data sets, inwhich there tend to be spectra with larger parent ion masses in the latter half of the data. These larger peptides take longer to solve: in the example with 2 threads, that would mean that the 1st thread finishes before the 2nd thread, leaving one processor idle for some period. To address this problem, the new threading system assigns spectra to threads in an alternating pattern:
    1. thread 1 = spectra #1, #3, #5; and
    2. thread 2 = spectra #2, #4, #6.
    This should have the effect of better balancing the complexity of the calculation between all threads.

Copyright © 2004, The Global Proteome Machine Organization Privacy Statement