list path, taxonomy information

Syntax

Value is an ASCII text string describing a configuration file.

Notes

  1. Relative or absolute path names can be used.
  2. If the configuration file are resident on a remote disk, a full UNC path name is usually required.
  3. Any allowed file name can be used, but only one taxonomy configuration file can be specified per input.

Description

The parameter is used to pass the location of the file that translates the "taxonomy" information in the value protein, taxon into a list of FASTA file names. A simple taxonomy file is illustrated below.

<?xml version="1.0"?>
<bioml label="x! taxon-to-file matching list">
	<taxon label="human">
		<file format="peptide" URL="../fasta/human.fasta.pro" />
		<file format="peptide" URL="../fasta/human_extra.fasta" />
		<file format="spectrum" URL="../lib/human_fasta.hlf" />
		<file format="saps" URL="../fasta/human_saps.xml; />
		<file format="mods" URL="../mods/human_mod.xml" />
	</taxon>
	<taxon label="yeast">
		<file format="peptide" URL="../fasta/scd.fasta.pro" />
		<file format="peptide" URL="../fasta/scd_1.fasta.pro" />
		<file format="peptide" URL="../fasta/extras.fasta.pro" />
	</taxon>
</bioml>

If the protein, taxon value is set to human, then the files examined by TANDEM to find a matching peptide sequence would be as follows, in the order listed:

  1. ../fasta/human.fasta.pro
  2. ../fasta/human_extra.fasta

The additional files listed for human would be used as follows:

  • The file "../fasta/saps/human_saps.xml" would be loaded in to provide information about known single amino acid polymorphisms.
  • The file "../fasta/mods/human_mod.xml" would be loaded in to provide information about known potential modifications.
  • If X! Hunter was being used, the file "../lib/human_fasta.hlf" would be used to provide annotation spectrum library information.

Notes:

  1. If a file is listed more than once, it is only used one time.
  2. Do not use commas in the labels for taxon entries: all other normal characters are allowed. X! Tandem interprets a comma as separating multiple taxon entries. For example, human,yeast would be interpreted as the union of all of the sequence files in both the human and yeast taxon entries.
  3. If a sequence file does not exist, processing continues if at least one sequence file for the taxon does exist. The output file records all of the sequence files actually used: non-existent files are not listed.
  4. Avoid using the same taxon label twice in a taxonomy file. If more than one instance of a taxon label is found, the union of all of the sequence files for the multiple instances of that taxon label are used for the calculation.

see also: protein, taxon | FASTA-pro format

X! TANDEM API description project