Proteomics frequently involves the use of large lists of
protein sequence related assignments based on the interpretation of tandem mass spectra. This page describes some
general rules-of-thumb for interpretting this type of information
when you need to apply it to a specific experimental or theoretical case.
- Adjacent residues may be hard to distinguish from the measurements.
It may be very difficult to assign the exact residue associated with any particular assignment. For example,
if a peptide has the sequence "ASTYYLFR", it may be easy to determine from a mass spectrum that this peptide
sequence is phosphorylated. It may be very difficult, however, to determine whether the phosphorylation is on S as
opposed to T based on the fragmentation pattern in that spectrum. Similarly it may be difficult to distinguish between
phosphorylation at Y or Y. In GPMDB, if there is no data in the spectrum that clearly distinguishes the two (or more) cases,
both will be reported. Therefore if there are assignments at nearly adjacent residues, exercise caution and consult the
original data (using pSYT)
to determine how well (or poorly) these alternative cases are supported by the original experimental observations.
- Splice variants may be difficult to assign unambiguously.
The data obtained from experiments that pull down peptides with specific modifications, e.g., metal-oxide columns
for phosphopeptides, usually will only retain a small number of peptide sequences for a particular protein. Given the
very limited sequence coverage associated with a small number of peptides, it is usually not possible to
specify which alternate splice variant or protein isoform has been detected. GPMDB reports all protein variants that
contain the detected peptide sequence in an individual experiments. If it is important to know which variant has
been modified, it will be necessary to examine the data in detail. Therefore, it is easier to exclude a variant on the
basis of a missing site assignment than it is to distinguish between alternate sequences all of which contain the
same site assignment.
- Isobaric interference.
All assignments made by tandem mass spectrometry can only distinguish between things that result in measurable mass
differences. Some modifications are simply too close in mass to be confidently distinguished using the types of
measurements that are commonly used in high-throughput proteomics. One important example is tyrosine phosphorylation vs.
sulfonation. Both modifications are very similar in mass (79.966331 Da vs. 79.956815 Da) and to conclusively measure
one or the other requires high resolution mass spectrometry. Another example is lysine acetylation vs. trimethylation (42.010565 Da vs
42.046950 Da) which requires careful measurement to ensure correct assignment. Fortunately phosphorylation and acetylation are
much more frequent post-translational modifications than sulfonation or trimethylation in general, so simply assigning
the most common modication is often justified. Some protein sequences, such as histones, in which multiple modifications
can occur may also require more careful treatment than is possible in generating high-throughput sequence modification information.