The Global Proteome Machine Organization
   GPM Blog
Data set of the week: (2014/12/20)
Different binding motifs of the celiac disease-associated HLA molecules DQ2.5, DQ2.2, and DQ7.5 revealed by relative quantitative proteomics of endogenous peptide repertoires.
Overall rating: excellent data (leading the field)
This data set consisted of 51 results, analyzed by reversed phase HPLC MS/MS. The data files were made available through ProteomeXchange, PXD001205. It has been published by Bergseng E, Dorum S, Arntzen MO, Nielsen M, Nygard S, Buus S, de Souza GA, and Sollid LM, Immunogenetics. 2014 Dec 12 (PubMed).
The combination of an interesting question, experimental protocol, sample preparation and excellent experimental technique make these results truly remarkable. This study focusses on the very specific peptides bound to selected types of MHC II-type antigen presentation complexes, demonstrating that it is possible to selectively (and sensitively) observe these biologically and clinically important peptides. The peptide signals themselves are strong and unambiguous, making this data set probably the most interesting collection of endogenous peptides to be made publicly available to date. This data set should appeal to immunologists (who may be puzzled by the proteins represented), computational biologists (who should want to understand why these peptides were chosen), bioinformaticians (who want to understand the observation of non-tryptic peptides) and clinicians (who want to understand the immunological basis of celiac disease).
Data set of the week: (2014/12/10)
Adenovirus composition, proteolysis, and disassembly studied by in-depth qualitative and quantitative proteomics.
Overall rating: very good data (specialist interest)
This data set consisted of 5 results, using three different protease digests analyzed by reversed phase HPLC MS/MS. The data files were made available through ProteomeXchange, PXD000591. It has been published by Benevento M1, Di Palma S, Snijder J, Moyer CL, Reddy VS, Nemerow GR, and Heck AJ, J Biol Chem. 2014 289:11421-30 (PubMed).
This study nicely demonstrates the level of detail regarding a virus' proteome that can be obtained in short order using modern methods. With only five experiments, the authors were able to almost fully characterize the proteins present in a type 5 human adenovirus (HAdV) vector. They could then use this information to create SRM assays for each of the proteins in the vector and use the assays to perform quantitative experiments. While not emphasized in the manuscript, the data also makes it possible to determine which human proteins co-purify with the viral particles, although it doesn't contain enough information to determine whether these proteins were incorporated into the virons themselves or simply adhered during purification.
Data set of the week: (2014/12/2)
Site-specific mapping and quantification of protein S-sulphenylation in cells.
Overall rating: very good data (specialist interest)
This data set consisted of 24 results, using a combination of affinity isolation followed by reversed phase HPLC MS/MS. The data files were made available through the CPTAC Portal. It has been published by Yang J, Gupta V, Carroll KS and Liebler DC, Nat Commun. 2014 Sep 1;5:4776 (PubMed).
This study makes use of an interesting click-chemistry reagent as part of an affinity purification scheme to isolate peptides that had the transitory PTM cysteine sulphenylation. The PTM (the oxidation of the sulphydryl side of cysteine, SH -> S-OH) had been previously detected at the protein level, but this study is the first to track it back to the specific cysteine acceptor sites that use the modificiation. In addition to identifying the acceptor residues, the reagent had a +6 Da "heavy" version that allowed for relative quantitation studies. The experiments were well done and the reagent appears to work very well for the purpose, resulting in LC/MS/MS runs with more that 15% of identified peptides corresponding to the desired modification. The DYn-2-triazohexanoic acid modification produced a significant shift in the chromatographic retention to later in the gradient for the labelled peptides, making validation of the identifications very straightforward.
GPMDB REST API, version 2 (2014/11/27)
We have begun to roll out the version 2 features of the GPMDB REST API. The first set of the new version 2 methods (listed here) were designed to make it easy to determine which bases on the human genome are associated with specific post-translational modifications (PTMs) that have been observed in the proteome and recorded in GPMDB. The PTM acceptor sites have been curated to ENSEMBL v. 70 human proteome and the GrCH37 version of the human genome. The methods can be used for interpreting the results of any genome or transcriptome study that discovered missense nucleotide variants in terms of the effect of those variants on the PTM status of the associated protein splice variants in ENSEMBL v.70.
Version 2 is a stand-alone set of new methods. The methods associated with version 1 (listed here) will remain the same and will be accessible from the same URLs as before. No changes to the version 1 interface are contemplated at this time.
Data set of the week: (2014/11/25)
Characterization of native protein complexes and protein isoform variation using size-fractionation-based quantitative proteomics.
Overall rating: excellent data (worth study)
This data set consisted of 120 results, using a combination of native size-exclusion chromatographic (SEC) separation followed by reversed phase HPLC MS/MS of the SEC fractions. The data files were made available through ProteomeXchange, PXD001220. It has been published by Kirkwood KJ1, Ahmad Y, Larance M, and Lamond AI, Mol Cell Proteomics. 2013 12:3851-73 (PubMed).
The rather innovative study examines the use of native size exclusion chromatography to isolate protein complexes and conventional LC MS/MS to assess their protein composition. The SEC method they employ worked very well and it proved to be an excellent method to produce functionally-related protein fractions. The work was technically first rate and hopefully it will popularize the use of modern SEC methods in proteomics sample preparation.
Data set of the week: (2014/11/17)
Rapid and Deep Proteomes by Faster Sequencing on a Benchtop Quadrupole Ultra-High-Field Orbitrap Mass Spectrometer.
Overall rating: very good data (specialist interest)
This data set consisted of 36 results, using reversed phase HPLC MS/MS. The data files were made available through ProteomeXchange, PXD001305 . It has been published by Kelstrup CD, Jersie-Christensen RR, Batth TS, Arrey TN, Kuehn A, Kellmann M and Olsen JV, J Proteome Res. 2014 Nov 10 (PubMed).
The goals and results of this study are remarkably similar to those in last week's featured data set Scheltema, et al.. Many of the details are different, caused by significant differences in the chromatographic methods used in the two studies. This study utilized a gradient that rapidly rose to ~20% organic and then slowly increased to ~ 45%, followed by a isocratic hold for ~8000 scans. The Scheltema results used a linear gradient from ~5% to ~40% with a short additional gradient to ~55% organic at the end. The method used in the Kelstrup study resulted in a nearly constant peptide identification rate of ~50% throughout the main gradient, falling off sharply during the isocratic portion. The Scheltema study showed a more variable identification rate, starting at ~20% and rising to as much as 70% by the end. Overall the two studies show about the same "depth" in terms of identifications, although the Scheltema study had significantly better identifications for the human Alphapapillomavirus 7 proteins in their HeLa cell line. The overall efficiency of peptide identification was slightly better in the last week's study: 4.9 unique residues per spectrum (Scheltema) compared to 4.3 unique residues per spectrum (Kelstrup).
Data set of the week: (2014/11/11)
The Q Exactive HF, a Benchtop Mass Spectrometer with a Pre-filter, High Performance Quadrupole and an Ultra-High Field Orbitrap Analyzer.
Overall rating: very good data (specialist interest)
This data set consisted of 99 results, using reversed phase HPLC MS/MS. The data files were made available through ProteomeXchange, PXD001203. It has been published by Scheltema RA, Hauschild JP, Lange O, Hornburg D, Denisov E, Damoc E, Kuehn A, Makarov A and Mann M, Mol Cell Proteomics. 2014 Oct 30 (PubMed).
This study provides a set of rather straightforward analyses of HeLa cell lysates using LC/MS/MS performed using the new Q Exactive HF instrument. The results should be of interest to anyone interested in obtaining or using this MS/MS platform. The LC/MS/MS system performed very well, generating reproducible results with good sensitivity. The results show good identifications of the E7:p protein (human Alphapapillomavirus 7) and EIF5A:pm.K50+hypusine, which are reliable indicators of "depth" in HeLa cell studies. This instrument configuration generates significantly more identifiable peptide fragment ions than the previous generation instruments, making the accurate assignment of PTMs more reliable.
October 2014 Editions of the Mouse and Human Proteome Guides (2014/10/6)
The latest edition (v. 16) of the Guide to the Human Proteome and the Guide to the Mouse Proteome have been released and are available for download and use. They are both available in either HTML, CSV (comma-separated value) or XLS (excel spreadsheet) formats. This release will be the last one to use ENSEMBL 70 for human and ENSEMBL 69 for mouse proteomes: the January 2015 release will use ENSEMBL 76 for both human and mouse sequences.
Search engine basics 101 #4, by Ron Beavis (2014/09/25)
Chromatographic influences on peptide identification rate calculations.
It is very common to analyze the set of spectra generated by an HPLC MS/MS experiment as a group, thereby obtaining an ordered set of peptide-to-spectrum matches (PSMs). The matches are then examined and statistical QA/QC measures applied, resulting in a reduced set of PSMs assumed to be true positive assignments. The efficiency of this process is often characterized by calculating the ratio of the number of true postive PSMs to the total number of MS/MS spectra acquired. This ratio (R) is also frequently used to characterize the performance of one identification algorithm versus others more ...
Copyright © 2013, The Global Proteome Machine Organization. Privacy Statement