|
Proteomics crowdsourced "Big Data". The GPM is an
experimental project to create knowledge from proteomics data & reuse it to solve biomedical research problems.
|
||||||||||
|
Data set of the week: (2013/5/16)
Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq. Overall rating: excellent data (leading the field)
This data set consisted of 28 results that comprised a single
multidimensional chromatography experiment.
The data files were made available through PASSEL (PASS00215).
It was published by
Sheynkman GM, Shortreed MR, Frey BL and Smith LM in
Mol Cell Proteomics 2013 Apr 29 (PubMed).
This data set defines the state-of-the-art with respect to "deep" proteomics
of a human cell line (Jurkat cells). The combination of a first dimension using high pH HPLC followed by
low pH HPLC produced a very well separated collection of peptides. The use of HCD coupled with high resolution
fragment ion measurements using an Orbitrap lead to very high confidence peptide assignments. Anyone interested in
detecting relatively rare post-translational modifications or determining splice variants would be well served
by performing their analysis on this data set first.
In addition to the human phosphorylation annotation released on Sunday, we have also prepared annotation in the
same format for a set of model species commonly used in proteomics experiments. Annotation for the following
species is now available:
C. elegans,
D. melanogaster,
M. musculus and
S. cerevisiae.
As part of our contribution to the Human Proteome Project, we have compiled a comprehensive list of all human protein
phosphorylation sites represented by good quality data in GPMDB. This list has been subdivided on a chromosome-by-chromosome
basis, using ENSEMBL v. 70 as the source of the protein and gene sequences. All of the splice variants listed
by ENSEMBL have been annotated.
The files associated with the annotation for each chromosome (and a merged list of all chromosomes) is now available
by FTP. A description of the
format of these files
(README.txt) is
in the same directory. A short summary of the number of phospho-proteins, genes and sites is given
here.
For unique protein sequences in the proteome, the overall totals are as follows:
Data set of the week: (2013/5/10)
Proteogenomic Analysis of Human Colon Carcinoma Cell Lines LIM1215, LIM1899, and LIM2405. Overall rating: very good data (specialist interest)
This data set consisted of 136 results composed of individual SDS-PAGE gel slices
and experiment summaries.
The data files were made available through ProteomeXchange (PXD000120).
It was published by
Fanayan S, Smith JT, Lee LY, Yan F, Snyder M, Hancock WS and Nice E in
J Proteome Res. 2013 Mar 13 (PubMed).
The data reported here was a good example of what can be done with whole cell lysates
analyzed using SDS-PAGE protein separations and low resolution (LTQ) mass spectrometry. The experiments elucidate
an interesting biological issue: "How different were the protein concentrations in three related cell lines and how
were those changes generated by differences in RNA concentration?" This data would be useful for anyone interested
in practical difficulties associated with combining protein molecular mass information with peptide identifications
when using SDS-PAGE gels for protein separations.
As some readers may have noticed, the logo for GPM and GPMDB has changed recently (thanks to noted electronic artist KD Thornton).
This change is part of a general redesign of the site to conform to more modern web page coding trends, simplify page navigation and improve
the usefulness of the overall site on smaller screens and mobile platforms. If you have any suggestions regarding things you would
like to see in a new design (or things that really bug you about the current one), please let us know at contact@thegpm.org.
The definition of the GPMDB REST interface has been expanded to include a new method, allowing the rapid
calculation of peptide ω frequencies for any set of peptides and protein accession numbers
stored in GPMDB. These frequencies are useful when comparing observed peptides to those previously observed:
a technical definition is given here.
The description of this method and an example have been added to the GPMDB Wiki page
for the REST interface.
Copyright © 2013, The Global Proteome Machine Organization.
Privacy Statement
|