![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
News Archive
![]()
Data set of the week: (2012/12/23)
Intermembrane space proteome of yeast mitochondria. Overall rating: ![]() ![]() This data provided an unusually detailed look at the proteins associated with
mitochrondrial metabolism in baker's yeast (see this GO protein enrichment diagram
for an example of the level of enrichment obtained). The combination of good sample preparation, protein chemistry, separations
and mass spectrometry allowed the investigators to accurately distinguish between background levels of protein
flux and that specifically generated by the human sequence BAX:p treatment used in the experiments.
Data set of the week: (2012/12/16)
Tandem metal oxide affinity chromatography identifies novel in vivo MAP kinase substrates in Arabidopsis thaliana. Overall rating: ![]() ![]() The data obtained in this study was an excellent example of combining protein and peptide
separations methods to obtain samples that were highly enriched in relatively rare materials. The results obtained
were very high quality, allowing the unambiguous identification of numerous biologically relevant phospho-domains
in MAPK signalling related proteins.
Data set of the week: (2012/12/9)
The quantitative proteomes of human-induced pluripotent stem cells and embryonic stem cells. Overall rating: ![]() ![]() These experiments show what can be done using quantitative mass spectrometry methods
and several commonly available Orbitrap-based mass spectrometry technologies. The experiments were well executed
in a consistent manner and they should be quite reproducible. If you are interested in following the concentration
of any specific set of proteins in human embryonic stem cells, human-induced pluripotent stem cells or
the associated precursor fibroblast cell lines, it would be a good idea to consult this data set and use it
to select the appropriate technology for your experiments. While the quanitative method used in the study
(lysine/N-terminus derivatization with isotope-labelled dimethyl groups) may not be as popular as some
other protocols, all of the examples that we have seen have been well done, with a minimal number of side
reactions and artifacts.
Data set of the week: (2012/12/3)
Core proteome of the minimal cell: comparative proteomics of three mollicute species. Overall rating: ![]() ![]() This data was interesting as it belongs to what has become a relatively rare class of
results: it contains the only identification information available for many proteins from a relatively common
bacterium: Acholeplasma laidlawii. A. laidlawii is a very small mycoplasma (a Mollicute genus with no
cell wall), which is can pass through sterilization filters with 0.2 µm pores. It also has a small genome (~1.5 Mbp),
with only 1380 genes. This single study found 819 translated proteins, a remarkable 59% of all possible translation
products, including > 100 proteins current labeled as "hypothetical".
![]() Data set of the week: (2012/11/26)
Combination of chemical genetics and phosphoproteomics for kinase signaling analysis enables confident identification of cellular downstream targets. Overall rating: ![]() ![]() This data was an excellent example of how good phosphoproteomics measurements
have become using CID and an Orbitrap-LTQ. The level of phosphopeptide enrichment was very high (> 80%) and
multiply-phosphorylated peptides were very cleanly identified. The large neutral loss peaks that were so prominent
in the first generation of phosphopeptide CID spectra have been suppressed, making the identifications straightforward
without additional MS/MS/MS measurements. The sample preparation workflow used has generated phosphpeptides from
a significant number of proteins with poorly understood functions, such as NDEL1:p, TPD52L2:p, EML3:p and RAI1:p, that
have not been well sampled in previous large-scale phosphoproteomics experiments.
Data set of the week: (2012/11/18)
Identification of Proteins Associated with the Pseudomonas aeruginosa Biofilm Extracellular Matrix. Overall rating: ![]() ![]() Pseudomonas aeruginosa is a common bacteria that thrives
in many man-made environments. It is a human pathogen causing sepsis and generalized infections, particularly in
individuals with weakened immune systems. This well done study provides excellent insight into the proteins produced
by P.aeruginosa to form colony biofilm matrix material. The data is first rate and it is recommended for use as
a reference data set for examining the challenges associated with prokaryote proteomics for both protein and peptide sequence
assignment using spectra generated by CID in hybrid Orbitrap-LTQ instruments.
Data set of the week: (2012/11/11)
Comparative phosphoproteomic analysis of neonatal and adult murine brain. Overall rating: ![]() ![]() The data from this study showed a very good group phosphopeptide identifications
from murine brains, many of which were comparitively rare. The data also contained a significant subset of phosphopeptides
that were multiply phosphorylated, making it interesting from the view point of the mechanics of identifying this type of peptide.
The serine:threonine phosphorylation ratio for the identified peptides was ~5:1,
which is a common feature of mammalian S/T-phosphorylation studies.
![]() ![]() Data set of the week: (2012/11/04)
Salivary basic proline-rich proteins are elevated in HIV-exposed seronegative men who have sex with men. Overall rating: ![]() ![]() Unfortunately, only a limited number of the data files made available by the
researchers were retrievable from TRANCHE, but the two replicates that could be downloaded were very
good quality. The proteins and peptides found give an excellent guide to what can be sampled
using iTRAQ quantitation of clinical samples of human saliva. Saliva is a notoriously difficult
fluid to sample cleanly, but this study does an admirable job of obtaining good quality samples
and analyzing them thoroughly.
Data set of the week: (2012/10/28)
Global detection of protein kinase D-dependent phosphorylation events in nocodazole-treated human cells. Overall rating: ![]() ![]() The data from this study were very good quality MS/MS spectra, representing what can be
expected from any collection of well done SILAC quantitation experiments. The results support the conclusions, however
our reanalysis of the data revealed a significant level of amide carbamylation. In addition to carbamylation, the
paper's analysis omitted deamidation, dioxidation and N-terminal cyclization, leading
to a false negative rate of >15% in the results reported in the paper. While these assignments do
not affect the biological conclusions in any major way, they do have an effect on the decoy-target calculation used to
estimate the peptide sequence assignment error rate. Any group interested in how false negative assignments alter the outcomes of
the statistical analysis of proteomics data should examine these results carefully.
Data set of the week: (2012/10/21)
TSLP signaling network revealed by SILAC-based phosphoproteomics. Overall rating: ![]() ![]() This data was obtained from a well-planned study of the protein phosphorylation
dynamics of the thymic stromal lymphopoietin signalling system. The study used SILAC quantitative proteomics
and affinity purification to examine the changes in protein post-translational modification involved in this
particular system, which has been implicated in human disease. The SILAC method used (K6/R6) has become
increasingly popular recently, challenging the dominant K8/R10 method popularized by the Mann group.
Data set of the week: (2012/10/14)
The first comprehensive and quantitative analysis of human platelet protein composition allows the comparative analysis of structural and functional pathways. Overall rating: ![]() ![]() This data set is a good example of the depth of proteomics analysis available
for simple cells. The proteome of platelets is simplified by the absence of nuclear proteins as well as
proteins involved in translation and folding. Therefore, they provide an insight into the minimum set
of proteins necessary to sustain cell metabolism and to perform the primary function of the platelet: the
formation of blood clots. The data is high quality and the results really do provide an excellent resource
for understanding the thrombocyte proteome.
![]() Data set of the week: (2012/10/7)
Integral Quantification Accuracy Estimation for Reporter Ion-based Quantitative Proteomics (iQuARI). Overall rating: ![]() ![]() This data demonstrates the use of a large set of standard peptides mixed in with
a sample for the purposes of quantitation. The standard peptides in this case were a whole cell digest
of the proteome of Pyrococcus furiosus, an Archaea hyperthermophile. This set of peptides provided
comparators present at a wide range of concentrations, with very little peptide sequence overlap with
the human sample being analyzed. Even though this data was generated mainly for the purposes of a bioinformatics
study, it was state-of-the-art in terms of chromatography and mass spectrometry. It was ideal for the purpose
of the paper and this set of spectra should be considered as a standard for use when testing algorithms involved
in proteomics data analysis and associated bioinformatics and computational biology studies.
Data set of the week: (2012/10/1)
Analysis of protein palmitoylation reveals a pervasive role in Plasmodium development and pathogenesis. Overall rating: ![]() ![]() This ambitious study attempts to purify palmitoylated proteins from Plasmodium falciparum schizonts
obtained from Homo sapiens erythrocytes. The results show that they have generated fractions highly enriched in proteins with known
palmitoylation sites from both the malaria parasite and from human red blood cell membranes. The data is unusually high quality and
the methods used generated a rather complex problem in terms of peptide sequence assignments, protein identifications, computational biology and
bioinformatics.
![]() Data set of the week: (2012/09/23)
Extracellular polysaccharide-degrading proteome of Butyrivibrio proteoclasticus. Overall rating: ![]() ![]() This well done study represents the first publicly available data that details the
polysaccharide-degradation proteome of one of the primary bacterial components of the ruminant digestion process,
Butyrivibrio proteoclasticus. Ruminant mammals (e.g., cattle) use an elaborate series of bacterial fermentation
reactions to digest plant-sourced polysaccharides into small molecules that can be used by normal mammalian
digestive metabolism. The proteins identified in this study provide the best list currently available of the
enzymes and transport molecules used by the microorganism to cope with the environment of the rumen.
![]() ![]() Data set of the week: (2012/09/16)
Application of systems biology principles to protein biomarker discovery: urinary exosomal proteome in renal transplantation. Overall rating: ![]() ![]() This set of measurements nicely characterizes the proteins present in
clinically isolated urinary exosomes (the membrane-bound particles shed by kidney nephrons). The proteins
detected show that the exosomes contain significant amounts of molecules originating from cellular plasma
membranes as well as those originating from blood plasma. The data was excellent, easy to interpret and
there was no indication of significant experimental bias or artifacts in the peptides identified.
![]() These evidence codes do not refer to the
quality of an individual protein identification in a data set: they are a property of
the all of information in GPMDB about a particular protein. The evidence code for any particular protein
accession number can be obtained using the GPMDB REST interface
and the meaning of the codes can be found a here.
These codes are assigned automatically by an algorithm that considers all of the evidence in GPMDB, so the
particular value of an evidence code is subject to change as the evidence for a given protein
changes and as the algorithm is improved.
Data set of the week: (2012/09/09)
Streptococcus pyogenes in Human Plasma ADAPTIVE MECHANISMS ANALYZED BY MASS SPECTROMETRY-BASED PROTEOMICS. Overall rating: ![]() ![]() Streptococcus pyogenes is an important human pathogen, responsible for the diseases
generally classified as being caused by Group A Streptococcal (GAS) infection such as "strep throat", impetigo,
necrotizing fasciitis, scarlet fever and streptococcal toxic shock syndrome. This study examined the proteome changes caused by the presence
of human plasma in the cells' environment, in an attempt to understand how the organism adapts when it moves from
its normal environment into human blood. The data quality is very good and
the identified sequences provide good examples of the peptides available for MS-based proteomics, in the HPLC
retention range of 20—40% acetonitrile.
![]() ![]() Data set of the week: (2012/09/02)
Functional Interplay between Caspase Cleavage and Phosphorylation Sculpts the Apoptotic Proteome. Overall rating: ![]() ![]() The data from this study has the potential to provide some interesting insights into
the use and reproducibility of proteomics techniques when applied to biological experiments. The work
does not highlight any specific technological innovation, but it does use existing techniques well and in a routine
manner. The sample preparation and handling appear to have been unusually good, with low levels of experimental artifact
modifications, making the data suitable for more indepth study for the detection of rarer post-translational modifications.
There are detectable levels of a few adventious proteins (bovine serum albumin, bovine casein and latex proteins), but no detectable
viral proteins. There is significant sensitivity drop-off for peptides that elute prior to 20% or later than 40% acetonitrile, but
this effect is consistent throughout the data.
Data set of the week: (2012/08/26)
The miR-17-92 microRNA cluster regulates multiple components of the TGF-β pathway in neuroblastoma. Overall rating: ![]() ![]() This study provides an interesting insight into how COFRADIC can be used to reduce the complexity of
the peptides in protein identification experiments. The peptides found are significantly enriched in methionine, with almost 90% of
the identifications containing at least one Met residue. In combination with a simple SILAC method, protein quantitation was
obtained for a large number of peptides and identifications for more the 4,500 unique proteins. The use of proteomics
methods inconjunction with numerous biochemical methods to study microRNA
effects provided significant insight into pathway regulation in neuroblastoma cells.
Data set of the week: (2012/08/21)
Phosphoproteome dynamics upon changes in plant water status reveal early events associated with rapid growth adjustment in maize leaves. Overall rating: ![]() ![]() This interesting study contains a very large number of phosphopeptide identifications derived from
the leaves of the plant Zea mays (maize). The identifications are split between conventional CID MS/MS spectra and MS/MS/MS spectra
generated from the peaks corresponding to a neutral loss of -80 or -98 Da, caused by the loss of phosphate in the initial CID reaction. The study uses
chemical derivatization (light and heavy dimethylation) for quantitative analysis. These careful experiments provide some
interesting insights into the reaction of the plant to changes in water availability. They also are some of the best
proteomics observations made to date of this commercially important species.
Data set of the week: (2012/08/12)
Analysis of seminal plasma from patients with non-obstructive azoospermia and identification of candidate biomarkers of male infertility. Overall rating: ![]() ![]() The data contains some of the best identitifications currently available for many proteins specific to the
prostate and testis, such as
PATE1,
STEAP2, and
TGM4. It provides a very nice set of
examples of the proteins that can be reproducibly detected in seminal plasma using multidimensional
chromatography methods and they can be used to develop assays for specific proteins in this clinical sample. The
use of large, composite MGF files to report this type of data limits its utility for computational and
quantitative biology applications, because it is impossible to determine why the detected peptides are
biased against early eluting (< 20% ACN) sequences.
Data set of the week: (2012/08/05)
Plastid proteome assembly without Toc159: photosynthetic protein import and accumulation of N-acetylated plastid precursor proteins. Overall rating: ![]() ![]() This manuscript provides one of the largest, best sets of proteomics data from Arabidopsis thaliana cytosol
ever obtained using gel electrophoresis methods. The data is almost tailor made for bioinformatics investigations and
the development of peptide identification algorithms (much better than some of the truly low quality data proposed for this purpose).
For such a large experiment, the data quality is consistently high and the levels of experimental artifacts are
remarkably low.
Data set of the week: (2012/07/29)
The Evolutionary Imprint of Domestication on Genome Variation and Function of the Filamentous Fungus Aspergillus oryzae. Overall rating: ![]() ![]() This data provides a remarkable insight into the changes caused by domestication in an industrial important fungus,
Aspergillus oryzae. It is used to malt rice and other starch sources, a necessary step in the creation of
a number of wines, spirits and sauces common in Asia. Its nearest wild relative,
Aspergillus flavus, is also economically significant, however
it is considered a source of spoilage in food and a common infectious agent in aspergillosis. The results presented here characterize the
differences in the enzymes exported from the fungus into the environment, which the organism uses to generate small molecules for
import back into its filaments. Simple inspection of the lists of proteins tell the story of how selection has been used to craft the
suite of digestive enzymes secreted by the fungus, from primarily cellulose and protein digestion (A. flavus) to starch and protein
(A. oryzae).
One feature of the data that was not mentioned in the article was the very high degree of non-tryptic proteolysis. Because the organisms both secrete
non-specific proteases, the resulting mixture of proteins was most
likely partially-digested prior to sampling and continued to have proteolytic activity during the trypsin digestion used for proteomics.
This multi-step proteolysis leads to an unusual set of peptides, with 40–70% of the peptides having at least one non-tryptic cleavage and an
unusual bias towards peptides with pI < 5.
Data set of the week: (2012/07/22)
Proteomics profiling of Madin-Darby canine kidney plasma membranes reveals Wnt-5a involvement during oncogenic H-Ras/TGF-beta-mediated epithelial-mesenchymal transition. Overall rating: ![]() ![]() The data in this study is a good example of using one-dimensional SDS-PAGE to deal with membrane proteins.
The analysis of the data is straightforward and the group have done a good job of minimizing gel band contamination
with the common environmental proteins human KRT1, KRT2, KRT9, and KRT10, which can be an overwhelming presence in 1D gels. The
choice of Canis familiaris as the model species for the study gives an insight into the membrane proteins of a species that
has not be widely used for proteomics experiments, even though its complete genome has been known for many years. The lists of
proteins contain many prominent examples of proteins that are clearly present at significant levels in the organism
but which remain uncharacterized
(e.g., ENSCAFP00000021781,
ENSCAFP00000009106,
and ENSCAFP00000010256).
![]() ![]() Data set of the week: (2012/07/08)
Proteomic analysis of extracellular matrix from the hepatic stellate cell line LX-2 identifies CYR61 and Wnt-5a as novel constituents of fibrotic liver. Overall rating: ![]() ![]() This data provides a very nice insight into the extracellular matrix proteins being produced by
hepatic fibroblasts. These important proteins are most often mixed together with cellular proteins in clinical tissue samples
or discarded in cell culture experiments. These proteins are crucial to the formation and maintenance of tissues, but since they
cannot be effectively studied using the RNA-based techniques commonly used for intracellular proteins. The data supports the
conclusions in the associated manuscript, i.e., the differential presence of the relatively rare proteins WNT5A and CYR61.
![]() ![]() ![]()
Data set of the week: (2012/06/25)
Isolation and proteomic characterization of the mouse sperm acrosomal matrix. Overall rating: ![]() ![]() This data distinguishes itself by sampling a rarely examined portion of the
proteome, the acrosomal matrix. This structure on sperm is responsible for attachment to
the egg in the first stage of the fertilization process. The associated proteins are not
commonly found in other tissues, so the samples examined here provide some of the best measurements
of these molecules — such as Akap3, Akap4, Odf2, Ropn1 and all of the acrosomal dynein subunits.
![]()
![]() Data set of the week: (2012/06/17)
Proteomic analysis of the secretory response of Aspergillus niger to D-maltose and D-xylose. Overall rating: ![]() ![]() These results comprise a large fraction of the publicly available data about
the Aspergillus niger proteome. While the organism is very common in the environment, it is not
one of the human pathogenic Aspergillus species, such as A. fumigatus or A. flavus.
A. niger is a very important industrial fungus, used mainly as a source of enzymes for food production.
This study does a nice job of creating an inventory of the secreted proteins normally expressed by the organism
under two common growth conditions, providing insights into the metabolic changes that are necessary
for growth when the environment changes. Secreted proteins are very important for fungi as
they are responsible for digesting nearby carbohydates and proteins into a form that the fungus can use
as food.
![]()
The following are a few examples of these services.
1. Find the number of times a peptide sequence has been seen: GET /1/peptide/count/seq=SPSSVEPVADMLMGLFFR 2. Find the number of times a protein sequence has been seen: GET /1/protein/count/acc=ENSMUSP00000026459 3. Find the phosphorylation sites for a protein & how often each was observed: GET /1/protein/modifications/acc=YKL112W&mod=80&res=STY&maxe=-2.0
The source code for the preliminary services and a demonstration client application
have been made available at the GPMDB FTP site.
This source code will be kept up-to-date with changes in the draft specification document.
Data set of the week: (2012/06/10)
Comprehensive proteomic analysis of influenza virus polymerase complex reveals a novel association with mitochondrial proteins and RNA polymerase accessory factors. Overall rating: ![]() ![]() The results nicely demonstrate previously unknown associations between
the influenza polymerase complex and host cell proteins. The experimental strategy was
well thought out and an appropriate number of replicates with and without infection were performed
to confirm that the findings of the study were valid. The experiments provide some of the
best observations to date of the influenza A virus RNA polymerase subunits PA, PB1 and PB2. These
observations should
be useful to anyone investigating the use of SRM/MRM techniques to detect these molecules in vivo.
Comparison of the peptides observed for the polymerase subunits of the strain used in this study
(H5N1 Vietnam/1203/04 isolate) provide an interesting case
study when they are compared with those observed for other strains of the
influenza virus.
Data set of the week: (2012/06/05)
Proteomic Analysis of S-Acylated Proteins in Human B Cells Reveals Palmitoylation of the Immune Regulators CD20 and CD23. Overall rating: ![]() ![]() After spending the last few weeks dealing with the complexity of large collections
of mediocre data, it was a delight to find this gem. The authors have made excellent choices of the spectra
to include as evidence and they have retained enough common SDS-PAGE artifact proteins
so that the selected data retains the character of the original raw data. While some may be critical of this
process, it does provide good insight into the quality of the experiments and the type of data
used to support the conclusions in the paper. Note: CD20 and CD23 are annotated using their more
modern gene names, MS4A1 and FCER2, respectively. See the
HUGO Gene Nomenclature committee for CD molecules site for more information on the current status of
specific "CD" genes.
![]() Data set of the week: (2012/05/27)
Identification of targets of c-Src tyrosine kinase by chemical complementation and phosphoproteomics. Overall rating: ![]() ![]() This work nicely summarizes current trends in proteomics survey studies: early release
of data; high resolution parent
and fragment ion measurements; affinity methods to reduce sample complexity; and simple-to-interpret methods
for relative quantitation. This data set was released six months prior to publication, so any issues
relating to its quality or reproducibility could have been settled well before the conclusions were
published. The use of an Orbitrap in "high-high" mode made the identifications easy to
analyze and kept the false positive rate consistent and low (0.07-0.1%). The phospho-tyrosine peptide enrichment
method used worked well and resulted in high quality phospho-domain assignments. Finally, the appropriate
use of SILAC allowed the interpretation of the results to move beyond simply "yes" or "no" into a more nuanced
interpretation of the effects of changing c-Src tyrosine kinase activity.
Data set of the week: (2012/05/20)
Correct interpretation of comprehensive phosphorylation dynamics requires normalization by protein expression changes. Overall rating: ![]() ![]() The data and experiments reported in this paper are part of a general
shift in attitude towards the detection of phosphorylated domains in proteins. Most of the work in
the previous decade has placed considerable emphasis on the technical aspects of identifying phosphopeptides
and the qualitative reporting of their observation. This work (and that of others) is now focused
on how to interpret the observation of phosphorylated protein domains in the context of a cell's
biological function. The experiments performed here were well done, resulting in a nice set of protein
and peptide identifications of the phosphoproteins involved in yeast metabolism.
Data set of the week: (2012/05/13)
Metabolic switches and adaptations deduced from the proteomes of Streptomyces coelicolor wild type and phoP mutant grown in batch culture. Overall rating: ![]() ![]() These experiments give a good view into changes to the relative concentrations of many metabolic enzymes
in the environmental bacterium S. coelicolor in response to changes in phosphate-containing nutrient levels.
On the whole the experiments were well done, although there was significant, reproduced supression of
early eluting peptides in all of the LC/MS/MS runs. This supression may have made the experiments insensitive to
some particular enzymes. However, for enzymes containing observable peptides with gradient elutions > 20% acetonitrile,
the relative protein regulatory responses in could be inferred with reasonable accuracy from this data set.
![]() Data set of the week: (2012/05/07)
Cells lacking β-actin are genetically reprogrammed and maintain conditional migratory capacity. Overall rating: ![]() ![]() In this study, the authors use an unusual combination of SILAC relative quantitation and
combined fractional diagonal chromatography (COFRADIC) to study what happens to mouse embryonic fibroblast cells
when then lack an important cytoskeletal protein. Rather than the typical SILAC experiment in which heavy lysine and arginine
residues are used, this experimental design uses heavy methionine and COFRADIC to produce fractions enriched in peptides
containing oxidized methionine residues. While the use of an affinity technique has the potential to complicate
quantitative experiments, these experiments seem to have worked out quite well and generated some valuable
insights into the metabolic creativity shown by the fibroblasts in the face of what might seem to be an
insurmountable challenge.
![]() Data set of the week: (2012/04/29)
Kinome analysis of receptor-induced phosphorylation in human natural killer cells. Overall rating: ![]() ![]() The results presented in this study make very good use of high accuracy mass measurements of both
parent and fragment ion for their biological application — determining phosphorylation changes in
natural killer (NK) cells caused by changes in receptor stimulation. These cytotoxic leucocytes are known to
have kinome changes associated with such stimulation, but the phosphorylation domain changes associated with
specific stimulations have not been fully explored. This paper makes a start in this type of interesting, cell-specific
investigation that makes use of clinically-derived cells for kinome study.
Data set of the week: (2012/04/22)
Quantification of mRNA and protein and integration with protein turnover in a bacterium. Overall rating: ![]() ![]()
The data in these experiments give a good example of a straightforward analysis of the relationship between
protein and mRNA concentrations in a clinically important model organism, Mycoplasma pneumoniae. The results also
provide the best insights into the proteome of this prokaryote currently available, which has not be thoroughly studied even though
it has a comparatively simple genome and it is one of the primary causes of atypical bacterial pneumonia. The reproducibility
of this data was somewhat compromised by the consistent bias against early eluting peptides in the HPLC runs — very few peptides
that would be expected to elute at < 15% acetonitrile were observed.
Data set of the week: (2012/04/15)
Proteomic and phosphoproteomic comparison of human ES and iPS cells. Overall rating: ![]() ![]()
The results here were a good representation of the proteins and phosphorylated domains that could be readily sampled
in human embryonic stem cells and induced pluripotent stem cells. The techniques used were well described and
the measurements were in general very good. The studies were performed using a dual-cell quadrupole linear ion
trap-orbitrap hybrid mass spectrometer (dcQLT-Orbitrap), which produced high resolution, high accuracy parent and fragment ion measurements.
The data was made available through the authors' lab database site, the
Stem Cell-Omics Repository (SCOR).
![]() ![]() ![]() ![]()
![]()
HPP Executive Committee
HPP Senior Scientific Advisory Board
Data set of the week: (2012/04/8)
Comparison of proteomic and transcriptomic profiles in the bronchial airway epithelium of current and never smokers. Overall rating: ![]() ![]() This excellent study contrasted the proteomes of non- and current-smokers in
a very relevant tissue, bronchial airway epithelium. The results remain the definitive proteome
in this clinical tissue and contains some of the best observations for a number of rarely observed
proteins, such as TPPP3 (tubulin polymerization-promoting protein family member 3), SPATA18 (spermatogenesis associated 18 homolog),
ODF3B (outer dense fiber of sperm tails 3B), SPA17 (sperm autoantigenic protein 17) and ENSP00000387851 (member of the ciliary
rootlet coiled-coil family).
Data set of the week: (2012/04/1)
The matrisome: in silico definition and in vivo characterization by proteomics of normal and tumor extracellular matrices. Overall rating: ![]() ![]() The idea behind collecting this data set was to define which proteins compose the
extracellular matrix and to discover which proteins would be contributed to the extracellular matrix by
the host in a xenograft experiment. The results do a good job of determining the protein complement of
this material in human tissue. The xenograft experiment — growing human-source tumours in live mice —
clearly shows that both the tumour cells and mouse host tissue contribute to the proteins in the tumour-associated
matrix. The value of the data was somewhat reduced by the relatively large number of detectable chemical artifacts,
particularly the carbamylation and carbamidomethylation of peptide N-terminii and lysine sidechains.
![]() The RFA for a new FTP site for use by the chromosome-base Human Proteome Project has
been adopted. The new site designed to satisfy the RFA's requirements (ftp.proteomecentral.org)
is open and available for use. Any
c-HPP group interested in using the site for data storage should simply email Ron Beavis to
get their user name and password. The site is open to everyone for retrieving information — please read
the terms of use and
license for a better understanding
of how the site is meant to be used.
The protein sequences for the Brassica rapa (turnip) ENSEMBL proteome have been
added to the main search sites. This species is part of a large genus of plants that have been
broadly exploited as food, but the turnip is the first genome of the genus that has been fully
sequenced and interpretted.
Links to the Human Protein Reference Database (HPRD) have been removed from protein
evidence display pages because of licensing problems with that site. Links to the Human Metabolome Database
have also been removed from those pages, because an internal change at that site changed its behavior when
searching on gene names. If anyone have any suggestions for good replacements for these resources please
let us know.
Data set of the week: (2012/03/26)
Investigating the macropinocytic proteome of Dictyostelium amoebae by high-resolution mass spectrometry. Overall rating: ![]() ![]() Dictyostelium discoideum is one of the more peculiar organisms used in research. It is
a free-living "slime mold", commonly found in leaf litter on any temperate forest floor. In this study the
authors have characterized the proteins involved in the unusual method that the amoeboid form of this organism
uses to take in nutrients from the environment: macropinocytosis. The experimental methods used were very well done and the
results significantly extend what is known about both this process and the organism itself.
Data set of the week: (2012/03/18)
Proteogenomic analysis of Candida glabrata using high resolution mass spectrometry. Overall rating: ![]() ![]() Candida glabrata is a haploid yeast (a.k.a., Torulopsis glabrata). It was long
thought to be a human commensal organism, but it has been shown to cause pathogenic infections
in immune-compromised individuals. This study of the organism's proteome, performed using FTMS with high resolution for
both the parent and fragment ions, provides a nice insight into the observable proteome of this poorly studied
species. It also provides an excellent set of data to compare with an existing (but relatively untested) genome sequence to
discover novel genes, understand the extent of amino acid polymorphisms and compare the post-translational modification
of domains with other, better studied, yeast species.
The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome (2012/03/17)
![]() ![]() We will be mirroring relevant sections of the ftp.pride.ebi.ac.uk site through the GPMDB's FTP associated with
the c-HPP project in the folder "proteomexchange" (ftp.proteomecentral.org/proteomexchange). ProteomeXchange
accession numbers will be indexed in GPMDB and can be searched as a normal data set keyword. For example, this first entry can be accessed
using http://gpmdb.thegpm.org/PXD000001 or its
PRIDE ID using http://gpmdb.thegpm.org/data/keyword/PRIDE 22134.
Data set of the week: (2012/03/11)
The ethylmalonyl-CoA pathway is used in place of the glyoxylate cycle by Methylobacterium extorquens AM1 during growth on acetate. Overall rating: ![]() ![]() This study effectively defined the observable proteome of Methylobacterium extorquens, a Gram-negative bacterium
that lives on plant leaves (click here
for an amusing short presentation on this organism). Even though the title of the study suggests that
the study may have limited scope, each LC/MS/MS run generated identifications for ~40% of the proteins coded in the
complete genome. The analysis presented in GPMDB used the proteomes from three stains of the organism — AM1, DM4 and PA1 —
to be sure that no genes were absent because of errors in the specific genome assembly of an individual
strain. This analysis showed that the AM1 strain assembly was very good, with only a small number of
proteins from the PA1 and DM4 proteomes found without corresponding AM1 orthologs.
Data set of the week: (2012/03/04)
Comparative proteomic analysis of eleven common cell lines reveals ubiquitous but varying expression of most proteins. Overall rating: ![]() ![]() If you ever wanted to know what proteins were readily observable in
A549, GAMG, HEK293, HeLa, HepG2, K562,
MCF7, RKO, U2OS, Jurkat, HEK293, LnCap, HeLa or K562 cells, this is the data set for you. It is probably the
largest single data set generated for a publication using the current generation of Orbitrap technology. The
experiments were done using HCD fragmentation and consistent chromatographic and sample
preparation methods. The information is a good compliment to the earlier DSOTW Initial characterization of the human central proteome
where there is overlapping information generated with conventional CID.
Data set of the week: (2012/02/26)
Systematic phosphorylation analysis of human mitotic protein complexes. Overall rating: ![]() ![]()
These results were good examples of the use of proteomics to target an aspect of a particular cell process, in
this case the role of phosphorylation in mitosis. The experimental protocols do a good job of isolating the
relavent proteins and generating easily interpretted phophopeptide spectra. The chromatography and
mass spectrometry were very well done and consistent across the data set. An unusual feature of this data set was
the presence of relatively strong signals from the protease domain (picornain 3C) of the human rhinovirus B-14 polyprotein. While
it is known that HeLa cells are susceptible to rhinovirus (common cold) infections, this data may be the first
experimental confirmation of a rhinovirus infection in cell culture based on proteomics methods.
![]()
This RFI is directed toward determining how best to accelerate research in disruptive
proteomics technologies.
The Disruptive Proteomics Technologies (DPT) Working Group of the
NIH Common Fund wishes to
identify gaps and opportunities in current technologies and methodologies related to
proteome-wide measurements. For the purposes of this RFI, "disruptive" is defined as very
rapid, very significant gains, similar to the "disruptive" technology development that occurred
in DNA sequencing technology.
![]() The courses are as follows:
![]()
This study provides a large set of consistently good quality, journeyman data focussed on creating a catalog of proteins
present in a common cell line. The U2-OS line was derived from a female sarcoma with very few normal chromosomes and hypertriploid chromosome counts.
The cell culture used appears to have relatively clean, with little if any evidence of the presence of viruses or Mycoplasma. Any group
interested in quantifying unlabelled proteomics data, investigating rare post-translational modifications or developing
quality control metrics should take a look at this data.
![]() The
changes were as follows:
![]()
This series of multidimensional chromatography runs using high resolution MS and HCD MS/MS did exactly what
the title said: it provides a comprehensive catalogue of the proteins and consistituent peptides that
are to be expected when human bile is analyzed. It contains many best-to-date observations of proteins, even
ones that are not normally associated with bile, such as hornerin and dermcidin. The methods used produced
surprisingly good recovery of cysteine-containing peptides, which are often depleted in proteomics measurements.
Data set of the week: (2012/02/05)
Chemoproteomics profiling of HDAC inhibitors reveals selective targeting of HDAC complexes. Overall rating: ![]() ![]()
The results demonstrate that the best way to find and quantitate relatively rare proteins is to utilize a targeted-affinity
purification approach. The protocols described in the paper work very well and the measurements were
well done. The peptide identification work in the paper was rather cursory, but that does not affect the biological conclusions or
the validity of the approach.
Data set of the week: (2012/01/29)
Modularity and hormone sensitivity of the Drosophila melanogaster insulin receptor/target of rapamycin interaction proteome. Overall rating: ![]() ![]()
This study was a good example of the routine use of good quality proteomics technology to elucidate an interesting
aspect of biology. It examined the protein-protein interactions associated with the InR/TOR pathway in the well-established
Kc167 cell line. The measurements were unambigious, resulting in a significant number of indentifications of relatively
rare D. melanogaster proteins involved in this pathway. It also contained a nice survey of the detectable SNAPs present in this
cell line — fruit flies have a surprisingly large number of nsSNPs compared to mammal genomes.
Data set of the week: (2012/01/22)
Characterization of the Asia Oceania Human Proteome Organisation Membrane Proteomics Initiative Standard using SDS-PAGE shotgun proteomics. Overall rating: ![]() ![]()
These experiments provide insight into how straightforward it has become to identify membrane proteins. Using a fairly
simple sample preparation method and LC/MS/MS with an LTQ instrument, the results show that it is possible to easily
identify large numbers of membrane proteins. It is still common for people to suggest that membrane proteins are
"difficult" using proteomics techniques. These results show that they are really no more difficult than
any other class of protein, so long as they can be kept in solution long enough to be digested.
![]() Data set of the week: (2012/01/15)
Deep proteome and transcriptome mapping of a human cancer cell line. Overall rating: ![]() ![]()
This data set is an extensive investigation of how many peptides can be identified from the limited proteome of a
single human cell line using a combination of straight-forward LC/MS/MS
methods, multidimensional chromatography and multiple proteases, adding in high resolution MS/MS via HCD, and doing careful,
consistently state-of-the-art lab work. For the large number of groups that use HeLa cells, this work should serve
as a reference for what can be seen and what sort of experiment should be done to see it. For anyone interested in bioinformatics
and algorithm development, the scale (> 200,000 protein identifications) and precision of the work makes it an excellent
example for trying out new ideas. It is also an excellent raw data set to find novel post-translational modifications, splice
variants, viral contaminants and amino acid polymorphisms.
Data set of the week: (2012/01/08)
iPRG-2011: Study Materials for Identification of Electron Transfer Dissociation (ETD) Mass Spectra. Overall rating: ![]() ![]()
This rather oddball dataset provides more insight into the "chilli-cook-off" mentality associated with
evaluating bioinformatics algorithms than it does into the current real-world problems in biomedical research.
Tests of this sort can be useful when their goals are to provide feedback
to algorithm & user interface designers and to inform users of the characteristics of algorithm performance.
It is questionable as to whether any of such aims were achieved by analyzing this data set.
The data was artificially removed from context (only one of 21 SCX fractions was made available). The
sample preparation methods used generated very high levels of non-enzymatic cleavage (22% of observable peptides),
unusually high levels of asparagine deamination (48% of N-containing peptides) and peptide N-terminal glutamine
cyclization (88% of peptides with an N-terminal Q). The mass measurements had large parent ion and fragment
ion systematic errors (+5 ppm and -0.25 Da respectively) and standard deviations (4 ppm and 0.3 Da). The proteins
in the sample were heavily skewed towards the cytosolic proteins and the added human sequence standard proteins (Sigma UPS). The
lack of the other 20 fractions made it impossible to draw any conclusions about the relative observability of
the added UPS proteins (and the ribosomal E. coli protein contaminants in the UPS preparation). It was
very unclear why such a complex, poorly controlled sample/measurement combination was used to test
algorithms and so little information about the true character of the sample was provided to the participating
groups. This hidden complexity resulted in more of an examination of the detective abilities of the groups than a
useful test of the algorithms.
![]() Data set of the week: (2012/01/01)
Proteomic Analysis of a Pleistocene Mammoth Femur Reveals More than One Hundred Ancient Bone Proteins. Overall rating: ![]() ![]() This data was a truly amazing example of what can be obtained using samples that have
simply sat around outside for 43,000 years. The preservation of the detectable peptides was unexpectedly good.
The experiments were state-of-the-art at all levels and the data should be examined extensively by any
group interested in detecting amino acid polymorphisms associated with evolutionary change. The
analysis in the original paper was correct at the top level (the proteins detected) but was less well done at the level of
amino acid polymorphisms and side chain modifications. There are several more publications' worth of information
in this extraordinary data.
Copyright © 2012, The Global Proteome Machine Organization
|