The GPM wiki site opening (2007/11/07)
As an experiment in how to most effectively annotate proteomics data,
the GPM now has a dedicated wiki system integrated into its user interface. This wiki can also be accessed through
wiki.thegpm.org. Currently, the GPM interface
is linked to the wiki on the level of GPM accession number, protein accession number
and individual peptide sequences.
Sequence updates (2007/11/03)
The sequences used for human and mouse have been updated to ENSEMBL v.47. Mouse
now uses NCBI m37, the mose recent version of the mouse genome. The single nucleotide
induced amino acid polymorphisms listings have also been updated to reflect the
changes in these new sequence collections. The sequences used for rice have also
been updated, to use OSA1r5 from the J. Craig Ventner Institute (JCVI, the new name for TIGR).
Changes in the way that JCVI refers to sequences has led to a change in the style of
accession numbers being used for rice: rather than the feature index, the locus accession
is now being used.
Temporary service interuption (2007/10/26)
Between approximately 11:00 to 13:00 PDT on Oct. 26th, a number of the
GPM search servers will be unavailable. This interuption is necessary to perform
some much needed systems maintenance.
The "Global" in Global Proteome Machine (2007/10/17)
In order to better understand how the GPM system is being used, we have begun to use
Google Analytics to generate statistics on
the use of GPM generated searches and GPMDB database information retrieval. We will make this
information available on a monthly basis. The first month's data is available as a
PDF file. The current report shows what locations
in the world are using GPM and approximately how many pages are being downloaded per user visit.
New libraries for X! Hunter (2007/10/11)
The annotated spectrum libraries for X! Hunter have been updated, with
a significant expansion of sequence coverage for most species (see the new
statistics here). Libraries for P. troglodytes (chimp)
and Felis catus (house cat) have been added to the eukaryote species collections.
New species added to X! Hunter (2007/09/02)
The X! Hunter Annotated Spectrum Libraries have been updated to include
a number of prokaryote species, based on new data submitted to GPMDB. The following
species are now available for high-speed searching:
- Deinococcus radiodurans
- Escherichia coli
- Halobacterium sp.
- Mycobacterium smegmatis
- Mycobacterium tuberculosis
- Salmonella enterica
- Salmonella typhi
- Salmonella typhimurium
- Shewanella oneidensis
- Streptococcus pyogenes
Opening of the GPMDB MS/MS repository (2007/09/01)
The GPM Database has become the largest source of publically accessible data through
the donation of data from laboratories from around the world. In an effort to make that
service more comprehensive, we have added a new feature to the public GPM sites that
can create a highly compressed version of all of the original MS/MS data files submitted for analysis.
If the results will be made available in GPMDB, the compressed MS/MS data file will
now be archived and made available using the CMN 1.0 data format.
The total contents of the archive will available at ftp://ftp.thegpm.org/data/msms.
These files will be named in the same manner as GPM data models, for example the data
model accession number "GPM00300001111" will have model file named "GPM00300001111.xml"
and an archived data file named "GPM00300001111.cmn".
This archive is organized into separate folders, corresponding to the first three numbers in the GPM accession number.
New release of X! series search engines (2007/07/01)
The X! series search engines (Tandem, P3 and Hunter) have been updated to include
compatibility with some variants of mzXML and mzData spectrum input files, which use
64-bit floating point numbers for fragment ion mass and intensity information. X! Hunter has
also been updated to include a new format (see the definition)
for the input annotated spectrum libraries that
is a suggested standard format for the exchange of this type of information.
GPM Adopts Cell and Tissue Ontologies (2007/06/28)
In an effort to increase the utility of the GPM and GPMDB, the public sites have
been updated to include an interface allowing researchers to include more information
about their experiments. This information is organized around current "ontology"
projects, which supply standard lists of relevant biological terms linked to accession numbers.
The ontologies were chosen to provide as much consistency as possible between GPMDB and PRIDE.
- Gene Ontology (GO):
the GO Slim list of terms associated with cellular localization;
- Cell Type Ontology (CELL):
a fairly comprehensive collection of eukaryote cell types; and
- BRENDA Tissue Ontology (BRENDA):
the BRENDA tissue list has been broken down into cell lines and tissues normally found in an organism.
New X! Hunter ASLs released (2007/05/20)
The 2007.05.15 version of the GPM Annotated Spectrum Libraries for X! Hunter
are now available for download from the GPM FTP site.
The new library was compiled using a new curation process that was designed to reduce
the number of potential false positive entries in the library. The list of allowed
sequence modifications was expanded to include
- ICAT (both classic and cleavable);
- S/T/Y phosphorylation; and
- Q/N deamidation
The new libraries also include HLF X! Hunter files, MGF spectrum files and FASTA peptide files
for use in bioinformatics research.
Milestone reached (2007/05/09)
GPMDB added its 25,000,000th peptide identification over
the weekend. We would like to thank all of the individual data contributors, as well as
the team at the PeptideAtlas repository, for making this possible.
System outage (2007/04/30)
GPMDB will be unavailable for several hours on the afternoon of April 30, 2007 for
System updates (2007/04/15)
A number of updates/upgrades have been performed on the overall GPM system.
- The human, mouse and rat proteomes have been updated to the latest version from ENSEMBL (v. 43)
- The 2007.04.01 versions of X! Tandem and P3 have been deployed. This release adds the capability
of checking for known single amino acid polymorphisms (SAPs). The known annotations are
based on the dbSNP and ENSEMBL SNP databases for coding, non-synonymous SNPs. The annotation
files are available from the GPM FTP site. This capability
has been made the default behavior for searching human, mouse and rat ENSEMBL proteomes.
- The frog and fish
boutique sites have been moved to 8 core computer platforms.
New equipment for boutique proteomes (2007/03/10)
The servers being used for the cow,
boutique sequence sites have been upgraded to the same type
of dual quad-core processor based computers as the new human site. The new servers
are a generous gift of the Biomedical Research Centre at the University of British
Columbia. We'd like to thank John Wilkins group at the
University of Manitoba, who donated the equipment to host these sites for
the last two years.
Human Invitational proteome updated (2007/03/08)
The Human Invitational Database is a collection of highly curated RNA sequences
meant to track the existence of splice variants and unanticipated translations of human
genes. We have always made this sequence collection available through the human
boutique search server and have updated this set of sequences to version
Cat and Guinea pig proteomes added (2007/03/08)
The predicted sequences of the cat (Felis catus) and Guinea pig (Cavia porcellus) proteomes
have been added to the main servers of the GPM. These sequences were
obtained from the ENSEMBL CAT build 43.1 and
ENSEMBL cavPor2 build 43.1.
These are low coverage 2X assemblies, so the underlying gene models are
expected to change with time. For comparison, these proteomes contain approximately 13,000 more protein
sequences for each species than are available in NCBI's nr.
Rabbit proteome added (2007/03/07)
The predicted sequences of the rabbit (Oryctolagus cuniculus) proteome
have been added to the main servers of the GPM. These sequences were
obtained from the ENSEMBL RABBIT build 43.1b.
This is a low coverage 2X assembly, so the underlying gene models are
expected to change with time. For comparison, this proteome contains approximately 10,000 more rabbit protein
sequences than are available in NCBI's nr.
Equipment upgrade (2007/02/28)
On Saturday (2007/02/24) we upgraded the search servers human,
h066, and h112 to dual processor (Intel XEON E5345, 2.2 GHz),
quad-core computers, improving search speed performance to be about three times faster than the fastest other
computers we have in the system. Several more of these relatively high speed computers have
been ordered and they should be installed within a few weeks.
An error in one of the configuration files on the new "human" server has caused any searches
performed on that server using the IPI, SWISS-PROT, UNIGENE or HIT sequences sets to be incomplete: X! Tandem
was unable to access the appropriate sequence files. The problem has been corrected, but any searches performed
on these sequence sets since Saturday should be repeated. The same problem affected all searches performed using
X! Hunter (the "Feeling lucky" button).
New versions of X! series search engines released (2007/01/31)
New versions of X! Tandem, P3 and Hunter are now available at the GPM ftp site.
This release fixes up a few small issues associated with operating system compatibility,
some new information generated from the data in GPMDB and adds some new information to the output data files
that can be used for quality control purposes. It also includes compiled versions for the Mac OS 10.4
for Intel-based Macs.
Copyright © 2007, The Global Proteome Machine Organization