The GPM wiki site opening (2007/11/07)

As an experiment in how to most effectively annotate proteomics data, the GPM now has a dedicated wiki system integrated into its user interface. This wiki can also be accessed through Currently, the GPM interface is linked to the wiki on the level of GPM accession number, protein accession number and individual peptide sequences.

Sequence updates (2007/11/03)

The sequences used for human and mouse have been updated to ENSEMBL v.47. Mouse now uses NCBI m37, the mose recent version of the mouse genome. The single nucleotide induced amino acid polymorphisms listings have also been updated to reflect the changes in these new sequence collections. The sequences used for rice have also been updated, to use OSA1r5 from the J. Craig Ventner Institute (JCVI, the new name for TIGR). Changes in the way that JCVI refers to sequences has led to a change in the style of accession numbers being used for rice: rather than the feature index, the locus accession is now being used.

Temporary service interuption (2007/10/26)

Between approximately 11:00 to 13:00 PDT on Oct. 26th, a number of the GPM search servers will be unavailable. This interuption is necessary to perform some much needed systems maintenance.

The "Global" in Global Proteome Machine (2007/10/17)

In order to better understand how the GPM system is being used, we have begun to use Google Analytics to generate statistics on the use of GPM generated searches and GPMDB database information retrieval. We will make this information available on a monthly basis. The first month's data is available as a PDF file. The current report shows what locations in the world are using GPM and approximately how many pages are being downloaded per user visit.

New libraries for X! Hunter (2007/10/11)

The annotated spectrum libraries for X! Hunter have been updated, with a significant expansion of sequence coverage for most species (see the new statistics here). Libraries for P. troglodytes (chimp) and Felis catus (house cat) have been added to the eukaryote species collections.

New species added to X! Hunter (2007/09/02)

The X! Hunter Annotated Spectrum Libraries have been updated to include a number of prokaryote species, based on new data submitted to GPMDB. The following species are now available for high-speed searching:

  1. Deinococcus radiodurans
  2. Escherichia coli
  3. Halobacterium sp.
  4. Mycobacterium smegmatis
  5. Mycobacterium tuberculosis
  6. Salmonella enterica
  7. Salmonella typhi
  8. Salmonella typhimurium
  9. Shewanella oneidensis
  10. Streptococcus pyogenes

Opening of the GPMDB MS/MS repository (2007/09/01)

The GPM Database has become the largest source of publically accessible data through the donation of data from laboratories from around the world. In an effort to make that service more comprehensive, we have added a new feature to the public GPM sites that can create a highly compressed version of all of the original MS/MS data files submitted for analysis. If the results will be made available in GPMDB, the compressed MS/MS data file will now be archived and made available using the CMN 1.0 data format. The total contents of the archive will available at

These files will be named in the same manner as GPM data models, for example the data model accession number "GPM00300001111" will have model file named "GPM00300001111.xml" and an archived data file named "GPM00300001111.cmn". This archive is organized into separate folders, corresponding to the first three numbers in the GPM accession number.

New release of X! series search engines (2007/07/01)

The X! series search engines (Tandem, P3 and Hunter) have been updated to include compatibility with some variants of mzXML and mzData spectrum input files, which use 64-bit floating point numbers for fragment ion mass and intensity information. X! Hunter has also been updated to include a new format (see the definition) for the input annotated spectrum libraries that is a suggested standard format for the exchange of this type of information.

GPM Adopts Cell and Tissue Ontologies (2007/06/28)

In an effort to increase the utility of the GPM and GPMDB, the public sites have been updated to include an interface allowing researchers to include more information about their experiments. This information is organized around current "ontology" projects, which supply standard lists of relevant biological terms linked to accession numbers. The ontologies were chosen to provide as much consistency as possible between GPMDB and PRIDE.

  1. Gene Ontology (GO): the GO Slim list of terms associated with cellular localization;
  2. Cell Type Ontology (CELL): a fairly comprehensive collection of eukaryote cell types; and
  3. BRENDA Tissue Ontology (BRENDA): the BRENDA tissue list has been broken down into cell lines and tissues normally found in an organism.

New X! Hunter ASLs released (2007/05/20)

The 2007.05.15 version of the GPM Annotated Spectrum Libraries for X! Hunter are now available for download from the GPM FTP site. The new library was compiled using a new curation process that was designed to reduce the number of potential false positive entries in the library. The list of allowed sequence modifications was expanded to include

  1. ICAT (both classic and cleavable);
  2. ITRAC;
  3. S/T/Y phosphorylation; and
  4. Q/N deamidation

The new libraries also include HLF X! Hunter files, MGF spectrum files and FASTA peptide files for use in bioinformatics research.

Milestone reached (2007/05/09)

GPMDB added its 25,000,000th peptide identification over the weekend. We would like to thank all of the individual data contributors, as well as the team at the PeptideAtlas repository, for making this possible.

System outage (2007/04/30)

GPMDB will be unavailable for several hours on the afternoon of April 30, 2007 for system maintenance.

System updates (2007/04/15)

A number of updates/upgrades have been performed on the overall GPM system.

  1. The human, mouse and rat proteomes have been updated to the latest version from ENSEMBL (v. 43)
  2. The 2007.04.01 versions of X! Tandem and P3 have been deployed. This release adds the capability of checking for known single amino acid polymorphisms (SAPs). The known annotations are based on the dbSNP and ENSEMBL SNP databases for coding, non-synonymous SNPs. The annotation files are available from the GPM FTP site. This capability has been made the default behavior for searching human, mouse and rat ENSEMBL proteomes.
  3. The frog and fish boutique sites have been moved to 8 core computer platforms.

New equipment for boutique proteomes (2007/03/10)

The servers being used for the cow, mouse, rat, plant, and prokaryote boutique sequence sites have been upgraded to the same type of dual quad-core processor based computers as the new human site. The new servers are a generous gift of the Biomedical Research Centre at the University of British Columbia. We'd like to thank John Wilkins group at the University of Manitoba, who donated the equipment to host these sites for the last two years.

Human Invitational proteome updated (2007/03/08)

The Human Invitational Database is a collection of highly curated RNA sequences meant to track the existence of splice variants and unanticipated translations of human genes. We have always made this sequence collection available through the human boutique search server and have updated this set of sequences to version H-InvDB_3.8.

Cat and Guinea pig proteomes added (2007/03/08)

The predicted sequences of the cat (Felis catus) and Guinea pig (Cavia porcellus) proteomes have been added to the main servers of the GPM. These sequences were obtained from the ENSEMBL CAT build 43.1 and ENSEMBL cavPor2 build 43.1. These are low coverage 2X assemblies, so the underlying gene models are expected to change with time. For comparison, these proteomes contain approximately 13,000 more protein sequences for each species than are available in NCBI's nr.

Rabbit proteome added (2007/03/07)

The predicted sequences of the rabbit (Oryctolagus cuniculus) proteome have been added to the main servers of the GPM. These sequences were obtained from the ENSEMBL RABBIT build 43.1b. This is a low coverage 2X assembly, so the underlying gene models are expected to change with time. For comparison, this proteome contains approximately 10,000 more rabbit protein sequences than are available in NCBI's nr.

Equipment upgrade (2007/02/28)

On Saturday (2007/02/24) we upgraded the search servers human, h066, and h112 to dual processor (Intel XEON E5345, 2.2 GHz), quad-core computers, improving search speed performance to be about three times faster than the fastest other computers we have in the system. Several more of these relatively high speed computers have been ordered and they should be installed within a few weeks.

An error in one of the configuration files on the new "human" server has caused any searches performed on that server using the IPI, SWISS-PROT, UNIGENE or HIT sequences sets to be incomplete: X! Tandem was unable to access the appropriate sequence files. The problem has been corrected, but any searches performed on these sequence sets since Saturday should be repeated. The same problem affected all searches performed using X! Hunter (the "Feeling lucky" button).

New versions of X! series search engines released (2007/01/31)

New versions of X! Tandem, P3 and Hunter are now available at the GPM ftp site. This release fixes up a few small issues associated with operating system compatibility, some new information generated from the data in GPMDB and adds some new information to the output data files that can be used for quality control purposes. It also includes compiled versions for the Mac OS 10.4 for Intel-based Macs.

