Nuclc. Acids. Res. OUP
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH ARTICLES TABLE OF CONTENTS
Compilation Paper
Categories List
Alphabetical List
Search Summary Papers

Gene Ontology Annotation (GOA)

http://www.ebi.ac.uk/GOA

Camon, E.B., Barrell, D., Binns, D., Fleischmann, W., Kersey, P., Magrane, M., Maslen, J., Mulder, N., Harte, N., Apweiler, R.

European Bioinformatics Institute (EBI) Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK

Contact   camon@ebi.ac.uk


Database Description

The Gene Onotology Annotation (GOA) database is maintained by the European Bioinformatics Institute (EBI) and aims to provide assignments of gene products to the Gene Ontology (GO) resource - a dynamic controlled vocabulary that encourages the consistent description of a proteins function, process and location of action. In the GOA project, this vocabulary is being applied to a non-redundant set of proteins described in the EBI's core genome and proteome databases (SWISS-PROT, TrEMBL, InterPro and Ensembl) that collectively provide complete and incomplete proteomes for human (GOA-Human) and other organisms (GOA-SPTR). GOA is produced by electronic and manual techniques and is updated monthly, in accordance with the latest data released by its core databases. The production of GOA-Human is in keeping with our GO Consortium agreement to the fast track the functional annotation of the human proteome and with our Human Proteomics Initiative. The recent release of all our annotation (for >45,000 species) via GOA-SPTR has made the EBI one of the largest resources of GO annotation by providing over 2.4 million GO associations across 522854 SWISS-PROT and TrEMBL entries. The GOA project also ensures maintenance and sharing of GO mappings to SWISS-PROT keywords and InterPro. These mappings provide a useful resource of functional information for the analysis of microarray and mass spectrometry data. The GOA project also ensures that at least some of the existing corpus of scientific knowledge in our core databases is converted into computationally accessible form. By annotating all characterized proteins with GO terms and facilitating the transfer of this knowledge to similar uncharacterized proteins, we hope to contribute to a better understanding of all proteomes. For further information about access to GOA data, please refer to our home page: http://www.ebi.ac.uk/GOA or contact us by e-mail to goa@ebi.ac.uk.

Recent Developments

At the outset of the GOA project, only SWISS-PROT and InterPro curators at the EBI participated. With the recent announcement of the UniProt Consortium, the existing collaboration between SWISS-PROT at the EBI and Swiss Institute of Bioinformatics (SIB), Geneva has been extended to encompass the Protein Information Resource (PIR), USA. With more curators of GO annotation we aim increase our GO coverage of SWISS-PROT and TrEMBL from 56% to 70% by 2004. Plans to integrate manual GO assignments made by other GO Consortium groups eg. FlyBase, will further enhance the GOA dataset. Early 2003 we plan to include our GO annotation directly in an XML version of TrEMBL database and will provide a cross reference to GOA from EMBL-Bank (which contains the nucleotide sequences of EMBL/DDBJ/GenBank). GOA project data is accessible from numerous databases including Ensembl, which shows the chromosome location of human genes mapped to a particular GO term.

Acknowledgements

We are very grateful to the GO consortium and in particular to the SWISS-PROT and GO curators and programmers at the EBI. This work was supported by the European Commission and the US National Institutes of Health.

REFERENCES

1. Apweiler R, Biswas M, Fleischmann W et al, 2001. Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes. Nucleic Acids Res. 29(1):44-48.
2. Bairoch A. and Apweiler R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28(1), 45-48.
3. Biswas M, O'Rourke JF, Camon E, Fraser G, Kanapin A, Karavidopoulou Y, Kersey P, Kriventseva E, Mittard V, Mulder N, Phan I, Servant F, Apweiler R. Applications of InterPro in protein annotation and genome analysis. Brief Bioinform. 2002 Sep;3(3):285-95.
4. Bulter D. 2002. NIH pledges cash for global protein database. Nature Sep;12(419), 101.
5. Hubbard T, Barker D, Birney E et al, 2002. The Ensembl genome database project. Nucleic Acids Res. 30(1):38-41.
6. O'Donovan C, Apweiler R, Bairoch A. The human proteomics initiative (HPI). Trends Biotechnol. 2001 May;19(5):178-81.
7. The Gene Ontology Consortium, 2001.Creating the gene ontology resource: design and implementation. Genome Res. 11, 1425-1433

Category   Varied Biomedical Content

 

Compilation Paper
Categories List
Alphabetical List
Search Summary Papers