Nuclc. Acids. Res. OUP
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH ARTICLES TABLE OF CONTENTS
Compilation Paper
Categories List
Alphabetical List
Search Summary Papers

HGVbase

http://hgvbase.cgb.ki.se

Fredman, D.1, Lehvaslaiho, H.2, Rios, D.1, Munns, G.1, Siegfried, M.1, Yuan, Y.P.3, Bork, P.3, Brookes, A.J.1

1Center for Genomics and Bioinformatics, Karolinska Institute, Sweden
2European Bioinformatics Institute, United Kingdom
3European Molecular Biology Laboratory, Germany

Contact   david.fredman@cgb.ki.se


Database Description

HGVbase (http://hgvbase.cgb.ki.se) was set up in 1998 in an effort to improve analysis of common human genetic polymorphism by maintaining a high-quality database of all publicly available genomic variation data (primarily SNPs). Currently maintained by an academic European consortium it contains close to 2 million non-redundant entries. Data is harvested from three main areas, (i) collaboration with other variation databases, (ii) extraction from scientific literature and (iii) submissions from various research groups. Bi-directional data exchange with dbSNP (SNP repository at the NCBI) was established in 2000. Collaboration with the Human Genome Variation Society (HGVS) has led to HGVbase becoming a central repository for clinical mutations and associated disease phenotypes of interest. Automated and manual data checking ensures internal consistency, and addresses errors often present in the original source information. All markers have a given position in a reference EMBL or Genbank sequence and have been mapped to the human draft genome. These positions are verified and updated in every new build of HGVbase. Each variant is presented in the context of detailed genome annotation (coding regions and splice sites), as well as repeat regions and genome duplication information. Genotypes, haplotypes and allele frequencies are included as they become available. Work performed to increase the utility of SNP information includes the provision of genotyping assays for every SNP, and functional predictions for non-synonymous SNPs. A prototype system for handling phenotypes, based on a flexible framework relying on user provided information rather than a phenotype ontology, has recently been created. Online search tools facilitate data interrogation by keywords, sequence and chromosomal location. Core HGVbase data is also represented within the Ensembl project (www.ensembl.org) as sequence features with links to the HGVbase database for full information. Downloads of HGVbase are available in a variety of formats (XML, Fasta, SRS, MySQL dumps and tagged-text files). Current developments include graphical tools for data interrogation and further genome annotation by the representation of promoter and synteny regions. A full database description is provided online and accompanying web pages summarize SNP-related meetings, genotyping methods, SNP-databases, and analysis software.

Recent Developments

·Inclusion of genotype and haplotype information ·Publically available tools for processing of haplotypes ·Prototype system for handling phenotypes ·Functional predictions for nsSNPs ·Improved genome annotation (genome duplication, genes, etc)

Acknowledgements

We thank Interactiva GmbH (Germany) for support during early development of the database and for transferring the project to the public domain.

REFERENCES

1. Cotton, R. G., V. McKusick, et al. (1998) The HUGO Mutation Database Initiative. Science 279(5347): 10-1.

Category   Mutation Databases

Go to the abstract in the NAR 2002 Database Issue.

 

Compilation Paper
Categories List
Alphabetical List
Search Summary Papers