Nuclc. Acids. Res. OUP
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH ARTICLES TABLE OF CONTENTS
Compilation Paper
Categories List
Alphabetical List
Search Summary Papers

Gene Resource Locator

http://grl.gi.k.u-tokyo.ac.jp

Honkura, T.1, Ogasawara, J.2, Yamada, T.1, Morishita, S.1

1Department of Complexity Science and Engineering, Faculty of Frontier Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, JAPAN
2Department of Computer Science, Faculty of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, JAPAN

Contact   honkura@gi.k.u-tokyo.ac.jp


Database Description

We analyzed the entire draft genome sequence by mapping millions of ESTs, and developed a new type of annotation based on the integration of aligned ESTs from various sources, such as Unigene, BodyMap, dbSNP and full-length cDNA with efficient GUI. We first aligned all the available sequences in the Unigene dataset against the draft human genome and the genomes of several model species such as mouse, and subsequently integrated the alignments to determine the positional clustering of genes and global tissue-specificities. In order to efficiently compute the alignments, we developed optimization techniques for accelerating dynamic programming algorithms. Millions of EST alignments were placed in groups by iterating the process of merging two alignments sharing common exons, until no more members could be added. Since the GRL server offers millions of alignments, it is important to use visualization tools that work efficiently, even on low-bandwidth networks connecting WWW data servers and usersf browsers. We chose Macromedia Flash, because of its widespread acceptance and the capability that it offers to design and deliver low-bandwidth dynamic representations. The viewer is implemented in ActionScript (Flash) and PHP. GRL uses a freely available relational database, MySQL, as a back-end. The collection of PHP scripts executes pre-defined SQL queries on the database, manipulate the data and present it in a variety of ways. The Macromedia Flash software designs and delivers dynamic representations in which millions of ESTsf locations and structures, along with a wide variety of genomic sequence annotations, can be displayed interactively. The Flash-based browser is available through the WWW on the Gene Resource Locator homepage (http://grl.gi.k.u-tokyo.ac.jp).

Recent Developments

Annotation of the mouse genome is available. For Finding a genome location of your sequence the GRL accepts online mapping queries of your sequence. If you have genomic, mRNA, or EST sequence, but don't know the location in the genome, the on-line mapping will rapidly locate the position by alignment. A successful alignment returns a list of one or more genome locations that match the input sequence. If more than one alignment exists for the requested EST, the system displays the location, the strand, the matching ratio and the number of exons for each alignment. To view one of the alignments in the GRL viewer, click the viewer link. The graphical representation of your sequence will be showed in gYour sequenceh track.Only DNA sequences of more than 40 bases and less than 100000 bases will be processed.

Acknowledgements

Special thanks go to Jun Sese for helping us build the GRL server. We are also grateful to Prof. Kousaku Okubo at Osaka Univ., Prof. Sumio Sugano at IMS, Univ. Tokyo, and James Kent at UCSC for their stimulus input. The Gene Resource Locator is supported by Grant#12208003 Grant-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Science, and Culture Japan.

REFERENCES

1. International Human Genome Sequencing Consortium. (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860-921.
2. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Murai, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A. et al. (2001) The sequence of the human genome. Science, 291, 1304-1351.
3. Ewing, B., Green, P. (2000) Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genet., 25, 232-234.
4. Roest, C.H., Jaillon, O., Bernot, A., Dasilva, C., Bouneau, L., Fischer, C., Fizames, C., Wincker, P., Brottier, P., Quetier, F., Saurin, W., Weissenbach, J. (2000) Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence. Nature Genet., 25, 235-238.
5. Liang, F., Holt, I., Pertea, G., Karamycheva, S., Salzberg, S.L., Quackenbush, J. (2000) Gene index analysis of the human genome estimates approximately 120,000 genes. Nature Genet., 25, 239-240
6. Schuler, G.D. (1997) Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J. Mol. Med., 75, 694–698.
7. Wheeler, D.L., Church, D.M., Lash, A.E., Leipe, D.D., Madden, T.L., Pontius,J.U., Schuler, G.D., Schriml, L.M., Tatusova, T.A., Wagner, L., Rapp, B.A. (2001) Database resources of the National Center for Biotechnology Information Nucleic Acids Res., 29,11-16
8. Sese, J., Nikaidou, H., Kawamoto, S., Minesaki, Y., Morishita, S., Okubo, K. (2001) BodyMap incorporated PCR-based expression profiling data and a gene ranking system. Nucleic Acids Res., 29, 156-158.
9. Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M., Sirotkin, K. (2001) dbSNP: The NCBI database of genetic variation. Nucleic Acids Res., 29, 308-311.
10. Suzuki, Y., Tsunoda, T., Sese, J., Taira, H., Mizushima-Sugano, J., Hata, H., Ota, T., Isogai, T., Tanaka, T., Nakamura, Y., Suyama, A., Sakaki, Y., Morishita, S., Okubo, K., Sugano, S. (2001) Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Research, 11, 677-84.
11. Pruitt, K.D., Maglott, D.R. (2001) RefSeq and LocusLink: NCBI gene-centered resources Nucleic Acids Res., 29, 137-140.
12. Wingender, E., Chen, X., Fricke, E., Geffers, R., Hehl, R., Liebich, I., Krull, M., Matys, V., Michael, H., Ohnhauser, R., Pruss, M., Schacherer, F., Thiele, S., Urbach, S. (2001) The TRANSFAC system on gene expression regulation. Nucleic Acids Res., 29, 281-283.

Category   Gene Identification and Structure

Go to the abstract in the NAR 2002 Database Issue.

 

Compilation Paper
Categories List
Alphabetical List
Search Summary Papers