NIH News Release
National Cancer Institute

Tuesday, August 10, 1999

Contact: NCI Press Office
(301) 496-6641

NCI's Tumor Gene Index Reaches Its Two-Year Mark

Two years ago this month, the National Cancer Institute (NCI) and Vice President Al Gore publicly launched a historic initiative to compile on the Internet the first comprehensive record, or index, of genes involved in human cancer. The project promised to lay the groundwork for scientists to define tumors based on their unique molecular features, information that could revolutionize the diagnosis and treatment of cancer.

Today, the NCI announced that the initiative, known as the Tumor Gene Index (TGI), is living up to its promise. According to Robert Strausberg, Ph.D., an NCI scientist who oversees the project, TGI already has discovered nearly 30,000 new human genes, making it the leading source of new gene discovery in the world today. Human DNA contains an estimated 100,000 genes, of which over 73,000 have been discovered.

Strausberg said TGI has catalogued over 66,400 genes in its first two years, both new and previously identified genes. In total, over 40,500 of them are active, directly or indirectly, in one or more cancers. Some of the 44 tissues that have been studied to date include:

"The Tumor Gene Index is still far from complete," said Richard Klausner, M.D., NCI director. "But already, it is difficult to think of another project that in such a short period of time has generated so much useful, publically available data to benefit cancer research and ultimately people with cancer."

TGI is the first in a series of initiatives under NCI's Cancer Genome Anatomy Project (CGAP), a program to develop publically available databases and technologies that assist scientists in deciphering the molecular anatomy of the cancer cell. "Just as anatomists have defined the human body, CGAP seeks to define for the first time the molecules that are present in cancer cells and make them accessible to scientists," said Strausberg.

The project builds on the understanding that genes encode the instructions to produce proteins, which comprise the bulk of a cell's molecular anatomy. By measuring the expression patterns of genes in tumor cells, scientists can generate a roll call of which proteins are produced and at which levels.

But TGI added another important element. For each tissue that the project planned to study, it would record gene expression levels in normal, precancerous, and cancer cells. Strausberg said that by monitoring various distinct cell types in any given tissue, the project would have multiple points of comparison to track the changes in gene expression over time that drive tumor development.

To create the index, Strausberg said the project initially turned to the tried-and-true strategy of creating cDNA libraries. cDNAs are snippets of DNA that scientists synthesize from gene transcripts, or copies of expressed genes, that are present en masse within the cell nucleus. By creating cDNAs from the transcripts and arranging them into ordered clone libraries, scientists can track which of the genome's estimated 100,000 genes are active in a given cell.

As reported today, TGI already has produced a total of 142 cDNA libraries. Of these libraries, 38 are created from normal cells, 11 originated from precancerous cells, and 91 are produced from cancer cells.

TGI also has submitted over 650,000 gene transcripts to EST databases, online storehouses of known gene transcripts, over the last two years. This makes the project the leading contributor to EST databases in the world today, accounting for just over half of all recorded gene transcripts. Strausberg said TGI would likely top the million mark in the next year.

Strausberg also stressed that TGI has made a major push to develop strong informatic tools on the CGAP Web site to assist scientists with their studies. These include its Tumor Suppressor and Oncogene directory, a variety of gene expression tools, such as "Differential Digital Display," an online tool to compare computed gene expression, and links to other biology Web sites.

"The TGI Web site has been designed not only to list gene names, but to display these names in the context of tumor biology," said Strausberg. "These added, valuable informatic tools provide an integrated package for accessing data and performing cancer experiments."

Strausberg said TGI will continue to build its gene expression index, including the generation of cDNA libraries from strains of mice commonly used in cancer research. In the meantime, NCI has begun soliciting applications from scientists to develop viable strategies and technologies to apply information in the TGI database toward a molecular classification of tumors. "This initiative would mark a giant step forward in our understanding of the molecular causes of cancer," said Klausner. "But, more importantly, it will lead to improved strategies for cancer prevention, early detection, diagnosis, and ultimately treatment."

TGI is a partnership among NCI, academic centers, and private companies. Some these partners include the National Institute of Dental and Craniofacial Research, the National Institute of Neurological Disorders and Stroke, and the National Institute of Allergy and Infectious Diseases, all at the National Institutes of Health; National Center for Biotechnology Information; Lawrence Livermore National Laboratory; Washington University Genome Sequencing Center; the University of Iowa Hospitals and Clinics Department of Pediatrics; Bristol-Myers Squibb; Genentech; Glaxo Wellcome; and Merck & Co.

The CGAP Web site can be accessed at

For more information about cancer visit NCI's Web site for patients, public, and the mass media at