"These new goals are ambitious, even audacious. But the Human Genome Project has never settled for making safe projections, and that has served us well," said Francis S. Collins, director of the National Human Genome Research Institute (NHGRI) and chair of the advisory council. "When we have looked at the facts, considered the opportunities, and tried to project forward five years, we have always done better than we thought we could. If there was ever a time to spark the imagination of the scientific community and the public, it is now."
The new plan will be presented to the advisory council by Collins, Department of Energy (DOE) genome program director Ari Patrinos, and planning committee chairs Aravinda Chakravarti and Leroy Walters. Besides goals for human genome sequencing, it also contains goals for studies of human genetic variation, genomic function, genomic analysis of model organisms, the ethical, legal, and social implications (ELSI) of genome research, as well as development of new technologies. But its boldest language pertains to spelling out the sequence of the 3 billion DNA bases that make up the human genome. Although delivering a complete human genome sequence has always been the project's ultimate and most challenging goal, actual production sequencing of human DNA has begun only recently as technology improvements and cost reductions have made it feasible. Analysis of sequencing strategies for human DNA over the past year's planning process led to the new proposal to deliver a complete, highly accurate, human genome sequence by the end of 2003.
"That analysis concluded we could finish one-third of the human genome sequence and have the rest pretty far along in 2001," Collins said. In sequencing terms, the word "finish" refers to stretches of DNA bases that contain no missing letters and that are spelled accurately. The new plan aims to sequence one-third of the human genome in this "finished" form in 2001, focusing on regions known to contain genes or other aspects important to biologists.
Meanwhile, sequencers will also be generating a "working draft" that, together with finished sequence, will cover at least 90% of the genome in 2001. The working draft will be immediately valuable to researchers and form the basis for high-quality, finished genome sequence.
"At full scale, that will put the complete, finished, high-quality human genome sequence within reach in 2003. No one else is doing this," Collins said.
According to Patrinos, DOE associate director for Biological and Environmental Research, "We have as our primary goal the finished 'Book of Life' by the end of 2003. However, we also want the working draft to be as useful as possible."
The federal genome project strategy will start with mapped DNA pieces of known location in the genome, so assembling the sequenced pieces will reflect the accurate orientation of DNA in the genome. Another key aspect of the new sequencing plan concerns access to the data. The new plan reiterates the Human Genome Project's position on release of DNA sequence into public databases for free access within 24 hours. The high quality and wide availability of the sequence will of maximum benefit to researchers studying the molecular basis of human health and disease.
The Human Genome Project officially began in 1990 as a 15-year program to characterize in detail the complete set of genetic instructions of the human and some important laboratory organisms. NHGRI, at the National Institutes of Health, and the Department of Energy carry out the effort. Most of the work is done in university research centers and in national laboratories. From the beginning the project has operated from a set of carefully established but aggressive research goals. The first plan covered fiscal years 1991-1995 and included mainly goals for genetic and physical mapping, computer management of research data, and research on ELSI issues. Rapid progress and technology advances required that a second five-year plan be developed in 1993 to cover research through 1998. To date, all of the proposed goals have been met or exceeded. DNA sequencing goals in the 1993-1998 plan, for example, called for completion of 80 million bases of DNA mostly from non-human organisms. To date, public databases contain more than three times that amount: 180 million bases of human DNA; 80 million bases of DNA from the roundworm; 14 million from the fruit fly; 12 million from yeast; and, 5 million from the bacterium E. coli.
The 1998-2003 plan contains goals for new areas of study, including how variations in human DNA sequence among different populations relate to the development of, or protection from, disease; new technologies and strategies for studying genetic function on a whole-genome scale; and new areas of ELSI research, such as identifying and addressing issues that link genetics to personal identity and racial or ethnic background, and their implications for philosophical and religious traditions.
The year 2003 is also the 50th anniversary of the discovery by James Watson and Francis Crick of the double helix structure of DNA. "There could hardly be a more fitting tribute to this momentous event in biology than the completion of the first human genome sequence in this anniversary year," the plans says. Watson was also the first director of the Human Genome Project at the National Institutes of Health from 1989 to 1992.