March 23, 2009

DNA Terrain Affects Function in Human Genome

Colorized image of a DNA molecule resembling an irregular chain of mountain peaks A scanning probe image of a single DNA molecule shows its surface topography. M. Antognozzi and M. Szczelkun. All rights reserved by Wellcome Images

Researchers have developed a novel method for detecting functional regions of the human genome: examining its three-dimensional (3D) structure. The new approach will help researchers use genomic information to improve human health.

In 2003, the Human Genome Project mapped the linear order of the 3 billion human DNA base pairs—a sequence told in an alphabet of only 4 DNA bases. This information has helped researchers identify countless important genes, which are the parts of the genome that code for proteins. The human genome contains an estimated 20,000 to 25,000 protein-coding genes, far less than initially predicted. In fact, the protein-coding genes only make up a small fraction of the human genome—about 1.5-2%. Little is known about the remaining 98%.

A team led by Dr. Elliott Margulies of NIH's National Human Genome Research Institute (NHGRI), and Dr. Thomas Tullius of Boston University hypothesized that the 3D structure of DNA could contain important clues to the function of these non-coding regions. However, predicting the shape of a DNA molecule based on its sequence is tricky. Similar DNA sequences can have different 3D structures, while different sequences can have comparable structures.

In a paper published in the online edition of Science on March 12, 2009, the scientists described an innovative approach for detecting the topography of DNA—the grooves and turns that make up its 3D structure—based on chemical and computer analyses. The researchers compared DNA structural information from human genomes to those from 36 mammalian species that included the mouse, chimpanzee, elephant and rabbit. The genomic information was acquired from the Encyclopedia of DNA Elements, or ENCODE, project. ENCODE is a public research consortium spearheaded and funded by NHGRI to identify functional elements in the human genome.

The team discovered that the structure of about 12% of non-coding DNA in the human genome is similar across multiple species—double the amount of similarity detected by sequences. Topographical features that are preserved across species, like sequence similarities, are likely to play important roles in development, health and disease.

The researchers next explored whether variations in the DNA sequence of non-coding regions are likely to cause structural changes that lead to disease. They gathered 734 non-coding single-nucleotide changes associated with disease and compared the structural changes they caused to harmless base pair variations. The disease-related variations tended to produce larger changes in DNA structure than those not linked to disease.

"We often think of DNA as a string of letters on a computer screen and forget that this string of letters is a three-dimensional molecule. But shape really matters," Margulies said. "Proteins that influence biological function by binding to DNA recognize more than just the sequence of bases. These binding proteins also see the surface of the DNA molecule and are looking for a shape that allows a lock-and-key fit."

Coupled with continued innovations in DNA sequencing, this new approach will speed researchers' efforts to identify functional elements in the human genome and understand how they affect human health.

Related Links