September 24, 2012

Finding Treasure in “Junk” DNA

DNA strands.

A vast consortium of researchers has created a view of the human genome that extends well beyond our genes. The findings suggest at least some function for more than 80% of the genome. In a related study, a systematic analysis linked regulatory regions to disease.

Over 98% of our DNA doesn't code for proteins. While a small portion of this non-coding sequence was previously associated with function, researchers have debated whether the rest of the genome lacked a meaningful role and was thus simply “junk” DNA. A growing body of evidence suggests that these regions can have important biological functions.

NIH's National Human Genome Research Institute (NHGRI) launched the ENCyclopedia Of DNA Elements (ENCODE) consortium in 2003 to identify all functional elements in the human genome sequence. Hundreds of researchers across the world performed more than 1,600 sets of experiments on 147 types of tissue with technologies standardized across the consortium. In a series of new reports—in Nature, Science, Cell and other journals—the scientists catalogued numerous aspects of gene regulation that can affect function. These include epigenetic modifications (chemical changes that affect how genes are expressed); proteins and 3-dimension DNA structures that can affect transcription; and DNA sequences that regulate gene expression.

In related work supported by ENCODE and the NIH Common Fund, a team led by Dr. John Stamatoyannopoulos at the University of Washington focused on disease-associated variants in regulatory regions of DNA. Sites that actively regulate gene expression are known to be sensitive to cleavage by a protein called DNaseI. The researchers created an extensive map of these sites, called DNaseI hypersensitivity sites (DHSs), using several hundred different cell and tissue samples. They then compared these to thousands of variants that have been tied in genome-wide association studies (GWAS) to various diseases and traits. Their report was published on September 7, 2012, in Science.

The team found that over 76% of non-coding GWAS variants were in or very near DHSs. This suggests that the vast majority of non-coding GWAS variants are involved in regulating genes. Most of the variants targeted distant genes that weren't the closest to the variant. The researchers also found that GWAS variants from similar diseases often disrupted interconnected networks of proteins. These results underscore how complicated it will be to fully understand how gene regulation affects disease.

Over 88% of the variants in these regulatory regions are active in fetal development, including variants associated with adult-onset disease. This finding supports the idea that, for some adult onset diseases, regulatory regions that function during early development can influence risk.

The researchers explored the cell- and tissue-specific patterns of variants in these regions to identify which cell types may be playing a role in various diseases. They identified cell types associated with Crohn's disease and multiple sclerosis that were only recently discovered to play a role in these diseases. The results suggest that this approach can help identify cell types that were previously unknown to contribute to disease.

“These exciting results show how a broad, systematic approach to deciphering regulatory DNA—essentially the genome's operating system—can have major implications for our understanding of the genetic basis of many common diseases and traits,” Stamatoyannopoulos says.

Related Links

References: Science. 2012 Sep 7;337(6099):1190-5. Epub 2012 Sep 5. PMID: 22955828.
Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247. PMID:22955616.