News Release

Monday, October 1, 2007

NIH Launches Extensive Open-Access Dataset of Genetic and Clinical Data

Landmark Framingham Heart Study Forms Foundation.

The National Institutes of Health (NIH) — the nation's medical research agency — is launching one of the most extensive collections of genetic and clinical data ever made freely available to researchers worldwide. Called SHARe (SNP Health Association Resource), the Web-based dataset enables qualified researchers to access a wealth of data from large population-based studies, starting with the landmark Framingham Heart Study. Funded by the NIH's National Heart, Lung, and Blood Institute (NHLBI), SHARe will accelerate discoveries linking genes and health, thereby advancing scientists' understanding of the causes and prevention of cardiovascular disease and other disorders.

Framingham SHARe includes data on more than 9,300 participants spanning three generations, including over 900 families, who had their DNA tested for 550,000 genetic variations (single nucleotide polymorphisms, or SNPs). In addition, the participants' clinical data gathered during the study, such as test results or weight, are included. SHARe will enable researchers to relate study participants' genetic variations with their clinical and laboratory test results. The Framingham Heart Study is funded by NHLBI in collaboration with Boston University School of Medicine (BUSM) and Boston University School of Public Health.

"The widespread availability of Framingham Heart Study data provides unprecedented opportunities to investigate the connections between genes and disease," said Health and Human Services (HHS) Secretary Mike Leavitt. "SHARe represents a major milestone in moving toward an era of personalized health care — a future in which the ways we prevent, diagnose, and treat health problems are tailored to an individual's genetic makeup."

Last month, Leavitt released the first HHS report on personalized health care. The report, "Personalized Health Care: Opportunities, Pathways, Resources" (, includes a review of departmental activities to advance genomic knowledge and incorporate gene-based advances in clinical care for patients.

"Sharing information while also safeguarding the privacy and confidentiality of our valued research participants is our best route toward an increased understanding of the genetic role in health and disease," said NIH Director Elias Zerhouni, M.D. "This is an exciting convergence of advanced information technology with what we've learned from the Human Genome Project and major clinical research endeavors, which will boost our research capacity."

NHLBI Director Elizabeth G. Nabel, M.D., said, "As one of the most comprehensive studies ever undertaken, the Framingham Heart Study will play a vital role in laying the foundation for this vast dataset to help researchers link genes and disease." She noted that data from ongoing Framingham Heart Study research will continue to be added. NHLBI will also incorporate data from other large studies. "NHLBI is firmly committed to maximizing this important new resource."

Karen Antman, M.D., BUSM dean and provost of the Boston University Medical Campus, noted that the university is pleased to be a part of this important endeavor. "The ongoing collaboration among the many Framingham Heart Study researchers has advanced our knowledge about health and disease, helping to improve the well-being of millions of individuals. In addition, the study participants have made invaluable contributions to science, and we are indebted to them."

Philip A. Wolf, MD, principal investigator of this study and BUSM professor of neurology and research professor of medicine (epidemiology and preventive medicine), added, "It is the hope of all those who have contributed to the Framingham Heart Study that this free flow of information will accelerate the discovery of pathways to human disease."

SHARe is accessed through dbGaP, or the database of Genotypes and Phenotypes (, a Web-based resource for archiving and distributing data from genome-wide association studies (GWAS). GWAS explore the associations between genes (genotype information) and observable traits (phenotypes), such as weight, cholesterol levels, or the presence or absence of a disease. Launched in December 2006, dbGaP was developed and is operated by the National Center for Biotechnology Information (NCBI), a division of NIH's National Library of Medicine (NLM).

The dbGaP also provides, for the first time, a central repository where study documentation, such as protocols and questionnaires, is linked to summary data of measured variables. For example, in Framingham SHARe, researchers can search for summary data on the average blood pressure value at a visit and easily find the associated protocol for measuring blood pressure.

"The SHARe data offer an unparalleled level of study detail, providing a wealth of opportunities for researchers, students and others to learn about study design from some of the brightest minds working in the field," said NLM Director Donald A.B. Lindberg, M.D. SHARe and other dbGaP studies are linked to related publications and relevant NLM genomic resources to aid researchers in the discovery process, according to Jim Ostell, PhD, chief of NCBI's Information Engineering Branch.

To protect the confidentiality of study participants, the SHARe data in dbGaP includes only de-identified data — stripped of names, Social Security numbers, etc. — from participants who have consented to genetic research and to allowing their data to be shared. Genotyping information, including data from a 500K mapping array, was generated for Framingham SHARe by Affymetrix, Inc., through a contract with NHLBI. Although summary data and analyses are open access (available to any researcher), individual-level data can be used only by authorized investigators who meet requirements for access outlined in the NIH GWAS policy ( Researchers are prohibited from redistributing data or trying to determine the identity of participants.

"Analyzing individual-level data with computer programs, researchers will be able to search for new connections between genetic variations and phenotypes such as high cholesterol," explained Christopher O’Donnell, M.D., associate director of the Framingham Heart Study and scientific director of Framingham SHARe. "The thousands of Framingham participants — some of whom have been monitored for almost 60 years — have already contributed greatly to our understanding of the role of risk factors for heart disease and other conditions, and now they will contribute a wealth of new and detailed information about the inherited basis of these conditions."

The Framingham Heart Study is a prospective, community-based, family study that began in 1948 among residents of Framingham, Mass. The original group of participants included 5,209 adults between the ages of 30 and 62 at enrollment who visited every two years for medical histories, physical exams, and laboratory tests. In 1971, 5,124 of the original group's adult children and their spouses were added. A third generation group — 4,095 grandchildren of the original group — was enrolled in 2002.

Researchers interested in applying for access to individual-level Framingham SHARe data should follow the directions at


Contact the NHLBI Communications Office at 301 496-4236 to interview NHLBI staff, including Drs. Nabel or O'Donnell; Cashell E. Jaquish, Ph.D., SHARe project officer; or Daniel Levy, M.D., Framingham Heart Study director. To interview Dr. Ostell at NLM, contact Kathy Cravedi at 301 496-6308. To interview Boston University Dean Antman, contact Gina DiGravio at 617 638-8491.

Part of the National Institutes of Health, the National Heart, Lung, and Blood Institute (NHLBI) plans, conducts, and supports research related to the causes, prevention, diagnosis, and treatment of heart, blood vessel, lung, and blood diseases, and sleep disorders. The Institute also administers national health education campaigns on women and heart disease, healthy weight for children, and other topics. NHLBI press releases and other materials are available online at

The National Center for Biotechnology Information (NCBI) was established in 1988 as a national resource for molecular biology information. NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing molecular and genomic data, and disseminates biomedical information, all for the better understanding of processes affecting human health and disease. NCBI is a division of the National Library of Medicine, the world's largest library of the health sciences. For more information, visit

About the National Institutes of Health (NIH): NIH, the nation's medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit

NIH…Turning Discovery Into Health®