Media Advisory

Friday, July 20, 2018

NIH Clinical Center releases dataset of 32,000 CT images

The National Institutes of Health’s Clinical Center has made a large-scale dataset of CT images publicly available to help the scientific community improve detection accuracy of lesions. While most publicly available medical image datasets have less than a thousand lesions, this dataset, named DeepLesion, has over 32,000 annotated lesions identified on CT images.

The images, which have been thoroughly anonymized, represent 4,400 unique patients, who are partners in research at the NIH. 

Once a patient steps out of a CT scanner, the corresponding images are sent to a radiologist to interpret. Radiologists at the Clinical Center then measure and mark clinically meaningful findings with an electronic bookmark tool. Similar to a physical bookmark, radiologists save their place and mark significant findings to be able to come back to at a later time. These bookmarks are complex – they provide arrows, lines, diameters, and text that can tell the exact location and size of a lesion so experts can identify growth or new disease.

The bookmarks, abundant with retrospective medical data, are what scientists used to develop the DeepLesion dataset. DeepLesion is unlike most lesion medical image datasets currently available, which can only detect one type of lesion. The database has great diversity – it contains all kinds of critical radiology findings from across the body, such as lung nodules, liver tumors, enlarged lymph nodes, and so on.

The conventional methods for collecting image labels like a search engine does, cannot be applied in the medical image domain. Medical image annotations require extensive clinical experience. But, that could change. The dataset released is large enough to train a deep neural network – it could enable the scientific community to create a large-scale universal lesion detector with one unified framework.

With the release of the dataset, researchers hope the others will be able to:

  • Develop a universal lesion detector that will help radiologists find all types of lesions. It may open the possibility to serve as an initial screening tool and send its detection results to other specialist systems trained on certain types of lesions.
  • Mine and study the relationship between different types of lesions. In DeepLesion, multiple findings are often marked in one CT exam image. Researchers are able to analyze their relationship to make new discoveries.
  • More accurately and automatically measure sizes of all lesions a patient has, enabling the whole body assessment of cancer burden.

In 2017, the research hospital released anonymized chest x-ray images and their corresponding data.

In the future, the NIH Clinical Center hopes to keep improving the DeepLesion dataset by collecting more data, thus improving its detection accuracy. The universal lesion detecting capability will become more reliable once researchers are able to leverage 3-D and lesion type information. It may be possible to further extend DeepLesion to other image modalities such as MRI and combine data from multiple hospitals, as well.


Ronald M. Summers, M.D., Ph.D., Senior Investigator of the Clinical Image Processing Service in the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory of the NIH Clinical Center Radiology and Imaging Sciences Department is available for interviews.


Ke Yan, Xiaosong Wang, Le Lu, Ronald M. Summers. DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. Journal of Medical Imaging (2018).

Images are available via Box:

About the NIH Clinical Center: The NIH Clinical Center is the clinical research hospital for the National Institutes of Health. Through clinical research, clinician-investigators translate laboratory discoveries into better treatments, therapies and interventions to improve the nation's health. More information:

About the National Institutes of Health (NIH): NIH, the nation's medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit

NIH…Turning Discovery Into Health®