Clinical Focus


  • Pediatric Radiology

Academic Appointments


Professional Education


  • Fellowship: Stanford University Radiology Fellowships (2007) CA
  • Fellowship: Vanderbilt University Medical Center (2007) TN
  • Residency: Drexel University College of Medicine Radiology Residency (2004) PA
  • Internship: Jefferson Frankford Hospital (2002) PA
  • Medical Education: University of New England College of Osteopathic Medicine (2001) ME
  • Board Certification: American Board of Radiology, Diagnostic Radiology (2006)
  • Residency: Medical University of South Carolina (2006) SC United States of America
  • Board Certification: American Board of Radiology, Pediatric Radiology (2008)

2023-24 Courses


All Publications


  • Characterizing continuous positive airway pressure (CPAP) Belly Syndrome in preterm infants in the neonatal intensive care unit (NICU). Journal of perinatology : official journal of the California Perinatal Association Gu, H., Seekins, J., Ritter, V., Halamek, L. P., Wall, J. K., Fuerch, J. H. 2024

    Abstract

    OBJECTIVE: Reproducibly define CPAP Belly Syndrome (CBS) in preterm infants and describe associated demographics, mechanical factors, and outcomes.STUDY DESIGN: A retrospective case-control study was conducted in infants <32 weeks gestation in the Stanford Children's NICU from January 1, 2020 to December 31, 2021. CBS was radiographically defined by a pediatric radiologist. Data analysis included descriptive statistics and comparator tests.RESULTS: Analysis included 41 infants with CBS and 69 infants without. CBS was associated with younger gestational age (median 27.7 vs 30 weeks, p<0.001) and lower birthweight (median 1.00 vs 1.31kg, p<0.001). Infants with CBS were more likely to receive bilevel respiratory support and higher positive end expiratory pressure. Infants with CBS took longer to advance enteral feeds (median 10 vs 7 days, p=0.003) and were exposed to more abdominal radiographs.CONCLUSIONS: Future CBS therapies should target small infants, prevent air entry from above, and aim to reduce time to full enteral feeds and radiographic exposure.

    View details for DOI 10.1038/s41372-024-01918-2

    View details for PubMedID 38448640

  • The influence of extracurricular activities on radiology resident selection decisions. Journal of the American College of Radiology : JACR Maxfield, C. M., Montano-Campos, J. F., Gould, J., Koontz, N. A., Milburn, J., Omofoye, T., Peterson, R., Seekins, J., Grimm, L. 2023

    Abstract

    OBJECTIVE: Extracurricular activities (EAs) listed on radiology residency applications can signal traits and characteristics desired in holistic reviews. We sought an objective analysis to determine the influence of EAs on resident selection decisions.METHODS: A discrete choice experiment was designed to model radiology resident selection and determine the relative weights of EAs among academic and demographic application factors. Faculty involved in resident selection at 30 US radiology programs chose between hypothetical pairs of applicant profiles between October, 2021 and February, 2022. Each applicant profile included one of 22 EAs chosen for study. A conditional logistic regression model assessed the relative weights of the attributes and odds ratios were calculated.RESULTS: 244 participants completed the exercise. Community service EAs were ranked most highly by participants. LGBTQ Pride Alliance (OR: 1.56, 95% CI: 1.14-2.15, p=0.006) and Young Republicans (OR: 0.60; 95% CI: 0.43-0.82, p=0.001) significantly influenced decisions. The highest ranked EAs were significantly preferred over the lowest ranked EAs (OR: 1.916; 95% CI: 1.671-2.197, p<0.001). Participants preferred EAs that reflected active over passive engagement (OR: 1.154, 95% CI: 1.022-1.304, p=0.021), and progressive over conservative ideology (OR: 1.280, 95% CI: 1.133-1.447, p<0.001). Participants who ranked progressive EAs more highly preferred applicants with progressive EAs (all p-values<0.05).DISCUSSION: The influence of EAs on resident selection decisions is significant and likely to gain importance in resident selection as medical student performance metrics are further eliminated. Applicants and selection committees should consider this influence and the bias that EAs can bring to resident selection decisions.

    View details for DOI 10.1016/j.jacr.2023.09.013

    View details for PubMedID 37922965

  • Brain Tumor Radiogenomic Classification of O6-Methylguanine-DNA Methyltransferase Promoter Methylation in Malignant Gliomas-Based Transfer Learning. Cancer control : journal of the Moffitt Cancer Center Sakly, H., Said, M., Seekins, J., Guetari, R., Kraiem, N., Marzougui, M. 2023; 30: 10732748231169149

    Abstract

    Artificial Intelligence (AI) is the subject of a challenge and attention in the field of oncology and raises many promises for preventive diagnosis, but also fears, some of which are based on highly speculative visions for the classification and detection of tumors. A brain tumor that is malignant is a life-threatening disorder. Glioblastoma is the most prevalent kind of adult brain cancer and the 1 with the poorest prognosis, with a median survival time of less than a year. The presence of O6 -methylguanine-DNA methyltransferase (MGMT) promoter methylation, a particular genetic sequence seen in tumors, has been proven to be a positive prognostic indicator and a significant predictor of recurrence.This strong revival of interest in AI is modeled in particular to major technological advances which have significantly increased the performance of the predicted model for medical decision support. Establishing reliable forecasts remains a significant challenge for electronic health records (EHRs). By enhancing clinical practice, precision medicine promises to improve healthcare delivery. The goal is to produce improved prognosis, diagnosis, and therapy through evidence-based sub stratification of patients, transforming established clinical pathways to optimize care for each patient's individual requirements. The abundance of today's healthcare data, dubbed "big data," provides great resources for new knowledge discovery, potentially advancing precision treatment. The latter necessitates multidisciplinary initiatives that will use the knowledge, skills, and medical data of newly established organizations with diverse backgrounds and expertise.The aim of this paper is to use magnetic resonance imaging (MRI) images to train and evaluate your model to detect the presence of MGMT promoter methylation in this competition to predict the genetic subtype of glioblastoma based transfer learning. Our objective is to emphasize the basic problems in the developing disciplines of radiomics and radiogenomics, as well as to illustrate the computational challenges from the perspective of big data analytics.

    View details for DOI 10.1177/10732748231169149

    View details for PubMedID 37078100

  • Improved Detection of Bone Metastases in Children and Young Adults with Ferumoxytol-enhanced MRI. Radiology. Imaging cancer Rashidi, A., Baratto, L., Theruvath, A. J., Greene, E. B., Jayapal, P., Hawk, K. E., Lu, R., Seekins, J., Spunt, S. L., Pribnow, A., Daldrup-Link, H. E. 2023; 5 (2): e220080

    Abstract

    Purpose To evaluate if ferumoxytol can improve the detection of bone marrow metastases at diffusion-weighted (DW) MRI in pediatric and young adult patients with cancer. Materials and Methods In this secondary analysis of a prospective institutional review board-approved study (ClinicalTrials.gov identifier NCT01542879), 26 children and young adults (age range: 2-25 years; 18 males) underwent unenhanced or ferumoxytol-enhanced whole-body DW MRI between 2015 and 2020. Two reviewers determined the presence of bone marrow metastases using a Likert scale. One additional reviewer measured signal-to-noise ratios (SNRs) and tumor-to-bone marrow contrast. Fluorine 18 (18F) fluorodeoxyglucose (FDG) PET and follow-up chest CT, abdominal and pelvic CT, and standard (non-ferumoxytol enhanced) MRI served as the reference standard. Results of different experimental groups were compared using generalized estimation equations, Wilcoxon rank sum test, and Wilcoxon signed rank test. Results The SNR of normal bone marrow was significantly lower at ferumoxytol-enhanced MRI compared with unenhanced MRI at baseline (21.380 ± 19.878 vs 102.621 ± 94.346, respectively; P = .03) and after chemotherapy (20.026 ± 7.664 vs 54.110 ± 48.022, respectively; P = .006). This led to an increased tumor-to-marrow contrast on ferumoxytol-enhanced MRI scans compared with unenhanced MRI scans at baseline (1397.474 ± 938.576 vs 665.364 ± 440.576, respectively; P = .07) and after chemotherapy (1099.205 ± 864.604 vs 500.758 ± 439.975, respectively; P = .007). Accordingly, the sensitivity and diagnostic accuracy for detecting bone marrow metastases were 96% (94 of 98) and 99% (293 of 297), respectively, with the use of ferumoxytol-enhanced MRI compared with 83% (106 of 127) and 95% (369 of 390) with the use of unenhanced MRI. Conclusion Use of ferumoxytol helped improve the detection of bone marrow metastases in children and young adults with cancer. Keywords: Pediatrics, Molecular Imaging-Cancer, Molecular Imaging-Nanoparticles, MR-Diffusion Weighted Imaging, MR Imaging, Skeletal-Appendicular, Skeletal-Axial, Bone Marrow, Comparative Studies, Cancer Imaging, Ferumoxytol, USPIO © RSNA, 2023 ClinicalTrials.gov registration no. NCT01542879 See also the commentary by Holter-Chakrabarty and Glover in this issue.

    View details for DOI 10.1148/rycan.220080

    View details for PubMedID 36999999

  • MODIFIED ASSESSMENT OF COMPETENCY IN THORACIC SONOGRAPHY (ACTS) SCALE IN THE NICU AND PICU Dasani, R., Bhargava, V., Chan, B., Hamilton, C., Seekins, J., Halabi, S., Chen, D., Reddy, C., Hilgenberg, S., Haileselassie, B., Bhombal, S. LIPPINCOTT WILLIAMS & WILKINS. 2023: 302
  • Benchmarking saliency methods for chest X-ray interpretation NATURE MACHINE INTELLIGENCE Saporta, A., Gui, X., Agrawal, A., Pareek, A., Truong, S. H., Nguyen, C. T., Ngo, V., Seekins, J., Blankenberg, F. G., Ng, A. Y., Lungren, M. P., Rajpurkar, P. 2022
  • Imaging of pediatric testicular tumors: A COG Diagnostic Imaging Committee/SPR Oncology Committee White Paper. Pediatric blood & cancer Behr, G. G., Morani, A. C., Artunduaga, M., Desoky, S. M., Epelman, M., Friedman, J., Lala, S. V., Seekins, J., Towbin, A. J., Back, S. J. 2022: e29988

    Abstract

    Primary intratesticular tumors are uncommon in children, but incidence and risk of malignancy both sharply increase during adolescence. Ultrasound is the mainstay for imaging the primary lesion, and cross-sectional modalities are often required for evaluation of regional or distant disease. However, variations to this approach are dictated by additional clinical and imaging nuances. This paper offers consensus recommendations for imaging of pediatric patients with a known or suspected primary testicular malignancy at diagnosis and during follow-up.

    View details for DOI 10.1002/pbc.29988

    View details for PubMedID 36184829

  • Imaging of pediatric ovarian tumors: A COG Diagnostic Imaging Committee/SPR Oncology Committee White Paper. Pediatric blood & cancer Behr, G. G., Morani, A. C., Artunduaga, M., Desoky, S. M., Epelman, M., Friedman, J., Lala, S. V., Seekins, J., Towbin, A. J., Back, S. J. 2022: e29995

    Abstract

    Ovarian tumors in children are uncommon. Like those arising in the adult population, they may be broadly divided into germ cell, sex cord, and surface epithelium subtypes; however, germ cell tumors comprise the majority of lesions in children, whereas tumors of surface epithelial origin predominate in adults. Diagnostic workup, including the use of imaging, requires an approach that often differs from that required in an adult. This paper offers consensus recommendations for imaging of pediatric patients with a known or suspected primary ovarian malignancy at diagnosis and during follow-up.

    View details for DOI 10.1002/pbc.29995

    View details for PubMedID 36184758

  • Machine Learning Approach to Differentiation of Peripheral Schwannomas and Neurofibromas: A Multi-Center Study. Neuro-oncology Zhang, M., Tong, E., Wong, S., Hamrick, F., Mohammadzadeh, M., Rao, V., Pendleton, C., Smith, B. W., Hug, N. F., Biswal, S., Seekins, J., Napel, S., Spinner, R. J., Mahan, M. A., Yeom, K. W., Wilson, T. J. 2021

    Abstract

    BACKGROUND: Non-invasive differentiation between schwannomas and neurofibromas is important for appropriate management, preoperative counseling, and surgical planning, but has proven difficult using conventional imaging. The objective of this study was to develop and evaluate machine learning approaches for differentiating peripheral schwannomas from neurofibromas.METHODS: We assembled a cohort of schwannomas and neurofibromas from 3 independent institutions and extracted high-dimensional radiomic features from gadolinium-enhanced, T1-weighted MRI using the PyRadiomics package on Quantitative Imaging Feature Pipeline. Age, sex, neurogenetic syndrome, spontaneous pain, and motor deficit were recorded. We evaluated the performance of 6 radiomics-based classifier models with and without clinical features and compared model performance against human expert evaluators.RESULTS: 107 schwannomas and 59 neurofibroma were included. The primary models included both clinical and imaging data. The accuracy of the human evaluators (0.765) did not significantly exceed the no-information rate (NIR), whereas the Support Vector Machine (0.929), Logistic Regression (0.929), and Random Forest (0.905) classifiers exceeded the NIR. Using the method of DeLong, the AUC for the Logistic Regression (AUC=0.923) and K Nearest Neighbor (AUC=0.923) classifiers was significantly greater than the human evaluators (AUC=0.766; p = 0.041).CONCLUSIONS: The radiomics-based classifiers developed here proved to be more accurate and had a higher AUC on the ROC curve than expert human evaluators. This demonstrates that radiomics using routine MRI sequences and clinical features can aid in differentiation of peripheral schwannomas and neurofibromas.

    View details for DOI 10.1093/neuonc/noab211

    View details for PubMedID 34487172

  • Machine-learning Approach to Differentiation of Benign and Malignant Peripheral Nerve Sheath Tumors: A Multicenter Study Zhang, M., Tong, E., Hamrick, F., Pendleton, C., Smith, B., Hug, N., Mattonen, S., Napel, S., Spinner, R., Yeom, K., Wilson, T., Mahan, M. AMER ASSOC NEUROLOGICAL SURGEONS. 2021
  • Machine-Learning Approach to Differentiation of Benign and Malignant Peripheral Nerve Sheath Tumors: A Multicenter Study. Neurosurgery Zhang, M., Tong, E., Hamrick, F., Lee, E. H., Tam, L. T., Pendleton, C., Smith, B. W., Hug, N. F., Biswal, S., Seekins, J., Mattonen, S. A., Napel, S., Campen, C. J., Spinner, R. J., Yeom, K. W., Wilson, T. J., Mahan, M. A. 2021

    Abstract

    BACKGROUND: Clinicoradiologic differentiation between benign and malignant peripheral nerve sheath tumors (PNSTs) has important management implications.OBJECTIVE: To develop and evaluate machine-learning approaches to differentiate benign from malignant PNSTs.METHODS: We identified PNSTs treated at 3 institutions and extracted high-dimensional radiomics features from gadolinium-enhanced, T1-weighted magnetic resonance imaging (MRI) sequences. Training and test sets were selected randomly in a 70:30 ratio. A total of 900 image features were automatically extracted using the PyRadiomics package from Quantitative Imaging Feature Pipeline. Clinical data including age, sex, neurogenetic syndrome presence, spontaneous pain, and motor deficit were also incorporated. Features were selected using sparse regression analysis and retained features were further refined by gradient boost modeling to optimize the area under the curve (AUC) for diagnosis. We evaluated the performance of radiomics-based classifiers with and without clinical features and compared performance against human readers.RESULTS: A total of 95 malignant and 171 benign PNSTs were included. The final classifier model included 21 imaging and clinical features. Sensitivity, specificity, and AUC of 0.676, 0.882, and 0.845, respectively, were achieved on the test set. Using imaging and clinical features, human experts collectively achieved sensitivity, specificity, and AUC of 0.786, 0.431, and 0.624, respectively. The AUC of the classifier was statistically better than expert humans (P=.002). Expert humans were not statistically better than the no-information rate, whereas the classifier was (P=.001).CONCLUSION: Radiomics-based machine learning using routine MRI sequences and clinical features can aid in evaluation of PNSTs. Further improvement may be achieved by incorporating additional imaging sequences and clinical variables into future models.

    View details for DOI 10.1093/neuros/nyab212

    View details for PubMedID 34131749

  • Artificial Intelligence Algorithm Improves Radiologist Performance in Skeletal Age Assessment: A Prospective Multicenter Randomized Controlled Trial. Radiology Eng, D. K., Khandwala, N. B., Long, J., Fefferman, N. R., Lala, S. V., Strubel, N. A., Milla, S. S., Filice, R. W., Sharp, S. E., Towbin, A. J., Francavilla, M. L., Kaplan, S. L., Ecklund, K., Prabhu, S. P., Dillon, B. J., Everist, B. M., Anton, C. G., Bittman, M. E., Dennis, R., Larson, D. B., Seekins, J. M., Silva, C. T., Zandieh, A. R., Langlotz, C. P., Lungren, M. P., Halabi, S. S. 2021: 204021

    Abstract

    Background Previous studies suggest that use of artificial intelligence (AI) algorithms as diagnostic aids may improve the quality of skeletal age assessment, though these studies lack evidence from clinical practice. Purpose To compare the accuracy and interpretation time of skeletal age assessment on hand radiograph examinations with and without the use of an AI algorithm as a diagnostic aid. Materials and Methods In this prospective randomized controlled trial, the accuracy of skeletal age assessment on hand radiograph examinations was performed with (n = 792) and without (n = 739) the AI algorithm as a diagnostic aid. For examinations with the AI algorithm, the radiologist was shown the AI interpretation as part of their routine clinical work and was permitted to accept or modify it. Hand radiographs were interpreted by 93 radiologists from six centers. The primary efficacy outcome was the mean absolute difference between the skeletal age dictated into the radiologists' signed report and the average interpretation of a panel of four radiologists not using a diagnostic aid. The secondary outcome was the interpretation time. A linear mixed-effects regression model with random center- and radiologist-level effects was used to compare the two experimental groups. Results Overall mean absolute difference was lower when radiologists used the AI algorithm compared with when they did not (5.36 months vs 5.95 months; P = .04). The proportions at which the absolute difference exceeded 12 months (9.3% vs 13.0%, P = .02) and 24 months (0.5% vs 1.8%, P = .02) were lower with the AI algorithm than without it. Median radiologist interpretation time was lower with the AI algorithm than without it (102 seconds vs 142 seconds, P = .001). Conclusion Use of an artificial intelligence algorithm improved skeletal age assessment accuracy and reduced interpretation times for radiologists, although differences were observed between centers. Clinical trial registration no. NCT03530098 © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Rubin in this issue.

    View details for DOI 10.1148/radiol.2021204021

    View details for PubMedID 34581608

  • Deep COVID DeteCT: an international experience on COVID-19 lung detection and prognosis using chest CT. NPJ digital medicine Lee, E. H., Zheng, J. n., Colak, E. n., Mohammadzadeh, M. n., Houshmand, G. n., Bevins, N. n., Kitamura, F. n., Altinmakas, E. n., Reis, E. P., Kim, J. K., Klochko, C. n., Han, M. n., Moradian, S. n., Mohammadzadeh, A. n., Sharifian, H. n., Hashemi, H. n., Firouznia, K. n., Ghanaati, H. n., Gity, M. n., Doğan, H. n., Salehinejad, H. n., Alves, H. n., Seekins, J. n., Abdala, N. n., Atasoy, Ç. n., Pouraliakbar, H. n., Maleki, M. n., Wong, S. S., Yeom, K. W. 2021; 4 (1): 11

    Abstract

    The Coronavirus disease 2019 (COVID-19) presents open questions in how we clinically diagnose and assess disease course. Recently, chest computed tomography (CT) has shown utility for COVID-19 diagnosis. In this study, we developed Deep COVID DeteCT (DCD), a deep learning convolutional neural network (CNN) that uses the entire chest CT volume to automatically predict COVID-19 (COVID+) from non-COVID-19 (COVID-) pneumonia and normal controls. We discuss training strategies and differences in performance across 13 international institutions and 8 countries. The inclusion of non-China sites in training significantly improved classification performance with area under the curve (AUCs) and accuracies above 0.8 on most test sites. Furthermore, using available follow-up scans, we investigate methods to track patient disease course and predict prognosis.

    View details for DOI 10.1038/s41746-020-00369-1

    View details for PubMedID 33514852

  • Differentiation of benign and malignant lymph nodes in pediatric patients on ferumoxytol-enhanced PET/MRI THERANOSTICS Muehe, A., Siedek, F., Theruvath, A., Seekins, J., Spunt, S. L., Pribnow, A., Hazard, F., Liang, T., Daldrup-Link, H. 2020; 10 (8): 3612–21

    Abstract

    The composition of lymph nodes in pediatric patients is different from that in adults. Most notably, normal lymph nodes in children contain less macrophages. Therefore, previously described biodistributions of iron oxide nanoparticles in benign and malignant lymph nodes of adult patients may not apply to children. The purpose of our study was to evaluate if the iron supplement ferumoxytol improves the differentiation of benign and malignant lymph nodes in pediatric cancer patients on 18F-FDG PET/MRI. Methods: We conducted a prospective clinical trial from May 2015 to December 2018 to investigate the value of ferumoxytol nanoparticles for staging of children with cancer with 18F-FDG PET/MRI. Ferumoxytol is an FDA-approved iron supplement for the treatment of anemia and has been used "off-label" as an MRI contrast agent in this study. Forty-two children (7-18 years, 29 male, 13 female) received a 18F-FDG PET/MRI at 2 (n=20) or 24 hours (h) (n=22) after intravenous injection of ferumoxytol (dose 5 mg Fe/kg). The morphology of benign and malignant lymph nodes on ferumoxytol-enhanced T2-FSE sequences at 2 and 24 h were compared using a linear regression analysis. In addition, ADCmean-values, SUV-ratio (SUVmax lesion/SUVmean liver) and R2*-relaxation rate of benign and malignant lymph nodes were compared with a Mann-Whitney-U test. The accuracy of different criteria was assessed with a receiver operating characteristics (ROC) curve. Follow-up imaging for at least 6 months served as the standard of reference. Results: We examined a total of 613 lymph nodes, of which 464 (75.7%) were benign and 149 (24.3%) were malignant. On ferumoxytol-enhanced T2-FSE images, benign lymph nodes showed a hypointense hilum and hyperintense parenchyma, while malignant lymph nodes showed no discernible hilum. This pattern was not significantly different at 2 h and 24 h postcontrast (p=0.82). Benign and malignant lymph nodes showed significantly different ferumoxytol enhancement patterns, ADCmean values of 1578 and 852 x10-6 mm2/s, mean SUV-ratios of 0.5 and 2.8, and mean R2*-relaxation rate of 127.8 and 84.4 Hertz (Hz), respectively (all p<0.001). The accuracy of ADCmean, SUV-ratio and pattern (area under the curve (AUC): 0.99; 0.98; 0.97, respectively) was not significantly different (p=0.07). Compared to these three parameters, the accuracy of R2* was significantly lower (AUC: 0.93; p=0.001). Conclusion: Lymph nodes in children show different ferumoxytol-enhancement patterns on MRI than previously reported for adult patients. We found high accuracy (>90%) of ADCmean, SUV-ratio, pattern, and R2* measurements for the characterization of benign and malignant lymph nodes in children. Ferumoxytol nanoparticle accumulation at the hilum can be used to diagnose a benign lymph node. In the future, the delivery of clinically applicable nanoparticles to the hilum of benign lymph nodes could be harnessed to deliver theranostic drugs for immune cell priming.

    View details for DOI 10.7150/thno.40606

    View details for Web of Science ID 000518768400016

    View details for PubMedID 32206111

    View details for PubMedCentralID PMC7069081

  • Deep learning to automate Brasfield chest radiographic scoring for cystic fibrosis JOURNAL OF CYSTIC FIBROSIS Zucker, E. J., Barnes, Z. A., Lungren, M. P., Shpanskaya, Y., Seekins, J. M., Halabi, S. S., Larson, D. B. 2020; 19 (1): 131–38
  • Deep learning to automate Brasfield chest radiographic scoring for cystic fibrosis. Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society Zucker, E. J., Barnes, Z. A., Lungren, M. P., Shpanskaya, Y., Seekins, J. M., Halabi, S. S., Larson, D. B. 2019

    Abstract

    BACKGROUND: The aim of this study was to evaluate the hypothesis that a deep convolutional neural network (DCNN) model could facilitate automated Brasfield scoring of chest radiographs (CXRs) for patients with cystic fibrosis (CF), performing similarly to a pediatric radiologist.METHODS: All frontal/lateral chest radiographs (2058 exams) performed in CF patients at a single institution from January 2008-2018 were retrospectively identified, and ground-truth Brasfield scoring performed by a board-certified pediatric radiologist. 1858 exams (90.3%) were used to train and validate the DCNN model, while 200 exams (9.7%) were reserved for a test set. Five board-certified pediatric radiologists independently scored the test set according to the Brasfield method. DCNN model vs. radiologist performance was compared using Spearman correlation (rho) as well as mean difference (MD), mean absolute difference (MAD), and root mean squared error (RMSE) estimation.RESULTS: For the total Brasfield score, rho for the model-derived results computed pairwise with each radiologist's scores ranged from 0.79-0.83, compared to 0.85-0.90 for radiologist vs. radiologist scores. The MD between model estimates of the total Brasfield score and the average score of radiologists was -0.09. Based on MD, MAD, and RMSE, the model matched or exceeded radiologist performance for all subfeatures except air-trapping and large lesions.CONCLUSIONS: A DCNN model is promising for predicting CF Brasfield scores with accuracy similar to that of a pediatric radiologist.

    View details for PubMedID 31056440

  • Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ digital medicine Patel, B. N., Rosenberg, L. n., Willcox, G. n., Baltaxe, D. n., Lyons, M. n., Irvin, J. n., Rajpurkar, P. n., Amrhein, T. n., Gupta, R. n., Halabi, S. n., Langlotz, C. n., Lo, E. n., Mammarappallil, J. n., Mariano, A. J., Riley, G. n., Seekins, J. n., Shen, L. n., Zucker, E. n., Lungren, M. n. 2019; 2: 111

    Abstract

    Human-in-the-loop (HITL) AI may enable an ideal symbiosis of human experts and AI models, harnessing the advantages of both while at the same time overcoming their respective limitations. The purpose of this study was to investigate a novel collective intelligence technology designed to amplify the diagnostic accuracy of networked human groups by forming real-time systems modeled on biological swarms. Using small groups of radiologists, the swarm-based technology was applied to the diagnosis of pneumonia on chest radiographs and compared against human experts alone, as well as two state-of-the-art deep learning AI models. Our work demonstrates that both the swarm-based technology and deep-learning technology achieved superior diagnostic accuracy than the human experts alone. Our work further demonstrates that when used in combination, the swarm-based technology and deep-learning technology outperformed either method alone. The superior diagnostic accuracy of the combined HITL AI solution compared to radiologists and AI alone has broad implications for the surging clinical AI deployment and implementation strategies in future practice.

    View details for DOI 10.1038/s41746-019-0189-7

    View details for PubMedID 31754637

    View details for PubMedCentralID PMC6861262

  • Erratum: Author Correction: Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ digital medicine Patel, B. N., Rosenberg, L. n., Willcox, G. n., Baltaxe, D. n., Lyons, M. n., Irvin, J. n., Rajpurkar, P. n., Amrhein, T. n., Gupta, R. n., Halabi, S. n., Langlotz, C. n., Lo, E. n., Mammarappallil, J. n., Mariano, A. J., Riley, G. n., Seekins, J. n., Shen, L. n., Zucker, E. n., Lungren, M. P. 2019; 2: 129

    Abstract

    [This corrects the article DOI: 10.1038/s41746-019-0189-7.].

    View details for DOI 10.1038/s41746-019-0198-6

    View details for PubMedID 31840097

    View details for PubMedCentralID PMC6904441

  • CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., Ng, A. Y., AAAI ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2019: 590–97
  • Author Correction: Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ digital medicine Patel, B. N., Rosenberg, L. n., Willcox, G. n., Baltaxe, D. n., Lyons, M. n., Irvin, J. n., Rajpurkar, P. n., Amrhein, T. n., Gupta, R. n., Halabi, S. n., Langlotz, C. n., Lo, E. n., Mammarappallil, J. n., Mariano, A. J., Riley, G. n., Seekins, J. n., Shen, L. n., Zucker, E. n., Lungren, M. P. 2019; 2 (1): 129

    Abstract

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.

    View details for DOI 10.1038/s41746-019-0198-6

    View details for PubMedID 33293693

  • Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists PLOS MEDICINE Rajpurkar, P., Irvin, J., Ball, R. L., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C. P., Patel, B. N., Yeom, K. W., Shpanskaya, K., Blankenberg, F. G., Seekins, J., Amrhein, T. J., Mong, D. A., Halabi, S. S., Zucker, E. J., Ng, A. Y., Lungren, M. P. 2018; 15 (11)
  • Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS medicine Rajpurkar, P., Irvin, J., Ball, R. L., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C. P., Patel, B. N., Yeom, K. W., Shpanskaya, K., Blankenberg, F. G., Seekins, J., Amrhein, T. J., Mong, D. A., Halabi, S. S., Zucker, E. J., Ng, A. Y., Lungren, M. P. 2018; 15 (11): e1002686

    Abstract

    BACKGROUND: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists.METHODS AND FINDINGS: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution.CONCLUSIONS: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.

    View details for PubMedID 30457988