Dr. Safwan Halabi is a Clinical Associate Professor of Radiology at the Stanford University School of Medicine and serves as the Medical Director for Radiology Informatics at Stanford Children's Health. He is board-certified in Radiology with Certificate of Added Qualification in Pediatric Radiology. He is also board-certified in Clinical Informatics. He clinically practices obstetric and pediatric imaging at Lucile Packard Children's Hospital. Dr. Halabi’s clinical and administrative leadership roles are directed at improving quality of care, efficiency, and patient safety. He has also lead strategic efforts to improve the enterprise imaging platforms at Stanford Children’s Health. He is a strong advocate of patient-centric care and has helped guide policies for radiology report and image release to patients. He has published in peer-reviewed journals on various clinical and informatics topics. His current academic and research interests include: imaging informatics, deep/machine learning in imaging, artificial intelligence in medicine, clinical decision support and patient-centric health care delivery. He is currently the Chair of the RSNA Informatics Data Science Committee and serves as a Board Member for the Society for Imaging Informatics in Medicine.
- Medical Informatics
- Fetal Imaging
- Pediatric Imaging
- Pediatric Radiology
Clinical Associate Professor, Radiology - Pediatric Radiology
Fellowship:Cincinnati Childrens Hospital and Medical Center Radiology Fellowships (2007) OH
Board Certification: Clinical Informatics, American Board of Preventive Medicine (2014)
Board Certification: Pediatric Radiology, American Board of Radiology (2009)
Board Certification: Radiology, American Board of Radiology (2006)
Residency:Henry Ford Health System (2006) MI
Internship:Henry Ford Health System (2002) MI
Medical Education:University of Toledo College of Medicine (2001) OH
Validation of an Artificial Intelligence-based Algorithm for Skeletal Age Assessment
The purpose of this study is to understand the effects of using a Artificial Intelligence algorithm for skeletal age estimation as a computer-aided diagnosis (CADx) system. In this prospective real-time study, the investigators will send de-identified hand radiographs to the Artificial Intelligence algorithm and surface the output of this algorithm to the radiologist, who will incorporate this information with their normal workflows to make a diagnosis of the patient's bone age. All radiologists involved in the study will be trained to recognize the surfaced prediction to be the output of the Artificial Intelligence algorithm. The radiologists' diagnosis will be final and considered independent to the output of the algorithm.
- Obstetric and neonatal outcomes in pregnancies complicated by fetal lung masses: does final histology matter? MOSBY-ELSEVIER. 2019: S151
The RSNA Pediatric Bone Age Machine Learning Challenge.
Purpose The Radiological Society of North America (RSNA) Pediatric Bone Age Machine Learning Challenge was created to show an application of machine learning (ML) and artificial intelligence (AI) in medical imaging, promote collaboration to catalyze AI model creation, and identify innovators in medical imaging. Materials and Methods The goal of this challenge was to solicit individuals and teams to create an algorithm or model using ML techniques that would accurately determine skeletal age in a curated data set of pediatric hand radiographs. The primary evaluation measure was the mean absolute distance (MAD) in months, which was calculated as the mean of the absolute values of the difference between the model estimates and those of the reference standard, bone age. Results A data set consisting of 14 236 hand radiographs (12 611 training set, 1425 validation set, 200 test set) was made available to registered challenge participants. A total of 260 individuals or teams registered on the Challenge website. A total of 105 submissions were uploaded from 48 unique users during the training, validation, and test phases. Almost all methods used deep neural network techniques based on one or more convolutional neural networks (CNNs). The best five results based on MAD were 4.2, 4.4, 4.4, 4.5, and 4.5 months, respectively. Conclusion The RSNA Pediatric Bone Age Machine Learning Challenge showed how a coordinated approach to solving a medical imaging problem can be successfully conducted. Future ML challenges will catalyze collaboration and development of ML tools and methods that can potentially improve diagnostic accuracy and patient care. © RSNA, 2018 Online supplemental material is available for this article.
View details for DOI 10.1148/radiol.2018180736
View details for PubMedID 30480490
Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet.
2018; 15 (11): e1002699
BACKGROUND: Magnetic resonance imaging (MRI) of the knee is the preferred method for diagnosing knee injuries. However, interpretation of knee MRI is time-intensive and subject to diagnostic error and variability. An automated system for interpreting knee MRI could prioritize high-risk patients and assist clinicians in making diagnoses. Deep learning methods, in being able to automatically learn layers of features, are well suited for modeling the complex relationships between medical images and their interpretations. In this study we developed a deep learning model for detecting general abnormalities and specific diagnoses (anterior cruciate ligament [ACL] tears and meniscal tears) on knee MRI exams. We then measured the effect of providing the model's predictions to clinical experts during interpretation.METHODS AND FINDINGS: Our dataset consisted of 1,370 knee MRI exams performed at Stanford University Medical Center between January 1, 2001, and December 31, 2012 (mean age 38.0 years; 569 [41.5%] female patients). The majority vote of 3 musculoskeletal radiologists established reference standard labels on an internal validation set of 120 exams. We developed MRNet, a convolutional neural network for classifying MRI series and combined predictions from 3 series per exam using logistic regression. In detecting abnormalities, ACL tears, and meniscal tears, this model achieved area under the receiver operating characteristic curve (AUC) values of 0.937 (95% CI 0.895, 0.980), 0.965 (95% CI 0.938, 0.993), and 0.847 (95% CI 0.780, 0.914), respectively, on the internal validation set. We also obtained a public dataset of 917 exams with sagittal T1-weighted series and labels for ACL injury from Clinical Hospital Centre Rijeka, Croatia. On the external validation set of 183 exams, the MRNet trained on Stanford sagittal T2-weighted series achieved an AUC of 0.824 (95% CI 0.757, 0.892) in the detection of ACL injuries with no additional training, while an MRNet trained on the rest of the external data achieved an AUC of 0.911 (95% CI 0.864, 0.958). We additionally measured the specificity, sensitivity, and accuracy of 9 clinical experts (7 board-certified general radiologists and 2 orthopedic surgeons) on the internal validation set both with and without model assistance. Using a 2-sided Pearson's chi-squared test with adjustment for multiple comparisons, we found no significant differences between the performance of the model and that of unassisted general radiologists in detecting abnormalities. General radiologists achieved significantly higher sensitivity in detecting ACL tears (p-value = 0.002; q-value = 0.019) and significantly higher specificity in detecting meniscal tears (p-value = 0.003; q-value = 0.019). Using a 1-tailed t test on the change in performance metrics, we found that providing model predictions significantly increased clinical experts' specificity in identifying ACL tears (p-value < 0.001; q-value = 0.006). The primary limitations of our study include lack of surgical ground truth and the small size of the panel of clinical experts.CONCLUSIONS: Our deep learning model can rapidly generate accurate clinical pathology classifications of knee MRI exams from both internal and external datasets. Moreover, our results support the assertion that deep learning models can improve the performance of clinical experts during medical imaging interpretation. Further research is needed to validate the model prospectively and to determine its utility in the clinical setting.
View details for DOI 10.1371/journal.pmed.1002699
View details for PubMedID 30481176
Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists.
2018; 15 (11): e1002686
BACKGROUND: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists.METHODS AND FINDINGS: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution.CONCLUSIONS: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.
View details for DOI 10.1371/journal.pmed.1002686
View details for PubMedID 30457988
Migrating to the Modern PACS: Challenges and Opportunities.
Radiographics : a review publication of the Radiological Society of North America, Inc
2018; 38 (6): 1761–72
With progressive advancements in picture archiving and communication system (PACS) technology, radiology practices frequently look toward system upgrades and replacements to further improve efficiency and capabilities. The transition between PACS has the potential to derail the operations of a radiology department. Careful planning and attention to detail from radiology informatics leaders are imperative to ensure a smooth transition. This article is a review of the architecture of a modern PACS, highlighting areas of recent innovation. Key considerations for planning a PACS migration and important issues to consider in data migration, change management, and business continuity are discussed. Beyond the technical aspects of a PACS migration, the human factors to consider when managing the cultural change that accompanies a new informatics tool and the keys to success when managing technical failures are explored. Online supplemental material is available for this article. ©RSNA, 2018.
View details for DOI 10.1148/rg.2018180161
View details for PubMedID 30303805
Performance of a Deep-Learning Neural Network Model in Assessing Skeletal Maturity on Pediatric Hand Radiographs
2018; 287 (1): 313–22
Purpose To compare the performance of a deep-learning bone age assessment model based on hand radiographs with that of expert radiologists and that of existing automated models. Materials and Methods The institutional review board approved the study. A total of 14 036 clinical hand radiographs and corresponding reports were obtained from two children's hospitals to train and validate the model. For the first test set, composed of 200 examinations, the mean of bone age estimates from the clinical report and three additional human reviewers was used as the reference standard. Overall model performance was assessed by comparing the root mean square (RMS) and mean absolute difference (MAD) between the model estimates and the reference standard bone ages. Ninety-five percent limits of agreement were calculated in a pairwise fashion for all reviewers and the model. The RMS of a second test set composed of 913 examinations from the publicly available Digital Hand Atlas was compared with published reports of an existing automated model. Results The mean difference between bone age estimates of the model and of the reviewers was 0 years, with a mean RMS and MAD of 0.63 and 0.50 years, respectively. The estimates of the model, the clinical report, and the three reviewers were within the 95% limits of agreement. RMS for the Digital Hand Atlas data set was 0.73 years, compared with 0.61 years of a previously reported model. Conclusion A deep-learning convolutional neural network model can estimate skeletal maturity with accuracy similar to that of an expert radiologist and to that of existing automated models. © RSNA, 2017 An earlier incorrect version of this article appeared online. This article was corrected on January 19, 2018.
View details for DOI 10.1148/radiol.2017170236
View details for Web of Science ID 000427992600038
View details for PubMedID 29095675
Translational Radiomics: Defining the Strategy Pipeline and Considerations for Application-Part 2: From Clinical Implementation to Enterprise
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY
2018; 15 (3): 543–49
Enterprise imaging has channeled various technological innovations to the field of clinical radiology, ranging from advanced imaging equipment and postacquisition iterative reconstruction tools to image analysis and computer-aided detection tools. More recently, the advancement in the field of quantitative image analysis coupled with machine learning-based data analytics, classification, and integration has ushered in the era of radiomics, a paradigm shift that holds tremendous potential in clinical decision support as well as drug discovery. However, there are important issues to consider to incorporate radiomics into a clinically applicable system and a commercially viable solution. In this two-part series, we offer insights into the development of the translational pipeline for radiomics from methodology to clinical implementation (Part 1) and from that point to enterprise development (Part 2). In Part 2 of this two-part series, we study the components of the strategy pipeline, from clinical implementation to building enterprise solutions.
View details for DOI 10.1016/j.jacr.2017.12.006
View details for Web of Science ID 000427667000011
View details for PubMedID 29366598
- Data Science: Big Data, Machine Learning, and Artificial Intelligence JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY 2018; 15 (3): 497–98
Translational Radiomics: Defining the Strategy Pipeline and Considerations for Application-Part 1: From Methodology to Clinical Implementation
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY
2018; 15 (3): 538–42
Enterprise imaging has channeled various technological innovations to the field of clinical radiology, ranging from advanced imaging equipment and postacquisition iterative reconstruction tools to image analysis and computer-aided detection tools. More recently, the advancements in the field of quantitative image analysis coupled with machine learning-based data analytics, classification, and integration have ushered us into the era of radiomics, which has tremendous potential in clinical decision support as well as drug discovery. There are important issues to consider to incorporate radiomics as a clinically applicable system and a commercially viable solution. In this two-part series, we offer insights into the development of the translational pipeline for radiomics from methodology to clinical implementation (Part 1) and from that to enterprise development (Part 2).
View details for DOI 10.1016/j.jacr.2017.12.008
View details for Web of Science ID 000427667000010
View details for PubMedID 29366600
Imaging before 24 weeks gestation can predict neonatal respiratory morbidity in pregnancies complicated by fetal lung masses
MOSBY-ELSEVIER. 2018: S287–S288
View details for Web of Science ID 000422946900478
Evaluating the Effect of Unstructured Clinical Information on Clinical Decision Support Appropriateness Ratings.
Journal of the American College of Radiology
2017; 14 (6): 737-743
To determine the appropriateness rating (AR) of advanced inpatient imaging requests that were not rated by prospective, point-of-care clinical decision support (CDS) using computerized provider order entry.During 30-day baseline and intervention periods, CDS generated an AR for advanced inpatient imaging requests (nuclear medicine, CT, and MRI) using provider-selected structured indications from pull-down menus in the computerized provider order entry portal. The AR was only displayed during the intervention, and providers were required to acknowledge the AR to finalize the request. Subsequently, the unstructured free text information accompanying all requests was reviewed, and the AR was revised when possible. The percentage of unrated requests and the overall AR, before and after radiologist review, were compared between periods and by provider type.CDS software prospectively generated an AR for only 25.4% and 28.4% of baseline and intervention imaging requests, respectively; however, radiologist review generated an AR for 82.4% and 93.6% of the same requests. During the respective periods, the percentage of baseline and intervention imaging requests considered appropriate was 18.7% and 22.9% by prospective CDS software rating and increased to 82.4% and 88.7% with radiologist review.Despite limited effective use of CDS software, the percentage of requests containing additional, relevant clinical information increased, and the majority of requests had overall high appropriateness when reviewed by a radiologist. Additional work is needed to improve the amount and quality of clinical information available to CDS software and to facilitate the entry of this information by appropriate end users.
View details for DOI 10.1016/j.jacr.2017.02.003
View details for PubMedID 28434848
Concierge and Second-Opinion Radiology: Review of Current Practices.
Current problems in diagnostic radiology
2016; 45 (2): 111-114
Radiology's core assets include the production, interpretation, and distribution of quality imaging studies. Second-opinion services and concierge practices in radiology aim to augment traditional services by providing patient-centered and physician-centered care, respectively. Patient centeredness enhances patients' understanding and comfort with their radiology tests and procedures and allows them to make better decisions about their health care. As the fee-for-service paradigm shifts to value-based care models, radiology practices have begun to diversify imaging service delivery and communication to coincide with the American College of Radiology Imaging 3.0 campaign. Physician-centered consultation allows for communication of evidence-based guidelines to assist referring physicians and other providers in making the most appropriate imaging or treatment decision for a specific clinical condition. There are disparate practice models and payment schema for the various second-opinion and concierge practices. This review article explores the current state and payment models of second-opinion and concierge practices in radiology. This review also includes a discussion on the benefits, roadblocks, and ethical issues that surround these novel types of practices.
View details for DOI 10.1067/j.cpradiol.2015.07.011
View details for PubMedID 26305521
Systematic Literature Review of Imaging Features of Spinal Degeneration in Asymptomatic Populations
AMERICAN JOURNAL OF NEURORADIOLOGY
2015; 36 (4): 811-816
Degenerative changes are commonly found in spine imaging but often occur in pain-free individuals as well as those with back pain. We sought to estimate the prevalence, by age, of common degenerative spine conditions by performing a systematic review studying the prevalence of spine degeneration on imaging in asymptomatic individuals.We performed a systematic review of articles reporting the prevalence of imaging findings (CT or MR imaging) in asymptomatic individuals from published English literature through April 2014. Two reviewers evaluated each manuscript. We selected age groupings by decade (20, 30, 40, 50, 60, 70, 80 years), determining age-specific prevalence estimates. For each imaging finding, we fit a generalized linear mixed-effects model for the age-specific prevalence estimate clustering in the study, adjusting for the midpoint of the reported age interval.Thirty-three articles reporting imaging findings for 3110 asymptomatic individuals met our study inclusion criteria. The prevalence of disk degeneration in asymptomatic individuals increased from 37% of 20-year-old individuals to 96% of 80-year-old individuals. Disk bulge prevalence increased from 30% of those 20 years of age to 84% of those 80 years of age. Disk protrusion prevalence increased from 29% of those 20 years of age to 43% of those 80 years of age. The prevalence of annular fissure increased from 19% of those 20 years of age to 29% of those 80 years of age.Imaging findings of spine degeneration are present in high proportions of asymptomatic individuals, increasing with age. Many imaging-based degenerative features are likely part of normal aging and unassociated with pain. These imaging findings must be interpreted in the context of the patient's clinical condition.
View details for DOI 10.3174/ajnr.A4173
View details for Web of Science ID 000352512400038
View details for PubMedID 25430861
The Effect of Clinical Decision Support for Advanced Inpatient Imaging
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY
2015; 12 (4): 358-363
To examine the effect of integrating point-of-care clinical decision support (CDS) using the ACR Appropriateness Criteria (AC) into an inpatient computerized provider order entry (CPOE) system for advanced imaging requests.Over 12 months, inpatient CPOE requests for nuclear medicine, CT, and MRI were processed by CDS to generate an AC score using provider-selected data from pull-down menus. During the second 6-month period, AC scores were displayed to ordering providers, and acknowledgement was required to finalize a request. Request AC scores and percentages of requests not scored by CDS were compared among primary care providers (PCPs) and specialists, and by years in practice of the responsible physician of record.CDS prospectively generated a score for 26.0% and 30.3% of baseline and intervention requests, respectively. The average AC score increased slightly for all requests (7.2 ± 1.6 versus 7.4 ± 1.5; P < .001), for PCPs (6.9 ± 1.9 versus 7.4 ± 1.6; P < .001), and minimally for specialists (7.3 ± 1.6 versus 7.4 ± 1.5; P < .001). The percentage of requests lacking sufficient structured clinical information to generate an AC score decreased for all requests (73.1% versus 68.9%; P < .001), for PCPs (78.0% versus 71.7%; P < .001), and for specialists (72.9% versus 69.1%; P < .001).Integrating CDS into inpatient CPOE slightly increased the overall AC score of advanced imaging requests as well as the provision of sufficient structured data to automatically generate AC scores. Both effects were more pronounced in PCPs compared with specialists.
View details for DOI 10.1016/j.jacr.2014.11.013
View details for Web of Science ID 000352181000011
View details for PubMedID 25622766
Improving the Application of Imaging Clinical Decision Support Tools: Making the Complex Simple
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY
2014; 11 (3): 257-261
With the promotion and incentivization of electronic health records and computerized order entry by CMS, there is a unique opportunity to catalyze the use of evidence-based guidelines with the inclusion of clinical decision support (CDS) tools. Imaging CDS tools have evolved from static paper algorithms, checklists, and scores to interactive systems that provide feedback and recommendations with the intent of directing health care providers to deliver best practices. Some of the major limitations of first generation imaging CDS tools include a lack of comprehensive evidence-based guidelines, limited ability to input detailed patient conditions and symptoms, and time-intensive user interfaces. Next-generation imaging CDS tools will attempt to close the information and interface gaps to provide more meaningful guidance to health care providers and improve the delivery of best practices to patients.
View details for DOI 10.1016/j.jacr.2013.10.007
View details for Web of Science ID 000332354800015
View details for PubMedID 24589400