Curtis Langlotz

Senior Associate Vice Provost for Research, Professor of Radiology (Integrative Biomedical Imaging Informatics), of Medicine (BMIR), of Biomedical Data Science and Senior Fellow at the Stanford Institute for Human-Centered AI

Practices at Stanford Health Care

Bio

Dr. Langlotz is Professor of Radiology, Medicine, and Biomedical Data Science and Senior Associate Vice Provost for Research at Stanford University. His laboratory investigates the use of deep neural networks and other machine learning technologies to detect disease and eliminate diagnostic errors through analysis of medical images and clinical notes. He is a Senior Fellow at Stanford’s Institute for Human-Centered Artificial Intelligence and Director of the Center for Artificial Intelligence in Medicine and Imaging (AIMI Center), which supports over 150 Stanford faculty conducting interdisciplinary artificial intelligence research that optimizes how clinical data are used to promote health.

He has published over 200 scholarly articles and is author of the book “The Radiology Report: A Guide to Thoughtful Communication for Radiologists and Other Medical Professionals”. He has led many national and international efforts to improve the quality of radiology communication, including the RadLex™ terminology standard, the RadLex™ Playbook of radiology exam codes, the RSNA report template library, and a technical standard for communication of radiology templates.

Raised in St. Paul, Minnesota, Dr. Langlotz received his undergraduate degree in Human Biology, Master’s in Computer Science, MD in Medicine, and PhD in Medical Information Science, all from Stanford University. He is a founder and past president of the Radiology Alliance for Health Services Research (RAHSR) and has served as president of the Society for Imaging Informatics in Medicine (SIIM), and the College of SIIM Fellows. He is a former board member of the Association of University Radiologists (AUR), the American Medical Informatics Association (AMIA) and the Society for Medical Decision Making (SMDM). He currently serves as President of the Radiological Society of North America (RSNA).

Dr. Langlotz is a recipient of the Lee B. Lusted Research Prize from the Society of Medical Decision Making and the Career Achievement Award from the Radiology Alliance for Health Services Research. He and his trainees have received numerous scientific awards, including seven best paper awards and five research career development grants. He has founded several healthcare information technology companies, including Montage Healthcare Solutions, which was acquired by Nuance Communications in 2016.

Clinical Focus

Diagnostic Radiology

Academic Appointments

Professor - University Medical Line, Radiology
Professor - University Medical Line, Medicine - Biomedical Informatics Research
Professor - University Medical Line, Department of Biomedical Data Science
Senior Fellow, Institute for Human-Centered Artificial Intelligence (HAI)
Member, Bio-X

Administrative Appointments

Director, Center for Artificial Intelligence in Medicine and Imaging (2018 - Present)
Associate Director, Institute for Human-Centered Artificial Intelligence (2023 - Present)
Physician Lead, Imaging AI, Stanford Health Care (2024 - Present)
Associate Chair for Information Systems, Department of Radiology (2014 - 2024)
Medical Informatics Director for Radiology, Stanford Health Care (2014 - 2024)

Honors & Awards

Lee B. Lusted Research Prize, Society for Medical Decision Making (1986)
GERRAF Career Development Award, Association of University Radiologists (1993)
Best Information Technology Company, New Jersey Technology Council (2001)
Fellow, American College of Medical Informatics (2008)
Fellow, Society for Imaging Informatics in Medicine (2010)
Lifetime Achievement Award, Radiology Alliance for Health Services Research (2017)

Boards, Advisory Committees, Professional Organizations

President, Radiological Society of North America (2024 - 2025)
Chair of the Board, Radiological Society of North America (2022 - 2023)
Board of Directors, Radiological Society of North America (2016 - 2023)
President, College of Imaging Informatics Fellows (2015 - 2018)
President, Society for Imaging Informatics in Medicine (2006 - 2008)
Board of Directors, Association for University Radiologists (2003 - 2010)
Founder and President, Radiology Alliance for Health Services Research (2002 - 2004)
Board of Trustees, Society for Medical Decision Making (1996 - 1998)

Professional Education

Medical Education: Stanford University School of Medicine (1989) CA
Residency: Hospital of the University of Pennsylvania, Department of Radiology (1994) PA
Internship: Hospital of the University of Pennsylvania Primary Care (1990) PA
AB, Stanford University, Human Biology (1981)
MS, Stanford University, Computer Science: Artificial Intelligence (1983)
MD, Stanford University, Medicine (1989)
PhD, Stanford University, Medical Information Sciences (1989)
Board Certification: American Board of Radiology, Diagnostic Radiology (1994)

Patents

Curtis Langlotz. "United States Patent 6,366,683 Apparatus and Method for Recording Image Analysis Information", Radiological Society of North America, Apr 2, 2002

Contact

Academic
langlotz@stanford.edu
University - Faculty Department: Rad/Integrative Biomedical Imaging Informatics at Stanford Position: Professor-Univ Med Line
- H1330D, Stanford Hospital and Clinics
- 300 Pasteur Drive
- Stanford, California 94305

Administrative Contact Jacqueline Thomas thomajr@stanford.edu
- 650-497-5880 (office)

Clinical (Primary) Stanford Hospital and Clinics 300 Pasteur Dr Rm H1330D Stanford, CA 94305
- (650) 498-4797 (office)
(650) 723-6717 (fax)

Additional Clinical Info

Stanford Health Care

Additional Info

Mail Code: 5621
Other Names:
Curt Langlotz
ORCID:
https://orcid.org/0000-0002-8972-8051

Current Research and Scholarly Interests

My laboratory develops machine learning methods to help physicians detect disease and eliminate diagnostic errors. We are developing neural network systems that detect and classify disease on medical images. We also develop natural language processing methods that use the narrative radiology report for contrastive learning and other multi-modal methods that improve the accuracy and capability of machine learning systems. We are committed to the clinical evaluation and use of ideas conceived in the laboratory. When our results show potential, we disseminate them as open source or commercial software.

Clinical Trials

Prospective Evaluation of Machine Learning Models for Radiology Not Recruiting

The purpose of this study is to understand the effects of using an Artificial Intelligence algorithm for skeletal age estimation as a computer-aided diagnosis (CADx) system. In this prospective real-time study, the investigators will send de-identified hand radiographs to the Artificial Intelligence algorithm and surface the output of this algorithm to the radiologist, who will incorporate this information with their normal workflows to make an estimation of the bone age. All radiologists involved in the study will be trained to recognize the surfaced prediction to be the output of the Artificial Intelligence algorithm. The radiologists' diagnosis will be final and considered independent to the output of the algorithm.

Stanford is currently not accepting patients for this trial. For more information, please contact Safwan Halabi, M.D., (650) 721-2850.

View full details

2025-26 Courses

Independent Studies (7)
- Directed Reading
  BMDS 299 (Aut, Win, Spr)
- Directed Reading in Radiology
  RAD 299 (Aut, Win, Spr, Sum)
- Early Clinical Experience in Radiology
  RAD 280 (Aut, Win, Spr, Sum)
- Graduate Research
  RAD 399 (Aut, Win, Spr, Sum)
- Medical Scholars Research
  RAD 370 (Aut, Win, Spr, Sum)
- Readings in Radiology Research
  RAD 101 (Aut, Win, Spr, Sum)
- Undergraduate Research
  RAD 199 (Aut, Win, Spr, Sum)

Stanford Advisees

Postdoctoral Faculty Sponsor
Yunhe Gao, Chong Wang
Doctoral Dissertation Advisor (NonAC)
Maya Varma
Doctoral (Program)
Eva Prakash, Maya Varma

Graduate and Fellowship Programs

Biomedical Data Science (Masters Program)
Biomedical Data Science (Phd Program)
Clinical Informatics (Fellowship Program)

All Publications

Large language models for simplifying radiology reports: a systematic review and meta-analysis of patient, public, and clinician evaluations. The Lancet. Digital health Alabed, S., Anderson, A., Maiter, A., Hughes, A., McAnenly, N., Salehi, M., Sharkey, M., Dwivedi, K., Hokmabadi, A., Alahdab, F., Stevenson, M., Ma, N., Gaizauskas, R., Chico, T. J., Swift, A. J., Li, J. J., Kleesiek, J., Langlotz, C. 2026: 100960

Abstract

Radiology reports are typically written in language that is difficult for patients to understand. Large language models (LLMs) excel at simplifying text. We aimed to evaluate the ability of LLMs to improve the understanding of radiology reports.In this systematic review and meta-analysis, we searched CENTRAL, MEDLINE, and Embase from inception to Nov 11, 2025, without restrictions on language. Full-text articles and preprints were considered for inclusion. Eligible studies applied LLMs to simplify radiology reports and had these reports assessed by members of the public or medical professionals. We excluded studies that focused solely on dialogues with interactive chatbots, preimaging leaflets, educational materials, appointment letters, or summarising findings without simplifying them for patients. Search results were screened independently by two authors and full-text review and data extraction were done by three authors; disagreements were resolved by consensus. The main outcomes were patient, public, and clinician evaluations (Likert scores) and text readability metrics. We assessed study quality with the MAIC-10 tool. This study was registered with PROSPERO (CRD420251027489).We identified 2385 records, of which 38 studies were eligible. These 38 studies generated 12 922 simplified reports, assessed by 508 evaluators (387 lay people and 121 clinicians). 35 (92%) of 38 studies used OpenAI GPT models and 29 (76%) produced simplified reports in English. Patients perceived LLM-rewritten reports as significantly more understandable than radiologist reports (mean Likert score 4·04 [SD 1·20] for simplified reports vs 2·16 [SD 0·94] for original reports; mean difference 2·00 [95% CI 1·54-2·46]). Clinicians rated LLM-rewritten reports highly for accuracy (mean 4·45 [95% CI 4·27-4·63]; 27 studies) and completeness (mean 4·53 [95% CI 4·30-4·76]; 14 studies). Readability was improved across imaging modalities, with lower Flesch-Kincaid Grade Level for LLM-rewritten reports, including a mean difference of -6·20 (95% CI -6·91 to -5·48) for CT, -5·07 (-5·99 to -4·15) for x-ray, and -5·0 (-6·0 to -4·0) for MRI. The error rate in LLM-rewritten reports was 7·2% (95% CI 5·1%-10·0%; 13 studies) and 0·9% (95% CI 0·6-1·5%; 2 studies) for clinically significant errors.LLM-simplified radiology reports improved patient-perceived understanding and readability and were rated by clinicians as largely accurate and complete, although a small proportion contained clinically significant errors. LLM-based simplification shows promise for making radiology communication more patient-centred, but further evaluation of its effect on patient outcomes and clinical workflows is required.National Institute for Health and Care Research Sheffield Biomedical Research Centre.

View details for DOI 10.1016/j.landig.2025.100960

View details for PubMedID 41698858
The Effect of AI on the Radiologist Workforce: A Task-Based Analysis. medRxiv : the preprint server for health sciences Langlotz, C. P. 2025

Abstract

The effect of AI algorithms on the radiology workforce has been a subject of commentary and controversy. There is now sufficient published evidence to support a quantitative task-based analysis to predict these effects.To construct a quantitative, task-based model to predict the effect of AI on the radiology workforce using the best available evidence.We reviewed the literature to establish the tasks on which radiologists spend their time. We then developed categories of AI applications that could affect these tasks. We used published evidence to estimate the effect of each AI application on each radiology task using a 5-year time horizon. When published evidence was unavailable, we used our own judgment.The model projects a 33% reduction in hours worked by radiologists in 5 years, with a range of 14% to 49%. The main effects are due to radiology report drafting for all modalities and study delegation for radiography and mammography.AI applications likely will cause a significant decrease in radiologist hours worked.. Given the relatively static radiology workforce and the continued growth in imaging volumes, radiologist job loss is unlikely for the foreseeable future.

View details for DOI 10.64898/2025.12.20.25342714

View details for PubMedID 41480022

View details for PubMedCentralID PMC12755265
Generative artificial intelligence in medicine. Nature medicine Teo, Z. L., Thirunavukarasu, A. J., Elangovan, K., Cheng, H., Moova, P., Soetikno, B., Nielsen, C., Pollreisz, A., Ting, D. S., Morris, R. J., Shah, N. H., Langlotz, C. P., Ting, D. S. 2025

Abstract

Generative artificial intelligence (GAI) can automate a growing number of biomedical tasks, ranging from clinical decision support to design and analysis of research studies. GAI uses machine learning and transformer model architectures to generate useful text, images and sound data in response to user queries. While previous biomedical deep-learning applications have used general-purpose datasets and enormous volumes of labeled data for training, evidence now suggests that GAI models may perform better while requiring less training data-for example, using smaller, domain-specific datasets. Moreover, AI techniques have progressed from fully supervised training to less label-intensive approaches, such as weakly supervised or unsupervised fine-tuning and reinforcement learning. Recent iterations of GAI, such as agents, mixture-of-expert models and reasoning models, have further extended their capabilities to assist with complex and multistage tasks. Here, we provide an overview of recent technical advancements in GAI. We explore the potential of the latest generation of models to improve healthcare for clinicians and patients, and discuss validation approaches using specific examples to illustrate challenges and opportunities for further work.

View details for DOI 10.1038/s41591-025-03983-2

View details for PubMedID 41053447

View details for PubMedCentralID 12082058
RadGPT: A system based on a large language model that generates sets of patient-centered materials to explain radiology report information. Journal of the American College of Radiology : JACR Herwald, S. E., Shah, P., Johnston, A., Olsen, C., Delbrouck, J., Langlotz, C. P. 2025

Abstract

OBJECTIVE: The Cures Act Final Rule requires that patients have real-time access to their radiology reports, which contain technical language. Our objective to was to use a novel system called RadGPT, which integrates concept extraction and a large language model (LLM), to help patients understand their radiology reports.METHODS: RadGPT generated 150 concept explanations and 390 question-and-answer pairs from 30 radiology report impressions from between 2012 and 2020. The extracted concepts were used to create concept-based explanations, as well as concept-based question-and-answer pairs where questions were generated using either a fixed template or an LLM. Additionally, report-based question-and-answer pairs were generated directly from the impression using an LLM without concept extraction. One board-certified radiologist and 4 radiology residents rated the material quality using a standardized rubric.RESULTS: Concept-based LLM-generated questions were significantly higher quality than concept-based template-generated questions (p < 0.001). Excluding those template-based question-and-answer pairs from further analysis, nearly all (> 95%) of RadGPT-generated materials were rated highly, with at least 50% receiving the highest possible ranking from all 5 raters. No answers or explanations were rated as likely to affect the safety or effectiveness of patient care. Report-level LLM-based questions and answers were rated particularly highly, with 92% of report-level LLM-based questions and 61% of the corresponding report-level answers receiving the highest rating from all raters.DISCUSSION: The educational tool RadGPT generated high-quality explanations and question-and-answer pairs that were personalized for each radiology report, unlikely to produce harmful explanations and likely to enhance patient understanding of radiology information.

View details for DOI 10.1016/j.jacr.2025.06.013

View details for PubMedID 40505763
Best Practices for Large Language Models in Radiology. Radiology Bluethgen, C., Van Veen, D., Zakka, C., Link, K. E., Fanous, A. H., Daneshjou, R., Frauenfelder, T., Langlotz, C. P., Gatidis, S., Chaudhari, A. 2025; 315 (1): e240528

Abstract

Radiologists must integrate complex imaging data with clinical information to produce actionable insights. This task requires a nuanced application of language across many activities, including managing clinical requests, analyzing imaging findings in the context of clinical data, interpreting these through the radiologist's lens, and effectively documenting and communicating the outcomes. Radiology practices must ensure reliable communication among numerous systems and stakeholders critical for medical decision-making. Large language models (LLMs) offer an opportunity to improve the management and interpretation of the vast amounts of text data in radiology. Despite being developed as general-purpose tools, these advanced computational models demonstrate impressive capabilities in specialized tasks, even without specific training. Unlocking the potential of LLMs for radiology requires an understanding of their foundations and a strategic approach to navigate their idiosyncrasies. This review, drawing from practical radiology and machine learning expertise, provides general and technically adept radiologists insight into the potential of LLMs in radiology. It also equips those interested in implementing applicable best practices that have so far stood the test of time in the rapidly evolving landscape of LLMs. The review provides practical advice for optimizing LLM characteristics for radiology practices, including advice on limitations, effective prompting, and fine-tuning strategies.

View details for DOI 10.1148/radiol.240528

View details for PubMedID 40298602
Developing a Research Center for Artificial Intelligence in Medicine. Mayo Clinic proceedings. Digital health Langlotz, C. P., Kim, J., Shah, N., Lungren, M. P., Larson, D. B., Datta, S., Li, F. F., O'Hara, R., Montine, T. J., Harrington, R. A., Gold, G. E. 2024; 2 (4): 677-686

Abstract

Artificial intelligence (AI) and machine learning (ML) are driving innovation in biosciences and are already affecting key elements of medical scholarship and clinical care. Many schools of medicine are capitalizing on the promise of these new technologies by establishing academic units to catalyze and grow research and innovation in AI/ML. At Stanford University, we have developed a successful model for an AI/ML research center with support from academic leaders, clinical departments, extramural grants, and industry partners. The Center for Artificial Intelligence in Medicine and Imaging uses the following 4 key tactics to support AI/ML research: project-based learning opportunities that build interdisciplinary collaboration; internal grant programs that catalyze extramural funding; infrastructure that facilitates the rapid creation of large multimodal AI-ready clinical data sets; and educational and open data programs that engage the broader research community. The center is based on the premise that foundational and applied research are not in tension but instead are complementary. Solving important biomedical problems with AI/ML requires high-quality foundational team science that incorporates the knowledge and expertise of clinicians, clinician scientists, computer scientists, and data scientists. As AI/ML becomes an essential component of research and clinical care, multidisciplinary centers of excellence in AI/ML will become a key part of the scholarly portfolio of academic medical centers and will provide a foundation for the responsible, ethical, and fair implementation of AI/ML systems.

View details for DOI 10.1016/j.mcpdig.2024.07.005

View details for PubMedID 39802660

View details for PubMedCentralID PMC11720458
A vision-language foundation model for the generation of realistic chest X-ray images. Nature biomedical engineering Bluethgen, C., Chambon, P., Delbrouck, J. B., van der Sluijs, R., Połacin, M., Zambrano Chaves, J. M., Abraham, T. M., Purohit, S., Langlotz, C. P., Chaudhari, A. S. 2024

Abstract

The paucity of high-quality medical imaging datasets could be mitigated by machine learning models that generate compositionally diverse images that faithfully represent medical concepts and pathologies. However, large vision-language models are trained on natural images, and the diversity distribution of the generated images substantially differs from that of medical images. Moreover, medical language involves specific and semantically rich vocabulary. Here we describe a domain-adaptation strategy for large vision-language models that overcomes distributional shifts. Specifically, by leveraging publicly available datasets of chest X-ray images and the corresponding radiology reports, we adapted a latent diffusion model pre-trained on pairs of natural images and text descriptors to generate diverse and visually plausible synthetic chest X-ray images (as confirmed by board-certified radiologists) whose appearance can be controlled with free-form medical text prompts. The domain-adaptation strategy for the text-conditioned synthesis of medical images can be used to augment training datasets and is a viable alternative to the sharing of real medical images for model training and fine-tuning.

View details for DOI 10.1038/s41551-024-01246-y

View details for PubMedID 39187663

View details for PubMedCentralID 10131505
Adapted large language models can outperform medical experts in clinical text summarization. Nature medicine Van Veen, D., Van Uden, C., Blankemeier, L., Delbrouck, J. B., Aali, A., Bluethgen, C., Pareek, A., Polacin, M., Reis, E. P., Seehofnerová, A., Rohatgi, N., Hosamani, P., Collins, W., Ahuja, N., Langlotz, C. P., Hom, J., Gatidis, S., Pauly, J., Chaudhari, A. S. 2024

Abstract

Analyzing vast textual data and summarizing key information from electronic health records imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown promise in natural language processing (NLP) tasks, their effectiveness on a diverse range of clinical summarization tasks remains unproven. Here we applied adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks: radiology reports, patient questions, progress notes and doctor-patient dialogue. Quantitative assessments with syntactic, semantic and conceptual NLP metrics reveal trade-offs between models and adaptation methods. A clinical reader study with 10 physicians evaluated summary completeness, correctness and conciseness; in most cases, summaries from our best-adapted LLMs were deemed either equivalent (45%) or superior (36%) compared with summaries from medical experts. The ensuing safety analysis highlights challenges faced by both LLMs and medical experts, as we connect errors to potential medical harm and categorize types of fabricated information. Our research provides evidence of LLMs outperforming medical experts in clinical text summarization across multiple tasks. This suggests that integrating LLMs into clinical workflows could alleviate documentation burden, allowing clinicians to focus more on patient care.

View details for DOI 10.1038/s41591-024-02855-5

View details for PubMedID 38413730

View details for PubMedCentralID 5593724
RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports Delbrouck, J., Chambon, P., Chen, Z., Varma, M., Johnston, A., Blankemeier, L., Van Veen, D., Bui, T., Steven Truong, Langlotz, C. P. edited by Martins, A., Srikumar, Ku, L. W. ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2024: 12902-12915

View details for Web of Science ID 001391786804035
Organizational Factors in Clinical Data Sharing for Artificial Intelligence in Health Care. JAMA network open Youssef, A., Ng, M. Y., Long, J., Hernandez-Boussard, T., Shah, N., Miner, A., Larson, D., Langlotz, C. P. 2023; 6 (12): e2348422

Abstract

Limited sharing of data sets that accurately represent disease and patient diversity limits the generalizability of artificial intelligence (AI) algorithms in health care.To explore the factors associated with organizational motivation to share health data for AI development.This qualitative study investigated organizational readiness for sharing health data across the academic, governmental, nonprofit, and private sectors. Using a multiple case studies approach, 27 semistructured interviews were conducted with leaders in data-sharing roles from August 29, 2022, to January 9, 2023. The interviews were conducted in the English language using a video conferencing platform. Using a purposive and nonprobabilistic sampling strategy, 78 individuals across 52 unique organizations were identified. Of these, 35 participants were enrolled. Participant recruitment concluded after 27 interviews, as theoretical saturation was reached and no additional themes emerged.Concepts defining organizational readiness for data sharing and the association between data-sharing factors and organizational behavior were mapped through iterative qualitative analysis to establish a framework defining organizational readiness for sharing clinical data for AI development.Interviews included 27 leaders from 18 organizations (academia: 10, government: 7, nonprofit: 8, and private: 2). Organizational readiness for data sharing centered around 2 main constructs: motivation and capabilities. Motivation related to the alignment of an organization's values with data-sharing priorities and was associated with its engagement in data-sharing efforts. However, organizational motivation could be modulated by extrinsic incentives for financial or reputational gains. Organizational capabilities comprised infrastructure, people, expertise, and access to data. Cross-sector collaboration was a key strategy to mitigate barriers to access health data.This qualitative study identified sector-specific factors that may affect the data-sharing behaviors of health organizations. External incentives may bolster cross-sector collaborations by helping overcome barriers to accessing health data for AI development. The findings suggest that tailored incentives may boost organizational motivation and facilitate sustainable flow of health data for AI development.

View details for DOI 10.1001/jamanetworkopen.2023.48422

View details for PubMedID 38113040
The Future of AI and Informatics in Radiology: 10 Predictions. Radiology Langlotz, C. P. 2023; 309 (1): e231114

View details for DOI 10.1148/radiol.231114

View details for PubMedID 37874234
Automated deidentification of radiology reports combining transformer and "hide in plain sight" rule-based methods. Journal of the American Medical Informatics Association : JAMIA Chambon, P. J., Wu, C., Steinkamp, J. M., Adleberg, J., Cook, T. S., Langlotz, C. P. 2022

Abstract

OBJECTIVE: To develop an automated deidentification pipeline for radiology reports that detect protected health information (PHI) entities and replaces them with realistic surrogates "hiding in plain sight."MATERIALS AND METHODS: In this retrospective study, 999 chest X-ray and CT reports collected between November 2019 and November 2020 were annotated for PHI at the token level and combined with 3001 X-rays and 2193 medical notes previously labeled, forming a large multi-institutional and cross-domain dataset of 6193 documents. Two radiology test sets, from a known and a new institution, as well as i2b2 2006 and 2014 test sets, served as an evaluation set to estimate model performance and to compare it with previously released deidentification tools. Several PHI detection models were developed based on different training datasets, fine-tuning approaches and data augmentation techniques, and a synthetic PHI generation algorithm. These models were compared using metrics such as precision, recall and F1 score, as well as paired samples Wilcoxon tests.RESULTS: Our best PHI detection model achieves 97.9 F1 score on radiology reports from a known institution, 99.6 from a new institution, 99.5 on i2b2 2006, and 98.9 on i2b2 2014. On reports from a known institution, it achieves 99.1 recall of detecting the core of each PHI span.DISCUSSION: Our model outperforms all deidentifiers it was compared to on all test sets as well as human labelers on i2b2 2014 data. It enables accurate and automatic deidentification of radiology reports.CONCLUSIONS: A transformer-based deidentification pipeline can achieve state-of-the-art performance for deidentifying radiology reports and other medical documents.

View details for DOI 10.1093/jamia/ocac219

View details for PubMedID 36416419
Implementation of Clinical Artificial Intelligence in Radiology: Who Decides and How? Radiology Daye, D., Wiggins, W. F., Lungren, M. P., Alkasab, T., Kottler, N., Allen, B., Roth, C. J., Bizzo, B. C., Durniak, K., Brink, J. A., Larson, D. B., Dreyer, K. J., Langlotz, C. P. 2022: 212151

Abstract

As the role of artificial intelligence (AI) in clinical practice evolves, governance structures oversee the implementation, maintenance, and monitoring of clinical AI algorithms to enhance quality, manage resources, and ensure patient safety. In this article, a framework is established for the infrastructure required for clinical AI implementation and presents a road map for governance. The road map answers four key questions: Who decides which tools to implement? What factors should be considered when assessing an application for implementation? How should applications be implemented in clinical practice? Finally, how should tools be monitored and maintained after clinical implementation? Among the many challenges for the implementation of AI in clinical practice, devising flexible governance structures that can quickly adapt to a changing environment will be essential to ensure quality patient care and practice improvement objectives.

View details for DOI 10.1148/radiol.212151

View details for PubMedID 35916673
Moving Toward Seamless Interinstitutional Electronic Image Transfer. Journal of the American College of Radiology : JACR Larson, D. B., Krishnaraj, A., Mendelson, D. S., Langlotz, C. P., Wald, C. 1800

Abstract

The fact that medical images are still predominately exchanged between institutions via physical media is unacceptable in the era of value-driven health care. Although better solutions are technically possible, problems of coordination and market dynamics may be inhibiting progress more than technical factors. We provide a macrosystem analysis of the problem of interinstitutional medical image exchange and propose a strategy for nudging the market toward a patient-friendly solution. The system can be viewed as a network, with autonomous nodes interconnected by links through which information is exchanged. A variety of potential network configurations include those that depend on individual carriers, peer-to-peer links, one or multiple hubs, or a hybrid of models. We find the linked multihub model, in which individual institutions are connected to other institutions via image exchange companies, to be the configuration most likely to create a patient-friendly electronic image exchange system. To achieve this configuration, image exchange companies, which operate in a competitive marketplace, must exchange images with each other. We call on these vendors to immediately commit to coordinating in this manner. We call on all other stakeholders, including medical societies, payers, and regulators, to actively encourage and facilitate this behavior. Specifically, we call on institutions to create appropriate market incentives by only contracting with image exchange vendors who are committed to begin vendor-to-vendor image exchange by no later than2024.

View details for DOI 10.1016/j.jacr.2021.11.017

View details for PubMedID 35114138
Automated coronary calcium scoring using deep learning with multicenter external validation. NPJ digital medicine Eng, D., Chute, C., Khandwala, N., Rajpurkar, P., Long, J., Shleifer, S., Khalaf, M. H., Sandhu, A. T., Rodriguez, F., Maron, D. J., Seyyedi, S., Marin, D., Golub, I., Budoff, M., Kitamura, F., Takahashi, M. S., Filice, R. W., Shah, R., Mongan, J., Kallianos, K., Langlotz, C. P., Lungren, M. P., Ng, A. Y., Patel, B. N. 2021; 4 (1): 88

Abstract

Coronary artery disease (CAD), the most common manifestation of cardiovascular disease, remains the most common cause of mortality in the United States. Risk assessment is key for primary prevention of coronary events and coronary artery calcium (CAC) scoring using computed tomography (CT) is one such non-invasive tool. Despite the proven clinical value of CAC, the current clinical practice implementation for CAC has limitations such as the lack of insurance coverage for the test, need for capital-intensive CT machines, specialized imaging protocols, and accredited 3D imaging labs for analysis (including personnel and software). Perhaps the greatest gap is the millions of patients who undergo routine chest CT exams and demonstrate coronary artery calcification, but their presence is not often reported or quantitation is not feasible. We present two deep learning models that automate CAC scoring demonstrating advantages in automated scoring for both dedicated gated coronary CT exams and routine non-gated chest CTs performed for other reasons to allow opportunistic screening. First, we trained a gated coronary CT model for CAC scoring that showed near perfect agreement (mean difference in scores=-2.86; Cohen's Kappa=0.89, P<0.0001) with current conventional manual scoring on a retrospective dataset of 79 patients and was found to perform the task faster (average time for automated CAC scoring using a graphics processing unit (GPU) was 3.5±2.1s vs. 261s for manual scoring) in a prospective trial of 55 patients with little difference in scores compared to three technologists (mean difference in scores=3.24, 5.12, and 5.48, respectively). Then using CAC scores from paired gated coronary CT as a reference standard, we trained a deep learning model on our internal data and a cohort from the Multi-Ethnic Study of Atherosclerosis (MESA) study (total training n=341, Stanford test n=42, MESA test n=46) to perform CAC scoring on routine non-gated chest CT exams with validation on external datasets (total n=303) obtained from four geographically disparate health systems. On identifying patients with any CAC (i.e., CAC≥1), sensitivity and PPV was high across all datasets (ranges: 80-100% and 87-100%, respectively). For CAC≥100 on routine non-gated chest CTs, which is the latest recommended threshold to initiate statin therapy, our model showed sensitivities of 71-94% and positive predictive values in the range of 88-100% across all the sites. Adoption of this model could allow more patients to be screened with CAC scoring, potentially allowing opportunistic early preventive interventions.

View details for DOI 10.1038/s41746-021-00460-1

View details for PubMedID 34075194
Artificial Intelligence Algorithm Improves Radiologist Performance in Skeletal Age Assessment: A Prospective Multicenter Randomized Controlled Trial. Radiology Eng, D. K., Khandwala, N. B., Long, J., Fefferman, N. R., Lala, S. V., Strubel, N. A., Milla, S. S., Filice, R. W., Sharp, S. E., Towbin, A. J., Francavilla, M. L., Kaplan, S. L., Ecklund, K., Prabhu, S. P., Dillon, B. J., Everist, B. M., Anton, C. G., Bittman, M. E., Dennis, R., Larson, D. B., Seekins, J. M., Silva, C. T., Zandieh, A. R., Langlotz, C. P., Lungren, M. P., Halabi, S. S. 2021: 204021

Abstract

Background Previous studies suggest that use of artificial intelligence (AI) algorithms as diagnostic aids may improve the quality of skeletal age assessment, though these studies lack evidence from clinical practice. Purpose To compare the accuracy and interpretation time of skeletal age assessment on hand radiograph examinations with and without the use of an AI algorithm as a diagnostic aid. Materials and Methods In this prospective randomized controlled trial, the accuracy of skeletal age assessment on hand radiograph examinations was performed with (n = 792) and without (n = 739) the AI algorithm as a diagnostic aid. For examinations with the AI algorithm, the radiologist was shown the AI interpretation as part of their routine clinical work and was permitted to accept or modify it. Hand radiographs were interpreted by 93 radiologists from six centers. The primary efficacy outcome was the mean absolute difference between the skeletal age dictated into the radiologists' signed report and the average interpretation of a panel of four radiologists not using a diagnostic aid. The secondary outcome was the interpretation time. A linear mixed-effects regression model with random center- and radiologist-level effects was used to compare the two experimental groups. Results Overall mean absolute difference was lower when radiologists used the AI algorithm compared with when they did not (5.36 months vs 5.95 months; P = .04). The proportions at which the absolute difference exceeded 12 months (9.3% vs 13.0%, P = .02) and 24 months (0.5% vs 1.8%, P = .02) were lower with the AI algorithm than without it. Median radiologist interpretation time was lower with the AI algorithm than without it (102 seconds vs 142 seconds, P = .001). Conclusion Use of an artificial intelligence algorithm improved skeletal age assessment accuracy and reduced interpretation times for radiologists, although differences were observed between centers. Clinical trial registration no. NCT03530098 © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Rubin in this issue.

View details for DOI 10.1148/radiol.2021204021

View details for PubMedID 34581608
Video-based AI for beat-to-beat assessment of cardiac function. Nature Ouyang, D., He, B., Ghorbani, A., Yuan, N., Ebinger, J., Langlotz, C. P., Heidenreich, P. A., Harrington, R. A., Liang, D. H., Ashley, E. A., Zou, J. Y. 2020; 580 (7802): 252-256

Abstract

Accurate assessment of cardiac function is crucial for the diagnosis of cardiovascular disease1, screening for cardiotoxicity2 and decisions regarding the clinical management of patients with a critical illness3. However, human assessment of cardiac function focuses on a limited sampling of cardiac cycles and has considerable inter-observer variability despite years of training4,5. Here, to overcome this challenge, we present a video-based deep learning algorithm-EchoNet-Dynamic-that surpasses the performance of human experts in the critical tasks of segmenting the left ventricle, estimating ejection fraction and assessing cardiomyopathy. Trained on echocardiogram videos, our model accurately segments the left ventricle with a Dice similarity coefficient of 0.92, predicts ejection fraction with a mean absolute error of 4.1% and reliably classifies heart failure with reduced ejection fraction (area under the curve of 0.97). In an external dataset from another healthcare system, EchoNet-Dynamic predicts the ejection fraction with a mean absolute error of 6.0% and classifies heart failure with reduced ejection fraction with an area under the curve of 0.96. Prospective evaluation with repeated human measurements confirms that the model has variance that is comparable to or less than that of human experts. By leveraging information across multiple cardiac cycles, our model can rapidly identify subtle changes in ejection fraction, is more reproducible than human evaluation and lays the foundation for precise diagnosis of cardiovascular disease in real time. As a resource to promote further innovation, we also make publicly available a large dataset of 10,030 annotated echocardiogram videos.

View details for DOI 10.1038/s41586-020-2145-8

View details for PubMedID 32269341
Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms. JAMA Kaushal, A. n., Altman, R. n., Langlotz, C. n. 2020; 324 (12): 1212–13

View details for DOI 10.1001/jama.2020.12067

View details for PubMedID 32960230
Ethics of Using and Sharing Clinical Imaging Data for Artificial Intelligence: A Proposed Framework. Radiology Larson, D. B., Magnus, D. C., Lungren, M. P., Shah, N. H., Langlotz, C. P. 2020: 192536

Abstract

In this article, the authors propose an ethical framework for using and sharing clinical data for the development of artificial intelligence (AI) applications. The philosophical premise is as follows: when clinical data are used to provide care, the primary purpose for acquiring the data is fulfilled. At that point, clinical data should be treated as a form of public good, to be used for the benefit of future patients. In their 2013 article, Faden et al argued that all who participate in the health care system, including patients, have a moral obligation to contribute to improving that system. The authors extend that framework to questions surrounding the secondary use of clinical data for AI applications. Specifically, the authors propose that all individuals and entities with access to clinical data become data stewards, with fiduciary (or trust) responsibilities to patients to carefully safeguard patient privacy, and to the public to ensure that the data are made widely available for the development of knowledge and tools to benefit future patients. According to this framework, the authors maintain that it is unethical for providers to "sell" clinical data to other parties by granting access to clinical data, especially under exclusive arrangements, in exchange for monetary or in-kind payments that exceed costs. The authors also propose that patient consent is not required before the data are used for secondary purposes when obtaining such consent is prohibitively costly or burdensome, as long as mechanisms are in place to ensure that ethical standards are strictly followed. Rather than debate whether patients or provider organizations "own" the data, the authors propose that clinical data are not owned at all in the traditional sense, but rather that all who interact with or control the data have an obligation to ensure that the data are used for the benefit of future patients and society.

View details for DOI 10.1148/radiol.2020192536

View details for PubMedID 32208097
A Roadmap for Foundational Research on Artificial Intelligence in Medical Imaging: From the 2018 NIH/RSNA/ACR/The Academy Workshop RADIOLOGY Langlotz, C. P., Allen, B., Erickson, B. J., Kalpathy-Cramer, J., Bigelow, K., Cook, T. S., Flanders, A. E., Lungren, M. P., Mendelson, D. S., Rudie, J. D., Wang, G., Kandarpa, K. 2019; 291 (3): 781–91

View details for DOI 10.1148/radiol.2019190613

View details for Web of Science ID 000468618200036
Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs RADIOLOGY Dunnmon, J. A., Yi, D., Langlotz, C. P., Re, C., Rubin, D. L., Lungren, M. P. 2019; 290 (2): 537–44

View details for DOI 10.1148/radiol.2018181422

View details for Web of Science ID 000456444200043
CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., Ng, A. Y., AAAI ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2019: 590–97

View details for Web of Science ID 000485292600073
Will Artificial Intelligence Replace Radiologists? Radiology. Artificial intelligence Langlotz, C. P. 2019; 1 (3): e190058

View details for DOI 10.1148/ryai.2019190058

View details for PubMedID 33937794

View details for PubMedCentralID PMC8017417
Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists PLOS MEDICINE Rajpurkar, P., Irvin, J., Ball, R. L., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C. P., Patel, B. N., Yeom, K. W., Shpanskaya, K., Blankenberg, F. G., Seekins, J., Amrhein, T. J., Mong, D. A., Halabi, S. S., Zucker, E. J., Ng, A. Y., Lungren, M. P. 2018; 15 (11)

View details for DOI 10.1371/journal.pmed.1002686

View details for Web of Science ID 000451827800004
The LOINC RSNA radiology playbook - a unified terminology for radiology procedures JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Vreeman, D. J., Abhyankar, S., Wang, K. C., Carr, C., Collins, B., Rubin, D. L., Langlotz, C. P. 2018; 25 (7): 885–92

Abstract

This paper describes the unified LOINC/RSNA Radiology Playbook and the process by which it was produced.The Regenstrief Institute and the Radiological Society of North America (RSNA) developed a unification plan consisting of six objectives 1) develop a unified model for radiology procedure names that represents the attributes with an extensible set of values, 2) transform existing LOINC procedure codes into the unified model representation, 3) create a mapping between all the attribute values used in the unified model as coded in LOINC (ie, LOINC Parts) and their equivalent concepts in RadLex, 4) create a mapping between the existing procedure codes in the RadLex Core Playbook and the corresponding codes in LOINC, 5) develop a single integrated governance process for managing the unified terminology, and 6) publicly distribute the terminology artifacts.We developed a unified model and instantiated it in a new LOINC release artifact that contains the LOINC codes and display name (ie LONG_COMMON_NAME) for each procedure, mappings between LOINC and the RSNA Playbook at the procedure code level, and connections between procedure terms and their attribute values that are expressed as LOINC Parts and RadLex IDs. We transformed all the existing LOINC content into the new model and publicly distributed it in standard releases. The organizations have also developed a joint governance process for ongoing maintenance of the terminology.The LOINC/RSNA Radiology Playbook provides a universal terminology standard for radiology orders and results.

View details for PubMedID 29850823

View details for PubMedCentralID PMC6016707
Performance of a Deep-Learning Neural Network Model in Assessing Skeletal Maturity on Pediatric Hand Radiographs RADIOLOGY Larson, D. B., Chen, M. C., Lungren, M. P., Halabi, S. S., Stence, N. V., Langlotz, C. P. 2018; 287 (1): 313–22

Abstract

Purpose To compare the performance of a deep-learning bone age assessment model based on hand radiographs with that of expert radiologists and that of existing automated models. Materials and Methods The institutional review board approved the study. A total of 14 036 clinical hand radiographs and corresponding reports were obtained from two children's hospitals to train and validate the model. For the first test set, composed of 200 examinations, the mean of bone age estimates from the clinical report and three additional human reviewers was used as the reference standard. Overall model performance was assessed by comparing the root mean square (RMS) and mean absolute difference (MAD) between the model estimates and the reference standard bone ages. Ninety-five percent limits of agreement were calculated in a pairwise fashion for all reviewers and the model. The RMS of a second test set composed of 913 examinations from the publicly available Digital Hand Atlas was compared with published reports of an existing automated model. Results The mean difference between bone age estimates of the model and of the reviewers was 0 years, with a mean RMS and MAD of 0.63 and 0.50 years, respectively. The estimates of the model, the clinical report, and the three reviewers were within the 95% limits of agreement. RMS for the Digital Hand Atlas data set was 0.73 years, compared with 0.61 years of a previously reported model. Conclusion A deep-learning convolutional neural network model can estimate skeletal maturity with accuracy similar to that of an expert radiologist and to that of existing automated models. © RSNA, 2017 An earlier incorrect version of this article appeared online. This article was corrected on January 19, 2018.

View details for PubMedID 29095675
The Radiology Report: A Guide to Thoughtful Communication for Radiologists and Other Medical Professionals (Book) Langlotz, C. P. Amazon CreateSpace . 2015
ACR BI-RADS for breast imaging communication: a roadmap for the rest of radiology. Journal of the American College of Radiology Langlotz, C. P. 2009; 6 (12): 861-863

View details for DOI 10.1016/j.jacr.2009.09.015

View details for PubMedID 19945041
Fundamental measures of diagnostic examination performance: Usefulness for clinical decision making and research RADIOLOGY Langlotz, C. P. 2003; 228 (1): 3-9

Abstract

Measures of diagnostic accuracy, such as sensitivity, specificity, predictive values, and receiver operating characteristic curves, can often seem like abstract mathematic concepts that have a minimal relationship with clinical decision making or clinical research. The purpose of this article is to provide definitions and examples of these concepts that illustrate their usefulness in specific clinical decision-making tasks. In particular, nine principles are provided to guide the use of these concepts in daily radiology practice, in interpreting clinical literature, and in designing clinical research studies. An understanding of these principles and of the measures of diagnostic accuracy to which they apply is vital to the appropriate evaluation and use of diagnostic imaging examinations.

View details for DOI 10.1148/radiol.2281011106

View details for Web of Science ID 000183689700001

View details for PubMedID 12832567
A METHODOLOGY FOR GENERATING COMPUTER-BASED EXPLANATIONS OF DECISION-THEORETIC ADVICE MEDICAL DECISION MAKING Langlotz, C. P., Shortliffe, E. H., Fagan, L. M. 1988; 8 (4): 290-303

Abstract

Decision analysis is an appealing methodology with which to provide decision support to the practicing physician. However, its use in the clinical setting is impeded because computer-based explanations of decision-theoretic advice are difficult to generate without resorting to mathematical arguments. Nevertheless, human decision analysts generate useful and intuitive explanations based on decision trees. To facilitate the use of decision theory in a computer-based decision support system, the authors developed a computer program that uses symbolic reasoning techniques to generate nonquantitative explanations of the results of decision analyses. A combined approach has been implemented to explain the differences in expected utility among branches of a decision tree. First, the mathematical relationships inherent in the structure of the tree are used to find any asymmetries in tree structure or inequalities among analogous decision variables that are responsible for a difference in expected utility. Next, an explanation technique is selected and applied to the most significant variables, creating a symbolic expression that justifies the decision. Finally, the symbolic expression is converted to English-language text, thereby generating an explanation that justifies the desirability of the choice with the greater expected utility. The explanation does not refer to mathematical formulas, nor does it include probability or utility values. The results suggest that explanations produced by a combination of decision analysis and symbolic processing techniques may be more persuasive and acceptable to clinicians than those produced by either technique alone.

View details for Web of Science ID A1988Q325300010

View details for PubMedID 3185181
ADAPTING A CONSULTATION SYSTEM TO CRITIQUE USER PLANS INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES Langlotz, C. P., Shortliffe, E. H. 1983; 19 (5): 479-496

View details for Web of Science ID A1983RU78100006
Shaping the future of myopia: artificial intelligence for vitreoretinal complications of high and pathologic myopia. Graefe's archive for clinical and experimental ophthalmology = Albrecht von Graefes Archiv fur klinische und experimentelle Ophthalmologie Mesfin, Y., Salvi, A., Arnal, L., Langlotz, C., Mahajan, V., Ludwig, C. A. 2026

Abstract

The global impact of myopia extends far beyond individual ocular health, posing significant challenges to healthcare systems worldwide. Artificial intelligence (AI), particularly deep learning (DL) applied to ophthalmic imaging, offers a promising strategy to ease constraints posed by the myopia epidemic by detecting subtle structural changes early. Here we describe the current literature on AI for detecting retinal sequelae of myopia, including retinal detachments (RD), myopic macular degeneration (MMD), and myopic traction maculopathy (MTM), with attention to imaging modality and model task (classification vs. segmentation).A literature search was conducted to identify studies using DL to detect RD, MMD, and MTM across ophthalmic imaging modalities (including OCT and fundus photography, and where available fluorescein angiography and ultrasonography).We reviewed 28 studies that piloted DL models usingclassification and/or segmentation approaches for RD (10 studies), MMD (12 studies), and MTM (6 studies). Reported performance for RD ranged from area under the curve (AUC) 86-100%, accuracy 79.3-98.9%, sensitivity 77.1-97.6%, and specificity 79.7-100%. For MMD, performance ranged from AUC 86-100%, accuracy 85.3-99.8%, sensitivity 37.1-97.8%, and specificity 91.5-99.9%. For MTM, performance ranged from AUC 93.8-99.7%, accuracy 94.3-99.3%, sensitivity 74.5-98.4%, and specificity 84.8-99.7%. Across studies, there was substantial heterogeneity in case definitions, datasets, and evaluation methods, and external validation was inconsistently reported. Many earlier studies used CNN-based architectures, while more recent work increasingly incorporates transformer-based backbones and pretrained or foundation models.Researchers have demonstrated excellent results for developing DL models that accurately classify and segment retinal pathologies associated with myopia. However, despite strong performance, additional work is needed to translate these models into clinical use, including robust external validation, calibration for clinical decision-making, and prospective evaluation, particularly for longitudinal prognostication of incident complications in pathologic myopia.

View details for DOI 10.1007/s00417-025-07098-9

View details for PubMedID 41636834

View details for PubMedCentralID 11469969
A multimodal retinal aging clock for biological age prediction and systemic health assessment via OCT and fundus imaging. Scientific reports Ludwig, C. A., Salvi, A., Mesfin, Y., Arnal, L., Langlotz, C., Mahajan, V. 2026

Abstract

Herein we developed age clocks that predict biological age from fundus photography and optical coherence tomography. We evaluated our multimodal models' clinical relevance by examining their associations between predicted biological age and the Charlson Comorbidity Index (CCI). Study 1 assessed how models trained on normal eyes generalize to diseased eyes, and Study 2 tested whether incorporating disease labels improves performance and systemic associations. Models were fine-tuned to the imaging dataset to predict biological age. Linear regressors were trained on chronological and biological features to infer CCI. Gradient-weighted regression activation mapping also generated heatmaps to identify the model's region of focus. Prediction performance improved when trained on both normal and diseased eyes. Predicted biological age showed significantly stronger correlations with CCI than chronological age across both studies, supporting our algorithm's association with this validated measure of mortality. Thus, our algorithm may provide insight into systemic health burdens beyond that of traditional risk assessments.

View details for DOI 10.1038/s41598-026-36518-x

View details for PubMedID 41606027
A deep learning-based automated pipeline for colorectal cancer detection in contrast-enhanced CT images. Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society Qiu, C., Miller, S., Subramanian, B., Ryu, A., Zhang, H., Fisher, G. A., Shah, N. H., Mongan, J., Langlotz, C., Poullos, P., Shen, J. 2026; 128: 102717

Abstract

Colorectal cancer (CRC) is the third most commonly diagnosed malignancy worldwide and a leading cause of cancer-related mortality. This study aims to investigate an automatic detection pipeline for identification and localization of the primary CRC in portal venous phase contrast-enhanced CT scans, which is a crucial first step for downstream CRC staging, prognostication, and treatment planning. We propose a deep learning-based automated detection pipeline using YOLOv11 as the baseline architecture. A ResNet50 module was incorporated into the YOLOv11 backbone to enhance image feature extraction. Additionally, a scale-adaptive loss function, which introduces an adaptive coefficient and a scaling factor to adaptively measure the Intersection over Union (IoU) and center point distance for improving box regression performance, was designed to further improve detection performance. The proposed pipeline achieved a recall of 0.8092, precision of 0.8187, and F-1 score of 0.8139 for CRC detection on our in-house dataset at the patient level (inter-patient evaluation) and a recall of 0.9949, precision of 0.9894, and F-1 score of 0.9921 at the slice level (intra-patient evaluation). Validation on an external public dataset demonstrated that our pipeline, when trained on a patient-level in-house dataset, obtained a recall of 0.8283, precision of 0.8414, and F-1 score of 0.8348 and, when trained on a slice-level in-house dataset, achieved a recall of 0.6897, precision of 0.7888, and F-1 score of 0.7358, outperforming existing representative detection methods. The superior CRC detection performance on the in-house CT dataset and state-of-the-art generalization performance on the public dataset (with a 31.97 %age point improvement in detection sensitivity (recall) over the next closest state-of-the-art method), highlight the potential translational value of our pipeline for CRC clinical decision support, conditional upon validation in larger cohorts.

View details for DOI 10.1016/j.compmedimag.2026.102717

View details for PubMedID 41633187
Effects of Real-Time Notification of AI-Derived Incidental Coronary Artery Calcium on Statin Initiation: the NOTIFY-PICTURE Trial Dudum, R., Jain, S., Mastrodicasa, D., Ngo, S., Furst, A., Xu, S., Eng, D., Khandwala, N., Langlotz, C., Chaudhari, A., Sandhu, A., Maron, D., Rodriguez, F. LIPPINCOTT WILLIAMS & WILKINS. 2025

View details for Web of Science ID 001655643200064
EchoGraph system for automated quality assessment of echocardiography reports. NPJ digital medicine Chao, C. J., Delbrouck, J. B., Asadi, M., Banerjee, I., Farina, J. M., Galasso, F., Mahmoud, A. K., Abbas, M. T., Wang, Y. C., Arsanjani, R., Kane, G. C., Oh, J. K., Erickson, B. J., Fei-Fei, L., Adeli, E., Langlotz, C. 2025

Abstract

Generative AI needs automatic clinical text accuracy metrics, but none exist for echocardiography. To address this, we developed EchoGraph, a BERT-based model trained on 600 densely annotated echocardiography reports from the Mayo Clinic (2017), split 7:2:1 for training, validation, and testing, using a tailored schema with 48,256 entities and 29,731 relations annotated. Sixty random MIMIC-EchoNote reports were annotated (3672 entities and 2360 relations) for external validation. EchoGraph demonstrated strong performance predicting entities (micro F1 0.85) and relations (micro F1 0.70), maintaining performance on external validation (entity micro F1 0.80, relation micro F1 0.52). EchoGraph F1 score showed superior error sensitivity versus RadGraph F1, with 2.8-fold higher slope magnitude (-0.817 vs -0.291) and better variance explained (R2 = 0.803 vs 0.578). EchoGraph offers an effective solution for evaluating language model-based echocardiography applications, supporting more accurate AI-generated reports.

View details for DOI 10.1038/s41746-025-02140-w

View details for PubMedID 41372462
Leveraging large language models to extract smoking history from clinical notes for lung cancer surveillance. NPJ digital medicine Luo, I., Graber-Naidich, A., Zhang, M., Kaushik, R., Nieda, G. M., Chen, T., Gu, B., Choi, E., Ding, V. Y., Gunturkun, F., Satoyoshi, M., Bhat, A., Lee, T. Y., Su, C. C., Ellis-Caleo, T. J., Henry, A. S., Desai, M., Backhus, L. M., Lui, N. S., Leung, A., Neal, J. W., Kurian, A. W., Langlotz, C. P., Wakelee, H. A., Liang, S. Y., Khan, A., Han, S. S. 2025; 8 (1): 731

Abstract

Accurate smoking documentation in electronic health records (EHRs) is crucial for risk assessment and patient monitoring. However, key information is often missing or inaccurately recorded. Large language models (LLMs) present a promising solution for interpreting clinical narratives to extract comprehensive smoking data. We developed a framework utilizing LLMs combined with rule-based longitudinal smoothing techniques to enhance data quality. We compared generative LLMs (Gemini-1.5-Flash, PaLM-2-Text-Bison, GPT-4) against BERT-based models using 1683 manually annotated clinical notes from 518 patients across Stanford and Sutter Health systems. Generative LLMs achieved superior performance ( > 96% accuracy) across seven smoking variables, with external validation showing robust generalizability (97.5-98.8% accuracy). We deployed Gemini-1.5-Flash to 79,408 notes from 4792 lung cancer patients, demonstrating that risk model-based surveillance incorporating smoking factors outperformed NCCN Guidelines in identifying second malignancies. Our study highlights the potential of generative LLMs to improve smoking history documentation quality, enhancing lung cancer surveillance and broader clinical applications.

View details for DOI 10.1038/s41746-025-02009-y

View details for PubMedID 41315854

View details for PubMedCentralID 11745215
Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data. Research square Moroianu, S. L., Bluethgen, C., Chambon, P., Cherti, M., Delbrouck, J. B., Paschali, M., Price, B., Gichoya, J., Jitsev, J., Langlotz, C. P., Chaudhari, A. S. 2025

Abstract

Achieving robust performance and fairness across diverse patient populations remains a central challenge in developing clinically deployable deep learning models for diagnostic imaging. Synthetic data generation has emerged as a promising strategy to address current limitations in dataset scale and diversity. In this study, we introduce RoentGen-v2, a state-of-the-art text-to-image diffusion model for chest radiographs that enables fine-grained control over both radiographic findings and patient demographic attributes, including sex, age, and race/ethnicity. RoentGen-v2 is the first model to generate clinically plausible chest radiographs with explicit demographic conditioning, facilitating the creation of a large, demographically balanced synthetic dataset comprising over 565,000 images. We use this large synthetic dataset to evaluate optimal training pipelines for downstream disease classification models. In contrast to prior work that combines real and synthetic data naively, we propose an improved training strategy that leverages synthetic data for supervised pretraining, followed by fine-tuning on real data. Through extensive evaluation on over 137,000 held-out chest radiographs from five institutions, we demonstrate that synthetic pretraining consistently improves model performance, generalization to out-of-distribution settings, and fairness across demographic subgroups defined across varying fairness metrics. Across datasets, synthetic pretraining led to a 6.5% accuracy increase in the performance of downstream classification models, compared to a modest 2.7% increase when naively combining real and synthetic data. We observe this performance improvement simultaneously with the reduction of the underdiagnosis fairness gap by 19.3%, with marked improvements across intersectional subgroups of sex, age, and race/ethnicity. Our proposed data-centric training approach that combines high-fidelity synthetic training data with multi-stage training pipelines is label-efficient, reducing reliance on large quantities of annotated real data. These results highlight the potential of demographically controllable synthetic imaging to advance equitable and generalizable medical deep learning under real-world data constraints. We open source our code, trained models, and synthetic dataset.

View details for DOI 10.21203/rs.3.rs-7687810/v1

View details for PubMedID 41356360

View details for PubMedCentralID PMC12676388
Effects of Real-Time Notification of AI-Detected Incidental Coronary Artery Calcium on Statin Prescription: the NOTIFY-PICTURE Trial. Circulation Dudum, R., Jain, S. S., Mastrodicasa, D., Furst, A., Xu, S., Ngo, S., Eng, D., Khandwala, N., Sousa, D., Chaudhari, A., Langlotz, C., Sandhu, A. T., Maron, D. J., Rodriguez, F. 2025

View details for DOI 10.1161/CIRCULATIONAHA.125.078155

View details for PubMedID 41213130
Deep Learning Algorithm Prognosticating Retinal Tears and Detachments From Optical Coherence Tomography. Translational vision science & technology Salvi, A., Mesfin, Y., Arnal, L., Langlotz, C., Mahajan, V., Ludwig, C. A. 2025; 14 (11): 18

Abstract

Our image classifier prognosticates future retinal tear/retinal detachment (RT/RD) likelihood from optical coherence tomography (OCT) while providing pixel-level explanations of clinical importance.RT/RD status (International Classification of Diseases, Ninth and Tenth Revision codes) and surgical status (Current Procedural Terminology codes) were determined for OCTs collected from the Stanford Research Repository. An image positive for future RT/RD-related surgery was defined as no RT/RD or surgery prior to the acquisition date and the acquisition date 90 days prior to RT/RD diagnosis or surgery. A negative image had no patient overlap with the positive class, had no RT/RD or surgery indication at any time, and was positive for plaquenil use without toxic maculopathy. A convolutional neural network, Inception-v4, was fine-tuned in a class-stratified fivefold fashion on the data set containing 433 negative patients (1027 images) and 343 positive patients (1027 images). Each fold contained a separate patient cohort. Heatmaps indicating a model's region of focus were generated using gradient-weighted class activation mapping to verify that the model's intuition was consistent with clinical knowledge.Performance metrics were collected by averaging across folds. For the test set, the model achieved an area under the receiver operating characteristic curve of 0.87, an average precision score of 0.85, and an accuracy of 0.78. Anatomy highlighted in heatmaps described macular biomarkers for RT/RD, including epiretinal membrane presence, vitreomacular traction, degree of myopic tilt, and choroidal thickness.The binary image classifier accurately identified future RT/RD development from OCTs.Our deep learning algorithm highlights biomarkers for patients at high risk for RT/RDs, providing a window for prophylactic treatment to prevent vision loss. .

View details for DOI 10.1167/tvst.14.11.18

View details for PubMedID 41247117
A Multi-Task Deep Learning Model for Pediatric Echocardiography Analysis. medRxiv : the preprint server for health sciences Joseph, C., Mrudang, M., Dhamanpreet, K., Matthew, D., Adil, D., Aravind, K., Matthew, L., Rohan, S., Gonzalez, A. K., Joseph, L., Christa, S., Robyn, F., Abhinav, K., Cyril, Z., Langlotz, C. P., Jolley, M. A., William, H. 2025

Abstract

Congenital heart defects afflict nearly 1% of all births worldwide. While deep learning algorithms have shown significant promise in automating and improving adult echocardiography analysis, similar progress has not been observed in pediatric echocardiography. Specifically, existing pediatric-based models are limited to single tasks and specific echocardiographic views. To address this, we introduce EchoAI-Peds, the first multi-task deep learning model for pediatric echocardiography. Our model was developed using the most comprehensive set of pediatric labels to date and is designed to integrate information from multiple echocardiographic views simultaneously.A video-based vision transformer was trained to simultaneously detect 28 congenital heart defects, structural and functional abnormalities, repairs, and interventions directly from complete pediatric echocardiography studies with multiple videos. During inference, our model integrates information from all available views to produce unified study-level predictions. Our model was developed using over 700,000 videos derived from more than 11,000 studies at Stanford Medicine. Model efficacy was tested on an internal held-out dataset. In addition, model generalizability was tested on a spatially and temporally distinct patient cohort at the Children's Hospital of Philadelphia.Our model achieved macro-averaged AUROC values of 0.91 (95% CI: 0.90-0.92) and 0.89 (95% CI: 0.88-0.90) on the internal and external test sets, respectively. Moreover, our model significantly outperformed adult-based echocardiography foundation models trained on substantially larger datasets (p < 0.001). Finally, our model demonstrated robust performance across patient age, patient sex, and studies with varying number of videos.Our findings demonstrate the remarkable potential for multi-task deep learning models to aid the interpretation of pediatric echocardiograms. In addition, our results underscore the need for models that are specifically tailored to pediatric populations.

View details for DOI 10.1101/2025.10.27.25338912

View details for PubMedID 41282661

View details for PubMedCentralID PMC12636689
Socioeconomic Inequalities and Lung Cancer Outcomes: Evidence From an Integrated EHR Database and State Cancer Registry Data Lee, T. Y., Su, C. C., Choi, E., Ding, V. Y., Satoyoshi, M., Bhat, A., Chen, T., Luo, I., Liu, Y., Henry, S., Backhus, L. M., Ellis-Caleo, T. J., Gomez, S. L., Lui, N. S., Leung, A., Langlotz, C., Neal, J. W., Kurian, A. W., Wakelee, H. A., Liang, S., Han, S. S. ELSEVIER SCIENCE INC. 2025

View details for Web of Science ID 001616089400014
Large Language Models to Extract Smoking History From Clinical Notes in EHR to Evaluate Lung Cancer Surveillance Strategies Luo, I., Graber-Naidich, A., Kaushik, R., Nieda, G. M., Choi, E., Ding, V. Y., Gunturkun, F., Satoyoshi, M., Bhat, A., Chen, T., Zhang, M., Lee, T., Su, C. C., Ellis-Caleo, T. J., Henry, A. S., Desai, M., Backhus, L. M., Lui, N. S., Leung, A., Neal, J. W., Kurian, A. W., Langlotz, C. P., Wakelee, H. A., Liang, S. Y., Kahn, A., Han, S. S. ELSEVIER SCIENCE INC. 2025

View details for Web of Science ID 001615996000034
Reply. Journal of the American College of Radiology : JACR Herwald, S. E., Shah, P., Johnston, A., Olsen, C., Delbrouck, J., Langlotz, C. P. 2025

View details for DOI 10.1016/j.jacr.2025.08.004

View details for PubMedID 40812729
Automatic Abstraction of Computed Tomography Imaging Indication Using Natural Language Processing for Evaluation of Surveillance Patterns in Long-Term Lung Cancer Survivors. JCO clinical cancer informatics Khan, A., Choi, E., Su, C., Graber-Naidich, A., Henry, S., Satoyoshi, M. L., Bhat, A., Kurian, A. W., Liang, S. Y., Neal, J., Gould, M., Leung, A., Wakelee, H. A., Backhus, L. M., Langlotz, C., Wu, J., Han, S. S. 2025; 9: e2400279

Abstract

Despite its routine use to monitor patients with lung cancer (LC), real-world evaluations of the impact of computed tomography (CT) surveillance on overall survival (OS) have been inconsistent. A major confounder is the absence of imaging indications because patients undergo CT scans for purposes beyond surveillance, like symptom evaluations (eg, cough) linked to poor survival. We propose a novel natural language processing model to predict CT imaging indications (surveillance v others).We used electronic health records of 585 long-term LC survivors (≥5 years) at Stanford, followed for up to 22 years. Their 3,362 post-5-year CT reports (including 1,672 manually annotated) were used for modeling by integrating structured variables (eg, CT intervals) with key-phrase analysis of radiology reports. Naïve analysis compared OS in patients with CT for any indications (including symptoms) versus those without post-5-year CT, as in previous studies. Using model-predicted indications, we conducted exploratory analyses to compare OS between those with post-5-year surveillance CT and those without.The model showed high discrimination (AUC, 0.86), with key predictors including a longer interval (≥6-month) from the previous CT (odds ratios [OR], 5.50; P < .001) and surveillance-related key phrases (OR, 1.37; P = .03). Propensity-adjusted survival analysis indicated better OS for patients with any post-5-year surveillance CT versus those without (adjusted hazard ratio, 0.60; P = .016). By contrast, no significant survival difference was found (P = .53) between patients with any CT versus those without post-5-year CT.Our model abstracted CT indications from real-world data with high discrimination. Exploratory analyses revealed the obscured imaging-OS association when considering indications, highlighting the model's potential for future real-world studies.

View details for DOI 10.1200/CCI-24-00279

View details for PubMedID 40700679
Foundation versus domain-specific models for left ventricular segmentation on cardiac ultrasound. NPJ digital medicine Chao, C. J., Gu, Y. R., Kumar, W., Xiang, T., Appari, L., Wu, J., Farina, J. M., Wraith, R., Jeong, J., Arsanjani, R., Kane, G. C., Oh, J. K., Langlotz, C. P., Banerjee, I., Fei-Fei, L., Adeli, E. 2025; 8 (1): 341

Abstract

The Segment Anything Model (SAM) was fine-tuned on the EchoNet-Dynamic dataset and evaluated on external transthoracic echocardiography (TTE) and Point-of-Care Ultrasound (POCUS) datasets from CAMUS (University Hospital of St Etienne) and Mayo Clinic (99 patients: 58 TTE, 41 POCUS). Fine-tuned SAM was superior or comparable to MedSAM. The fine-tuned SAM also outperformed EchoNet and U-Net models, demonstrating strong generalization, especially on apical 2-chamber (A2C) images (fine-tuned SAM vs. EchoNet: CAMUS-A2C: DSC 0.891 ± 0.040 vs. 0.752 ± 0.196, p < 0.0001) and POCUS (DSC 0.857 ± 0.047 vs. 0.667 ± 0.279, p < 0.0001). Additionally, SAM-enhanced workflow reduced annotation time by 50% (11.6 ± 4.5 sec vs. 5.7 ± 1.7 sec, p < 0.0001) while maintaining segmentation quality. We demonstrated an effective strategy for fine-tuning a vision foundation model for enhancing clinical workflow efficiency and supporting human-AI collaboration.

View details for DOI 10.1038/s41746-025-01730-y

View details for PubMedID 40481190

View details for PubMedCentralID PMC12144204
Enabling national identification of lung cancer screening eligibility with large language models. Wu, J., Conover, S., Su, C., Corrigan, J., Culnan, J., Liu, Y., Kelley, M. J., Do, N., Arya, S., Harris, A. H. S., Langlotz, C., Wiener, R., Branch-Elliman, W., Han, S., Fillmore, N. LIPPINCOTT WILLIAMS & WILKINS. 2025: e13613

View details for DOI 10.1200/JCO.2025.43.16_suppl.e13613

View details for Web of Science ID 001509490400001
A Dataset for Understanding Radiologist-Artificial Intelligence Collaboration. Scientific data Moehring, A., Kutwal, M., Huang, R., Banerjee, O., Jacobi, A., Eber, C., Mendoza, D., Chung, M., Dayan, E., Gupta, Y., Bui, T. D., Truong, S. Q., Pareek, A., Langlotz, C. P., Lungren, M. P., Agarwal, N., Rajpurkar, P., Salz, T. 2025; 12 (1): 739

Abstract

This dataset, Collab-CXR, provides a unique resource to study human-AI collaboration in chest X-ray interpretation. We present experimentally generated data from 227 professional radiologists who assessed 324 historical cases under varying information conditions: with and without AI assistance, and with and without clinical history. Using a custom-designed interface, we collected probabilistic assessments for 104 thoracic pathologies using a comprehensive hierarchical reporting structure. This dataset is the largest known comparison of human-AI collaborative performance to either AI or humans alone in radiology, offering assessments across an extensive range of pathologies with rich metadata on radiologist characteristics and decision-making processes. Multiple experimental designs enable both within-subject and between-subject analyses. Researchers can leverage this dataset to investigate how radiologists incorporate AI assistance, factors influencing collaborative effectiveness, and impacts on diagnostic accuracy, speed, and confidence across different cases and pathologies. By enabling rigorous study of human-AI integration in clinical workflows, this dataset can inform AI tool development, implementation strategies, and ultimately improve patient care through optimized collaboration in medical imaging.

View details for DOI 10.1038/s41597-025-05054-0

View details for PubMedID 40319039

View details for PubMedCentralID PMC12049457
Evaluating large language models in echocardiography reporting: opportunities and challenges. European heart journal. Digital health Chao, C. J., Banerjee, I., Arsanjani, R., Ayoub, C., Tseng, A., Delbrouck, J. B., Kane, G. C., Lopez-Jimenez, F., Attia, Z., Oh, J. K., Erickson, B., Fei-Fei, L., Adeli, E., Langlotz, C. 2025; 6 (3): 326-339

Abstract

The increasing need for diagnostic echocardiography tests presents challenges in preserving the quality and promptness of reports. While Large Language Models (LLMs) have proven effective in summarizing clinical texts, their application in echo remains underexplored.Adult echocardiography studies, conducted at the Mayo Clinic from 1 January 2017 to 31 December 2017, were categorized into two groups: development (all Mayo locations except Arizona) and Arizona validation sets. We adapted open-source LLMs (Llama-2, MedAlpaca, Zephyr, and Flan-T5) using In-Context Learning and Quantized Low-Rank Adaptation fine-tuning (FT) for echo report summarization from 'Findings' to 'Impressions.' Against cardiologist-generated Impressions, the models' performance was assessed both quantitatively with automatic metrics and qualitatively by cardiologists. The development dataset included 97 506 reports from 71 717 unique patients, predominantly male (55.4%), with an average age of 64.3 ± 15.8 years. EchoGPT, a fine-tuned Llama-2 model, outperformed other models with win rates ranging from 87% to 99% in various automatic metrics, and produced reports comparable to cardiologists in qualitative review (significantly preferred in conciseness (P < 0.001), with no significant preference in completeness, correctness, and clinical utility). Correlations between automatic and human metrics were fair to modest, with the best being RadGraph F1 scores vs. clinical utility (r = 0.42) and automatic metrics showed insensitivity (0-5% drop) to changes in measurement numbers.EchoGPT can generate draft reports for human review and approval, helping to streamline the workflow. However, scalable evaluation approaches dedicated to echo reports remains necessary.

View details for DOI 10.1093/ehjdh/ztae086

View details for PubMedID 40395412

View details for PubMedCentralID PMC12088711
Leveraging Generative Pre-trained Transformer (GPT) Large Language Models (LLMs) For Interstitial Lung Diseases (ILD) Clinical Research Chen, S., Maddali, M., Bluethgen, C., Langlotz, C. P., Raj, R. AMER THORACIC SOC. 2025

View details for Web of Science ID 001488569000013
Framework for Environmentally Sustainable Radiology: Call for Collaborative Action and a Health-Centered Focus. Radiology Hanneman, K., Redenius, I., Dewey, M., Kielar, A., Dobranowski, J., Bellin, M. F., Tasu, J. P., Aida, N., Jinzaki, M., Tomiyama, N., Halliday, K., Harden, S., Reichardt, O., Catalano, C., Nikolaou, K., Kuhl, C., Langlotz, C. P., Mahmood, U., Gandolfo, N., Giovagnoni, A. 2025; 315 (1): e250070

Abstract

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. It is imperative that the entire medical imaging sector acts collectively and decisively to reduce its own environmental impact and prepare for the current and future effects of the climate crisis. The Radiology R7 meeting convened in Venice, Italy, on October 10-13, 2024 to discuss environmental sustainability and other key issues facing radiology and the patients served by medical imaging. Radiology R7 delegates agree that collaborative action is urgently needed to transform radiology systems to be climate-resilient, equitable, low-carbon, and sustainable. This special report highlights priorities and outlines a framework for environmentally sustainable radiology, centered on eight collaborative action areas. A health-centered response reinforces the role of radiologists as physicians, emphasizes the opportunity for medical imaging to improve health, and will be essential to engage key partners in climate action. Effective leadership and governance are needed to ensure that radiology services are accessible, equitable, affordable, high quality and sustainable. Collaboration and partnership are essential to achieve meaningful change. Health equity should be prioritized to increase global access to high quality radiology services while minimizing the environmental impact. Multiple climate response pathways should be implemented in parallel including mitigation strategies to reduce the use of energy, finite resources and waste and adaptation strategies to build resilience to the effects of climate change. Innovation and research are necessary to develop, validate, and implement sustainable solutions. Finally, knowledge sharing, education, and training are needed to disseminate information on actions toward environmentally sustainable radiology practices. We all have a role to play and must work together to achieve these aims quickly by identifying the problem, setting goals, implementing a plan, measuring impact, sharing results, and celebrating successes.

View details for DOI 10.1148/radiol.250070

View details for PubMedID 40261175
A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings. Nature communications Zambrano Chaves, J. M., Huang, S. C., Xu, Y., Xu, H., Usuyama, N., Zhang, S., Wang, F., Xie, Y., Khademi, M., Yang, Z., Awadalla, H., Gong, J., Hu, H., Yang, J., Li, C., Gao, J., Gu, Y., Wong, C., Wei, M., Naumann, T., Chen, M., Lungren, M. P., Chaudhari, A., Yeung-Levy, S., Langlotz, C. P., Wang, S., Poon, H. 2025; 16 (1): 3108

Abstract

Large foundation models show promise in biomedicine but face challenges in clinical use due to performance gaps, accessibility, cost, and lack of scalable evaluation. Here we show that open-source small multimodal models can bridge these gaps in radiology by generating free-text findings from chest X-ray images. Our data-centric approach leverages 697K curated radiology image-text pairs to train a specialized, domain-adapted chest X-ray encoder. We integrate this encoder with pre-trained language models via a lightweight adapter that aligns image and text modalities. To enable robust, clinically relevant evaluation, we develop and validate CheXprompt, a GPT-4-based metric for assessing factual accuracy aligned with radiologists' evaluations. Benchmarked with CheXprompt and other standard factuality metrics, LLaVA-Rad (7B) achieves state-of-the-art performance, outperforming much larger models like GPT-4V and Med-PaLM M (84B). While not immediately ready for real-time clinical deployment, LLaVA-Rad is a scalable, privacy-preserving and cost-effective step towards clinically adaptable multimodal AI for radiology.

View details for DOI 10.1038/s41467-025-58344-x

View details for PubMedID 40169573

View details for PubMedCentralID PMC11962106
Evaluating large language models in echocardiography reporting: opportunities and challenges EUROPEAN HEART JOURNAL - DIGITAL HEALTH Chao, C., Banerjee, I., Arsanjani, R., Ayoub, C., Tseng, A., Delbrouck, J., Kane, G. C., Lopez-Jimenez, F., Attia, Z., Oh, J. K., Erickson, B., Fei-Fei, L., Adeli, E., Langlotz, C. 2025

View details for DOI 10.1093/ehjdh/ztae086

View details for Web of Science ID 001456248300001
Crucial Role of Understanding in Human-Artificial Intelligence Interaction for Successful Clinical Adoption. Korean journal of radiology Park, S. H., Langlotz, C. P. 2025

View details for DOI 10.3348/kjr.2025.0071

View details for PubMedID 40015562
FUTURE-AI: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. BMJ (Clinical research ed.) Lekadir, K., Frangi, A. F., Porras, A. R., Glocker, B., Cintas, C., Langlotz, C. P., Weicken, E., Asselbergs, F. W., Prior, F., Collins, G. S., Kaissis, G., Tsakou, G., Buvat, I., Kalpathy-Cramer, J., Mongan, J., Schnabel, J. A., Kushibar, K., Riklund, K., Marias, K., Amugongo, L. M., Fromont, L. A., Maier-Hein, L., Cerdá-Alberich, L., Martí-Bonmatí, L., Cardoso, M. J., Bobowicz, M., Shabani, M., Tsiknakis, M., Zuluaga, M. A., Fritzsche, M. C., Camacho, M., Linguraru, M. G., Wenzel, M., De Bruijne, M., Tolsgaard, M. G., Goisauf, M., Cano Abadía, M., Papanikolaou, N., Lazrak, N., Pujol, O., Osuala, R., Napel, S., Colantonio, S., Joshi, S., Klein, S., Aussó, S., Rogers, W. A., Salahuddin, Z., Starmans, M. P. 2025; 388: e081554

Abstract

Despite major advances in artificial intelligence (AI) research for healthcare, the deployment and adoption of AI technologies remain limited in clinical practice. This paper describes the FUTURE-AI framework, which provides guidance for the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI Consortium was founded in 2021 and comprises 117 interdisciplinary experts from 50 countries representing all continents, including AI scientists, clinical researchers, biomedical ethicists, and social scientists. Over a two year period, the FUTURE-AI guideline was established through consensus based on six guiding principles—fairness, universality, traceability, usability, robustness, and explainability. To operationalise trustworthy AI in healthcare, a set of 30 best practices were defined, addressing technical, clinical, socioethical, and legal dimensions. The recommendations cover the entire lifecycle of healthcare AI, from design, development, and validation to regulation, deployment, and monitoring.

View details for DOI 10.1136/bmj-2024-081554

View details for PubMedID 39909534

View details for PubMedCentralID PMC11795397
Foundation Models in Radiology: What, How, Why, and Why Not. Radiology Paschali, M., Chen, Z., Blankemeier, L., Varma, M., Youssef, A., Bluethgen, C., Langlotz, C., Gatidis, S., Chaudhari, A. 2025; 314 (2): e240597

Abstract

Recent advances in artificial intelligence have witnessed the emergence of large-scale deep learning models capable of interpreting and generating both textual and imaging data. Such models, typically referred to as foundation models (FMs), are trained on extensive corpora of unlabeled data and demonstrate high performance across various tasks. FMs have recently received extensive attention from academic, industry, and regulatory bodies. Given the potentially transformative impact that FMs can have on the field of radiology, radiologists must be aware of potential pathways to train these radiology-specific FMs, including understanding both the benefits and challenges. Thus, this review aims to explain the fundamental concepts and terms of FMs in radiology, with a specific focus on the requirements of training data, model training paradigms, model capabilities, and evaluation strategies. Overall, the goal of this review is to unify technical advances and clinical needs for safe and responsible training of FMs in radiology to ultimately benefit patients, providers, and radiologists.

View details for DOI 10.1148/radiol.240597

View details for PubMedID 39903075
Open-Source Large Language Models in Radiology: A Review and Tutorial for Practical Research and Clinical Deployment. Radiology Savage, C. H., Kanhere, A., Parekh, V., Langlotz, C. P., Joshi, A., Huang, H., Doo, F. X. 2025; 314 (1): e241073

Abstract

Integrating large language models (LLMs) into health care holds substantial potential to enhance clinical workflows and care delivery. However, LLMs also pose serious risks if integration is not thoughtfully executed, with complex challenges spanning accuracy, accessibility, privacy, and regulation. Proprietary commercial LLMs (eg, GPT-4 [OpenAI], Claude 3 Sonnet and Claude 3 Opus [Anthropic], Gemini [Google]) have received much attention from researchers in the medical domain, including radiology. Interestingly, open-source LLMs (eg, Llama 3 and LLaVA-Med) have received comparatively little attention. Yet, open-source LLMs hold several key advantages over proprietary LLMs for medical institutions, hospitals, and individual researchers. The wider adoption of open-source LLMs has been slower, perhaps in part due to the lack of familiarity, accessible computational infrastructure, and community-built tools to streamline their local implementation and customize them for specific use cases. Thus, this article provides a tutorial for the implementation of open-source LLMs in radiology, including examples of commonly used tools for text generation and techniques for troubleshooting issues with prompt engineering, retrieval-augmented generation, and fine-tuning. Implementation-ready code for each tool is provided at https://github.com/UM2ii/Open-Source-LLM-Tools-for-Radiology. In addition, this article compares the benefits and drawbacks of open-source and proprietary LLMs, discusses the differentiating characteristics of popular open-source LLMs, and highlights recent advancements that may affect their adoption.

View details for DOI 10.1148/radiol.241073

View details for PubMedID 39873598
Medical Data Under Shadow Attacks via Hybrid Model Inversion Azhar, A., Thielen, P., Langlotz, C. edited by Li, Y., Mandt, S., Agrawal, S., Khan, E. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2025

View details for Web of Science ID 001593416700154
Automated Structured Radiology Report Generation Delbrouck, J., Xu, J., Moll, J., Thomas, A., Chen, Z., Ostmeier, S., Azhar, A., Li, K., Johnston, A., Bluethgen, C., Reis, E., Muneer, M., Varma, M., Langlotz, C. edited by Che, W., Nabende, J., Shutova, E., Pilehvar, M. T. ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2025: 26813-26829

View details for Web of Science ID 001611629600020
CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback Hein, D., Chen, Z., Ostmeier, S., Xu, J., Varma, M., Reis, E., Michalson, A., Bluethgen, C., Shin, H., Langlotz, C., Chaudhari, A. S. edited by Che, W., Nabende, J., Shutova, E., Pilehvar, M. T. ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2025: 27679-27702

View details for Web of Science ID 001611629600061
Merlin: A Vision Language Foundation Model for 3D Computed Tomography. Research square Blankemeier, L., Cohen, J. P., Kumar, A., Veen, D. V., Gardezi, S., Paschali, M., Chen, Z., Delbrouck, J. B., Reis, E., Truyts, C., Bluethgen, C., Jensen, M., Ostmeier, S., Varma, M., Valanarasu, J., Fang, Z., Huo, Z., Nabulsi, Z., Ardila, D., Weng, W. H., Junior, E. A., Ahuja, N., Fries, J., Shah, N., Johnston, A., Boutin, R., Wentland, A., Langlotz, C., Hom, J., Gatidis, S., Chaudhari, A. 2024

Abstract

Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current shortage of both general and specialized radiologists, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies while simultaneously using the images to extract novel physiological insights. Prior state-of-the-art approaches for automated medical image interpretation leverage vision language models (VLMs) that utilize both the image and the corresponding textual radiology reports. However, current medical VLMs are generally limited to 2D images and short reports. To overcome these shortcomings for abdominal CT interpretation, we introduce Merlin - a 3D VLM that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining without requiring additional manual annotations. We train Merlin using a high-quality clinical dataset of paired CT scans (6+ million images from 15,331 CTs), EHR diagnosis codes (1.8+ million codes), and radiology reports (6+ million tokens) for training. We comprehensively evaluate Merlin on 6 task types and 752 individual tasks. The non-adapted (off-the-shelf) tasks include zero-shot findings classification (31 findings), phenotype classification (692 phenotypes), and zero-shot cross-modal retrieval (image to findings and image to impressions), while model adapted tasks include 5-year chronic disease prediction (6 diseases), radiology report generation, and 3D semantic segmentation (20 organs). We perform internal validation on a test set of 5,137 CTs, and external validation on 7,000 clinical CTs and on two public CT datasets (VerSe, TotalSegmentator). Beyond these clinically-relevant evaluations, we assess the efficacy of various network architectures and training strategies to depict that Merlin has favorable performance to existing task-specific baselines. We derive data scaling laws to empirically assess training data needs for requisite downstream task performance. Furthermore, unlike conventional VLMs that require hundreds of GPUs for training, we perform all training on a single GPU. This computationally efficient design can help democratize foundation model training, especially for health systems with compute constraints. We plan to release our trained models, code, and dataset, pending manual removal of all protected health information.

View details for DOI 10.21203/rs.3.rs-4546309/v1

View details for PubMedID 38978576

View details for PubMedCentralID PMC11230513
Checklist for Artificial Intelligence in Medical Imaging (CLAIM): 2024 Update. Radiology. Artificial intelligence Tejani, A. S., Klontzas, M. E., Gatti, A. A., Mongan, J. T., Moy, L., Park, S. H., Kahn, C. E. 2024: e240300

Abstract

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. ©RSNA, 2024.

View details for DOI 10.1148/ryai.240300

View details for PubMedID 38809149
Almanac - Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI Zakka, C., Shad, R., Chaurasia, A., Dalal, A. R., Kim, J. L., Moor, M., Fong, R., Phillips, C., Alexander, K., Ashley, E., Boyd, J., Boyd, K., Hirsch, K., Langlotz, C., Lee, R., Melia, J., Nelson, J., Sallam, K., Tullis, S., Vogelsong, M. A., Cunningham, J. P., Hiesinger, W. 2024; 1 (2)

Abstract

Large language models (LLMs) have recently shown impressive zero-shot capabilities, whereby they can use auxiliary data, without the availability of task-specific training examples, to complete a variety of natural language tasks, such as summarization, dialogue generation, and question answering. However, despite many promising applications of LLMs in clinical medicine, adoption of these models has been limited by their tendency to generate incorrect and sometimes even harmful statements.We tasked a panel of eight board-certified clinicians and two health care practitioners with evaluating Almanac, an LLM framework augmented with retrieval capabilities from curated medical resources for medical guideline and treatment recommendations. The panel compared responses from Almanac and standard LLMs (ChatGPT-4, Bing, and Bard) versus a novel data set of 314 clinical questions spanning nine medical specialties.Almanac showed a significant improvement in performance compared with the standard LLMs across axes of factuality, completeness, user preference, and adversarial safety.Our results show the potential for LLMs with access to domain-specific corpora to be effective in clinical decision-making. The findings also underscore the importance of carefully testing LLMs before deployment to mitigate their shortcomings. (Funded by the National Institutes of Health, National Heart, Lung, and Blood Institute.).

View details for DOI 10.1056/aioa2300068

View details for PubMedID 38343631

View details for PubMedCentralID PMC10857783
Ocular Biometry OCR: a machine learning algorithm leveraging optical character recognition to extract intra ocular lens biometry measurements. Frontiers in artificial intelligence Salvi, A., Arnal, L., Ly, K., Ferreira, G., Wang, S. Y., Langlotz, C., Mahajan, V., Ludwig, C. A. 2024; 7: 1428716

Abstract

Given close relationships between ocular structure and ophthalmic disease, ocular biometry measurements (including axial length, lens thickness, anterior chamber depth, and keratometry values) may be leveraged as features in the prediction of eye diseases. However, ocular biometry measurements are often stored as PDFs rather than as structured data in electronic health records. Thus, time-consuming and laborious manual data entry is required for using biometry data as a disease predictor. Herein, we used two separate models, PaddleOCR and Gemini, to extract eye specific biometric measurements from 2,965 Lenstar, 104 IOL Master 500, and 3,616 IOL Master 700 optical biometry reports. For each patient eye, our text extraction pipeline, referred to as Ocular Biometry OCR, involves 1) cropping the report to the biometric data, 2) extracting the text via the optical character recognition model, 3) post-processing the metrics and values into key value pairs, 4) correcting erroneous angles within the pairs, 5) computing the number of errors or missing values, and 6) selecting the window specific results with fewest errors or missing values. To ensure the models' predictions could be put into a machine learning-ready format, artifacts were removed from categorical text data through manual modification where necessary. Performance was evaluated by scoring PaddleOCR and Gemini results. In the absence of ground truth, higher scoring indicated greater inter-model reliability, assuming an equal value between models indicated an accurate result. The detection scores, measuring the number of valid values (i.e., not missing or erroneous), were Lenstar: 0.990, IOLM 500: 1.000, and IOLM 700: 0.998. The similarity scores, measuring the number of equal values, were Lenstar: 0.995, IOLM 500: 0.999, and IOLM 700: 0.999. The agreement scores, combining detection and similarity scores, were Lenstar: 0.985, IOLM 500: 0.999, and IOLM 700: 0.998. IOLM 500 was annotated for ground truths; in this case, higher scoring indicated greater model-to-annotator accuracy. PaddleOCR-to-Annotator achieved scores of detection: 1.000, similarity: 0.999, and agreement: 0.999. Gemini-to-Annotator achieved scores of detection: 1.000, similarity: 1.000, and agreement: 1.000. Scores range from 0 to 1. While PaddleOCR and Gemini demonstrated high agreement, PaddleOCR offered slightly better performance upon reviewing quantitative and qualitative results.

View details for DOI 10.3389/frai.2024.1428716

View details for PubMedID 39834877

View details for PubMedCentralID PMC11743993
Human-AI Symbiosis: A Path Forward to Improve Chest Radiography and the Role of Radiologists in Patient Care. Radiology Gefter, W. B., Prokop, M., Seo, J. B., Raoof, S., Langlotz, C. P., Hatabu, H. 2024; 310 (1): e232778

View details for DOI 10.1148/radiol.232778

View details for PubMedID 38259206
Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segmentation Deshpande, T., Prakash, E., Ross, E., Langlotz, C., Ng, A., Valanarasu, J. edited by Burgos, N., Petitjean, C., Vakalopoulou, M., Christodoulidis, S., Coupe, P., Delingette, H., Lartizien, C., Mateus, D. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024: 391-405

View details for Web of Science ID 001482498600025
Perceptions of Data Set Experts on Important Characteristics of Health Data Sets Ready for Machine Learning: A Qualitative Study. JAMA network open Ng, M. Y., Youssef, A., Miner, A. S., Sarellano, D., Long, J., Larson, D. B., Hernandez-Boussard, T., Langlotz, C. P. 2023; 6 (12): e2345892

Abstract

The lack of data quality frameworks to guide the development of artificial intelligence (AI)-ready data sets limits their usefulness for machine learning (ML) research in health care and hinders the diagnostic excellence of developed clinical AI applications for patient care.To discern what constitutes high-quality and useful data sets for health and biomedical ML research purposes according to subject matter experts.This qualitative study interviewed data set experts, particularly those who are creators and ML researchers. Semistructured interviews were conducted in English and remotely through a secure video conferencing platform between August 23, 2022, and January 5, 2023. A total of 93 experts were invited to participate. Twenty experts were enrolled and interviewed. Using purposive sampling, experts were affiliated with a diverse representation of 16 health data sets/databases across organizational sectors. Content analysis was used to evaluate survey information and thematic analysis was used to analyze interview data.Data set experts' perceptions on what makes data sets AI ready.Participants included 20 data set experts (11 [55%] men; mean [SD] age, 42 [11] years), of whom all were health data set creators, and 18 of the 20 were also ML researchers. Themes (3 main and 11 subthemes) were identified and integrated into an AI-readiness framework to show their association within the health data ecosystem. Participants partially determined the AI readiness of data sets using priority appraisal elements of accuracy, completeness, consistency, and fitness. Ethical acquisition and societal impact emerged as appraisal considerations in that participant samples have not been described to date in prior data quality frameworks. Factors that drive creation of high-quality health data sets and mitigate risks associated with data reuse in ML research were also relevant to AI readiness. The state of data availability, data quality standards, documentation, team science, and incentivization were associated with elements of AI readiness and the overall perception of data set usefulness.In this qualitative study of data set experts, participants contributed to the development of a grounded framework for AI data set quality. Data set AI readiness required the concerted appraisal of many elements and the balancing of transparency and ethical reflection against pragmatic constraints. The movement toward more reliable, relevant, and ethical AI and ML applications for patient care will inevitably require strategic updates to data set creation practices.

View details for DOI 10.1001/jamanetworkopen.2023.45892

View details for PubMedID 38039004
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts. Research square Veen, D. V., Uden, C. V., Blankemeier, L., Delbrouck, J. B., Aali, A., Bluethgen, C., Pareek, A., Polacin, M., Reis, E. P., Seehofnerova, A., Rohatgi, N., Hosamani, P., Collins, W., Ahuja, N., Langlotz, C., Hom, J., Gatidis, S., Pauly, J., Chaudhari, A. 2023

Abstract

Sifting through vast textual data and summarizing key information from electronic health records (EHR) imposes a substantial burden on how clinicians allocate their time. Although large language models (LLMs) have shown immense promise in natural language processing (NLP) tasks, their efficacy on a diverse range of clinical summarization tasks has not yet been rigorously demonstrated. In this work, we apply domain adaptation methods to eight LLMs, spanning six datasets and four distinct clinical summarization tasks: radiology reports, patient questions, progress notes, and doctor-patient dialogue. Our thorough quantitative assessment reveals trade-offs between models and adaptation methods in addition to instances where recent advances in LLMs may not improve results. Further, in a clinical reader study with ten physicians, we show that summaries from our best-adapted LLMs are preferable to human summaries in terms of completeness and correctness. Our ensuing qualitative analysis highlights challenges faced by both LLMs and human experts. Lastly, we correlate traditional quantitative NLP metrics with reader study scores to enhance our understanding of how these metrics align with physician preferences. Our research marks the first evidence of LLMs outperforming human experts in clinical text summarization across multiple tasks. This implies that integrating LLMs into clinical workflows could alleviate documentation burden, empowering clinicians to focus more on personalized patient care and the inherently human aspects of medicine.

View details for DOI 10.21203/rs.3.rs-3483777/v1

View details for PubMedID 37961377

View details for PubMedCentralID PMC10635391
The Stanford Medicine data science ecosystem for clinical and translational research. JAMIA open Callahan, A., Ashley, E., Datta, S., Desai, P., Ferris, T. A., Fries, J. A., Halaas, M., Langlotz, C. P., Mackey, S., Posada, J. D., Pfeffer, M. A., Shah, N. H. 2023; 6 (3): ooad054

Abstract

To describe the infrastructure, tools, and services developed at Stanford Medicine to maintain its data science ecosystem and research patient data repository for clinical and translational research.The data science ecosystem, dubbed the Stanford Data Science Resources (SDSR), includes infrastructure and tools to create, search, retrieve, and analyze patient data, as well as services for data deidentification, linkage, and processing to extract high-value information from healthcare IT systems. Data are made available via self-service and concierge access, on HIPAA compliant secure computing infrastructure supported by in-depth user training.The Stanford Medicine Research Data Repository (STARR) functions as the SDSR data integration point, and includes electronic medical records, clinical images, text, bedside monitoring data and HL7 messages. SDSR tools include tools for electronic phenotyping, cohort building, and a search engine for patient timelines. The SDSR supports patient data collection, reproducible research, and teaching using healthcare data, and facilitates industry collaborations and large-scale observational studies.Research patient data repositories and their underlying data science infrastructure are essential to realizing a learning health system and advancing the mission of academic medical centers. Challenges to maintaining the SDSR include ensuring sufficient financial support while providing researchers and clinicians with maximal access to data and digital infrastructure, balancing tool development with user training, and supporting the diverse needs of users.Our experience maintaining the SDSR offers a case study for academic medical centers developing data science and research informatics infrastructure.

View details for DOI 10.1093/jamiaopen/ooad054

View details for PubMedID 37545984

View details for PubMedCentralID PMC10397535
External validation, radiological evaluation, and development of deep learning automatic lung segmentation in contrast-enhanced chest CT. European radiology Dwivedi, K., Sharkey, M., Alabed, S., Langlotz, C. P., Swift, A. J., Bluethgen, C. 2023

Abstract

There is a need for CT pulmonary angiography (CTPA) lung segmentation models. Clinical translation requires radiological evaluation of model outputs, understanding of limitations, and identification of failure points. This multicentre study aims to develop an accurate CTPA lung segmentation model, with evaluation of outputs in two diverse patient cohorts with pulmonary hypertension (PH) and interstitial lung disease (ILD).This retrospective study develops an nnU-Net-based segmentation model using data from two specialist centres (UK and USA). Model was trained (n = 37), tested (n = 12), and clinically evaluated (n = 176) on a diverse 'real-world' cohort of 225 PH patients with volumetric CTPAs. Dice score coefficient (DSC) and normalised surface distance (NSD) were used for testing. Clinical evaluation of outputs was performed by two radiologists who assessed clinical significance of errors. External validation was performed on heterogenous contrast and non-contrast scans from 28 ILD patients.A total of 225 PH and 28 ILD patients with diverse demographic and clinical characteristics were evaluated. Mean accuracy, DSC, and NSD scores were 0.998 (95% CI 0.9976, 0.9989), 0.990 (0.9840, 0.9962), and 0.983 (0.9686, 0.9972) respectively. There were no segmentation failures. On radiological review, 82% and 71% of internal and external cases respectively had no errors. Eighteen percent and 25% respectively had clinically insignificant errors. Peripheral atelectasis and consolidation were common causes for suboptimal segmentation. One external case (0.5%) with patulous oesophagus had a clinically significant error.State-of-the-art CTPA lung segmentation model provides accurate outputs with minimal clinical errors on evaluation across two diverse cohorts with PH and ILD.Clinical translation of artificial intelligence models requires radiological review and understanding of model limitations. This study develops an externally validated state-of-the-art model with robust radiological review. Intended clinical use is in techniques such as lung volume or parenchymal disease quantification.• Accurate, externally validated CT pulmonary angiography (CTPA) lung segmentation model tested in two large heterogeneous clinical cohorts (pulmonary hypertension and interstitial lung disease). • No segmentation failures and robust review of model outputs by radiologists found 1 (0.5%) clinically significant segmentation error. • Intended clinical use of this model is a necessary step in techniques such as lung volume, parenchymal disease quantification, or pulmonary vessel analysis.

View details for DOI 10.1007/s00330-023-10235-9

View details for PubMedID 37775589

View details for PubMedCentralID 6646484
Evaluating progress in automatic chest X-ray radiology report generation. Patterns (New York, N.Y.) Yu, F., Endo, M., Krishnan, R., Pan, I., Tsai, A., Reis, E. P., Fonseca, E. K., Lee, H. M., Abad, Z. S., Ng, A. Y., Langlotz, C. P., Venugopal, V. K., Rajpurkar, P. 2023; 4 (9): 100802

Abstract

Artificial intelligence (AI) models for automatic generation of narrative radiology reports from images have the potential to enhance efficiency and reduce the workload of radiologists. However, evaluating the correctness of these reports requires metrics that can capture clinically pertinent differences. In this study, we investigate the alignment between automated metrics and radiologists' scoring of errors in report generation. We address the limitations of existing metrics by proposing new metrics, RadGraph F1 and RadCliQ, which demonstrate stronger correlation with radiologists' evaluations. In addition, we analyze the failure modes of the metrics to understand their limitations and provide guidance for metric selection and interpretation. This study establishes RadGraph F1 and RadCliQ as meaningful metrics for guiding future research in radiology report generation.

View details for DOI 10.1016/j.patter.2023.100802

View details for PubMedID 37720336

View details for PubMedCentralID PMC10499844
Automatic Detection of Perilunate and Lunate Dislocations on Wrist Radiographs Using Deep Learning. Plastic and reconstructive surgery Pridgen, B., von Rabenau, L., Luan, A., Gu, A. J., Wang, D. S., Langlotz, C., Chang, J., Do, B. 2023

Abstract

Delayed or missed diagnosis of perilunate or lunate dislocations can lead to significant morbidity. Advances in computer vision provide an opportunity to improve diagnostic performance. In this study, a deep learning algorithm was utilized for detection of perilunate and lunate dislocations on lateral wrist radiographs. A total of 435 lateral wrist radiographs were labeled as normal or pathologic (perilunate or lunate dislocation). The lunate in each radiograph was segmented with a rectangular bounding box. Images were partitioned into training and test sets. Two neural networks, consisting of an object detector followed by an image classifier, were applied in series. First, the object detection module was used to localize the lunate. Next, the image classifier performed a binary classification for normal or pathologic. The accuracy, sensitivity, and specificity of the overall system were evaluated. A receiver operating characteristic (ROC) curve and the associated area under the curve (AUC) were used to demonstrate the overall performance of the computer vision algorithm. The lunate object detector was 97.0% accurate at identifying the lunate. Accuracy was 98.7% among the sub-group of normal wrist radiographs, and 91.3% among the sub-group of wrist radiographs with perilunate/lunate dislocations. The perilunate/lunate dislocation classifier had a sensitivity (recall) of 93.8%, specificity of 93.3%, and accuracy of 93.4%. The AUC was 0.986. We have developed a proof-of-concept computer vision system for diagnosis of perilunate/lunate dislocations on lateral wrist radiographs. This novel deep learning algorithm has potential to improve clinical sensitivity to ultimately prevent delayed or missed diagnosis of these injuries.

View details for DOI 10.1097/PRS.0000000000010928

View details for PubMedID 37467052
Almanac: Retrieval-Augmented Language Models for Clinical Medicine. Research square Zakka, C., Chaurasia, A., Shad, R., Dalal, A. R., Kim, J. L., Moor, M., Alexander, K., Ashley, E., Boyd, J., Boyd, K., Hirsch, K., Langlotz, C., Nelson, J., Hiesinger, W. 2023

Abstract

Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate incorrect and sometimes even toxic statements. In this study, we develop Almanac, a large language model framework augmented with retrieval capabilities for medical guideline and treatment recommendations. Performance on a novel dataset of clinical scenarios (n= 130) evaluated by a panel of 5 board-certified and resident physicians demonstrates significant increases in factuality (mean of 18% at p-value < 0.05) across all specialties, with improvements in completeness and safety. Our results demonstrate the potential for large language models to be effective tools in the clinical decision-making process, while also emphasizing the importance of careful testing and deployment to mitigate their shortcomings.

View details for DOI 10.21203/rs.3.rs-2883198/v1

View details for PubMedID 37205549

View details for PubMedCentralID PMC10187428
Diagnosis and Treatment of Patients With Suspected Pneumonia in 28 Utah Urgent Care Clinics Dean, N. C., Hart, J. H., Eve, J. R., Butler, A. M., Sakata, T. W., Wallin, A. R., Reid, J. D., Atwood, B. M., Carman, C., Haug, P. J., Kuttler, K. G., Van Uden, C. E., Irvin, J. A., Langlotz, C. P., Stenehjem, E. A. AMER THORACIC SOC. 2023

View details for Web of Science ID 000995814700218
Truth and Transformation: RSNA's Journey Toward Equity. Radiographics : a review publication of the Radiological Society of North America, Inc Langlotz, C. P., Mauro, M. A., Mahmood, U., Klein, J. S., Meltzer, C. C., Bhalla, S., Heller, R. E., Scott, J. A., Flanders, A. E., Pandharipande, P. V. 2023; 43 (4): e239005

View details for DOI 10.1148/rg.239005

View details for PubMedID 36862085
Evaluating semi-supervision methods for medical image segmentation: applications in cardiac magnetic resonance imaging. Journal of medical imaging (Bellingham, Wash.) Hooper, S. M., Wu, S., Davies, R. H., Bhuva, A., Schelbert, E. B., Moon, J. C., Kellman, P., Xue, H., Langlotz, C., Ré, C. 2023; 10 (2): 024007

Abstract

Neural networks have potential to automate medical image segmentation but require expensive labeling efforts. While methods have been proposed to reduce the labeling burden, most have not been thoroughly evaluated on large, clinical datasets or clinical tasks. We propose a method to train segmentation networks with limited labeled data and focus on thorough network evaluation.We propose a semi-supervised method that leverages data augmentation, consistency regularization, and pseudolabeling and train four cardiac magnetic resonance (MR) segmentation networks. We evaluate the models on multiinstitutional, multiscanner, multidisease cardiac MR datasets using five cardiac functional biomarkers, which are compared to an expert's measurements using Lin's concordance correlation coefficient (CCC), the within-subject coefficient of variation (CV), and the Dice coefficient.The semi-supervised networks achieve strong agreement using Lin's CCC ( > 0.8 ), CV similar to an expert, and strong generalization performance. We compare the error modes of the semi-supervised networks against fully supervised networks. We evaluate semi-supervised model performance as a function of labeled training data and with different types of model supervision, showing that a model trained with 100 labeled image slices can achieve a Dice coefficient within 1.10% of a network trained with 16,000+ labeled image slices.We evaluate semi-supervision for medical image segmentation using heterogeneous datasets and clinical metrics. As methods for training models with little labeled data become more common, knowledge about how they perform on clinical tasks, how they fail, and how they perform with different amounts of labeled data is useful to model developers and users.

View details for DOI 10.1117/1.JMI.10.2.024007

View details for PubMedID 37009059

View details for PubMedCentralID PMC10061343
Toward Expanding the Scope of Radiology Report Summarization to Multiple Anatomies and Modalities Chen, Z., Varma, M., Wan, X., Langlotz, C. P., Delbrouck, J. edited by Boyd-Graber, J., Okazaki, N., Rogers, A. ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2023: 469-484

View details for Web of Science ID 001181088800041
Exploring Image Augmentations for Siamese Representation Learning with Chest X-Rays Van der Sluijs, R., Bhaskhar, N., Rubin, D. L., Langlotz, C. P., Chaudhari, A. S. edited by Noble, J., Li, Oguz, Styner, M., Baumgartner, C., Rusu, M., Heinmann, T., Kontos, D., Landman, B., Dawant, B. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023: 444-467

View details for Web of Science ID 001221108600027
RaLEs: a Benchmark for Radiology Language Evaluations Chaves, J., Bhaskhar, N., Attias, M., Delbrouck, J., Rubin, D. L., Loening, A., Langlotz, C., Chaudhari, A. S. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023

View details for Web of Science ID 001228825101034
INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis Huang, S., Huo, Z., Steinberg, E., Chiang, C., Lungren, M. P., Langlotz, C. P., Yeung, S., Shah, N. H., Fries, J. A. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023

View details for Web of Science ID 001224281507036
ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data Varma, M., Delbrouck, J., Hooper, S., Chaudhari, A., Langlotz, C., IEEE IEEE COMPUTER SOC. 2023: 22168-22178

View details for DOI 10.1109/ICCV51070.2023.02031

View details for Web of Science ID 001169500506073
A case for reframing automated medical image classification as segmentation Hooper, S. M., Chen, M. F., Saab, K., Bhatia, K., Langlotz, C., Re, C. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023

View details for Web of Science ID 001202273400019
A hybrid modelling approach for abstracting CT imaging indications by integrating natural language processing from radiology reports with structured data from electronic health records. Khan, A., Wu, J., Choi, E., Graber-Naidich, A., Henry, S., Wakelee, H. A., Kurian, A. W., Liang, S., Leung, A., Langlotz, C., Backhus, L. M., Han, S. S. AMER ASSOC CANCER RESEARCH. 2023

View details for Web of Science ID 001057852300077
Developing medical imaging AI for emerging infectious diseases. Nature communications Huang, S., Chaudhari, A. S., Langlotz, C. P., Shah, N., Yeung, S., Lungren, M. P. 2022; 13 (1): 7060

View details for DOI 10.1038/s41467-022-34234-4

View details for PubMedID 36400764
Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports. Journal of digital imaging Chambon, P., Cook, T. S., Langlotz, C. P. 2022

Abstract

Building a document-level classifier for COVID-19 on radiology reports could help assist providers in their daily clinical routine, as well as create large numbers of labels for computer vision models. We have developed such a classifier by fine-tuning a BERT-like model initialized from RadBERT, its continuous pre-training on radiology reports that can be used on all radiology-related tasks. RadBERT outperforms all biomedical pre-trainings on this COVID-19 task (P<0.01) and helps our fine-tuned model achieve an 88.9 macro-averaged F1-score, when evaluated on both X-ray and CT reports. To build this model, we rely on a multi-institutional dataset re-sampled and enriched with concurrent lung diseases, helping the model to resist to distribution shifts. In addition, we explore a variety of fine-tuning and hyperparameter optimization techniques that accelerate fine-tuning convergence, stabilize performance, and improve accuracy, especially when data or computational resources are limited. Finally, we provide a set of visualization tools and explainability methods to better understand the performance of the model, and support its practical use in the clinical setting. Our approach offers a ready-to-use COVID-19 classifier and can be applied similarly to other radiology report classification tasks.

View details for DOI 10.1007/s10278-022-00714-8

View details for PubMedID 36323915
Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nature biomedical engineering Tiu, E., Talius, E., Patel, P., Langlotz, C. P., Ng, A. Y., Rajpurkar, P. 2022

Abstract

In tasks involving the interpretation of medical images, suitably trained machine-learning models often exceed the performance of medical experts. Yet such a high-level of performance typically requires that the models be trained with relevant datasets that have been painstakingly annotated by experts. Here we show that a self-supervised model trained on chest X-ray images that lack explicit annotations performs pathology-classification tasks with accuracies comparable to those of radiologists. On an external validation dataset of chest X-rays, the self-supervised model outperformed a fully supervised model in the detection of three pathologies (out of eight), and the performance generalized to pathologies that were not explicitly annotated for model training, to multiple image-interpretation tasks and to datasets from multiple institutions.

View details for DOI 10.1038/s41551-022-00936-9

View details for PubMedID 36109605
Optimizing the Breast Imaging Report for Today and Tomorrow. Journal of breast imaging McGrath, A. L., McGinty, G., Berg, W. A., Mendelson, E. B., Drotman, M. B., Ellis, R. L., Langlotz, C. P. 2022; 4 (4): 343-345

View details for DOI 10.1093/jbi/wbac033

View details for PubMedID 38416981
Deep Learning Preoperative Risk Stratification. The Annals of thoracic surgery Ouyang, D., Hiesinger, W., Langlotz, C. 2022

View details for DOI 10.1016/j.athoracsur.2022.05.023

View details for PubMedID 35661716
ViLMedic: a framework for research at the intersection of vision and language in medical AI Delbrouck, J., Saab, K., Varma, M., Eyuboglu, S., Dunnmon, J. A., Chambon, P., Zambrano, J., Chaudhari, A., Langlotz, C. P., Assoc Computat Linguist ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2022: 23-34

View details for Web of Science ID 000828759800003
Contrastive Learning of Medical Visual Representations from Paired Images and Text Zhang, Y., Jiang, H., Miura, Y., Manning, C. D., Langlotz, C. P. edited by Lipton, Z., Ranganath, R., Sendak, M., Sjoding, M., Yeung, S. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022: 2-25

View details for Web of Science ID 001227558500001
Designing clinically translatable artificial intelligence systems for high-dimensional medical imaging NATURE MACHINE INTELLIGENCE Shad, R., Cunningham, J. P., Ashley, E. A., Langlotz, C. P., Hiesinger, W. 2021; 3 (11): 929-935

View details for DOI 10.1038/s42256-021-00399-8

View details for Web of Science ID 000719338000003
Biomedical and clinical English model packages for the Stanza Python NLP library. Journal of the American Medical Informatics Association : JAMIA Zhang, Y., Zhang, Y., Qi, P., Manning, C. D., Langlotz, C. P. 2021

Abstract

OBJECTIVE: The study sought to develop and evaluate neural natural language processing (NLP) packages for the syntactic analysis and named entity recognition of biomedical and clinical English text.MATERIALS AND METHODS: We implement and train biomedical and clinical English NLP pipelines by extending the widely used Stanza library originally designed for general NLP tasks. Our models are trained with a mix of public datasets such as the CRAFT treebank as well as with a private corpus of radiology reports annotated with 5 radiology-domain entities. The resulting pipelines are fully based on neural networks, and are able to perform tokenization, part-of-speech tagging, lemmatization, dependency parsing, and named entity recognition for both biomedical and clinical text. We compare our systems against popular open-source NLP libraries such as CoreNLP and scispaCy, state-of-the-art models such as the BioBERT models, and winning systems from the BioNLP CRAFT shared task.RESULTS: For syntactic analysis, our systems achieve much better performance compared with the released scispaCy models and CoreNLP models retrained on the same treebanks, and are on par with the winning system from the CRAFT shared task. For NER, our systems substantially outperform scispaCy, and are better or on par with the state-of-the-art performance from BioBERT, while being much more computationally efficient.CONCLUSIONS: We introduce biomedical and clinical NLP packages built for the Stanza library. These packages offer performance that is similar to the state of the art, and are also optimized for ease of use. To facilitate research, we make all our models publicly available. We also provide an online demonstration (http://stanza.run/bio).

View details for DOI 10.1093/jamia/ocab090

View details for PubMedID 34157094
Long-term survival in patients with post-LVAD right ventricular failure: multi-state modelling with competing outcomes of heart transplant. The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation Shad, R., Fong, R., Quach, N., Bowles, C., Kasinpila, P., Li, M., Callon, K., Castro, M., Guha, A., Suarez, E. E., Lee, S., Jovinge, S., Boeve, T., Shudo, Y., Langlotz, C. P., Teuteberg, J., Hiesinger, W. 2021

Abstract

BACKGROUND: Multicenter data on long term survival following LVAD implantation that make use of contemporary definitions of RV failure are limited. Furthermore, traditional survival analyses censor patients who receive a bridge to heart transplant. Here we compare the outcomes of LVAD patients who develop post-operative RV failure accounting for the transitional probability of receiving an interim heart transplantation.METHODS: We use a retrospective cohort of LVAD patients sourced from multiple high-volume centers based in the United States. Five- and ten-year survival accounting for transition probabilities of receiving a heart transplant were calculated using a multi-state Aalen Johansen survival model.RESULTS: Of the 897 patients included in the study, 238 (26.5%) developed post-operative RV failure at index hospitalization. At 10 years the probability of death with post-op RV failure was 79.28% vs 61.70% in patients without (HR 2.10; 95% CI 1.72 - 2.57; p=< .001). Though not significant, patients with RV failure were less likely to be bridged to a heart transplant (HR 0.87, p=.4). Once transplanted the risk of death between both patient groups remained equivalent; the probability of death after a heart transplant was 3.97% in those with post-operative RV failure shortly after index LVAD implant, as compared to 14.71% in those without.CONCLUSIONS AND RELEVANCE: Long-term durable mechanical circulatory support is associated with significantly higher mortality in patients who develop post-operative RV failure. Improving outcomes may necessitate expeditious bridge to heart transplant wherever appropriate, along with critical reassessment of organ allocation policies.

View details for DOI 10.1016/j.healun.2021.05.002

View details for PubMedID 34167863
Regulatory Frameworks for Development and Evaluation of Artificial Intelligence-Based Diagnostic Imaging Algorithms: Summary and Recommendations JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY Larson, D. B., Harvey, H., Rubin, D. L., Irani, N., Tse, J. R., Langlotz, C. P. 2021; 18 (3): 413–24

View details for DOI 10.1016/j.jacr.2020.09.060413

View details for Web of Science ID 000631977100012
Predicting post-operative right ventricular failure using video-based deep learning. Nature communications Shad, R., Quach, N., Fong, R., Kasinpila, P., Bowles, C., Castro, M., Guha, A., Suarez, E. E., Jovinge, S., Lee, S., Boeve, T., Amsallem, M., Tang, X., Haddad, F., Shudo, Y., Woo, Y. J., Teuteberg, J., Cunningham, J. P., Langlotz, C. P., Hiesinger, W. 2021; 12 (1): 5192

Abstract

Despite progressive improvements over the decades, the rich temporally resolved data in an echocardiogram remain underutilized. Human assessments reduce the complex patterns of cardiac wall motion, to a small list of measurements of heart function. All modern echocardiography artificial intelligence (AI) systems are similarly limited by design - automating measurements of the same reductionist metrics rather than utilizing the embedded wealth of data. This underutilization is most evident where clinical decision making is guided by subjective assessments of disease acuity. Predicting the likelihood of developing post-operative right ventricular failure (RV failure) in the setting of mechanical circulatory support is one such example. Here we describe a video AI system trained to predict post-operative RV failure using the full spatiotemporal density of information in pre-operative echocardiography. We achieve an AUC of 0.729, and show that this ML system significantly outperforms a team of human experts at the same task on independent evaluation.

View details for DOI 10.1038/s41467-021-25503-9

View details for PubMedID 34465780
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation Miura, Y., Zhang, Y., Tsai, E., Langlotz, C. P., Jurafsky, D., Assoc Computat Linguist ASSOC COMPUTATIONAL LINGUISTICS-ACL. 2021: 5288-5304

View details for Web of Science ID 000895685605031
Beyond the AJR: "Deep Learning Using Chest Radiographs to Identify High-Risk Smokers for Lung Cancer Screening Computed Tomography: Development and Validation of a Prediction Model". AJR. American journal of roentgenology Patel, B. N., Langlotz, C. P. 2020

View details for DOI 10.2214/AJR.20.25334

View details for PubMedID 33355488
The Project Baseline Health Study: a step towards a broader mission to map human health NPJ DIGITAL MEDICINE Arges, K., Assimes, T., Bajaj, V., Balu, S., Bashir, M. R., Beskow, L., Blanco, R., Califf, R., Campbell, P., Carin, L., Christian, V., Cousins, S., Das, M., Dockery, M., Douglas, P. S., Dunham, A., Eckstrand, J., Fleischmann, D., Ford, E., Fraulo, E., French, J., Gambhir, S. S., Ginsburg, G. S., Green, R. C., Haddad, F., Hernandez, A., Hernandez, J., Huang, E. S., Jaffe, G., King, D., Koweek, L. H., Langlotz, C., Liao, Y. J., Mahaffey, K. W., Marcom, K., Marks, W. J., Maron, D., McCabe, R., McCall, S., McCue, R., Mega, J., Miller, D., Muhlbaier, L. H., Munshi, R., Newby, L., Pak-Harvey, E., Patrick-Lake, B., Pencina, M., Peterson, E. D., Rodriguez, F., Shore, S., Shah, S., Shipes, S., Sledge, G., Spielman, S., Spitler, R., Schaack, T., Swamy, G., Willemink, M. J., Wong, C. A. 2020; 3 (1): 84

Abstract

The Project Baseline Health Study (PBHS) was launched to map human health through a comprehensive understanding of both the health of an individual and how it relates to the broader population. The study will contribute to the creation of a biomedical information system that accounts for the highly complex interplay of biological, behavioral, environmental, and social systems. The PBHS is a prospective, multicenter, longitudinal cohort study that aims to enroll thousands of participants with diverse backgrounds who are representative of the entire health spectrum. Enrolled participants will be evaluated serially using clinical, molecular, imaging, sensor, self-reported, behavioral, psychological, environmental, and other health-related measurements. An initial deeply phenotyped cohort will inform the development of a large, expanded virtual cohort. The PBHS will contribute to precision health and medicine by integrating state of the art testing, longitudinal monitoring and participant engagement, and by contributing to the development of an improved platform for data sharing and analysis.

View details for DOI 10.1038/s41746-020-0290-y

View details for Web of Science ID 000538242900001

View details for PubMedID 32550652

View details for PubMedCentralID PMC7275087
Integrating artificial intelligence into the clinical practice of radiology: challenges and recommendations. European radiology Recht, M. P., Dewey, M., Dreyer, K., Langlotz, C., Niessen, W., Prainsack, B., Smith, J. J. 2020

Abstract

Artificial intelligence (AI) has the potential to significantly disrupt the way radiology will be practiced in the near future, but several issues need to be resolved before AI can be widely implemented in daily practice. These include the role of the different stakeholders in the development of AI for imaging, the ethical development and use of AI in healthcare, the appropriate validation of each developed AI algorithm, the development of effective data sharing mechanisms, regulatory hurdles for the clearance of AI algorithms, and the development of AI educational resources for both practicing radiologists and radiology trainees. This paper details these issues and presents possible solutions based on discussions held at the 2019 meeting of the International Society for Strategic Studies in Radiology. KEY POINTS: Radiologists should be aware of the different types of bias commonly encountered in AI studies, and understand their possible effects. Methods for effective data sharing to train, validate, and test AI algorithms need to be developed. It is essential for all radiologists to gain an understanding of the basic principles, potentials, and limits of AI.

View details for DOI 10.1007/s00330-020-06672-5

View details for PubMedID 32064565
AppendiXNet: Deep Learning for Diagnosis of Appendicitis from A Small Dataset of CT Exams Using Video Pretraining. Scientific reports Rajpurkar, P. n., Park, A. n., Irvin, J. n., Chute, C. n., Bereket, M. n., Mastrodicasa, D. n., Langlotz, C. P., Lungren, M. P., Ng, A. Y., Patel, B. N. 2020; 10 (1): 3958

Abstract

The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abdominal emergencies, using a small training dataset of less than 500 training CT exams. We explored whether pretraining the model on a large collection of natural videos would improve the performance of the model over training the model from scratch. AppendiXNet was pretrained on a large collection of YouTube videos called Kinetics, consisting of approximately 500,000 video clips and annotated for one of 600 human action classes, and then fine-tuned on a small dataset of 438 CT scans annotated for appendicitis. We found that pretraining the 3D model on natural videos significantly improved the performance of the model from an AUC of 0.724 (95% CI 0.625, 0.823) to 0.810 (95% CI 0.725, 0.895). The application of deep learning to detect abnormalities on CT examinations using video pretraining could generalize effectively to other challenging cross-sectional medical imaging tasks when training data is limited.

View details for DOI 10.1038/s41598-020-61055-6

View details for PubMedID 32127625
Impact of a deep learning assistant on the histopathologic classification of liver cancer. NPJ digital medicine Kiani, A. n., Uyumazturk, B. n., Rajpurkar, P. n., Wang, A. n., Gao, R. n., Jones, E. n., Yu, Y. n., Langlotz, C. P., Ball, R. L., Montine, T. J., Martin, B. A., Berry, G. J., Ozawa, M. G., Hazard, F. K., Brown, R. A., Chen, S. B., Wood, M. n., Allard, L. S., Ylagan, L. n., Ng, A. Y., Shen, J. n. 2020; 3 (1): 23

Abstract

Artificial intelligence (AI) algorithms continue to rival human performance on a variety of clinical tasks, while their actual impact on human diagnosticians, when incorporated into clinical workflows, remains relatively unexplored. In this study, we developed a deep learning-based assistant to help pathologists differentiate between two subtypes of primary liver cancer, hepatocellular carcinoma and cholangiocarcinoma, on hematoxylin and eosin-stained whole-slide images (WSI), and evaluated its effect on the diagnostic performance of 11 pathologists with varying levels of expertise. Our model achieved accuracies of 0.885 on a validation set of 26 WSI, and 0.842 on an independent test set of 80 WSI. Although use of the assistant did not change the mean accuracy of the 11 pathologists (p = 0.184, OR = 1.281), it significantly improved the accuracy (p = 0.045, OR = 1.499) of a subset of nine pathologists who fell within well-defined experience levels (GI subspecialists, non-GI subspecialists, and trainees). In the assisted state, model accuracy significantly impacted the diagnostic decisions of all 11 pathologists. As expected, when the model's prediction was correct, assistance significantly improved accuracy (p = 0.000, OR = 4.289), whereas when the model's prediction was incorrect, assistance significantly decreased accuracy (p = 0.000, OR = 0.253), with both effects holding across all pathologist experience levels and case difficulty levels. Our results highlight the challenges of translating AI models into the clinical setting, and emphasize the importance of taking into account potential unintended negative consequences of model assistance when designing and testing medical AI-assistance tools.

View details for DOI 10.1038/s41746-020-0232-8

View details for PubMedID 33594170
Prospective Deployment of Deep Learning in MRI: A Framework for Important Considerations, Challenges, and Recommendations for Best Practices. Journal of magnetic resonance imaging : JMRI Chaudhari, A. S., Sandino, C. M., Cole, E. K., Larson, D. B., Gold, G. E., Vasanawala, S. S., Lungren, M. P., Hargreaves, B. A., Langlotz, C. P. 2020

Abstract

Artificial intelligence algorithms based on principles of deep learning (DL) have made a large impact on the acquisition, reconstruction, and interpretation of MRI data. Despite the large number of retrospective studies using DL, there are fewer applications of DL in the clinic on a routine basis. To address this large translational gap, we review the recent publications to determine three major use cases that DL can have in MRI, namely, that of model-free image synthesis, model-based image reconstruction, and image or pixel-level classification. For each of these three areas, we provide a framework for important considerations that consist of appropriate model training paradigms, evaluation of model robustness, downstream clinical utility, opportunities for future advances, as well recommendations for best current practices. We draw inspiration for this framework from advances in computer vision in natural imaging as well as additional healthcare fields. We further emphasize the need for reproducibility of research studies through the sharing of datasets and software. LEVEL OF EVIDENCE: 5 TECHNICAL EFFICACY STAGE: 2.

View details for DOI 10.1002/jmri.27331

View details for PubMedID 32830874
Regulatory Frameworks for Development and Evaluation of Artificial Intelligence-Based Diagnostic Imaging Algorithms: Summary and Recommendations. Journal of the American College of Radiology : JACR Larson, D. B., Harvey, H. n., Rubin, D. L., Irani, N. n., Tse, J. R., Langlotz, C. P. 2020

Abstract

Although artificial intelligence (AI)-based algorithms for diagnosis hold promise for improving care, their safety and effectiveness must be ensured to facilitate wide adoption. Several recently proposed regulatory frameworks provide a solid foundation but do not address a number of issues that may prevent algorithms from being fully trusted. In this article, we review the major regulatory frameworks for software as a medical device applications, identify major gaps, and propose additional strategies to improve the development and evaluation of diagnostic AI algorithms. We identify the following major shortcomings of the current regulatory frameworks: (1) conflation of the diagnostic task with the diagnostic algorithm, (2) superficial treatment of the diagnostic task definition, (3) no mechanism to directly compare similar algorithms, (4) insufficient characterization of safety and performance elements, (5) lack of resources to assess performance at each installed site, and (6) inherent conflicts of interest. We recommend the following additional measures: (1) separate the diagnostic task from the algorithm, (2) define performance elements beyond accuracy, (3) divide the evaluation process into discrete steps, (4) encourage assessment by a third-party evaluator, (5) incorporate these elements into the manufacturers' development process. Specifically, we recommend four phases of development and evaluation, analogous to those that have been applied to pharmaceuticals and proposed for software applications, to help ensure world-class performance of all algorithms at all installed sites. In the coming years, we anticipate the emergence of a substantial body of research dedicated to ensuring the accuracy, reliability, and safety of the algorithms.

View details for DOI 10.1016/j.jacr.2020.09.060

View details for PubMedID 33096088

View details for PubMedCentralID PMC7574690
Improving Cancer Diagnosis and Care: Patient Access to Oncologic Imaging Expertise JOURNAL OF CLINICAL ONCOLOGY Nass, S. J., Cogle, C. R., Brink, J. A., Langlotz, C. P., Balogh, E. P., Muellner, A., Siegal, D., Schilsky, R. L., Hricak, H. 2019; 37 (20): 1690-+

View details for DOI 10.1200/JCO.18.01970

View details for Web of Science ID 000510833000003
Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification ARTIFICIAL INTELLIGENCE IN MEDICINE Banerjee, I., Ling, Y., Chen, M. C., Hasan, S. A., Langlotz, C. P., Moradzadeh, N., Chapman, B., Amrhein, T., Mong, D., Rubin, D. L., Farri, O., Lungren, M. P. 2019; 97: 79–88

View details for DOI 10.1016/j.artmed.2018.11.004

View details for Web of Science ID 000474326600009
Cross-type biomedical named entity recognition with deep multi-task learning BIOINFORMATICS Wang, X., Zhang, Y., Ren, X., Zhang, Y., Zitnik, M., Shang, J., Langlotz, C., Han, J. 2019; 35 (10): 1745–52

View details for DOI 10.1093/bioinformatics/bty869

View details for Web of Science ID 000469437800015
Effect of Clinical Decision Support-Generated Report Cards Versus Real-Time Alerts on Primary Care Provider Guideline Adherence for Low Back Pain Outpatient Lumbar Spine MRI Orders AMERICAN JOURNAL OF ROENTGENOLOGY Zafart, H. M., Ip, I. K., Mills, A. M., Raja, A. S., Langlotz, C. P., Khorasani, R. 2019; 212 (2): 386–94

View details for DOI 10.2214/AJR.18.19780

View details for Web of Science ID 000461833400028
Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ digital medicine Patel, B. N., Rosenberg, L. n., Willcox, G. n., Baltaxe, D. n., Lyons, M. n., Irvin, J. n., Rajpurkar, P. n., Amrhein, T. n., Gupta, R. n., Halabi, S. n., Langlotz, C. n., Lo, E. n., Mammarappallil, J. n., Mariano, A. J., Riley, G. n., Seekins, J. n., Shen, L. n., Zucker, E. n., Lungren, M. n. 2019; 2: 111

Abstract

Human-in-the-loop (HITL) AI may enable an ideal symbiosis of human experts and AI models, harnessing the advantages of both while at the same time overcoming their respective limitations. The purpose of this study was to investigate a novel collective intelligence technology designed to amplify the diagnostic accuracy of networked human groups by forming real-time systems modeled on biological swarms. Using small groups of radiologists, the swarm-based technology was applied to the diagnosis of pneumonia on chest radiographs and compared against human experts alone, as well as two state-of-the-art deep learning AI models. Our work demonstrates that both the swarm-based technology and deep-learning technology achieved superior diagnostic accuracy than the human experts alone. Our work further demonstrates that when used in combination, the swarm-based technology and deep-learning technology outperformed either method alone. The superior diagnostic accuracy of the combined HITL AI solution compared to radiologists and AI alone has broad implications for the surging clinical AI deployment and implementation strategies in future practice.

View details for DOI 10.1038/s41746-019-0189-7

View details for PubMedID 31754637

View details for PubMedCentralID PMC6861262
Fostering a Healthy AI Ecosystem for Radiology: Conclusions of the 2018 RSNA Summit on AI in Radiology. Radiology. Artificial intelligence Chokshi, F. H., Flanders, A. E., Prevedello, L. M., Langlotz, C. P. 2019; 1 (2): 190021

Abstract

The 2018 RSNA Summit on AI in Radiology brought together a diverse group of stakeholders to identify and prioritize areas of need related to artificial intelligence in radiology. This article presents the proceedings of the summit with emphasis on RSNA's role in leading, organizing, and catalyzing change during this important time in radiology. © RSNA, 2019.

View details for DOI 10.1148/ryai.2019190021

View details for PubMedID 33937789

View details for PubMedCentralID PMC8017423
A Road Map for Translational Research on Artificial Intelligence in Medical Imaging: From the 2018 National Institutes of Health/RSNA/ACR/The Academy Workshop. Journal of the American College of Radiology : JACR Allen, B. n., Seltzer, S. E., Langlotz, C. P., Dreyer, K. P., Summers, R. M., Petrick, N. n., Marinac-Dabic, D. n., Cruz, M. n., Alkasab, T. K., Hanisch, R. J., Nilsen, W. J., Burleson, J. n., Lyman, K. n., Kandarpa, K. n. 2019

Abstract

Advances in machine learning in medical imaging are occurring at a rapid pace in research laboratories both at academic institutions and in industry. Important artificial intelligence (AI) tools for diagnostic imaging include algorithms for disease detection and classification, image optimization, radiation reduction, and workflow enhancement. Although advances in foundational research are occurring rapidly, translation to routine clinical practice has been slower. In August 2018, the National Institutes of Health assembled multiple relevant stakeholders at a public meeting to discuss the current state of knowledge, infrastructure gaps, and challenges to wider implementation. The conclusions of that meeting are summarized in two publications that identify and prioritize initiatives to accelerate foundational and translational research in AI for medical imaging. This publication summarizes key priorities for translational research developed at the workshop including: (1) creating structured AI use cases, defining and highlighting clinical challenges potentially solvable by AI; (2) establishing methods to encourage data sharing for training and testing AI algorithms to promote generalizability to widespread clinical practice and mitigate unintended bias; (3) establishing tools for validation and performance monitoring of AI algorithms to facilitate regulatory approval; and (4) developing standards and common data elements for seamless integration of AI tools into existing clinical workflows. An important goal of the resulting road map is to grow an ecosystem, facilitated by professional societies, industry, and government agencies, that will allow robust collaborations between practicing clinicians and AI researchers to advance foundational and translational research relevant to medical imaging.

View details for DOI 10.1016/j.jacr.2019.04.014

View details for PubMedID 31151893
Comparison of Natural Language Processing Rules-based and Machine-learning Systems to Identify Lumbar Spine Imaging Findings Related to Low Back Pain ACADEMIC RADIOLOGY Tan, W., Hassanpour, S., Heagerty, P. J., Rundell, S. D., Suri, P., Huhdanpaa, H. T., James, K., Carrell, D. S., Langlotz, C. P., Organ, N. L., Meier, E. N., Sherman, K. J., Kallmes, D. F., Luetmer, P. H., Griffith, B., Nerenz, D. R., Jarvik, J. G. 2018; 25 (11): 1422–32

Abstract

To evaluate a natural language processing (NLP) system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems.We used a limited data set (de-identified except for dates) sampled from lumbar spine imaging reports of a prospectively assembled cohort of adults. From N = 178,333 reports, we randomly selected N = 871 to form a reference-standard dataset, consisting of N = 413 x-ray reports and N = 458 MR reports. Using standardized criteria, four spine experts annotated the presence of 26 findings, where 71 reports were annotated by all four experts and 800 were each annotated by two experts. We calculated inter-rater agreement and finding prevalence from annotated data. We randomly split the annotated data into development (80%) and testing (20%) sets. We developed an NLP system from both rule-based and machine-learned models. We validated the system using accuracy metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).The multirater annotated dataset achieved inter-rater agreement of Cohen's kappa > 0.60 (substantial agreement) for 25 of 26 findings, with finding prevalence ranging from 3% to 89%. In the testing sample, rule-based and machine-learned predictions both had comparable average specificity (0.97 and 0.95, respectively). The machine-learned approach had a higher average sensitivity (0.94, compared to 0.83 for rules-based), and a higher overall AUC (0.98, compared to 0.90 for rules-based).Our NLP system performed well in identifying the 26 lumbar spine findings, as benchmarked by reference-standard annotation by medical experts. Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity, and overall higher AUC.

View details for PubMedID 29605561

View details for PubMedCentralID PMC6162177
Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet PLOS MEDICINE Bien, N., Rajpurkar, P., Ball, R. L., Irvin, J., Park, A., Jones, E., Bereket, M., Patel, B. N., Yeom, K. W., Shpanskaya, K., Halabi, S., Zucker, E., Fanton, G., Amanatullah, D. F., Beaulieu, C. F., Riley, G. M., Stewart, R. J., Blankenberg, F. G., Larson, D. B., Jones, R. H., Langlotz, C. P., Ng, A. Y., Lungren, M. P. 2018; 15 (11)

View details for DOI 10.1371/journal.pmed.1002699

View details for Web of Science ID 000451827800015
Deep Learning in Neuroradiology AMERICAN JOURNAL OF NEURORADIOLOGY Zaharchuk, G., Gong, E., Wintermark, M., Rubin, D., Langlotz, C. P. 2018; 39 (10): 1776–84

View details for DOI 10.3174/ajnr.A5543

View details for Web of Science ID 000445849400007
Clinical decision support increases diagnostic yield of computed tomography for suspected pulmonary embolism AMERICAN JOURNAL OF EMERGENCY MEDICINE Mills, A. M., Ip, I. K., Langlotz, C. P., Raja, A. S., Zafar, H. M., Khorasani, R. 2018; 36 (4): 540–44

Abstract

Determine effects of evidence-based clinical decision support (CDS) on the use and yield of computed tomographic pulmonary angiography for suspected pulmonary embolism (CTPE) in Emergency Department (ED) patients.This multi-site prospective quality improvement intervention conducted in three urban EDs used a pre/post design. For ED patients aged 18+years with suspected PE, CTPE use and yield were compared 19months pre- and 32months post-implementation of CDS intervention based on the Wells criteria, provided at the time of CTPE order, deployed in April 2012. Primary outcome was the yield (percentage of studies positive for acute PE). Secondary outcome was utilization (number of studies/100 ED visits) of CTPE. Chi-square and statistical process control chart assessed pre- and post-intervention differences. An interrupted time series analysis was also performed.Of 558,795 patients presenting October 2010-December 2014, 7987 (1.4%) underwent CTPE (mean age 52±17.5years, 66% female, 60.1% black); 34.7% of patients presented pre- and 65.3% post-CDS implementation. Overall CTPE diagnostic yield was 9.8% (779/7987 studies positive for PE). Yield increased a relative 30.8% after CDS implementation (8.1% vs. 10.6%; p=0.0003). There was no statistically significant change in CTPE utilization (1.4% pre- vs. 1.4% post-implementation; p=0.25). A statistical process control chart demonstrated immediate and sustained improvement in CTPE yield post-implementation. Interrupted time series analysis demonstrated the slope of PE findings versus time to be unchanged before and after the intervention (p=0.9). However, there was a trend that the intervention was associated with a 50% increased probability of PE finding (p=0.08), suggesting an immediate rather than gradual change after the intervention.Implementing evidence-based CDS in the ED was associated with an immediate, significant and sustained increase in CTPE yield without a measurable decrease in CTPE utilization. Further studies will be needed to assess whether stronger interventions could further improve appropriate use of CTPE.

View details for DOI 10.1016/j.ajem.2017.09.004

View details for Web of Science ID 000431713500002

View details for PubMedID 28970024

View details for PubMedCentralID PMC5839946
Deep Learning to Classify Radiology Free-Text Reports RADIOLOGY Chen, M. C., Ball, R. L., Yang, L., Moradzadeh, N., Chapman, B. E., Larson, D. B., Langlotz, C. P., Amrhein, T. J., Lungren, M. P. 2018; 286 (3): 845–52

Abstract

Purpose To evaluate the performance of a deep learning convolutional neural network (CNN) model compared with a traditional natural language processing (NLP) model in extracting pulmonary embolism (PE) findings from thoracic computed tomography (CT) reports from two institutions. Materials and Methods Contrast material-enhanced CT examinations of the chest performed between January 1, 1998, and January 1, 2016, were selected. Annotations by two human radiologists were made for three categories: the presence, chronicity, and location of PE. Classification of performance of a CNN model with an unsupervised learning algorithm for obtaining vector representations of words was compared with the open-source application PeFinder. Sensitivity, specificity, accuracy, and F1 scores for both the CNN model and PeFinder in the internal and external validation sets were determined. Results The CNN model demonstrated an accuracy of 99% and an area under the curve value of 0.97. For internal validation report data, the CNN model had a statistically significant larger F1 score (0.938) than did PeFinder (0.867) when classifying findings as either PE positive or PE negative, but no significant difference in sensitivity, specificity, or accuracy was found. For external validation report data, no statistical difference between the performance of the CNN model and PeFinder was found. Conclusion A deep learning CNN model can classify radiology free-text reports with accuracy equivalent to or beyond that of an existing traditional NLP model. © RSNA, 2017 Online supplemental material is available for this article.

View details for PubMedID 29135365
Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes JOURNAL OF DIGITAL IMAGING Huhdanpaa, H. T., Tan, W., Rundell, S. D., Suri, P., Chokshi, F. H., Comstock, B. A., Heagerty, P. J., James, K. T., Avins, A. L., Nedeljkovic, S. S., Nerenz, D. R., Kallmes, D. F., Luetmer, P. H., Sherman, K. J., Organ, N. L., Griffith, B., Langlotz, C. P., Carrell, D., Hassanpour, S., Jarvik, J. G. 2018; 31 (1): 84–90

Abstract

Electronic medical record (EMR) systems provide easy access to radiology reports and offer great potential to support quality improvement efforts and clinical research. Harnessing the full potential of the EMR requires scalable approaches such as natural language processing (NLP) to convert text into variables used for evaluation or analysis. Our goal was to determine the feasibility of using NLP to identify patients with Type 1 Modic endplate changes using clinical reports of magnetic resonance (MR) imaging examinations of the spine. Identifying patients with Type 1 Modic change who may be eligible for clinical trials is important as these findings may be important targets for intervention. Four annotators identified all reports that contained Type 1 Modic change, using N = 458 randomly selected lumbar spine MR reports. We then implemented a rule-based NLP algorithm in Java using regular expressions. The prevalence of Type 1 Modic change in the annotated dataset was 10%. Results were recall (sensitivity) 35/50 = 0.70 (95% confidence interval (C.I.) 0.52-0.82), specificity 404/408 = 0.99 (0.97-1.0), precision (positive predictive value) 35/39 = 0.90 (0.75-0.97), negative predictive value 404/419 = 0.96 (0.94-0.98), and F1-score 0.79 (0.43-1.0). Our evaluation shows the efficacy of rule-based NLP approach for identifying patients with Type 1 Modic change if the emphasis is on identifying only relevant cases with low concern regarding false negatives. As expected, our results show that specificity is higher than recall. This is due to the inherent difficulty of eliciting all possible keywords given the enormous variability of lumbar spine reporting, which decreases recall, while availability of good negation algorithms improves specificity.

View details for PubMedID 28808792

View details for PubMedCentralID PMC5788819
Expanding a radiology lexicon using contextual patterns in radiology reports. Journal of the American Medical Informatics Association : JAMIA Percha, B. n., Zhang, Y. n., Bozkurt, S. n., Rubin, D. n., Altman, R. B., Langlotz, C. P. 2018

Abstract

Distributional semantics algorithms, which learn vector space representations of words and phrases from large corpora, identify related terms based on contextual usage patterns. We hypothesize that distributional semantics can speed up lexicon expansion in a clinical domain, radiology, by unearthing synonyms from the corpus.We apply word2vec, a distributional semantics software package, to the text of radiology notes to identify synonyms for RadLex, a structured lexicon of radiology terms. We stratify performance by term category, term frequency, number of tokens in the term, vector magnitude, and the context window used in vector building.Ranking candidates based on distributional similarity to a target term results in high curation efficiency: on a ranked list of 775 249 terms, >50% of synonyms occurred within the first 25 terms. Synonyms are easier to find if the target term is a phrase rather than a single word, if it occurs at least 100× in the corpus, and if its vector magnitude is between 4 and 5. Some RadLex categories, such as anatomical substances, are easier to identify synonyms for than others.The unstructured text of clinical notes contains a wealth of information about human diseases and treatment patterns. However, searching and retrieving information from clinical notes often suffer due to variations in how similar concepts are described in the text. Biomedical lexicons address this challenge, but are expensive to produce and maintain. Distributional semantics algorithms can assist lexicon curation, saving researchers time and money.

View details for PubMedID 29329435
The Role of Radiology in the Diagnostic Process: Information, Communication, and Teamwork AMERICAN JOURNAL OF ROENTGENOLOGY Larson, D. B., Langlotz, C. P. 2017; 209 (5): 992–1000

Abstract

The diagnostic radiology process represents a partnership between clinical and radiology teams. As such, breakdowns in interpersonal interactions and communication can result in patient harm.We explore the role of radiology in the diagnostic process, focusing on key concepts of information and communication, as well as key interpersonal interactions of teamwork, collaboration, and collegiality, all based on trust. We propose 10 principles to facilitate effective information flow in the diagnostic process.

View details for PubMedID 28742380
Use of Radiology Procedure Codes in Health Care: The Need for Standardization and Structure RADIOGRAPHICS Wang, K. C., Patel, J. B., Vyas, B., Toland, M., Collins, B., Vreeman, D. J., Abhyankar, S., Siegel, E. L., Rubin, D. L., Langlotz, C. P. 2017; 37 (4): 1099–1110

Abstract

Radiology procedure codes are a fundamental part of most radiology workflows, such as ordering, scheduling, billing, and image interpretation. Nonstandardized unstructured procedure codes have typically been used in radiology departments. Such codes may be sufficient for specific purposes, but they offer limited support for interoperability. As radiology workflows and the various forms of clinical data exchange have become more sophisticated, the need for more advanced interoperability with use of standardized structured codes has increased. For example, structured codes facilitate the automated identification of relevant prior imaging studies and the collection of data for radiation dose tracking. The authors review the role of imaging procedure codes in radiology departments and across the health care enterprise. Standards for radiology procedure coding are described, and the mechanisms of structured coding systems are reviewed. In particular, the structure of the RadLex™ Playbook coding system and examples of the use of this system are described. Harmonization of the RadLex Playbook system with the Logical Observation Identifiers Names and Codes standard, which is currently in progress, also is described. The benefits and challenges of adopting standardized codes-especially the difficulties in mapping local codes to standardized codes-are reviewed. Tools and strategies for mitigating these challenges, including the use of billing codes as an intermediate step in mapping, also are reviewed. In addition, the authors describe how to use the RadLex Playbook Web service application programming interface for partial automation of code mapping. © RSNA, 2017.

View details for PubMedID 28696857
Implementation of an Automated Radiology Recommendation-Tracking Engine for Abdominal Imaging Findings of Possible Cancer JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY Cook, T. S., Lalevic, D., Sloan, C., Chadalavada, S. C., Langlotz, C. P., Schnall, M. D., Zafar, H. M. 2017; 14 (5): 629-636

View details for DOI 10.1016/j.jacr.2017.01.024

View details for Web of Science ID 000400634400015

View details for PubMedID 28325488
Medicare Imaging Demonstration: Assessing Attributes of Appropriate Use Criteria and Their Influence on Ordering Behavior. AJR. American journal of roentgenology Lacson, R., Ip, I., Hentel, K. D., Malhotra, S., Balthazar, P., Langlotz, C. P., Raja, A. S., Khorasani, R. 2017: 1-7

Abstract

Persistent concern exists about the variable and possibly inappropriate utilization of high-cost imaging tests. The purpose of this study is to assess the influence of appropriate use criteria attributes on altering ambulatory imaging orders deemed inappropriate.This secondary analysis included Medicare Imaging Demonstration data collected from three health care systems in 2011-2013 via the use of clinical decision support (CDS) during ambulatory imaging order entry. The CDS system captured whether orders were inappropriate per the appropriate use criteria of professional societies and provided advice during the intervention period. For orders deemed inappropriate, we assessed the impact of the availability of alternative test recommendations, conflicts with local best practices, and the strength of evidence for appropriate use criteria on the primary outcome of cancellation or modification of inappropriate orders. Expert review determined conflicts with local best practices for 250 recommendations for abdominal and thoracic CT orders. Strength of evidence was assessed for the 15 most commonly triggered recommendations that were deemed inappropriate. A chi-square test was used for univariate analysis.A total of 1691 of 63,222 imaging test orders (2.7%) were deemed inappropriate during the intervention period; this amount decreased from 364 of 11,675 test orders (3.1%) in the baseline period (p < 0.00001). Of 270 inappropriate recommendations with alternative test recommendations, 28 (10.4%) were modified, compared with four of 1024 inappropriate recommendations without alternatives (0.4%) (p < 0.0001). Seventy-eight of 250 recommendations (31%) conflicted with local best practices, but only six of 69 inappropriate recommendations (9%) conflicted (p < 0.001). No inappropriate recommendations that conflicted with local best practices were modified. All 15 commonly triggered recommendations had an Oxford Centre for Evidence-Based Medicine level of evidence of 5 (i.e., expert opinion).Orders for imaging tests that were deemed inappropriate were modified infrequently, more often with alternative recommendations present and only for appropriate use criteria consistent with local best practices.

View details for DOI 10.2214/AJR.16.17169

View details for PubMedID 28267371
JOURNAL CLUB: Predictors of Provider Response to Clinical Decision Support: Lessons Learned From the Medicare Imaging Demonstration. AJR. American journal of roentgenology Ip, I. K., Lacson, R., Hentel, K., Malhotra, S., Darer, J., Langlotz, C., Weiss, J., Raja, A. S., Khorasani, R. 2017; 208 (2): 351-357

Abstract

The efficacy of imaging clinical decision support (CDS) varies. Our objective was to identify CDS factors contributing to imaging order cancellation or modification.This pre-post study was performed across four institutions participating in the Medicare Imaging Demonstration. The intervention was CDS at order entry for selected outpatient imaging procedures. On the basis of the information entered, computerized alerts indicated to providers whether orders were not covered by guidelines, appropriate, of uncertain appropriateness, or inappropriate according to professional society guidelines. Ordering providers could override or accept CDS. We considered actionable alerts to be those that could generate an immediate order behavior change in the ordering physician (i.e., cancellation of inappropriate orders or modification of orders of uncertain appropriateness that had a recommended alternative). Chi-square and logistic regression identified predictors of order cancellation or modification after an alert.A total of 98,894 radiology orders were entered (83,114 after the intervention). Providers ignored 98.9%, modified 1.1%, and cancelled 0.03% of orders in response to alerts. Actionable alerts had a 10 fold higher rate of modification (8.1% vs 0.7%; p < 0.0001) or cancellation (0.2% vs 0.02%; p < 0.0001) orders compared with nonactionable alerts. Orders from institutions with preexisting imaging CDS had a sevenfold lower rate of cancellation or modification than was seen at sites with newly implemented CDS (1.4% vs 0.2%; p < 0.0001). In multivariate analysis, actionable alerts were 12 times more likely to result in order cancellation or modification. Orders at sites with preexisting CDS were 7.7 times less likely to be cancelled or modified (p < 0.0001).Using results from the Medicare Imaging Demonstration project, we identified potential factors that were associated with CDS effect on provider imaging ordering; these findings may have implications for future design of such computerized systems.

View details for DOI 10.2214/AJR.16.16373

View details for PubMedID 27897445
Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield. AJR. American journal of roentgenology Hassanpour, S., Langlotz, C. P., Amrhein, T. J., Befera, N. T., Lungren, M. P. 2017: 1-4

Abstract

The purpose of this study is to evaluate the performance of a natural language processing (NLP) system in classifying a database of free-text knee MRI reports at two separate academic radiology practices.An NLP system that uses terms and patterns in manually classified narrative knee MRI reports was constructed. The NLP system was trained and tested on expert-classified knee MRI reports from two major health care organizations. Radiology reports were modeled in the training set as vectors, and a support vector machine framework was used to train the classifier. A separate test set from each organization was used to evaluate the performance of the system. We evaluated the performance of the system both within and across organizations. Standard evaluation metrics, such as accuracy, precision, recall, and F1 score (i.e., the weighted average of the precision and recall), and their respective 95% CIs were used to measure the efficacy of our classification system.The accuracy for radiology reports that belonged to the model's clinically significant concept classes after training data from the same institution was good, yielding an F1 score greater than 90% (95% CI, 84.6-97.3%). Performance of the classifier on cross-institutional application without institution-specific training data yielded F1 scores of 77.6% (95% CI, 69.5-85.7%) and 90.2% (95% CI, 84.5-95.9%) at the two organizations studied.The results show excellent accuracy by the NLP machine learning classifier in classifying free-text knee MRI reports, supporting the institution-independent reproducibility of knee MRI report classification. Furthermore, the machine learning classifier performed well on free-text knee MRI reports from another institution. These data support the feasibility of multiinstitutional classification of radiologic imaging text reports with a single machine learning classifier without requiring institution-specific training data.

View details for DOI 10.2214/AJR.16.16128

View details for PubMedID 28140627
Characterization of Change and Significance for Clinical Findings in Radiology Reports Through Natural Language Processing. Journal of digital imaging Hassanpour, S., Bay, G., Langlotz, C. P. 2017

Abstract

We built a natural language processing (NLP) method to automatically extract clinical findings in radiology reports and characterize their level of change and significance according to a radiology-specific information model. We utilized a combination of machine learning and rule-based approaches for this purpose. Our method is unique in capturing different features and levels of abstractions at surface, entity, and discourse levels in text analysis. This combination has enabled us to recognize the underlying semantics of radiology report narratives for this task. We evaluated our method on radiology reports from four major healthcare organizations. Our evaluation showed the efficacy of our method in highlighting important changes (accuracy 99.2%, precision 96.3%, recall 93.5%, and F1 score 94.7%) and identifying significant observations (accuracy 75.8%, precision 75.2%, recall 75.7%, and F1 score 75.3%) to characterize radiology reports. This method can help clinicians quickly understand the key observations in radiology reports and facilitate clinical decision support, review prioritization, and disease surveillance.

View details for DOI 10.1007/s10278-016-9931-8

View details for PubMedID 28050714
Bone Tumor Diagnosis Using a Naïve Bayesian Model of Demographic and Radiographic Features. Journal of digital imaging Do, B. H., Langlotz, C. n., Beaulieu, C. F. 2017

Abstract

Because many bone tumors have a variety of appearances and are uncommon, few radiologists develop sufficient expertise to guide optimal management. Bayesian inference can guide decision-making by computing probabilities of multiple diagnoses to generate a differential. We built and validated a naïve Bayes machine (NBM) that processes 18 demographic and radiographic features. We reviewed over 1664 analog radiographic cases of bone tumors and selected 811 cases (66 diagnoses) for annotation using a quantitative imaging platform. Leave-one-out cross validation was performed. Primary accuracy was defined as the correct pathological diagnosis as the top machine prediction. Differential accuracy was defined as whether the correct pathological diagnosis was within the top three predictions. For the 29 most common diagnoses (710 cases), primary accuracy was 44%, and differential accuracy was 60%. For the top 10 most common diagnoses (478 cases), primary accuracy was 62%, and differential accuracy was 80%. The machine returned relevant diagnoses for the majority of unknown test cases and may be a feasible alternative to machine learning approaches such as deep neural networks or support vector machines that typically require larger training data (our model required a minimum of five samples per diagnosis) and are "black boxes" (our model can provide details of probability calculations to identify features that most significantly contribute to truth diagnoses). Finally, our Bayes model was designed to scale and "learn" from external data, enabling incorporation of outside knowledge such as Dahlin's Bone Tumors, a reference of anatomic and demographic statistics of more than 10,000 tumors.

View details for PubMedID 28752323

View details for PubMedCentralID PMC5603428
Implications of Direct Patient Online Access to Radiology Reports Through Patient Web Portals JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY Lee, C. I., Langlotz, C. P., Elmore, J. G. 2016; 13 (12): 1608-1614

Abstract

In an era of increasing health information transparency and informed decision making, more patients are being provided with direct online access to their medical records, including radiology reports, via web-based portals. Although radiologists' narrative reports have previously been the purview of referring physicians, patients are now reading these on their own. Many potential benefits may result from patients reviewing their radiology reports, including improvements in patients' own understanding of their health, promotion of shared decision making and patient-physician communication, and, ultimately, improvements in patient outcomes. However, there may also be negative consequences, including confusion and anxiety among patients and longer patient-physician interactions. The rapid adoption of this new technology has led to major questions regarding ethics and professionalism for radiologists, including the following: Who is the intended audience of radiology reports? How should content be presented or worded? How will open access influence radiologists' relationships with patients and referring physicians? What legal ramifications may arise from increased patient access? The authors describe the current practices and research findings associated with patient online access to medical records, including radiology reports, and discuss several implications of this growing trend for the radiology profession.

View details for DOI 10.1016/j.jacr.2016.09.007

View details for Web of Science ID 000389562000013

View details for PubMedID 27888949
"Chasing a Ghost": Factors that Influence Primary Care Physicians to Follow Up on Incidental Imaging Findings. Radiology Zafar, H. M., Bugos, E. K., Langlotz, C. P., Frasso, R. 2016; 281 (2): 567-573

Abstract

Purpose To explore provider and patient characteristics that influence how primary care providers (PCPs) communicate and manage incidental imaging findings. Materials and Methods This HIPAA-compliant study was approved by the institutional review board. Through semistructured interviews, researchers explored concerns and perspectives of 30 PCPs on receiving and acting on incidental imaging findings. Open-ended questions were designed to elicit a range of responses rather than quantifiable data. Thematic codes were developed and explicitly defined. Three research assistants independently coded all 30 deidentified transcripts and resolved discrepancies (κ = 0.85). Codes pertaining to PCP and patient characteristics were organized into an explanatory model. Results Some PCPs felt compelled but frustrated to pursue costly follow-up for incidental imaging findings of limited clinical importance. Other PCPs did not act on findings that were unfamiliar or occurred in an unusual clinical context when follow-up recommendations were not given; the challenges of researching the clinical importance of these findings or seeking specialist consultation led to inaction. Some PCPs reported using a uniform approach to communicate and manage incidental findings, while others adapted their approach to the patient and the finding. Sometimes PCP characteristics such as follow-up style superseded patient characteristics. At other times patient characteristics such as health literacy superseded PCP characteristics. Conclusion PCPs cited a variety of objective and subjective factors that influence how they communicate and manage incidental imaging findings. These results suggest that some patients may receive inappropriate follow-up of incidental imaging findings and present an opportunity for radiologists to help PCPs and patients to best use the information conveyed in imaging reports. (©) RSNA, 2016 Online supplemental material is available for this article.

View details for PubMedID 27192458
Why Isn't There More High-fidelity Simulation Training in Diagnostic Radiology? Results of a Survey of Academic Radiologists ACADEMIC RADIOLOGY Cook, T. S., Hernandez, J., Scanlon, M., Langlotz, C., Li, C. L. 2016; 23 (7): 870-876

Abstract

Despite its increasing use in training other medical specialties, high-fidelity simulation to prepare diagnostic radiology residents for call remains an underused educational resource. To attempt to characterize the barriers toward adoption of this technology, we conducted a survey of academic radiologists and radiology trainees.An Institutional Review Board-approved survey was distributed to the Association of University Radiologists members via e-mail. Survey results were collected electronically, tabulated, and analyzed.A total of 68 survey responses representing 51 programs were received from program directors, department chairs, chief residents, and program administrators. The most common form of educational activity for resident call preparation was lectures. Faculty supervised "baby call" was also widely reported. Actual simulated call environments were quite rare with only three programs reporting this type of educational activity. Barriers to the use of simulation include lack of faculty time, lack of faculty expertise, and lack of perceived need.High-fidelity simulation can be used to mimic the high-stress, high-stakes independent call environment that the typical radiology resident encounters during the second year of training, and can provide objective data for program directors to assess the Accreditation Council of Graduate Medical Education milestones. We predict that this technology will begin to supplement traditional diagnostic radiology teaching methods and to improve patient care and safety in the next decade.

View details for DOI 10.1016/j.acra.2016.03.008

View details for Web of Science ID 000378444600014

View details for PubMedID 27212606
Health IT vendors and the academic community: The 2014 ACMI debate. Journal of biomedical informatics McCray, A. T., Glaser, J., Koppel, R., Langlotz, C. P., Silverstein, J. 2016; 60: 365-375

Abstract

The American College of Medical Informatics (ACMI) periodically hosts a debate at the American Medical Informatics Association (AMIA) fall symposium on a timely topic in biomedical informatics. In 2014 a panel of ACMI fellows debated the following proposition: "The lack of interaction and collaboration between health IT vendors and academic clinical informatics units is stifling innovation and will continue to have a detrimental effect on the evolution of commercial products." Debaters disagreed on the level of interaction and collaboration between the health IT sector and academia and disagreed on whether and by whom innovation was actually taking place. While collaboration between industry and academia was seen as desirable by all of the debaters, there was an acknowledgment that these groups have notably different roles and responsibilities. There was consensus that a path forward should be found, and that AMIA itself has an important role to play in effecting this.

View details for DOI 10.1016/j.jbi.2016.03.003

View details for PubMedID 26968349
Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository. Journal of digital imaging Hassanpour, S., Langlotz, C. P. 2016; 29 (1): 59-62

Abstract

Radiology report narrative contains a large amount of information about the patient's health and the radiologist's interpretation of medical findings. Most of this critical information is entered in free text format, even when structured radiology report templates are used. The radiology report narrative varies in use of terminology and language among different radiologists and organizations. The free text format and the subtlety and variations of natural language hinder the extraction of reusable information from radiology reports for decision support, quality improvement, and biomedical research. Therefore, as the first step to organize and extract the information content in a large multi-institutional free text radiology report repository, we have designed and developed an unsupervised machine learning approach to capture the main concepts in a radiology report repository and partition the reports based on their main foci. In this approach, radiology reports are modeled in a vector space and compared to each other through a cosine similarity measure. This similarity is used to cluster radiology reports and identify the repository's underlying topics. We applied our approach on a repository of 1,899,482 radiology reports from three major healthcare organizations. Our method identified 19 major radiology report topics in the repository and clustered the reports accordingly to these topics. Our results are verified by a domain expert radiologist and successfully explain the repository's primary topics and extract the corresponding reports. The results of our system provide a target-based corpus and framework for information extraction and retrieval systems for radiology reports.

View details for DOI 10.1007/s10278-015-9823-3

View details for PubMedID 26353748

View details for PubMedCentralID PMC4722022
Predicting High Imaging Utilization Based on Initial Radiology Reports: A Feasibility Study of Machine Learning ACADEMIC RADIOLOGY Hassanpour, S., Langlotz, C. P. 2016; 23 (1): 84-89

Abstract

Imaging utilization has significantly increased over the last two decades, and is only recently showing signs of moderating. To help healthcare providers identify patients at risk for high imaging utilization, we developed a prediction model to recognize high imaging utilizers based on their initial imaging reports.The prediction model uses a machine learning text classification framework. In this study, we used radiology reports from 18,384 patients with at least one abdomen computed tomography study in their imaging record at Stanford Health Care as the training set. We modeled the radiology reports in a vector space and trained a support vector machine classifier for this prediction task. We evaluated our model on a separate test set of 4791 patients. In addition to high prediction accuracy, in our method, we aimed at achieving high specificity to identify patients at high risk for high imaging utilization.Our results (accuracy: 94.0%, sensitivity: 74.4%, specificity: 97.9%, positive predictive value: 87.3%, negative predictive value: 95.1%) show that a prediction model can enable healthcare providers to identify in advance patients who are likely to be high utilizers of imaging services.Machine learning classifiers developed from narrative radiology reports are feasible methods to predict imaging utilization. Such systems can be used to identify high utilizers, inform future image ordering behavior, and encourage judicious use of imaging.

View details for DOI 10.1016/j.acra.2015.09.014

View details for Web of Science ID 000367279800013
Optimization of Radiology Reports for Intensive Care Unit Portable Chest Radiographs Perceptions and Preferences of Radiologists and ICU Practitioners JOURNAL OF THORACIC IMAGING Barbosa, E. J., Lynch, M. C., Langlotz, C. P., Gefter, W. B. 2016; 31 (1): 43-48

Abstract

The aim of the study was to evaluate opinions and perceptions of radiologists and referring practitioners regarding reports of portable chest radiography (pCXR) obtained in the intensive care unit (ICU).A total of 1265 referring practitioners and 76 radiologists were invited to participate in 2 internet-based surveys, containing 15 and 17 multiple choice questions, respectively, similarly presented to both groups, utilizing a Likert scale or multiple choices. Results were compared using the Fisher exact test or χ test.One hundred ninety-two referring practitioners and 63 radiologists answered the surveys, resulting in response rates of 15% and 83%. The majority of radiologists and referring practitioners are satisfied with the quality of the reports; however, radiologists and referring practitioners disagree about the reports' clinical value and impact, the referring practitioners having a more positive view. Both groups overwhelmingly agree that pertinent clinical information is crucial for optimal image interpretation. The 2 groups differ in their preferences regarding report style and information content, with radiologists strongly supporting concise reports emphasizing temporal changes and major findings, whereas referring practitioners prefer more complete, itemized structured reports describing support devices in detail.The results substantiate the perceived clinical value of radiologist reports for pCXR, from the perspective of referring practitioners. Nonetheless, there is disagreement regarding report structure and content. Several issues were raised, offering opportunities for improvement, which may increase referring practitioners' satisfaction and positively impact patient outcomes. Any strategy to implement standardized structured reports for pCXR will have to satisfy referring practitioners' needs while optimizing radiologists' efficiency, will have to be widely accepted, and will have to fulfill the overarching goal of maximizing the value of pCXR reports.

View details for DOI 10.1097/RTI.0000000000000165

View details for Web of Science ID 000372707700008
Information extraction from multi-institutional radiology reports ARTIFICIAL INTELLIGENCE IN MEDICINE Hassanpour, S., Langlotz, C. P. 2016; 66: 29-39

Abstract

The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications.We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations.Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05).Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval.

View details for DOI 10.1016/j.artmed.2015.09.007

View details for Web of Science ID 000371368900003

View details for PubMedID 26481140
Conversion of Radiology Reporting Templates to the MRRT Standard JOURNAL OF DIGITAL IMAGING Kahn, C. E., Genereaux, B., Langlotz, C. P. 2015; 28 (5): 528-536

Abstract

In 2013, the Integrating the Healthcare Enterprise (IHE) Radiology workgroup developed the Management of Radiology Report Templates (MRRT) profile, which defines both the format of radiology reporting templates using an extension of Hypertext Markup Language version 5 (HTML5), and the transportation mechanism to query, retrieve, and store these templates. Of 200 English-language report templates published by the Radiological Society of North America (RSNA), initially encoded as text and in an XML schema language, 168 have been converted successfully into MRRT using a combination of automated processes and manual editing; conversion of the remaining 32 templates is in progress. The automated conversion process applied Extensible Stylesheet Language Transformation (XSLT) scripts, an XML parsing engine, and a Java servlet. The templates were validated for proper HTML5 and MRRT syntax using web-based services. The MRRT templates allow radiologists to share best-practice templates across organizations and have been uploaded to the template library to supersede the prior XML-format templates. By using MRRT transactions and MRRT-format templates, radiologists will be able to directly import and apply templates from the RSNA Report Template Library in their own MRRT-compatible vendor systems. The availability of MRRT-format reporting templates will stimulate adoption of the MRRT standard and is expected to advance the sharing and use of templates to improve the quality of radiology reports.

View details for DOI 10.1007/s10278-015-9787-3

View details for Web of Science ID 000364522100003

View details for PubMedID 25776768
Code Abdomen: An Assessment Coding Scheme for Abdominal Imaging Findings Possibly Representing Cancer. Journal of the American College of Radiology Zafar, H. M., Chadalavada, S. C., Kahn, C. E., Cook, T. S., Sloan, C. E., Lalevic, D., Langlotz, C. P., Schnall, M. D. 2015; 12 (9): 947-950

View details for DOI 10.1016/j.jacr.2015.04.005

View details for PubMedID 26130223
True "Meaningful Use": Technology Meets Both Patient and Provider Needs AMERICAN JOURNAL OF MANAGED CARE Black, H., Gonzalez, R., Priolo, C., Schapira, M. M., Sonnad, S. S., Hanson, C. W., Langlotz, C. P., Howell, J. T., Apter, A. J. 2015; 21 (5): E329-E337

View details for Web of Science ID 000358661300006
Assessment of Follow-up Completeness and Notification Preferences for Imaging Findings of Possible Cancer: What Happens After Radiologists Submit Their Reports? ACADEMIC RADIOLOGY Sloan, C. E., Chadalavada, S. C., Cook, T. S., Langlotz, C. P., Schnall, M. D., Zafar, H. M. 2014; 21 (12): 1579-1586

Abstract

To understand the reasons leading to potentially inappropriate management of imaging findings concerning for malignancy and identify optimal methods for communicating these findings to providers.We identified all abdominal imaging examinations with findings of possible cancer performed on six randomly selected days in August to December 2013. Electronic medical records (EMR) of one patient group were reviewed 3 months after the index examination to determine whether management was appropriate (completed follow-up or documented reason for no follow-up) or potentially inappropriate (no follow-up or no documented reason). Providers of a second patient group were contacted 5-6 days after imaging examinations to determine notification preferences.Among 43 patients in the first group, five (12%) received potentially inappropriate management. Reasons included patient loss to follow-up and provider failure to review imaging results, document known imaging findings, or communicate findings to providers outside the health system. Among 16 providers caring for patients in the second group, 33% were unaware of the findings, 75% preferred to be notified of abnormal findings via e-mail or EMR, 56% wanted an embedded hyperlink enabling immediate follow-up order entry, and only 25% had a system to monitor whether patients had completed ordered testing.One in eight patients did not receive potentially necessary follow-up care within 3 months of imaging findings of possible cancer. Automated notification of imaging findings and follow-up monitoring not only is desired by providers but can also address many of the reasons we found for inappropriate management.

View details for DOI 10.1016/j.acra.2014.07.006

View details for Web of Science ID 000344844400013

View details for PubMedID 25179562
Ten Commandments for Effective Clinical Decision Support for Imaging: Enabling Evidence-Based Practice to Improve Quality and Reduce Waste AMERICAN JOURNAL OF ROENTGENOLOGY Khorasani, R., Hentel, K., Darer, J., Langlotz, C., Ip, I. K., Manaker, S., Cardella, J., Min, R., Seltzer, S. 2014; 203 (5): 945-951

Abstract

We describe best practices for effective imaging clinical decision support (CDS) derived from firsthand experience, extending the Ten Commandments for CDS published a decade ago. Our collective perspective is used to set expectations for providers, health systems, policy makers, payers, and health information technology developers.Highlighting unique attributes of effective imaging CDS will help radiologists to successfully lead and optimize the value of the substantial federal and local investments in health information technology in the United States.

View details for DOI 10.2214/AJR.14.13134

View details for Web of Science ID 000347415700030

View details for PubMedID 25341131
Automated Extraction of Critical Test Values and Communications from Unstructured Radiology Reports: An Analysis of 9.3 Million Reports from 1990 to 2011 RADIOLOGY Lakhani, P., Kim, W., Langlotz, C. P. 2012; 265 (3): 809-818

Abstract

To determine the frequency of critical radiology results in 9.3 million radiology reports from our health system, to identify those containing documentation of communication by using automated text-classification algorithms, and to assess the impact of a policy requiring documentation of critical results communication.This HIPAA-compliant retrospective study received institutional review board approval. Text-mining algorithms that were previously validated to have mean accuracies of more than 90% for identifying certain critical results and documentation of communications were applied to a database of 9.3 million radiology reports. The frequency of critical results and documentation of communication were then determined from 1990 to 2011.There was an increase in documentation of communication for all critical results from 1990 to 2011. In 1990, 19.0% of reports with critical values had evidence of documentation of communication compared with 72.4% of reports in 2010. The linear trend for increasing documentation of communications began in 1997 and continued until 2011 (P < .001). From 1990 to 2011, documentation of communication was highest in acute scrotal torsion (70.6%) and ectopic pregnancy (65.4%) and lowest in unexplained free-intraperitoneal air (29.5%) and malpositioned tubes (30.4%). In 2010-2011, radiologists were least likely to document communication of results for malpositioned endotracheal and enteric tubes (2010, 58.56%; 2011, 57.50%) and unexplained free-intraperitoneal air (2010, 59.57%; 2011, 75.51%). They were most likely to document communication of results for ectopic pregnancy (2010, 94.12%; 2011, 93.48%) and acute appendicitis (2010, 86.87%; 2011, 84.31%).There was an increase in documentation of communication of critical results, which demonstrated a rising linear trend that began in 1997 and continued until 2011. The increasing trend began well before policy implementation, indicating that other factors such as heightened awareness among radiologists likely had a role.

View details for DOI 10.1148/radiol.12112438

View details for Web of Science ID 000311420300018

View details for PubMedID 22952381
Clinical Decision Support for Imaging in the Era of the Patient Protection and Affordable Care Act JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY Zafar, H. M., Mills, A. M., Khorasani, R., Langlotz, C. P. 2012; 9 (12): 907-918

Abstract

Imaging clinical decision support (CDS) systems provide evidence for or against imaging procedures ordered within a computerized physician order entry system at the time of the image order. Depending on the pertinent clinical history provided by the ordering clinician, CDS systems can optimize imaging by educating providers on appropriate image order entry and by alerting providers to the results of prior, potentially relevant imaging procedures, thereby reducing redundant imaging. The American Recovery and Reinvestment Act (ARRA) has expedited the adoption of computerized physician order entry and CDS systems in health care through the creation of financial incentives and penalties to promote the "meaningful use" of health IT. Meaningful use represents the latest logical next step in a long chain of legislation promoting the areas of appropriate imaging utilization, accurate reporting, and IT. It is uncertain if large-scale implementation of imaging CDS will lead to improved health care quality, as seen in smaller settings, or to improved patient outcomes. However, imaging CDS enables the correlation of existing imaging evidence with outcome measures, including morbidity, mortality, and short-term imaging-relevant management outcomes (eg, biopsy, chemotherapy). The purposes of this article are to review the legislative sequence relevant to imaging CDS and to give guidance to radiology practices focused on quality and financial performance improvement during this time of accelerating regulatory change.

View details for DOI 10.1016/j.jacr.2012.09.014

View details for Web of Science ID 000312629700016

View details for PubMedID 23206649
Predictors of initial F-18-fluorodeoxyglucose-positron emission tomography indication among patients with colorectal cancer NUCLEAR MEDICINE COMMUNICATIONS Zafar, H. M., Kramer, S., Bonaccorsi, D., Langlotz, C. P., Armstrong, K. 2012; 33 (7): 739-746

Abstract

To evaluate the determinants of initial F-fluorodeoxyglucose (F-FDG)-PET indication following primary colorectal cancer diagnosis among patients who underwent surgery between January 2000 and December 2007 and who were observed at a single institution for at least 2 years after diagnosis.Of the 530 patients who underwent colorectal cancer resection, 113 patients received at least one F-FDG-PET following diagnosis. Outcome variables included indication and time of the first F-FDG-PET following diagnosis. Potential predictors included disease-level and patient-level characteristics. Univariate and multivariate regression analyses were performed.Patients diagnosed later in the study period and patients with higher-stage disease were more likely to receive their first F-FDG-PET for initial staging (P<0.001 and P=0.016, respectively). Patients with lower-stage disease were more likely to receive their initial F-FDG-PET for suspected recurrence on conventional imaging. When performed more than 2 years after diagnosis, F-FDG-PET was more likely to be ordered for suspected recurrence either on the basis of conventional imaging or on the basis of patient symptoms/tumor markers (P=0.003 and 0.031, respectively). F-FDG-PET demonstrated disease progression in at least 50% of patients referred for each indication (P=0.037).Higher utilization of F-FDG-PET may be appropriate among patients referred for a number of indications including: initial staging, particularly among those with higher-stage disease; suspected recurrence on conventional imaging among patients with lower-stage disease; and suspected recurrence more than 2 years after diagnosis. Further research is needed to verify these findings.

View details for DOI 10.1097/MNM.0b013e328353b249

View details for Web of Science ID 000305500600009

View details for PubMedID 22531828
The Diagnostic and Economic Yield of Neuroimaging in Neuro-ophthalmology JOURNAL OF NEURO-OPHTHALMOLOGY Mehta, S., Loevner, L. A., Mikityansky, I., Langlotz, C., Ying, G., Tamhankar, M. A., Shindler, K. S., Volpe, N. J. 2012; 32 (2): 139-144

Abstract

Diagnostic studies such as computed tomography scans (CT) and magnetic resonance imaging (MRI) are ordered frequently in neuro-ophthalmic practice, although the diagnostic yield and cost-effectiveness of these tests have been studied for only a few conditions. We assessed the diagnostic and economic yield of CT and MRI across all patients evaluated in a neuro-ophthalmology practice.This retrospective review included all patients referred by the division of neuro-ophthalmology at the Scheie Eye Institute for CT, CT angiography, MRI, MRA, or magnetic resonance venography over a 12-month period. Abnormal imaging findings were categorized as significant (one that elicited changes in management) and/or relevant (one that related to the patient's neuro-ophthalmic complaint or examination findings). The diagnostic yield of the test ordered was analyzed according to the patient's chief complaint, neuro-ophthalmic examination findings, and indication for imaging. The total costs for each diagnostic group and costs per significant finding were calculated using the global Resource-Based Relative Value Units for each examination from the Centers for Medicare and Medicaid Services Web site.Two hundred eleven imaging studies in 157 patients were evaluated. 28.9% (95% confidence interval, 22.5%-36.2%) of imaging studies had significant abnormalities relevant to the neuro-ophthalmic complaint. Imaging obtained for evaluation of progressive optic nerve dysfunction and cranial nerve palsy had statistically significant higher diagnostic yield than studies performed for other reasons. Total cost of all imaging studies performed was $107,615.72. Cost per clinically significant and relevant finding was $1,764.19.In comparison to the diagnostic yield of neuroimaging studies in other specialties, CT and MRI of the brain requested by neuro-ophthalmologists provide significant and relevant data at a reasonable cost.

View details for DOI 10.1097/WNO.0b013e31824e3753

View details for Web of Science ID 000304790500010

View details for PubMedID 22510684
Automated Detection of Critical Results in Radiology Reports JOURNAL OF DIGITAL IMAGING Lakhani, P., Kim, W., Langlotz, C. P. 2012; 25 (1): 30-36

Abstract

The goal of this study was to develop and validate text-mining algorithms to automatically identify radiology reports containing critical results including tension or increasing/new large pneumothorax, acute pulmonary embolism, acute cholecystitis, acute appendicitis, ectopic pregnancy, scrotal torsion, unexplained free intraperitoneal air, new or increasing intracranial hemorrhage, and malpositioned tubes and lines. The algorithms were developed using rule-based approaches and designed to search for common words and phrases in radiology reports that indicate critical results. Certain text-mining features were utilized such as wildcards, stemming, negation detection, proximity matching, and expanded searches with applicable synonyms. To further improve accuracy, the algorithms utilized modality and exam-specific queries, searched under the "Impression" field of the radiology report, and excluded reports with a low level of diagnostic certainty. Algorithm accuracy was determined using precision, recall, and F-measure using human review as the reference standard. The overall accuracy (F-measure) of the algorithms ranged from 81% to 100%, with a mean precision and recall of 96% and 91%, respectively. These algorithms can be applied to radiology report databases for quality assurance and accreditation, integrated with existing dashboards for display and monitoring, and ported to other institutions for their own use.

View details for DOI 10.1007/s10278-011-9426-6

View details for Web of Science ID 000304113400006

View details for PubMedID 22038514
Extracting templates from radiology reports using sequence alignment INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS Wu, S., Langlotz, C. P., Lakhani, P., Ungar, L. H. 2012; 6 (6): 633-650

Abstract

Health care providers often dictate their reports by filling in slots in templates. These slots can be filled with a variety of different procedures, measurements, or findings. Many radiologists currently create their own personalised templates, costing time and leading to inconsistencies across physicians. We present a sequence alignment method Radiology Content Alignment (RADICAL) that uses dynamic programming to efficiently extract templates that are common across sets of reports, and give examples of the extracted templates and the contents of their slots.

View details for DOI 10.1504/IJDMB.2012.050248

View details for Web of Science ID 000311137000005

View details for PubMedID 23356012
Informatics in Radiology An Information Model of the DICOM Standard RADIOGRAPHICS Kahn, C. E., Langlotz, C. P., Channin, D. S., Rubin, D. L. 2011; 31 (1): 295-U356

Abstract

The Digital Imaging and Communications in Medicine (DICOM) Standard is a key foundational technology for radiology. However, its complexity creates challenges for information system developers because the current DICOM specification requires human interpretation and is subject to nonstandard implementation. To address this problem, a formally sound and computationally accessible information model of the DICOM Standard was created. The DICOM Standard was modeled as an ontology, a machine-accessible and human-interpretable representation that may be viewed and manipulated by information-modeling tools. The DICOM Ontology includes a real-world model and a DICOM entity model. The real-world model describes patients, studies, images, and other features of medical imaging. The DICOM entity model describes connections between real-world entities and the classes that model the corresponding DICOM information entities. The DICOM Ontology was created to support the Cancer Biomedical Informatics Grid (caBIG) initiative, and it may be extended to encompass the entire DICOM Standard and serve as a foundation of medical imaging systems for research and patient care.

View details for DOI 10.1148/rg.311105085

View details for Web of Science ID 000286608900024

View details for PubMedID 20980665

View details for PubMedCentralID PMC3399709
Automated Detection of Radiology Reports that Document Non-routine Communication of Critical or Significant Results JOURNAL OF DIGITAL IMAGING Lakhani, P., Langlotz, C. P. 2010; 23 (6): 647-657

Abstract

The purpose of this investigation is to develop an automated method to accurately detect radiology reports that indicate non-routine communication of critical or significant results. Such a classification system would be valuable for performance monitoring and accreditation. Using a database of 2.3 million free-text radiology reports, a rule-based query algorithm was developed after analyzing hundreds of radiology reports that indicated communication of critical or significant results to a healthcare provider. This algorithm consisted of words and phrases used by radiologists to indicate such communications combined with specific handcrafted rules. This algorithm was iteratively refined and retested on hundreds of reports until the precision and recall did not significantly change between iterations. The algorithm was then validated on the entire database of 2.3 million reports, excluding those reports used during the testing and refinement process. Human review was used as the reference standard. The accuracy of this algorithm was determined using precision, recall, and F measure. Confidence intervals were calculated using the adjusted Wald method. The developed algorithm for detecting critical result communication has a precision of 97.0% (95% CI, 93.5-98.8%), recall 98.2% (95% CI, 93.4-100%), and F measure of 97.6% (ß=1). Our query algorithm is accurate for identifying radiology reports that contain non-routine communication of critical or significant results. This algorithm can be applied to a radiology reports database for quality control purposes and help satisfy accreditation requirements.

View details for DOI 10.1007/s10278-009-9237-1

View details for Web of Science ID 000284163300001

View details for PubMedID 19826871
Documentation of nonroutine communications of critical or significant radiology results: a multiyear experience at a tertiary hospital. Journal of the American College of Radiology Lakhani, P., Langlotz, C. P. 2010; 7 (10): 782-790

Abstract

The aim of this study was to determine the frequency of radiology reports that contain nonroutine communications of results and categorize the urgency of such communications.A rule-based text-query algorithm was applied to a database of 2.3 million radiology reports, which has an accuracy of 98% for classifying reports containing documentation of communications. The frequency of such communications by year, modality, and study type was then determined. Finally, 200 random reports selected by the algorithm were analyzed, and reports containing critical results were categorized according to ascending levels of urgency.Critical or noncritical results to health care providers were present in 5.09% of radiology reports (116,184 of 2,282,923). For common modalities, documentation of communications were most frequent in CT (14.34% [57,537 of 402,060]), followed by ultrasound (9.55% [17,814 of 186,626]), MRI (5.50% [13,697 of 248,833]), and chest radiography (1.57% [19,840 of 1,262,925]). From 1997 to 2005, there was an increase in reports containing such communications (3.04% in 1997, 6.82% in 2005). More reports contained nonroutine communications in single-view chest radiography (1.29% [5,533 of 428,377]) than frontal/lateral chest radiography (0.80% [1,815 of 226,837]), diagnostic mammography (9.42% [3,662 of 38,877]) than screening mammography (0.47% [289 of 61,114]), and head CT (26.21% [20,963 of 79,985]) than abdominal CT (15.05% [19,871 of 132,034]) or chest CT (5.33% [3,017 of 56,613]). All of these results were statistically significant (P < .00001). Of 200 random radiology reports indicating nonroutine communications, 155 (78%) had critical and 45 (22%) had noncritical results. Regarding level of urgency, 94 of 155 reports (60.6%) with critical results were categorized as high urgency, 31 (20.0%) as low urgency, 26 (16.8%) as medium urgency, and 4 (2.6%) as discrepant.From 1997 to 2005, there was a significant increase in documentation of nonroutine communications, which may be due to increasing compliance with ACR guidelines. Most reports with nonroutine communications contain critical findings.

View details for DOI 10.1016/j.jacr.2010.05.025

View details for PubMedID 20889108
Comparison of two methods to transmit clinical history information from referring providers to radiologists. Journal of the American College of Radiology Agarwal, R., Bleshman, M. H., Langlotz, C. P. 2009; 6 (11): 795-799

Abstract

At many institutions, clerical personnel manually enter clinical histories into radiology information systems during the process of scheduling examinations. For outpatients, radiologists use this information as their primary source of clinical histories. The purpose of this study was to determine the discrepancy rate between these manually recorded clinical histories and paper request slips, thereby assessing the accuracy of the clinical information used by radiologists at the time of interpretation.A total of 129 imaging request slips for CT scans were randomly selected from 7 days in February and March 2007. The clinical history on each request slip was compared with the clinical history manually entered into the radiology information system. Discrepancies between paper request slips and the electronic information available to radiologists were placed into 4 categories: 1) no discrepancy, 2) electronic or paper history incomplete, 3) disagreement between electronic and paper information, and 4) other. Incomplete or discrepant histories were further subcategorized on the basis of whether they were clinically significant.Thirty-eight percent of studies (49 of 129) had no discrepancies between the paper request slips and the manually entered electronic information. The remaining 62% of studies (80 of 129) had incomplete or discrepant clinical histories. Forty-nine percent of studies (63 of 129) had incomplete electronic or paper information. Greater than half of those incomplete histories (36 of 63) were clinically significant. Ten percent of cases (13 of 129) showed frank disagreements between paper and electronic information. Sixty-nine percent of these (9 of 13) were clinically significant. Three percent of studies (4 of 129) showed other discrepancies whose clinical significance could not be categorized.The manual entry of clinical information introduces a high rate of discrepancies, most of which are clinically significant. These discrepancies highlight the need for better communication between referring providers and radiologists.

View details for DOI 10.1016/j.jacr.2009.06.012

View details for PubMedID 19878887
Structured Radiology Reporting: Are We There Yet? RADIOLOGY Langlotz, C. P. 2009; 253 (1): 23-25

View details for DOI 10.1148/radiol.2531091088

View details for Web of Science ID 000271275200005

View details for PubMedID 19789252
Toward Best Practices in Radiology Reporting RADIOLOGY Kahn, C. E., Langlotz, C. P., Burnside, E. S., Carrino, J. A., Channin, D. S., Hovsepian, D. M., Rubin, D. L. 2009; 252 (3): 852-856

Abstract

The goals and current efforts of the Radiological Society of North America Radiology Reporting Committee are described. The committee's charter provides an opportunity to improve the organization, content, readability, and usefulness of the radiology report and to advance the efficiency and effectiveness of the reporting process.

View details for DOI 10.1148/radiol.2523081992

View details for Web of Science ID 000270809500028

View details for PubMedID 19717755
Radiologist Use of and Perceived Need for Patient Data Access JOURNAL OF DIGITAL IMAGING Boonn, W. W., Langlotz, C. P. 2009; 22 (4): 357-362

Abstract

Given the increasing volume of radiological exams, the decreasing frequency of direct communication with the referring provider, and the distribution of patient data over many clinical systems, radiologists often do not have adequate clinical information at the time of interpretation. We have performed a survey of radiologists to determine the need and actual utilization of patient data at the time of image interpretation. Our findings demonstrate that most radiologists want more clinical information when interpreting images and that this information would impact their report, but they are discouraged by the time it takes to access this information. In addition, current mechanisms for monitoring necessary patient follow-up are inadequate.

View details for DOI 10.1007/s10278-008-9115-2

View details for Web of Science ID 000267824800003

View details for PubMedID 18459002
The IR Radlex Project: An Interventional Radiology Lexicon-A Collaborative Project of the Radiological Society of North America and the Society of Interventional Radiology JOURNAL OF VASCULAR AND INTERVENTIONAL RADIOLOGY Kundu, S., Itkin, M., Gervais, D. A., Krishnamurthy, V. N., Wallace, M. J., Cardella, J. F., Rubin, D. L., Langlotz, C. P. 2009; 20 (4): 433-435

View details for DOI 10.1016/j.jvir.2008.10.022

View details for Web of Science ID 000264958300001

View details for PubMedID 19081735
Improving language models for radiology speech recognition JOURNAL OF BIOMEDICAL INFORMATICS Paulett, J. M., Langlotz, C. P. 2009; 42 (1): 53-58

Abstract

Speech recognition systems have become increasingly popular as a means to produce radiology reports, for reasons both of efficiency and of cost. However, the suboptimal recognition accuracy of these systems can affect the productivity of the radiologists creating the text reports. We analyzed a database of over two million de-identified radiology reports to determine the strongest determinants of word frequency. Our results showed that body site and imaging modality had a similar influence on the frequency of words and of three-word phrases as did the identity of the speaker. These findings suggest that the accuracy of speech recognition systems could be significantly enhanced by further tailoring their language models to body site and imaging modality, which are readily available at the time of report creation.

View details for DOI 10.1016/j.jbi.2008.08.001

View details for Web of Science ID 000263882700006

View details for PubMedID 18761109
Extracting Templates from Radiology Reports using Sequence Alignment IEEE International Conference on Bioinformatics and Biomedicine (BIBMW 2009) Wu, S., Langlotz, C. P., Lakhani, P., Ungar, L. H. IEEE. 2009: 314–318

View details for Web of Science ID 000274329200051
Structured Reporting: Patient Care Enhancement or Productivity Nightmare? RADIOLOGY Weiss, D. L., Langlotz, C. P. 2008; 249 (3): 739-747

View details for DOI 10.1148/radiol.2493080988

View details for Web of Science ID 000261139300003

View details for PubMedID 19011178
The radiology report of the future: a summary of the 2007 Intersociety Conference. Journal of the American College of Radiology Dunnick, N. R., Langlotz, C. P. 2008; 5 (5): 626-629

Abstract

A radiology report is the official record documenting the contribution of a radiologist to a patient's care. The use of structured reports and a common lexicon will help referring physicians better understand the contents of reports. These same features in electronic health records will enable radiologists to mine reports for utilization management information as well as form the basis for clinical investigations.

View details for DOI 10.1016/j.jacr.2007.12.015

View details for PubMedID 18442766
From the chair: The top 10 myths about imaging informatics certification (January 2008-SIIM news) JOURNAL OF DIGITAL IMAGING Langlotz, C. 2008; 21 (1): 1-2

View details for DOI 10.1007/s10278-007-9100-1

View details for Web of Science ID 000253626200001

View details for PubMedID 18213485

View details for PubMedCentralID PMC2257996
RadLex: A new method for indexing online educational materials RADIOGRAPHICS Langlotz, C. P. 2006; 26 (6): 1595-1597

View details for DOI 10.1148/rg.266065168

View details for Web of Science ID 000241828200002

View details for PubMedID 17102038
Mentoring the mentors: Aligning mentor and mentee expectations ACADEMIC RADIOLOGY Lee, J. M., Anzai, Y., Langlotz, C. P. 2006; 13 (5): 556-561

Abstract

The Radiology Alliance for Health Services Research sponsored a symposium at the 2005 Annual Meeting of the Association of University Radiologists, which focused on the issue of aligning mentor and mentee expectations to foster successful mentoring relationships. This article presents a summary of the informal discussion of the panelists' individual experiences, common themes, and insights gained from the panel participants.

View details for DOI 10.1016/j.acra.2006.01.050

View details for Web of Science ID 000237075400005

View details for PubMedID 16627195
Development and validation of queries using structured query language (SQL) to determine the utilization of comparison imaging in radiology reports stored on PACS JOURNAL OF DIGITAL IMAGING Lakhani, P., Menschik, E. D., Goldszal, A. F., Murray, J. P., Weiner, M. G., Langlotz, C. P. 2006; 19 (1): 52-68

Abstract

The purpose of this research was to develop queries that quantify the utilization of comparison imaging in free-text radiology reports. The queries searched for common phrases that indicate whether comparison imaging was utilized, not available, or not mentioned. The queries were iteratively refined and tested on random samples of 100 reports with human review as a reference standard until the precision and recall of the queries did not improve significantly between iterations. Then, query accuracy was assessed on a new random sample of 200 reports. Overall accuracy of the queries was 95.6%. The queries were then applied to a database of 1.8 million reports. Comparisons were made to prior images in 38.69% of the reports (693,955/1,793,754), were unavailable in 18.79% (337,028/1,793,754), and were not mentioned in 42.52% (762,771/1,793,754). The results show that queries of text reports can achieve greater than 95% accuracy in determining the utilization of prior images.

View details for DOI 10.1007/s10278-005-7667-y

View details for Web of Science ID 000236218500006

View details for PubMedID 16132483
A framework for improving radiology reporting. Journal of the American College of Radiology Sistrom, C. L., Langlotz, C. P. 2005; 2 (2): 159-167

Abstract

The interpretative reports rendered by radiologists are the only tangible manifestation of their expertise, training, and experience. These documents are very often the primary means by which radiologists provide patient care. Radiology reports are extremely variable in form, content, and quality. The authors propose a framework for conceptualizing the reporting process and how it might be improved. This consists of standard language, a structured format, and consistent content. These attributes will be realized by modifying the clinical reporting process, including the creation, storage, transmission, and review of interpretative documents. The authors also point out that changes in training and evaluation must be a part of the process, because they are complementary to purely technical solutions.

View details for PubMedID 17411786
Using sonography to examine adult patients at an academic medical center: Have usage patterns changed with the expansion of managed care? AMERICAN JOURNAL OF ROENTGENOLOGY Liebeskind, M. E., Arger, P. H., Liebeskind, A., Maston, K., Langlotz, C. 2002; 179 (6): 1395-?

Abstract

This study was designed to determine whether significant changes have occurred in the utilization of sonography relative to more expensive cross-sectional imaging techniques in adult patients during a time of increasing reliance on managed care.Use of sonography was compared with use of CT and MR imaging of the abdomen, pelvis, and retroperitoneum in adult patients in 1993 and 1998 at an academic medical center. Clinicians who requested the greatest number of examinations in both years were surveyed to assess their perception of changes in their practice patterns during the interim.Between 1993 and 1998, the use of sonography relative to the other cross-sectional imaging modalities decreased from 56% to 43% (p < or = 0.001). During the same time, CT use increased from 30% to 41% (p < or = 0.001), and MR imaging use increased from 14% to 16% (p < or = 0.001). Survey responses indicated that potential cost saving was not a major factor in physicians' decisions to use sonography rather than other cross-sectional imaging modalities.Sonographic utilization decreased during a 5-year period in which managed care provided an increasingly large proportion of overall reimbursement. Cost did not appear to be a major factor in selection of diagnostic tests. Differences over time in refering clinicians' perception of the relative usefulness of sonography, CT, and MR imaging may have contributed to the change in usage patterns.

View details for Web of Science ID 000179407900004

View details for PubMedID 12438022
The effect of PACS on the time required for technologists to produce radiographic images in the emergency department radiology suite JOURNAL OF DIGITAL IMAGING Redfern, R. O., Langlotz, C. P., Abbuhl, S. B., Polansky, M., Horii, S. C., Kundel, H. L. 2002; 15 (3): 153-160

Abstract

The purpose of this study was to evaluate the effect of a switch to a filmless image management system on the time required for technologists to produce radiographic images in the emergency department (ED) after controlling for exam difficulty and a variable workload. Time and motion data were collected on patients who had radiographic images taken while being treated in the emergency department over the 3 1/2-year period from April 1997 to November 2000. Event times and demographic data were obtained from the radiology information system, from the hospital information system, from emergency department records, or by observation by research coordinators. Multiple least squares regression analysis identified several independent predictors of the time required for technologists to produce radiographic images. These variables included the level of technologist experience, the number of trauma-alert patient arrivals, and whether a filmless image management system was used (all P <.05). Our regression model explained 22% of the variability in technologist time (R2 Adjusted, 0.22; F = 24.01; P <.0001). The regression model predicted a time saving of 2 to 3 minutes per patient in the elapsed time from notification of a needed examination until image availability because of the implementation of PACS, a delay of 4 to 6 minutes per patient who were imaged by technologists who spent less than 10% of their work assignments within the ED, and a delay of 18 to 27 minutes in radiology workflow because of the arrival of a trauma alert patient. A filmless system decreased the amount of time required to produce radiographs. The arrival of a trauma alert patient delayed radiology workflow in the ED. Inexperienced technologists require 4 to 6 minutes of additional time per patient to complete the same amount of work accomplished by an experienced technologist.

View details for DOI 10.1007/s10278-002-0024-5

View details for Web of Science ID 000181622100006

View details for PubMedID 12415466
Automatic structuring of radiology reports: Harbinger of a second information revolution in radiology RADIOLOGY Langlotz, C. P. 2002; 224 (1): 5-7

View details for DOI 10.1148/radiol.2241020415

View details for Web of Science ID 000176454700002

View details for PubMedID 12091655
The completeness of existing lexicons for representing radiology report information. Journal of digital imaging Langlotz, C. P., Caldwell, S. A. 2002; 15: 201-205

Abstract

Although most medical lexicons contain up to 80% of clinical terms used in an ambulatory patient medical records archive, preliminary research suggests that they may be far less complete for radiology terms. We therefore compared the likelihood that several existing medical lexicons would contain terms found in a radiology report to the likelihood they would contain terms found in an ambulatory care medical record. We used three samples of imaging terms to assess the completeness of existing lexicons for medical imaging: (1) a random sample of imaging terms from the Unified Medical Language System Large Scale Vocabulary Test (UMLS-LSVT; n = 218), (2) terms actually used in the first 80 clinical knee magnetic resonance imaging reports generated by the routine clinical use of a structured reporting system (eDictation, Marlton, NJ; n = 76), and (3) terms listed in a glossary of thoracic imaging prepared by the Fleischner Society (n = 173). Using the UMLS Web-based Knowledge Source Server (http://umlsks.nlm.nih. gov/), we measured the rate at which terms in each of the above three sources were found in the UMLS and two of its major constituent terminologies: ICD-9-CM and SNOMED International. ICD-9-CM contained matches for 3%, 8%, and 11% of terms from the Fleischner Society Glossary, eDictation, and NLM-LSVT, respectively. SNOMED International contained matches for 32%, 46%, and 36% of terms from the Fleischner Society Glossary, eDictation, and NLM-LSVT, respectively. The UMLS contained matches for 36%, 50%, and 45% of terms from the Fleischner Society Glossary, eDictation, and NLM-LSVT, respectively. The assessed vocabularies were least likely to contain a term from the Fleischner Society Glossary and most likely to contain a term from the eDictation lexicon. The UMLS was the most complete, and ICD-9 was the least complete of the three systems evaluated. No lexicon achieved greater than 50% completeness for any test set of imaging terms. Our results show that no single lexicon is sufficiently complete to allow comprehensive indexing, search, and retrieval of radiology report information. These results confirm the few results available from the medical literature indicating that existing controlled vocabularies are insufficiently complete to represent the contents of radiology reports. A subjective analysis of these results may identify particular imaging sub-areas for which new terms should be developed.

View details for PubMedID 12105728
Evidence-based radiology: A new approach to the practice of radiology RADIOLOGY Black, W. C., Jadad, A. R., Jarvik, J. G., Kazerooni, E. A., Langlotz, C. P., Lentle, B. C., Maceneaney, P. M., Malone, D. E., Nahmias, C., Reed, M. H., Salena, B. J., Shannon, S. I., Stolberg, H. O. 2001; 220 (3): 566-575

Abstract

In this review, the principles of evidence-based health care and their application to radiology are discussed. Evidence-based health care involves the more formal integration of the best research evidence with clinical expertise and explicit acknowledgment of patient values in clinical decision making, as compared with conventional practice. Recently, many health care disciplines have adopted the principles and practice of evidence-based health care. In radiology, including its diagnostic and interventional aspects, these developments have received limited attention. This review of evidence-based health care could, therefore, be useful to radiologists at any stage of their training or career, to encourage the practice of evidence-based radiology. The development of evidence-based health care is described, and evidence-based health care and evidence-based radiology are defined. The importance of evidence-based health care as a new approach to the practice of medicine and its importance for transdisciplinary collaboration are discussed. The skills required to practice evidence-based radiology are identified, and the roles of evidence-based radiology in radiologic practice, education, and research are discussed.

View details for Web of Science ID 000170616700002

View details for PubMedID 11526249
Acute appendicitis: Comparison of helical CT diagnosis - Focused technique with oral contrast material versus nonfocused technique with oral and intravenous contrast material 86th Scientific Assembly and Annual Meeting of the Radiological-Society-of-North-America (RSNA) Jacobs, J. E., Birnbaum, B. A., Macari, M., Megibow, A. J., Israel, G., Maki, D. D., Aguiar, A. M., Langlotz, C. P. RADIOLOGICAL SOC NORTH AMERICA. 2001: 683–90

Abstract

To compare the diagnostic accuracy of focused helical computed tomography (CT) with orally administered contrast material with that of nonfocused helical CT with orally and intravenously administered contrast material.After receiving oral contrast material, 228 patients with clinically suspected appendicitis underwent focused appendiceal CT (5-mm section thickness, 15-cm coverage in the right lower quadrant). Immediately thereafter, helical CT of the entire abdomen and pelvis was performed following intravenous administration of contrast material (abdomen, 7-mm section thickness; pelvis, 5-mm section thickness). Studies were separated and independently interpreted by three observers who were blinded to patient names. Diagnoses were established by means of surgical and/or clinical follow-up findings.Fifty-one (22.4%) of 228 patients had acute appendicitis. Readers diagnosed appendicitis with 83.3%, 73.8%, and 71.4% sensitivity and 93.0%, 92.3%, and 97.9% specificity with focused nonenhanced appendiceal CT. Readers diagnosed appendicitis with 92.9%, 92.9%, and 88.1% sensitivity and 93.7%, 95.1%, and 96.5% specificity with nonfocused enhanced CT. Summary areas under the receiver operating characteristic curve estimates for focused nonenhanced and nonfocused enhanced CT were 0.916 and 0.964, respectively; the differences were statistically significant (P <.05) for two of three readers. All readers demonstrated higher sensitivities for detecting the inflamed appendix with nonfocused enhanced CT. Appendicitis was missed with focused CT in two patients whose inflamed appendix was not included in the imaging of the right lower quadrant. All readers were significantly more confident in diagnosing alternative conditions with nonfocused enhanced CT.Diagnostic accuracy of helical CT for acute appendicitis improved significantly with use of intravenous contrast material.

View details for Web of Science ID 000170616700019

View details for PubMedID 11526267
Visualization of areae gastricae on double-contrast upper gastrointestinal radiography: Relationship to age of patients AMERICAN JOURNAL OF ROENTGENOLOGY Charagundla, S. R., Levine, M. S., Langlotz, C. P., Rubesin, S. E., Laufer, I. 2001; 177 (1): 61-63

Abstract

The purpose of this study was to determine whether the frequency of visualization of areae gastricae on double-contrast upper gastrointestinal tract examinations is related to a patient's age.A total of 141 double-contrast upper gastrointestinal tract examinations with normal findings were reviewed for the presence or absence of areae gastricae on double-contrast images of the stomach. All images were evaluated by two radiologists who were blinded to the age of the patients. The data were then analyzed to determine if the frequency of visualization of areae gastricae on double-contrast studies was significantly related to the age of patients.The frequency of visualization of areae gastricae increased significantly with increasing age (p = 0.008). The youngest age group (20--29 years old) exhibited areae gastricae in only four (19%) of 21 cases, whereas the oldest age group (> or = 70 years old) exhibited areae gastricae in 19 (76%) of 25 cases. On average, the rate of visualization of areae gastricae on double-contrast studies increased by 9% per decade.Our data show that the frequency of visualization of areae gastricae on double-contrast upper gastrointestinal tract examinations increases significantly with increasing patient age. It is important for radiologists to be aware of the effect of aging on the delineation of areae gastricae on double-contrast studies.

View details for Web of Science ID 000169457900012

View details for PubMedID 11418398
Accuracy of MR imaging for staging prostate cancer: A meta-analysis to examine the effect of technologic change ACADEMIC RADIOLOGY Sonnad, S. S., Langlotz, C. P., Schwartz, J. S. 2001; 8 (2): 149-157

Abstract

The purpose of this study was to summarize the accuracy of magnetic resonance (MR) imaging for staging prostate cancer and to determine the effect of high magnetic field strength, use of the endorectal coil, use of fast spin-echo (SE) imaging, and study size on staging accuracy.A literature search and review yielded 27 studies comparing MR imaging to a pathologic standard in patients with clinically limited prostate cancer. Subgroup analyses examined magnetic field strength, use of an endorectal coil, use of fast SE imaging, publication date, and study size.A summary receiver operating characteristic curve for all studies had a maximum joint sensitivity and specificity of 74%. At a specificity of 80% on this curve, sensitivity was 69%. Subgroup analyses showed that fast SE imaging was statistically significantly more accurate than conventional SE techniques (P < .001). Unexpectedly, studies employing higher magnetic field strength and those employing an endorectal coil were less accurate.Seemingly small technologic advances may influence test accuracy. Early and small studies, however, may overstate accuracy because of publication bias, bias in small samples, or earlier studies being performed by the experts who developed the technology itself.

View details for Web of Science ID 000168770500005

View details for PubMedID 11227643
Economic consequences of diagnostic imaging for vocal cord paralysis ACADEMIC RADIOLOGY Liu, A. Y., Yousem, D. M., Chalian, A. A., Langlotz, C. P. 2001; 8 (2): 137-148

Abstract

The purpose of this retrospective study was to estimate the economic consequences of evaluating suspected vocal cord paralysis with magnetic resonance (MR) imaging and computed tomography (CT).Reports from MR imaging (n = 30) or CT (n = 19) studies of the neck in 49 patients were retrospectively reviewed for causes of vocal cord paralysis. The patients were divided into high-suspicion (n = 20) and low-suspicion (n = 29) groups, based on the presence or absence of a clinically detectable abnormality other than vocal cord immobility. Clinic and inpatient charts were examined to determine the work-up in all cases. The Medicare Resource-based Relative Value Scale was used to estimate the costs of most procedures.The high-clinical-suspicion group included nine true-positive, four false-positive, seven true-negative, and no false-negative cases. Further work-up was performed in seven true-positive, three false-positive, and one true-negative cases. The total cost of immediate diagnostic work-up in these 20 patients, including MR imaging and/or CT, was $20,737 ($2,304 per true-positive case). The low-suspicion group included two true-positive, nine false-positive, 18 true-negative, and no false-negative cases. Further work-up was performed in both true-positive, four false-positive, and two true-negative cases. The total cost of immediate diagnostic work-up in these 29 patients was $21,698, (mean, $748; $10,849 per true-positive case).The average cost of finding space-occupying lesions in patients with vocal cord paralysis is more than 4.5 times higher in patients without suspicious antecedent clinical findings than in those with such a history. The benefits of obtaining negative findings and of detecting a small number of space-occupying lesions should be weighed against the costs of such examinations and of additional work-up for false-positive findings.

View details for Web of Science ID 000168770500004

View details for PubMedID 11227642
Prostate Cancer: What is the future role for imaging? Semiannual Meeting of the American-College-of-Radiology-Imaging-Netwark Thornbury, J. R., Ornstein, D. K., Choyke, P. L., Langlotz, C. P., Weinreb, J. C. AMER ROENTGEN RAY SOC. 2001: 17–22

View details for Web of Science ID 000166074000003

View details for PubMedID 11133530
Readings in clinical imaging research: A structured bibliography ACADEMIC RADIOLOGY Langlotz, C. P. 2000; 7 (10): 880-890

View details for Web of Science ID 000089729500012

View details for PubMedID 11048884
Accuracy of CT angiography versus pulmonary angiography in the diagnosis of acute pulmonary embolism: Evaluation of the literature with summary ROC curve analysis ACADEMIC RADIOLOGY Harvey, R. T., Gefter, W. B., Hrung, J. M., Langlotz, C. P. 2000; 7 (10): 786-797

Abstract

The authors performed this study to estimate, by using published data, the sensitivity and specificity of computed tomographic (CT) angiography in the evaluation of suspected acute pulmonary embolism (PE).Summary receiver operating characteristic (ROC) curve analysis was used to determine the sensitivity and specificity of CT angiography in the diagnosis of acute PE. Pulmonary angiography was used as the diagnostic standard of reference. The authors reviewed the results of 11 independent studies published in the English-language literature between January 1992 and June 1999.The sensitivity of CT angiography in the diagnosis or exclusion of PE in the central pulmonary arteries (to the level of the segmental pulmonary arteries) ranged from 0.74 to 0.81 on the basis of specificities of 0.89-0.91. The sensitivity of CT angiography in the diagnosis or exclusion of PE in all pulmonary arteries (to the level of the subsegmental pulmonary arteries) was 0.68 on the basis of a specificity of 0.91.On the basis of the studies in the current literature, most of which used 5.0-mm collimation and single-detector CT, CT angiography may be less accurate in the diagnosis of PE than previously reported. With improvements in data acquisition, particularly the use of thinner section collimation and multidetector CT, and in the increased use of workstations for data analysis, the accuracy and utility of CT angiography will require continued investigation.

View details for Web of Science ID 000089729500002

View details for PubMedID 11048876
Diagnosis of primary versus secondary achalasia: Reassessment of clinical and radiographic criteria AMERICAN JOURNAL OF ROENTGENOLOGY Woodfield, C. A., Levine, M. S., Rubesin, S. E., Langlotz, C. P., Laufer, I. 2000; 175 (3): 727-731

Abstract

Our purpose was to reassess the usefulness of barium studies and various clinical parameters for differentiating primary from secondary achalasia.Radiology files from 1989 through 1999 revealed 29 patients with primary achalasia and 10 with secondary achalasia (caused by carcinoma of the esophagus in three, of the gastric cardia in three, of the lung in three, and of the uterus in one) who met our study criteria. The radiographs were reviewed to determine the morphologic features of the narrowed distal esophageal segment and gastric cardia and fundus. Medical records were also reviewed to determine the clinical presentation; endoscopic, manometric, and surgical findings; and treatment.The mean patient age was 53 years in primary achalasia versus 69 years in secondary achalasia (p = 0.03). The mean duration of dysphagia was 4.5 years in primary achalasia versus 1.9 months in secondary achalasia (p <0.0001). The narrowed distal esophageal segment had a mean length of 1.9 cm in primary achalasia versus 4.4 cm in secondary achalasia (p < 0.0001), and the esophagus had a mean diameter of 6.2 cm in primary achalasia versus 4.1 cm in secondary achalasia (p <0.0001). The narrowed segment was eccentric or nodular or had abrupt proximal borders in only four of 10 patients with secondary achalasia, and evidence of tumor was present in the gastric fundus in only three.When findings of achalasia are present on barium studies, a narrowed distal esophageal segment longer than 3.5 cm with little or no proximal dilatation in a patient with recent onset of dysphagia should be considered highly suggestive of secondary achalasia, even in the absence of other suspicious radiographic findings.

View details for Web of Science ID 000088910300025

View details for PubMedID 10954457
The costs of CT procedures in an academic radiology department determined by an activity-based costing (ABC) method JOURNAL OF COMPUTER ASSISTED TOMOGRAPHY Nisenbaum, H. L., Birnbaum, B. A., Myers, M. M., Grossman, R. I., Gefter, W. B., Langlotz, C. P. 2000; 24 (5): 813-823

Abstract

The purpose of this work was to determine the costs of computed tomography (CT) procedures in a large academic radiology department, including both professional (PC) and technical (TC) components, by analyzing actual resource consumption using an activity-based costing (ABC) method and comparing them with Medicare payments.Over a 12 month period from July 1, 1996, to June 30, 1997, 1,011 CT procedures, representing 16 Physicians' Current Procedural Terminology (CPT) codes and 98.3% of CT studies performed, were carefully observed by a research assistant trained in ABC methodology. Information collected during these time and motion studies included personnel/machine time and direct materials used. Actual resource units used during the different activities in each CT procedure were valued using appropriate cost drivers. Unit values for both direct and overhead costs were calculated: the cost of an individual procedure equaled the sum of component costs. Costs were compared with PC and TC payments according to the 1997 Medicare Fee Schedule.Total costs of CPT codes 70450 (CT Head unenhanced), 71260 (CT Chest enhanced), and 74160 (CT Abdomen enhanced), which represented 71.2% of CT studies performed, were $189.19, $273.53, and $343.20, respectively. For all 16 nonmodified CPT codes analyzed, Medicare's professional reimbursement was less than the professional cost, whereas its technical reimbursement exceeded respective cost in 14 of the 16 codes.In the setting and time period studied, Medicare underreimbursed professional costs while overreimbursing technical costs.

View details for Web of Science ID 000089727900026

View details for PubMedID 11045708
Enhancing the expressiveness of structured reporting systems 17th Symposium for Computer Applications in Radiology held at the Annual Meeting of the Society-for-Computer-Applications-in-Radiology (SCAR 2000) Langlotz, C. P. SPRINGER. 2000: 49–53

Abstract

The overall goal of this research is to build a structured reporting system that reduces the cost, delays, and inconvenience associated with conventional dictation and speech recognition systems. We have implemented such a structured reporting system for radiology that replaces current dictation and transcription processes by allowing radiologists and other imaging professionals to select imaging findings from a medical lexicon. The system uses an imaging-specific information model, called a "description set,' to organize selected terms in a relational database. Unique features of the knowledge representation that enhance its expressiveness include its ability to codify uncertainty about an imaging observation and to represent explicitly the logical relationships among imaging findings. In addition, the system does not require the user to fill in "blanks' in a static text template. Instead, it allows entry of terms in arbitrary order and uses automated text-generation techniques to create a text report that referring physicians are accustomed to receiving. In parallel, the system also produces a multimedia report that the referring physician can use as a quick reference. Unlike the results of conventional dictation or speech recognition, each finding is coded in a relational database for later information processing. Thus, the structured report database can be used to index images by content, to provide real-time decision support, to enhance radiologists' performance, to conduct exploratory clinical research, and to transmit imaging report data to computer-based patient record systems.

View details for Web of Science ID 000087339000013

View details for PubMedID 10847362
A picture archival and communication system shortens delays in obtaining radiographic information in a medical intensive care unit CRITICAL CARE MEDICINE Redfern, R. O., Kundel, H. L., Polansky, M., Langlotz, C. P., Horii, S. C., Lanken, P. N. 2000; 28 (4): 1006-1013

Abstract

To assess whether variables such as unit occupancy and aggregate severity of illness that reflect increased work demands on physicians in medical intensive care units (MICU) are associated with increased delays in their obtaining information about nonroutine chest radiographic examinations. To determine whether the presence of a picture archiving and communication system (PACS) workstation in the MICU shortens those delays.A prospective cohort study stratified for presence or absence of PACS.MICU of a university hospital.A total of 118 patients admitted to the MICU who had nonroutine bedside chest radiographs.Multivariate analyses were conducted to determine how unit occupancy, patient acuity, the time of day the examination was taken, and the presence of a PACS workstation influenced the time from radiographic examination completion to the time when MICU physicians first obtained image information. In a multivariate analysis, patient acuity, unit occupancy, the aggregate level of severity of illness in the study cohort, whether the examination was taken at night or day, and the presence of a PACS workstation were significant predictors of the elapsed time from examination completion until review by MICU physicians. Without the PACS workstation, higher occupancy, higher aggregate severity of illness, and examinations taken during the day were associated with longer delays. Overall, the multivariate analysis showed a 24-min decrease in the elapsed time to obtain information during periods with the PACS workstation compared with periods without the workstation (p = .03).A PACS workstation significantly decreased the delays in obtaining image information that occurred with high unit occupancy and high aggregate severity of illness and may improve unit efficiency under conditions of high physician workload.

View details for Web of Science ID 000086862800016

View details for PubMedID 10809274
Enhancing the expressiveness and usability of structured image reporting systems Annual Symposium of the American-Medical-Informatics-Association Langlotz, C. P., Meininger, L. HANLEY & BELFUS INC. 2000: 467–471

Abstract

We have implemented a structured reporting system for medical imaging that replaces dictation and transcription by allowing radiologists and other imaging professionals to select imaging findings from medical lexicons. The system uses an imaging-specific information model called a Description Set to organize selected terms in a relational database. The system's expressiveness for reporting is enhanced by its ability to codify uncertainty about imaging observations and to represent explicit causal and associational relationships among imaging findings. The system promptly and automatically generates a text report that referring physicians are accustomed to receiving. Because the image report information is stored in a fully coded fashion, it can be used to provide real-time decision support to radiologists, to transmit coded imaging data to electronic patient record systems, to measure and improve radiologists' performance, and to index images by content.

View details for Web of Science ID 000170207500096

View details for PubMedID 11079927
Cost-effectiveness of MR imaging and core-needle biopsy in the preoperative work-up of suspicious breast lesions 83rd Annual Meeting of the Radiological-Society-of-North-America Hrung, J. M., Langlotz, C. P., Orel, S. G., Fox, K. R., Schnall, M. D., Schwartz, J. S. RADIOLOGICAL SOC NORTH AMERICA. 1999: 39–49

Abstract

To assess the clinical and economic consequences of the use of preoperative breast magnetic resonance (MR) imaging and core-needle biopsy (CNB) to avert excisional biopsy (EXB).A decision-analytic Markov model was constructed to compare MR imaging, CNB, and EXB without preoperative testing in a woman with a suspicious breast lesion. Stage-specific cancer prevalence, tumor recurrence, progression rates, and MR imaging and CNB sensitivity and specificity were obtained from the literature. Cost estimates were obtained from the literature and from the Medicare fee schedule.EXB without preoperative testing was associated with the greatest quality-adjusted life expectancy, followed by MR imaging and CNB; life expectancies were 17.409, 17.405, and 17.398 years, respectively. EXB resulted in the greatest lifetime treatment cost ($31,438), followed by MR imaging ($29,072) and CNB ($28,573). Results were robust over a wide range of cancer prevalence, stage distribution, tumor progression rates, and procedure and treatment costs. Incremental cost-effectiveness ratios showed that preoperative testing was cost-effective, but the choice between MR imaging and CNB was highly dependent on the accuracy of each test and to patient preferences.Preoperative testing of most suspicious breast lesions was cost-effective. More precise estimates of MR imaging and CNB test performance characteristics are needed. Until those are available, patient preferences should inform individual decisions regarding preoperative testing.

View details for Web of Science ID 000082771900010

View details for PubMedID 10540638
Accuracy of MR imaging in the work-up of suspicious breast lesions: A diagnostic meta-analysis ACADEMIC RADIOLOGY Hrung, J. M., Sonnad, S. S., Schwartz, J. S., Langlotz, C. P. 1999; 6 (7): 387-397

Abstract

The authors performed a systematic, critical review of the literature on magnetic resonance (MR) imaging for primary breast cancer detection in patients with suspicious breast lesions, analyzed MR test performance in the articles meeting study criteria, and used this information to examine the cost-effectiveness of preoperative MR imaging.A structured, predefined MEDLINE search was conducted to identify potentially relevant, peer-reviewed, English-language references from January 1996 through August 1997 on the diagnostic accuracy of breast MR imaging. This information was supplemented by manually searching bibliographies of the retrieved articles for additional potentially relevant references. All studies were independently abstracted by two reviewers using a prospectively designed worksheet. Abstraction results were analyzed with the summary receiver operating characteristic (ROC) method.Of 41 identified studies, 16 met the inclusion criteria. These studies reported sensitivities ranging from 63% to 100% and specificities ranging from 21% to 100%. Maximum joint sensitivity and specificity of the summary ROC curve was 89% (95% confidence interval [CI]: 82%, 93%). At a sensitivity of 95%, specificity was 67%. When test performance values were applied to a previous cost-effectiveness analysis, the cost-effectiveness of preoperative MR imaging relative to that of excisional biopsy was confirmed, but its cost-effectiveness relative to that of needle core biopsy varied widely.For MR imaging to be a cost-effective alternative to excisional biopsy for diagnosis of suspicious breast lesions, its diagnostic test performance must be equal to or better than the best results in recently published studies.

View details for Web of Science ID 000086033200003

View details for PubMedID 10410164
Gastrointestinal imaging: A systems analysis comparing digital and conventional techniques AMERICAN JOURNAL OF ROENTGENOLOGY Chawla, S., Levine, M. S., Laufer, I., Gingold, E. L., Kelly, T. J., Langlotz, C. P. 1999; 172 (5): 1279-1284

Abstract

The purpose of this study was to compare digital and conventional methods of gastrointestinal imaging based on the cost of image storage and estimated overall costs, radiation exposure to the patient, and duration of the examination.Our study sample consisted of 128 patients who underwent conventional gastrointestinal studies (64 double-contrast upper gastrointestinal examinations and 64 double-contrast barium enemas) and 139 patients who underwent digital gastrointestinal studies (66 double-contrast upper gastrointestinal examinations and 73 double-contrast barium enemas). The number of images and films for each study was recorded, and the mean cost of image storage and the estimated overall costs for digital versus conventional studies were calculated. Both the duration of fluoroscopy and the time from start to completion of the study were obtained from our radiology information system. From these data, we calculated mean radiation exposure to the patient and the duration of the examination. Finally, referring physicians completed a questionnaire about their level of satisfaction with paper prints generated from digital gastrointestinal studies.When digital studies were compared with conventional studies, the mean cost of image storage decreased by 45% and the estimated overall 10-year costs decreased by 8%. The mean number of spot images increased by 8% for upper gastrointestinal examinations and by 25% for barium enema examinations, whereas the mean duration of fluoroscopy decreased by 4% and by 10%, respectively. As a result, radiation exposure to patients increased by only 2%, a difference that did not approach statistical significance. Finally, the mean duration of examinations decreased by 24% for upper gastrointestinal examinations and by 33% for barium enemas. Approximately 85% of the physicians who completed the questionnaires indicated that they reviewed the paper prints generated from digital studies and that they would like to continue receiving them.Digital gastrointestinal imaging systems are associated with higher initial costs than conventional systems, but the long-term costs of these digital imaging systems are slightly less because of the lower cost of image storage, and radiation exposure to patients is comparable. The shorter duration of digital examinations is a potential benefit of this technology, allowing improved patient throughput. Finally, referring physicians have a high level of satisfaction with paper prints generated from digital imaging.

View details for Web of Science ID 000079919700020

View details for PubMedID 10227502
Assessment of a bolus-traching technique in helical renal CT to optimize nephrographic phase imaging RADIOLOGY Birnbaum, B. A., Jacobs, J. E., Langlotz, C. P., Ramchandani, P. 1999; 211 (1): 87-94

Abstract

To evaluate a bolus-tracking technique in helical computed tomography (CT) for identifying the onset of the nephrographic phase and to determine the effect of varying the volume and injection rate of contrast material on nephrographic phase onset.Seventy-five patients underwent bolus tracking of contrast material followed by helical renal CT. In 50 patients, 150 mL of 60% iodinated contrast material (iohexol or iothalamate meglumine) was injected at either 2 mL/sec (25 patients [group 1]) or 3 mL/sec (25 patients [group 2]). In 25 patients who had previously undergone nephrectomy, 100 mL of 60% iodinated contrast material was injected at 3 mL/sec (group 3). Nephrographic phase onset was determined by visually assessing the transition to a homogeneous nephrogram during a monitoring scan series starting 40 seconds after injection.Nephrographic phase onset ranged from 60 to 136 seconds (mean, 89 seconds +/- 17 [+/- SD]). Statistically significant differences in mean onset times were observed among groups 1 (103 seconds +/- 12), 2 (91 seconds +/- 16), and 3 (75 seconds +/- 9) (P < .001). Multiple regression analysis showed patient age, contrast material volume, and injection rate to be independent predictors of nephrographic phase onset. Contrast material volume, patient age, and patient weight were independent predictors of the degree of renal enhancement.Nephrographic phase onset is highly dependent on methods of contrast material administration and patient characteristics.

View details for Web of Science ID 000079323200013

View details for PubMedID 10189457
Overcoming barriers to outcomes research on imaging: lessons from an abstract decision model. Academic radiology Langlotz, C. P. 1999; 6: S29-34

View details for PubMedID 9891164
A critical synopsis of the diagnostic and screening radiology outcomes literature. Academic radiology Blackmore, C. C., Black, W. C., Jarvik, J. G., Langlotz, C. P. 1999; 6: S8-18

Abstract

In summary, the radiology outcomes research literature is both extensive and broad. The methodologic quality, however, is quite variable. Overall, this quality could be improved by intervention in two areas: methodologic dissemination and development. The number of researchers investigating radiology-related outcomes is high, and presently there are over 20 journals devoted exclusively to radiology research. Even with a relatively narrow definition of "outcomes," we identified over 200 radiology outcomes studies, most from the past few years. However, the methodologic quality of most of these articles was relatively low, with important design flaws and biases. Nonetheless, a substantial number of radiology publications do employ state-of-the-art research methods and innovative approaches to methodologic challenges. The quality of radiology outcomes research overall would benefit tremendously from dissemination of such research methods. Instruction in outcomes research methods is accessible to radiologists. For example, there have been several recent articles and series of articles on outcomes research methods in JAMA, including guidelines for the performance and reporting of cost-effectiveness analyses (38-40) and for developing clinical prediction rules (57). Within radiology, several recent articles have appeared on, among other things, cost-effectiveness analysis (34,59,60), assessing quality of life (43), screening for disease (53), and defining the study population (61). The research compendium compiled for the GERRAF (General Electric-Association of University Radiologists Radiology Research Academic Fellowships) program remains a comprehensive methodologic source for many of the issues in radiology outcomes research, and outcomes research methods courses are offered every year at the Society for Health Services Research in Radiology and Society for Medical Decision Making meetings, as well as at the meeting of the Radiological Society of North America. Even so, awareness of the need for such research techniques remains limited. Dissemination of sound research methods is limited at least in part by the current incentives in radiology research. At many institutions, the number of research publications produced, rather than their quality, determines promotion or academic success. Unfortunately, more rigorous study designs often require more time and resources. Further, because peer reviewers are often as uninformed about research methods as the bulk of those who are submitting manuscripts, it may actually be more difficult to publish articles with more advanced methodologic designs. The standard in radiology is the uncontrolled case series, and deviation from the standard may make acceptance for publication more difficult. On a more optimistic note, recent publication of a number of methodology articles suggests that at least some journals are promoting improved research in methodology (43,53,59-61). We hope that time will be available for manuscript reviewers to learn to understand the strengths and weaknesses of various research approaches. If more rigorous study designs were required for publication, radiology outcomes research would probably improve drastically. Nevertheless, the current peer-review system does not effectively promote sound research design. The other great incentive in research is funding. Clearly, if advanced research design is required for funding, then there is incentive for improvement in research quality. Traditionally, National Cancer Institute and other National Institutes of Health and public sector funding has been predicated on a high level of research sophistication. Undoubtedly, availability of grants for diagnostic and screening imaging clinical trials and other research will go far to improve radiology research methods. The other traditional source of research funding is industry.

View details for PubMedID 9891161
Correlation of lesion appearance and histologic findings for the nodes of a breast MR imaging interpretation model 83rd Annual Meeting of the Radiological-Society-of-North-America Nunes, L. W., Schnall, M. D., Orel, S. G., Hochman, M. G., Langlotz, C. P., Reynolds, C. A., Torosian, M. H. RADIOLOGICAL SOC NORTH AMERICA. 1999: 79–92

Abstract

An interpretation model for evaluating magnetic resonance (MR) images of the breast was constructed that allowed differentiation of benign from malignant palpable or mammographically visible abnormalities. Architectural features define each node of the model. Investigation was subsequently made of the histologic findings in individuals within each node and of the frequency with which each histologic finding manifested as a particular architectural feature to determine whether nodal location and specific histologic findings are mutually predictive. The strongest associations were found between fibrocystic change and smooth masses, fibroadenoma and lobulated masses with nonenhancing internal septations, invasive ductal carcinoma (with or without ductal carcinoma in situ [DCIS]) and enhancing irregular or spiculated masses, invasive tubular carcinoma or radial scar and spiculated masses, medullary or colloid carcinoma and enhancing lobulated masses, invasive lobular carcinoma and the absence of a focal mass, DCIS and ductal enhancement, and DCIS (with or without invasive ductal carcinoma) and regional enhancement. Nodal location and histologic findings proved to be mutually predictive within the model; that is, the nodal location of MR imaging features within the model can be used to predict histologic findings and vice versa.

View details for Web of Science ID 000078106700010

View details for PubMedID 9925393
Use sf endorectal MR imaging to predict prostate carcinoma recurrence after radical prostatectomy 83rd Scientific Assembly and Annual Meeting of the Radiological-Society-of-North-America Manzone, T. A., Malkowicz, S. B., Tomaszewski, J. E., Schnall, M. D., Langlotz, C. P. RADIOLOGICAL SOC NORTH AMERICA. 1998: 537–42

Abstract

To determine the ability of endorectal magnetic resonance (MR) imaging to help predict postprostatectomy disease recurrence and, thereby, patient outcome.The authors evaluated 116 patients referred for prostate MR imaging during 1991 and 1992 who subsequently underwent radical prostatectomy and for whom follow-up data through 1996 could be obtained. The MR reports, clinic charts, and pathology reports were reviewed. Disease recurrence was established by means of detectable levels of serum prostate-specific antigen (PSA) after surgery.Thirty-four patients (29%) had postoperative disease recurrence. Patients with recurrence had higher preoperative PSA values (P < .0001). These patients also more frequently had positive surgical margins (P = .0005), transcapsular tumor spread (P < .0001), seminal vesicle involvement (P = .0012), and tumors of advanced stage (P < .0001) and high grade (P = .0058). Of 13 patients whose MR examinations showed definite extracapsular disease, eight (62%) had disease recurrence. The recurrence rate when MR imaging indicated limited disease (24%) was similar to that when MR imaging showed possible microscopic extension (27%). An MR finding of definite extracapsular disease was 24% sensitive and 94% specific for the prediction of disease recurrence.MR imaging findings of definite extracapsular spread of disease helped predict prostate tumor recurrence with high specificity, although with low sensitivity.

View details for Web of Science ID 000076618000039

View details for PubMedID 9807586
Contrast media reactions and extravasation: Relationship to intravenous injection rates RADIOLOGY Jacobs, J. E., Birnbaum, B. A., Langlotz, C. P. 1998; 209 (2): 411-416

Abstract

To evaluate the belief that the frequencies of contrast material extravasation and minor, nonidiosyncratic contrast material reactions correlate with intravenous injection rates.Complications of 6,660 consecutive injections of contrast material for computed tomography were prospectively recorded. Ionic (n = 4,851) or nonionic (n = 1,809) contrast material was injected at 0.5-4.0 mL/sec. The injection rate was 1.9 mL/sec or less in group 1 (n = 2,899), 2.0-2.9 mL/sec in group 2 (n = 2,475), and 3.0-4.0 mL/sec in group 3 (n = 1,286).The extravasation rate (0.6%) did not differ significantly between the groups. The reaction rate (8.4%) also did not differ significantly between the groups. The rate of minor reactions (8.0%) was higher with ionic (9.9%) than nonionic (2.9%) contrast material (relative risk = 3.4). The rate of major reactions (0.4%) did not vary significantly with type of contrast material. The rate of nausea or vomiting (3.8%) did not differ significantly between the groups but was higher with ionic (4.9%) than nonionic (1.1%) contrast material (relative risk = 4.5). The rate of severe warmth (2.1%) was significantly higher in group 3 (2.8%) than group 1 (2.0%) or 2 (1.8%).No correlations exist between injection rate and extravasation rate or overall reaction rate.

View details for Web of Science ID 000076618000021

View details for PubMedID 9807567
Meta-analysis of diagnostic procedures: A brief overview ACADEMIC RADIOLOGY Langlotz, C. P., Sonnad, S. S. 1998; 5: S269-S273

View details for Web of Science ID 000075665200008

View details for PubMedID 9750829
MR identification of white matter abnormalities in multiple sclerosis: A comparison between 1.5 T and 4 T AMERICAN JOURNAL OF NEURORADIOLOGY Keiper, M. D., Grossman, R. I., Hirsch, J. A., Bolinger, L., Ott, I. L., Mannon, L. J., Langlotz, C. P., Kolson, D. L. 1998; 19 (8): 1489-1493

Abstract

Although MR spectroscopy and functional MR imaging of the brain have been successful at 4 T, conventional fast spin-echo imaging of the brain at 4 T has not been adequately evaluated. The purpose of this study was to compare the detection of white matter abnormalities in multiple sclerosis (MS) at 1.5 T and 4 T.Fifteen patients with clinically definite MS were imaged at both 1.5 T and 4 T within a 1-week period. Comparison was made between fast spin-echo long-TR images at both field strengths. Pulse sequences were tailored to maximize resolution and signal-to-noise ratio in clinically relevant imaging times (< 7 min). Four interpreters independently reviewed the images obtained at both field strengths in separate sessions and evaluated them for lesion identification, size, characterization, and subjective resolution. Differences in interpretations at 1.5 T and 4 T were subsequently recorded.Images obtained at 4 T showed a mean of 88 more lesions as compared with images obtained at 1.5 T. All the lesions measured less than 5 mm and were typically aligned along perivascular spaces. Twenty-five consensually identified lesions on 4-T images were not seen at all on 1.5-T images. Moreover, 4-T images showed 56 additional consensually identified lesions, which were indistinct and seen only in retrospect on 1.5-T images. These lesions were frequently (n = 48) identified in large confluent areas of white matter signal intensity abnormality at 1.5 T. All observers also agreed that 4-T images subjectively enhanced the perception of normal perivascular spaces and small perivascular lesions.MR imaging at 4 T can depict white matter abnormalities in MS patients not detectable at 1.5 T through higher resolution with comparable signal-to-noise ratio and imaging times.

View details for Web of Science ID 000076039600024

View details for PubMedID 9763383
Patient preference for magnetic resonance versus conventional angiography - Assessment methods and implications for cost-effectiveness analysis: An overview INVESTIGATIVE RADIOLOGY Swan, J. S., Langlotz, C. P. 1998; 33 (9): 553-559

View details for Web of Science ID 000075842800011

View details for PubMedID 9766040
Diagnostic criteria for fatty infiltration of the liver on contrast-enhanced helical CT AMERICAN JOURNAL OF ROENTGENOLOGY Jacobs, J. E., Birnbaum, B. A., Shapiro, M. A., Langlotz, C. P., Slosman, F., Rubesin, S. E., Horii, S. C. 1998; 171 (3): 659-664

Abstract

The purpose of the study was to develop quantitative and qualitative criteria for diagnosing fatty liver on contrast-enhanced helical CT.Differential liver-spleen attenuation was evaluated between 80 and 120 sec after injection in 76 patients who underwent contrast-enhanced helical CT. Unenhanced CT images had earlier established fatty liver when the liver minus spleen attenuation difference was less than or equal to -10 H (n = 18). Four observers who had not seen the unenhanced images used contrast-enhanced CT images to assess the presence of fatty liver on a five-point Likert scale, the presence of geographic areas spared from fatty infiltration, and the relative liver-spleen attenuation. The diagnostic accuracies of various imaging criteria were compared using McNemar's chi-square test (for sensitivity and specificity) and analysis of receiver operating characteristic curves.Sensitivity, specificity, and receiver operating characteristic curve areas for observers' qualitative judgments were 54%, 95%, and .91, respectively; for quantitative differential liver-spleen attenuation (80-100 sec; -20.5 H discriminatory value), the values were 86%, 87%, and .94, respectively; and for quantitative differential liver-spleen attenuation (101-120 sec; -18.5 H discriminatory value), the values were 93%, 93%, and .98, respectively. Differential liver-spleen attenuation was time-dependent; overlap was noted between healthy subjects and patients with fatty liver. Qualitatively, geographic sparing was highly specific (94%) for fatty liver, whereas liver attenuation greater than or equal to spleen attenuation excluded fatty liver in all but one case.Although quantitative and qualitative criteria for diagnosing fatty liver on helical CT can be determined, they are protocol-specific. Limited unenhanced hepatic CT remains the optimal technique for detection of fatty infiltration of the liver.

View details for Web of Science ID 000075496700026

View details for PubMedID 9725292
Reperfusion edema after thromboendarterectomy: Radiographic patterns of disease JOURNAL OF THORACIC IMAGING Miller, W. T., Osiason, A. W., Langlotz, C. P., Palevsky, H. I. 1998; 13 (3): 178-183

Abstract

In patients with chronic pulmonary embolism, pulmonary thromboendarterectomy may result in a unique form of noncardiogenic pulmonary edema termed reperfusion edema. This report reviews the authors' experience after pulmonary thromboendarterectomy with particular emphasis on the radiographic manifestations of reperfusion edema. The clinical and radiographic record of 25 patients who underwent pulmonary thromboendarterectomy at the University of Pennsylvania from 1985 through 1995 were reviewed. The zonal distribution of radiographic opacity, time to maximal opacity, and the time to clearance of reperfusion edema were determined. The relationship of these radiographic manifestations to clinical severity of disease and clinical outcome was examined. Reperfusion edema, characterized by patchy bilateral perihilar alveolar opacities, occurred in all but one patient. There is a lower lung zone predominance of opacities, but in individual cases, striking unilateral or haphazard arrangements of opacities may be seen. In this small sample of patients, no association between preoperative pulmonary arterial pressures and radiographic appearance or clinical outcome was found. However, severity of radiographic opacities, as measured by the extent of involved lung, correlated with disease severity, as measured by time to extubation and time to discharge. Pneumonia, defined as a radiographic opacity that evolves discordantly with the reperfusion edema opacities, occurred in 20% of cases. Reperfusion edema is a common consequence of pulmonary thromboendarterectomy. The severity of radiographic manifestations and clinical severity of disease are related. This characteristically appears as perihilar alveolar opacities.

View details for Web of Science ID 000074500700003

View details for PubMedID 9671419
Extracranial atherosclerotic carotid artery disease: Evaluation of non-breath-hold three-dimensional gadolinium-enhanced MR angiography AMERICAN JOURNAL OF ROENTGENOLOGY Slosman, F., Stolpen, A. H., Lexa, F. J., Schnall, M. D., Langlotz, C. P., Carpenter, J. P., Goldberg, H. I. 1998; 170 (2): 489-495

Abstract

The purpose of this study was to compare the diagnostic information provided by a combination of two-dimensional and three-dimensional (3D) time-of-flight (TOF) techniques with that provided by non-breath-hold 3D spoiled gradient-echo gadolinium-enhanced MR angiography.Fifty patients suspected of having extracranial atherosclerotic carotid artery disease were examined with all three imaging techniques using a 1.5-T MR imaging system. Three observers independently and retrospectively measured the degree of stenosis according to the North American Symptomatic Carotid Endarterectomy trial criteria. The observers were unaware of the results of other MR imaging pulse sequences and digital subtraction angiography. The standard of reference was established by digital subtraction angiography. Results were evaluated with receiver operating characteristic curve analysis. The degree of interobserver agreement was determined using pairwise kappa statistics.The grading of carotid artery stenosis as measured by the area under the receiver operating characteristic curve was less accurate with non-breath-hold 3D gadolinium-enhanced MR angiography than with TOF imaging. Interobserver variability was greater for non-breath-hold 3D gadolinium-enhanced MR angiography than for TOF techniques.Routine evaluation of carotid artery stenosis at the level of the bifurcation using non-breath-hold 3D gadolinium-enhanced MR angiography is less accurate than is TOF imaging and is therefore not recommended. The weakness of this technique may be due to problems in timing the injection of gadolinium and the masking of the carotid bifurcation by the venous jugular system.

View details for Web of Science ID 000071619500049

View details for PubMedID 9456971
Incremental cost of department-wide implementation of a picture archiving and communication system and computed radiography RADIOLOGY Pratt, H. M., Langlotz, C. P., Feingold, E. R., Schwartz, J. S., Kundel, H. L. 1998; 206 (1): 245-252

Abstract

To determine the incremental cash flows associated with department-wide implementation of a picture archiving and communication system (PACS) and computed radiography (CR) at a large academic medical center.The authors determined all capital and operational costs associated with PACS implementation during an 8-year time horizon. Economic effects were identified, adjusted for time value, and used to calculate net present values (NPVs) for each section of the department of radiology and for the department as a whole.The chest-bone section used the most resources. Changes in cost assumptions for the chest-bone section had a dominant effect on the department-wide NPV. The base-case NPV (i.e., that determined by using the initial assumptions) was negative, indicating that additional net costs are incurred by the radiology department from PACS implementation. PACS and CR provide cost savings only when a 12-year hardware life span is assumed, when CR equipment is removed from the analysis, or when digitized long-term archives are compressed at a rate of 10:1.Full PACS-CR implementation would not provide cost savings for a large, subspecialized department. However, institutions that are committed to CR implementation (for whom CR implementation would represent a sunk cost) or institutions that are able to archive images by using image compression will experience cost savings from PACS.

View details for Web of Science ID 000071093500042

View details for PubMedID 9423679
Factors influencing the adoption of digital imaging systems PACS Design and Evaluation Conference Langlotz, C. P., Pratt, H. M., Feingold, E. R., Horii, S. C., Kundel, H. L. SPIE - INT SOC OPTICAL ENGINEERING. 1998: 421–428

View details for Web of Science ID 000075701600045
Assessing the impact of a radiology information management system in the emergency department PACS Design and Evaluation Conference Redfern, R., Langlotz, C. P., Lowe, R. A., Horii, S. C., Abbuhl, S. B., Kundel, H. L. SPIE - INT SOC OPTICAL ENGINEERING. 1998: 414–420

View details for Web of Science ID 000075701600044
Prototype controls for a plain radiography workstation PACS Design and Evaluation Conference Horii, S. C., Grevera, G., Feingold, E., Kundel, H., Mezrich, R., Nodine, C., Langlotz, C. P., Redfern, R., Muck, J., Phelan, M., Scoleri, S. SPIE - INT SOC OPTICAL ENGINEERING. 1998: 87–91

View details for Web of Science ID 000075701600009
Clinical and economic impact of incidental thyroid lesions found with CT and MR AMERICAN JOURNAL OF NEURORADIOLOGY Yousem, D. M., Huang, T., Loevner, L. A., Langlotz, C. P. 1997; 18 (8): 1423-1428

Abstract

To estimate the prevalence and the clinical and economic consequences of management strategies for thyroid lesions detected incidentally on cross-sectional imaging of the head and neck.Two hundred consecutive CT scans and 200 consecutive MR images of the neck performed over a 1-year period in patients being examined for other purposes were reviewed retrospectively to determine the prevalence of unexpected thyroid lesions. After excluding patients with prior thyroidectomies, known thyroid disease, and inadequate examinations, 231 imaging studies were analyzed.Incidental thyroid lesions were originally reported in 14 (6%) of the 231 patients, but an additional 22 (9.5%) were found on retrospective review for a total of nearly 16% (36 of 231). Six of the 36 patients received further workup, consisting of nuclear medicine scintigraphy (n = 3), sonography (n = 3), thyroid function tests (n = 5), fine-needle aspiration (n = 4), and thyroid lobectomy (n = 1). Final diagnoses, obtained in four of the six patients, included three multinodular goiters and one follicular adenoma. Two patients, one with nondiagnostic findings at fine-needle aspiration and a second with normal thyroid function test results, are being followed up. The mean cost of the workup and treatment per examined patient was $1158.Incidental thyroid lesions are frequently present and often overlooked on cross-sectional images of the neck in patients being examined for other reasons. The cost of pursuing a workup of these lesions and their high prevalence in the population raise questions regarding appropriate management strategies.

View details for Web of Science ID A1997XV50000005

View details for PubMedID 9296181
Prostate imaging may not be necessary in nonpalpable carcinoma of the prostate 1995 Annual Meeting of the Radiological-Society-of-North-America WERNERWASIK, M., Whittington, R., Malkowicz, S. B., Corn, B. W., Arger, P., REISINGER, S., Langlotz, C., Alexander, A., Damico, A. V., Hyslop, T., Gomella, L., BROWNSTEIN, K., WEIN, A. J. ELSEVIER SCIENCE INC. 1997: 385–89

Abstract

Stage T1c carcinoma of the prostate is defined as a nonpalpable carcinoma (NPC-P) that is not visible by imaging and is identified by needle biopsy performed because of elevated prostate-specific antigen (PSA) concentrations. The purpose of this study was to define the incidence of normal findings on transrectal ultrasound (TRUS) and/or endorectal coil magnetic resonance imaging (EMRI) among patients with NPC-P, as well as to investigate the value of differentiating patients with Stage T1c disease from all other patients with NPC-P.The records of 2211 patients diagnosed with prostate carcinoma between 1988 and 1995 were reviewed to identify 291 men with NPC-P. TRUS and EMRI reports were analyzed with regard to the presence and laterality of hypoechoic nodules or low-signal areas reported on T2-weighted images, respectively. Ninety percent of patients (n = 262) had at least six prostate biopsies, 185 patients (64%) underwent both TRUS and EMRI, 224 (77%) had TRUS, and 251 (86%) had an EMRI study.Results were considered normal in 101 (47%) of 214 patients undergoing TRUS, in 58 (23%) of 249 undergoing EMRI, and in 22 (12%) of 185 undergoing both TRUS and EMRI. For the side of the prostate with positive biopsy results, correlation with imaging abnormalities was better for EMRI than for TRUS (39% versus 24%). There was no significant difference in mean PSA value, distribution of Gleason score, or unilateral versus bilateral positive biopsy results among patients with normal versus abnormal findings on both TRUS and EMRI.(1) Only 12% of men with NPC-P have no TRUS or EMRI abnormalities, fulfilling the criteria for Stage T1c prostate carcinoma. (2) Those patients with Stage T1c disease do not differ from patients with NPC-P up-staged by TRUS or EMRI, with regard to pretreatment PSA levels, Gleason scores, and the probability of having bilateral rather than unilateral positive biopsy results. (3) The value of classifying patients with NPC-P into Stage T1c versus higher stages of prostate carcinoma on the basis of imaging should be questioned.

View details for Web of Science ID A1997XW20100012

View details for PubMedID 9301702
Diagnostic performance characteristics of architectural features revealed by high spatial-resolution MR imaging of the breast AMERICAN JOURNAL OF ROENTGENOLOGY Nunes, L. W., Schnall, M. D., Siegelman, E. S., Langlotz, C. P., Orel, S. G., Sullivan, D., Muenz, L. A., Reynolds, C. A., Torosian, M. H. 1997; 169 (2): 409-415

Abstract

Our objective was twofold: to determine which architectural features revealed by high spatial-resolution MR imaging of the breast contribute to diagnostic accuracy and to evaluate the diagnostic performance characteristics of those architectural features.Eligible patients with suspicious mammographic or palpable findings or both underwent MR imaging. Ninety-three patients whose MR images revealed lesions that corresponded to the mammographically visible or palpable findings were included in the study. Patients were examined with sagittal T1-weighted spin-echo MR imaging, fat-saturated T2-weighted fast spin-echo MR imaging, and dynamically enhanced fat-saturated fast gradient-echo MR imaging. All patients underwent subsequent excisional biopsy or cyst aspiration. Lesions were identified initially by an experienced radiologist who was aware of the patient's clinical or mammographic information. Two radiologists who were unaware of the patients' histories and who had less experience in MR imaging of the breast then independently evaluated each lesion for the architectural-features and predicted each lesion's potential for malignancy.Architectural features that were highly predictive of benign disease included smooth or lobulated borders (97-100%), the absence of mass enhancement (100%), and enhancement that was less than the enhancement of surrounding breast fibroglandular tissue (93-100%). Nonenhancing internal septations were specific for the diagnosis of fibroadenoma. Architectural features that were highly predictive of malignant disease included spiculated borders (76-88%) and peripheral rim enhancement in the presence of central lesion enhancement (79-92%).Architectural features revealed by high spatial-resolution MR imaging of the breast can help distinguish benign from malignant disease.

View details for Web of Science ID A1997XM75800019

View details for PubMedID 9242744
Endo-rectal coil magnetic resonance imaging in clinically localized prostate cancer: Is it accurate? JOURNAL OF UROLOGY Tempany, C. M., Langlotz, C. P. 1997; 157 (4): 1371-1372

View details for Web of Science ID A1997WN12900068

View details for PubMedID 9120955
Breast MB imaging: Interpretation model 1995 Annual Meeting of the Radiological-Society-of-North-America Nunes, L. W., Schnall, M. D., Orel, S. G., Hochman, M. G., Langlotz, C. P., Reynolds, C. A., Torosian, M. H. RADIOLOGICAL SOC NORTH AMERICA. 1997: 833–41

Abstract

To develop an interpretation model based on architectural features of suspicious breast findings on magnetic resonance (MR) images.One hundred ninety-two patients with mammographically visible or palpable findings underwent T1- and fat-saturated T2-weighted spin-echo and contrast agent-enhanced fat-saturated gradient-echo MR imaging. Patients underwent subsequent excisional biopsy for histopathologic confirmation. An interpretation model was constructed by using 98 cases and was tested prospectively and expanded by using 94 different cases. Sensitivity, specificity, predictive values, and receiver operating characteristic curves were computed for all models.Individual features with high predictive values were MR visibility, enhancement degree and pattern, focal mass border characteristics, and focal mass internal septations. Feature combinations with high negative predictive values for malignancy were absence of an MR-visible abnormality, focal masses with smooth borders, lobulated or irregular masses with nonenhancing internal septations, and focal masses with no (or minimal) enhancement. The validated- and revised-model performance characteristics were, respectively, as follows: sensitivity, 100% and 96%; specificity, 69% and 79%; positive predictive value, 75% and 76%; negative predictive value, 100% and 97%; and overall accuracy, 83% and 86%.An interpretation model that incorporates breast MR architectural features can achieve high sensitivity and improve specificity for diagnosing breast cancer.

View details for Web of Science ID A1997WJ45600044

View details for PubMedID 9051042
Barium enema and colonoscopy: Appropriateness of utilization in a Medicaid population ABDOMINAL IMAGING Levine, M. S., Sor, S., Yin, D., Langlotz, C. P., Bachwich, D. 1997; 22 (1): 41-44

Abstract

To assess the appropriateness of utilization patterns for the barium enema and colonoscopy in a Medicaid population.From 1987 to 1991, a Medicaid managed-care database in Philadelphia revealed claims for a total of 2357 outpatient barium enemas and 896 outpatient colonoscopic examinations. The database was reviewed to determine the primary diagnostic (ICD-9-CM) codes assigned to patients who underwent these procedures. These codes were used as a proxy for indications. Each of the diagnostic codes for barium enema and colonoscopy was then classified either as appropriate, inappropriate, equivocal, or miscoded based on current guidelines in the medical literature.A total of 1962 claims (83%) for barium enema were classified as appropriate, 126 (5%) as inappropriate, 84 (4%) as equivocal, and 185 (8%) as miscoded, whereas 645 claims (72%) for colonoscopy were classified as appropriate, 176 (20%) as inappropriate, 65 (7%) as equivocal, and 10 (1%) as miscoded. Thus, significantly more colonoscopic examinations were rated as inappropriate (p < 0.001).Our study suggests that more stringent criteria need to be used by physicians in ordering diagnostic examinations of the colon, particularly colonoscopy. Further investigation of the appropriateness of these procedures and the development and dissemination of guidelines seems warranted.

View details for Web of Science ID A1997VZ65600009

View details for PubMedID 9000352
PACS workstation usage and patient outcome surrogates. Conference on PACS Design and Evaluation - Engineering and Clinical Issues, at the Medical Imaging 1997 Meeting Redfern, R. O., Kundel, H. L., Seshadri, S. B., Langlotz, C., Horii, S. C., Nodine, C., Lanken, P. N., Polansky, M., Brikman, I., Bozzo, M. SPIE-INT SOC OPTICAL ENGINEERING. 1997: 424–430

View details for Web of Science ID A1997BH98D00051
Incremental cost of a department-wide PACS/CR implementation Conference on PACS Design and Evaluation - Engineering and Clinical Issues, at the Medical Imaging 1997 Meeting Pratt, H. M., Langlotz, C. P., Feingold, E. R., Schwartz, J. S., Kundel, H. L. SPIE-INT SOC OPTICAL ENGINEERING. 1997: 413–423

View details for Web of Science ID A1997BH98D00050
What do we need to advance PACS workstations: A critical review with suggestions Conference on PACS Design and Evaluation - Engineering and Clinical Issues, at the Medical Imaging 1997 Meeting Horii, S. C., Kundel, H. L., Feingold, E., Grevera, G., Nodine, C. F., Langlotz, C. P., Mezrich, R., Redfern, R., Muck, J. SPIE-INT SOC OPTICAL ENGINEERING. 1997: 6–14

View details for Web of Science ID A1997BH98D00002
Benefits and costs of MR imaging of prostate cancer. Magnetic resonance imaging clinics of North America Langlotz, C. P. 1996; 4 (3): 533-544

Abstract

This article answers several important questions about the ultimate clinical usefulness of prostate MR imaging. How accurate is prostate MR imaging? What are the optimal methods for performance and interpretation of the study, considering the tradeoffs between false-positive and false-negative results? Is endorectal-coil imaging a cost-effective part of the prostate examination? And, which men are likely to benefit the most from an endorectal prostate examination?

View details for PubMedID 8873018
Clinical assessment of MR of the brain in nonsurgical inpatients AMERICAN JOURNAL OF NEURORADIOLOGY Hirsch, J. A., Langlotz, C. P., Lee, J., Tanio, C. P., Grossman, R. I., Schulman, K. A. 1996; 17 (7): 1245-1253

Abstract

To evaluate the effect of MR imaging of the brain on four domains of patient care: diagnosis, diagnostic workup, therapy, and prognosis.Pre- and post-MR written questionnaires and oral interviews were administered to the referring clinicians of 103 medical and neurologic inpatients at a tertiary care institution. Additional information was obtained from radiologic reports and records.The study population had a diverse array of signs and symptoms and of presumptive clinical diagnoses, reflecting the breadth of disease seen at our institution. The vast majority of physicians (89%) reported that MR imaging added significant diagnostic information, playing an important role in guiding diagnostic workup (24%), planning treatment (34%), and estimating prognosis (47%). MR imaging was significantly more likely to decrease than to increase confidence in the presumptive clinical diagnosis. Thus, MR imaging may be most useful in the setting of diagnostic uncertainty.Our results show that MR imaging of the brain has important effects on each of the four domains of care for medical inpatients.

View details for Web of Science ID A1996VC32900008

View details for PubMedID 8871707
Technology assessment methods for radiology systems RADIOLOGIC CLINICS OF NORTH AMERICA Langlotz, C. P., Seshadri, S. 1996; 34 (3): 667-?

Abstract

This article discusses the strengths and weaknesses of technology assessment methods for the evaluation of novel and complex radiology systems, including picture archiving and communication systems (PACS), computed radiography (CR), teleradiology, and other new models for the delivery of radiology services. Using examples from PACS and CR, we review early economic assessments of PACS from the radiology department. We then broaden our perspective to discuss the analytic criteria that can be used to evaluate economic analyses of PACS as the health care delivery system shifts toward managed care. We close with a proposal for optimizing the integration of information technology into the clinical environment through ongoing target data collection during the implementation of new radiology systems.

View details for Web of Science ID A1996UN66400013

View details for PubMedID 8657877
Cost-effectiveness of endorectal magnetic resonance imaging for the staging of prostate cancer International Symposium on Costs and Benefits of Radiology Langlotz, C. P., Schnall, M. D., Malkowicz, S. B., Schwartz, J. S. ELSEVIER SCIENCE INC. 1996: S24–S27

View details for Web of Science ID A1996UE51900011

View details for PubMedID 8796502
Prospective study of PACS: Information flow and clinical action in a medical intensive care unit RADIOLOGY Kundel, H. L., Seshadri, S. B., Langlotz, C. P., Lanken, P. N., Horii, S. C., Nodine, C. F., Polansky, M., Feingold, E., Brikman, I., Bozzo, M., Redfern, R. 1996; 199 (1): 143-149

Abstract

To prospectively compare efficiency and outcome of a standard film-only system with those of a digital picture archiving and communication system (PACS).The film-only system, which used either analog film or computed radiography (CR) hard copy, was compared with a PACS, which used CR images displayed on a multiviewer in the radiology department and a workstation in the medical intensive care unit. A random sample of nonroutine, bedside chest radiographs was studied.Within 20 minutes of completion of radiography, 246 of 328 (75%) of the images were available at the workstations; it took 1.8 hours for 238 of 317 (75%) of the images to be displayed on the multiviewer. When the workstation was used, the staff did not access the image information earlier, but clinical actions were initiated more promptly in response to imaging findings. Consultation with radiologists decreased from 507 of 561 (90%) images with hard copies to 70 of 249 (28%) with the workstation.Use of a PACS improves the delivery of chest images, facilitates the initiation of clinical actions, and decreases input by radiologists.

View details for Web of Science ID A1996UB58400025

View details for PubMedID 8633138
An image workstation in a medical intensive care unit changes viewing patterns and timing of image based clinical actions in routine portable chest radiographs. 1996 Medical Imaging Symposium on PACS Design and Evaluation - Engineering and Clinical Issues Redfern, R., Kundel, H. L., Polansky, M., Langlotz, C., Lanken, P. N., Brikman, I., Horii, S., Bozzo, M., Feingold, E., Nodine, C. F. SPIE - INT SOC OPTICAL ENGINEERING. 1996: 298–306

View details for Web of Science ID A1996BF84P00036
Workflow in a neuroradiology reading room using multiviewers 1996 Medical Imaging Symposium on PACS Design and Evaluation - Engineering and Clinical Issues Kundel, H. L., Redfern, R., Langlotz, C., Grossman, R., Brikman, I., Horii, S. C., Feingold, E., Nodine, C. F. SPIE - INT SOC OPTICAL ENGINEERING. 1996: 232–235

View details for Web of Science ID A1996BF84P00027
PACS workstation functions: Usage differences between radiologists and MICU physicians 1996 Medical Imaging Symposium on PACS Design and Evaluation - Engineering and Clinical Issues Horii, S., Feingold, E., Kundel, H., Nodine, C., Langlotz, C., Redfern, R., Grevera, G., Brikman, I., Muck, J. SPIE - INT SOC OPTICAL ENGINEERING. 1996: 266–271

View details for Web of Science ID A1996BF84P00031
The effect of PACS/CR on cost of care and length of stay in a medical intensive care unit 1996 Medical Imaging Symposium on PACS Design and Evaluation - Engineering and Clinical Issues Langlotz, C. P., Kundel, H. L., Brikman, I., Pratt, H. M., Redfern, R. R., Horii, S. C., Schwartz, J. S. SPIE - INT SOC OPTICAL ENGINEERING. 1996: 272–280

View details for Web of Science ID A1996BF84P00032
EVALUATING HEALTH-SERVICES - THE IMPORTANCE OF PATIENTS PREFERENCES AND QUALITY-OF-LIFE AMERICAN JOURNAL OF ROENTGENOLOGY Yin, D. P., Forman, H. P., Langlotz, C. P. 1995; 165 (6): 1323-1328

Abstract

With limited resources available, we all would like to allocate health care dollars to do the most good. Clinical research tells us what outcomes to expect (in many cases) from the introduction of a health care program, a test, or a therapy. Even primitive cost analysis can assess the cost of such programs. The ability to place a value on health states is vital when assessing how patient outcomes influence the relative cost-effectiveness of medical procedures, therapies, and programs. Without a means to measure the value of a particular health state, one is left to compare apples with oranges and oranges with vacuum cleaners. In fact, comparisons of fruit and home appliances is relatively easy, because one can readily apply monetary values to apples, oranges, and vacuum cleaners and compare dollar amounts. How can one do the same for the outcomes of medical procedures and diagnostic tests? This is the challenge for health services and outcomes researchers throughout the world and, more urgently, the focus of policy makers, governments, and health insurers. The purpose of this paper is to describe quality-adjusted life-years (QALYs), a method that has successfully measured the outcomes of disparate health programs. We will introduce the QALY method, summarize the various methods of measuring and classifying health states, describe three methods that have been used to measure patients' preferences (utilities) for health states, and discuss the limitations of utility assessment and some controversies that result from the measurement and use of utilities and concerning health-related quality of life. Readers who are interested in general topics of radiology technology assessment and cost-effectiveness analysis should consult other review articles [1-4].

View details for Web of Science ID A1995TF76800001

View details for PubMedID 7484556
COLON-CANCER - MORPHOLOGY DETECTED WITH BARIUM ENEMA EXAMINATION VERSUS HISTOPATHOLOGIC STAGE RADIOLOGY McCarthy, P. A., Rubesin, S. E., Levine, M. S., Langlotz, C. P., Laufer, I., Furth, E. E., Herlinger, H. 1995; 197 (3): 683-687

Abstract

To determine the relationship between the morphology of colon carcinomas detected with barium enema examination and the cancer stage.Clinical, radiographic, endoscopic, surgical, and histopathologic findings were retrospectively reviewed in 152 patients with colon cancer detected with barium enema examination during a 2-year period.Eighty-six patients (57%) had lesions in the rectum and sigmoid and descending colon, and 66 (43%) patients had lesions more proximally in the colon. Lesions on the right side of the colon were less likely to cause symptoms than those on the left side. Eighty-one patients (53%) had annular or semiannular lesions, 57 (38%) had polypoid lesions, and 14 (9%) had plaquelike or carpet lesions. Six patients (4%) had Dukes stage A lesions; 84 (55%), Dukes stage B lesions; 42 (28%), Dukes stage C lesions; and 20 (13%), Dukes stage D lesions.Annular or semiannular carcinomas had higher rates of serosal invasion and lymph node metastasis than polypoid carcinomas, but the rates of liver metastases were comparable.

View details for Web of Science ID A1995TG33300023

View details for PubMedID 7480739
A METHODOLOGY FOR THE ECONOMIC-ASSESSMENT OF PICTURE ARCHIVING AND COMMUNICATION-SYSTEMS JOURNAL OF DIGITAL IMAGING Langlotz, C. P., EVENSHOSHAN, O., SESHADRI, S. S., Brikman, I., Kishore, S., Kundel, H. L., Schwartz, J. S. 1995; 8 (2): 95-102

Abstract

Most economic studies of picture archiving and communication systems (PACS) to date, including our own, have focused on the perspectives of the radiology department and its direct costs. However, many researchers have suggested additional cost savings that may accrue to the medical center as a whole through increased operational capacity, fewer lost images, rapid simultaneous access to images, and other decreases in resource utilization. We describe here an economic analysis framework we have developed to estimate these potential additional savings. Our framework is comprised of two parallel measurement methods. The first method estimates the cost of care actually delivered through online capture of charge entries from the hospital's billing computer and from the clinical practices' billing database. Multiple regression analyses will be used to model cost of care, length of stay, and other estimates of resource utilization. The second method is the observational measurement of actual resource utilization, such as technologist time, frequency and duration of film searches, and equipment utilization rates. The costs associated with changes in resource use will be estimated using wage rates and other standard economic methods. Our working hypothesis is that after controlling for the underlying clinical and demographic differences among patients, patients imaged using a PACS will have shorter lengths of stay, shorter exam performance times, and decreased costs of care. We expect the results of our analysis to explain and resolve some of the conflicting views of the cost-effectiveness of PACS.

View details for Web of Science ID A1995QY34300006

View details for PubMedID 7612707
STAGING OF PROSTATIC-CANCER - ACCURACY OF MR-IMAGING RADIOLOGY Langlotz, C., Schnall, M., Pollack, H. 1995; 194 (3): 645-646

View details for Web of Science ID A1995QG91800004

View details for PubMedID 7862957
COST-EFFECTIVENESS OF MR-ANGIOGRAPHY IN CASES OF LIMB-THREATENING PERIPHERAL VASCULAR-DISEASE RADIOLOGY Yin, D. P., Baum, R. A., Carpenter, J. P., Langlotz, C. P., Pentecost, M. J. 1995; 194 (3): 757-764

Abstract

To evaluate the cost-effectiveness of magnetic resonance (MR) angiography in the preoperative planning of treatment in patients with limb-threatening peripheral vascular disease (PVD).A decision model was developed to study the effects of MR angiography on the outcome and cost of treatment. The authors calculated the incremental cost per quality-adjusted life-years gained (ie, cost-effectiveness ratio) when conventional angiography was replaced or supplemented with MR angiography. Previously reported data regarding the accuracies of MR and conventional angiography were used in the analysis.The cost-effectiveness ratio of MR angiography ranged from negative (cost-reducing) values to $78,000. For the base case in which the sensitivity and specificity of MR angiography for the evaluation of inflow vessels were 92% and 88% and those of conventional angiography were 97% and 97%, respectively, the cost-effectiveness ratio was $25,895.MR angiography may be a cost-effective alternative to conventional angiography in patients with limb-threatening PVD if its accuracy for the inflow evaluation reaches certain thresholds. Further prospective investigation is warranted.

View details for Web of Science ID A1995QG91800023

View details for PubMedID 7862975
CD4 T-LYMPHOCYTE COUNT AND THE RADIOGRAPHIC PRESENTATION OF PULMONARY TUBERCULOSIS - A STUDY OF THE RELATIONSHIP BETWEEN THESE FACTORS IN PATIENTS WITH HUMAN-IMMUNODEFICIENCY-VIRUS INFECTION CHEST Keiper, M. D., Beumont, M., Elshami, A., Langlotz, C. P., Miller, W. T. 1995; 107 (1): 74-80

Abstract

Pulmonary infection and tumor in the AIDS population has a variable clinical and radiographic presentation. The association between the radiographic presentation of pulmonary tuberculosis and CD4 T lymphocyte count in the HIV-infected patient is investigated in order to provide an empirical approach for early diagnosis, treatment, and isolation of infected subjects.A retrospective analysis of chest radiographs, CD4 T lymphocyte counts, and clinical history of 35 subjects from 3 urban hospitals was performed. All subjects were HIV-seropositive and had culture-proven pulmonary tuberculosis. Radiographs were evaluated for the presence of either a pattern characteristic of post-primary tuberculosis (typical pattern) or a pattern uncharacteristic of post-primary infection (atypical pattern).Twenty-one of 26 subjects with a CD4 T lymphocyte count less than 0.20 x 10(9) cells/L, whereas only 1 of 9 subjects with a CD4 T lymphocyte count of 0.20 x 10(9) cells/L or more presented with an atypical pattern of pulmonary tuberculosis (p < 0.001). The mean CD4 T lymphocyte counts of those subjects presenting with atypical versus typical radiographic pattern of post-primary pulmonary tuberculosis were 0.069 x 10(9) cells/L (n = 22) and 0.323 x 10(9) cells/L (n = 13), respectively (p < 0.01). Twenty-one of the 22 subjects with an atypical radiographic pattern of pulmonary tuberculosis were significantly immunosuppressed (CD4 < 0.20 x 10(9) cells/L). Atypical radiographic pattern included diffuse and lower lobar opacities, pleural effusion, mediastinal adenopathy, interstitial nodules, and a normal chest radiograph.AIDS patients presenting with CD4 count less than 0.20 x 10(9) cells/L and an atypical radiographic pattern for pulmonary tuberculosis are at risk for tuberculous infection requiring appropriate treatment and isolation until the diagnosis of pulmonary tuberculosis has been excluded.

View details for Web of Science ID A1995QC15800018

View details for PubMedID 7813316
PROSPECTIVE COMPARISON OF THE USAGE OF CONVENTIONAL FILM AND PACS BASED COMPUTED RADIOGRAPHY FOR PORTABLE CHEST X-RAY IMAGING IN A MEDICAL INTENSIVE CARE UNIT Conference on PACS Design and Evaluation - Engineering and Clinical Issues Kundel, H. L., SESHADRI, S. S., Langlotz, C. P., Lanken, P. N., Horii, S., Polansky, M., Kishore, S., FINEGOLD, E., Brikman, I., Bozzo, M., Redfern, R. SPIE-INT SOC OPTICAL ENGINEERING. 1995: 302–309

View details for Web of Science ID A1995BD32H00036
INTENSIVE CARE UNIT WORKSTATION USAGE - DIGITIZED FILM VERSUS PHOSPHOR PLATE IMAGING Conference on PACS Design and Evaluation - Engineering and Clinical Issues Horii, S., Kishore, S., Feingold, E., Stevens, J. F., Seshadri, S., Langlotz, C., Kundel, H., Bozzo, M., Redfern, R., Brikman, I. SPIE-INT SOC OPTICAL ENGINEERING. 1995: 286–293

View details for Web of Science ID A1995BD32H00034
THE INCREMENTAL COST OF PACS IN A MEDICAL INTENSIVE CARE UNIT Conference on PACS Design and Evaluation - Engineering and Clinical Issues Langlotz, C. P., Cleff, B., EVENSHOSHAN, O., Bozzo, M., Redfern, R., SESHADRI, S. S., Horii, S., Kundel, H. L. SPIE-INT SOC OPTICAL ENGINEERING. 1995: 294–301

View details for Web of Science ID A1995BD32H00035
CATEGORIZATION OF ACROMIAL SHAPE - INTEROBSERVER VARIABILITY WITH MR-IMAGING AND CONVENTIONAL RADIOGRAPHY AMERICAN JOURNAL OF ROENTGENOLOGY Haygood, T. M., Langlotz, C. P., Kneeland, J. B., Iannotti, J. P., Williams, G. R., Dalinka, M. K. 1994; 162 (6): 1377-1382

Abstract

Our purpose was to determine interobserver variability in the interpretation of the shape of the acromion on sagittal oblique MR images and conventional radiographs. The shape of the acromion was defined according to a previously described classification scheme.We reviewed 26 sets of sagittal oblique MR images and corresponding conventional Y- or outlet-view radiographs of the shoulder. The shape of the acromion was graded for each study independently by four reviewers. Interobserver agreement was measured by using the kappa statistic. Analysis of variance and the chi 2-test were used for univariate analysis.The acromion was interpreted most often as being curved. The observers scored 9% of MR images and 28% of conventional radiographs as nondiagnostic (p < .001) (41% of transscapular Y views and 3% of supraspinatus outlet views were also considered nondiagnostic [p < .0001]). Kappa values were .23 for MR images and .43 for conventional radiographs. Variability in interpretation between techniques when controlled for observer was not statistically significant.Although sagittal oblique MR images were significantly more likely than conventional radiographs to be considered diagnostic by observers, interobserver agreement for MR examinations was poor. There was moderate agreement with conventional radiographs. This calls into question the usefulness of the previous system of interpretation and suggests that it might be more applicable with conventional radiographs than with MR images.

View details for Web of Science ID A1994NN22800024

View details for PubMedID 8192003
EVALUATION OF PACS IN A MEDICAL INTENSIVE-CARE UNIT - THE EFFECT OF COMPUTED RADIOGRAPHY Conference on PACS: Design and Evaluation Kundel, H. L., SESHADRI, S. S., Shile, P. E., Polansky, M., Langlotz, C., Lanken, P. N., Horii, S. C., Grossman, R. I., Purcell, J. A., Kishore, S., Brikman, I., BOZZO, M. T., Redfern, R. SPIE-INT SOC OPTICAL ENGINEERING. 1994: 481–487

View details for Web of Science ID A1994BB27K00052
A METHODOLOGY FOR THE ECONOMIC-ASSESSMENT OF PACS Conference on PACS: Design and Evaluation Langlotz, C. P., EVENSHOSHAN, O., SESHADRI, S. S., Brikman, I., Kishore, S., Kundel, H. L., Schwartz, J. S. SPIE-INT SOC OPTICAL ENGINEERING. 1994: 584–592

View details for Web of Science ID A1994BB27K00062
THE FEASIBILITY OF AXIOMATICALLY-BASED EXPERT SYSTEMS COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE Langlotz, C. P. 1989; 30 (2-3): 85-95

Abstract

We distinguish axiomatically-based expert systems, whose design and implementation are guided by one or more axiomatically-based theories of decision-making (e.g., decision theory, Bayesian probability theory, maximum entropy theory), from traditional expert systems. An analysis of the knowledge acquisition and computational needs of axiomatically-based expert systems is presented. An explicit quantitative comparison is made between the actual knowledge acquisition effort required to build an existing expert system, and the effort that would be required to build an analogous axiomatically-based advice system. The costs and benefits of the axiomatic approach are discussed. The analysis suggests that the small additional cost of knowledge acquisition for the axiomatic approach are outweighed by the long-term benefits this approach provides.

View details for Web of Science ID A1989CA90600003

View details for PubMedID 2684496
LOGICAL AND DECISION-THEORETIC METHODS FOR PLANNING UNDER UNCERTAINTY AI MAGAZINE Langlotz, C. P., Shortliffe, E. H. 1989; 10 (1): 39-47

View details for Web of Science ID A1989T870700002
A THERAPY PLANNING ARCHITECTURE THAT COMBINES DECISION-THEORY AND ARTIFICIAL-INTELLIGENCE TECHNIQUES COMPUTERS AND BIOMEDICAL RESEARCH Langlotz, C. P., Fagan, L. M., Tu, S. W., Sikic, B. I., Shortliffe, E. H. 1987; 20 (3): 279-303

Abstract

Through our experience with the ONCOCIN cancer therapy consultation system, we have identified a set of medical planning problems to which no single existing computer-based reasoning technique readily applies. In response to the need for automated assistance with this class of problems, we have devised a computer program called ONYX that combines decision-theoretic and artificial intelligence approaches to planning. We discuss our rationale for devising a new planning architecture and describe in detail how that architecture is implemented. The program's planning process consists of three steps: (i) the use of rules derived from therapy planning strategies to generate a small set of plausible plans, (ii) the use of knowledge about the structure and behavior of the human body to create simulations that predict possible consequences of each plan for the patient, and (iii) the use of decision theory to rank the plans according to how well the results of each simulation meet the treatment goals. This architecture explicitly manages the uncertainty inherent in many planning tasks, introduces a possible mechanism for the dissemination of decision-theoretic therapy advice, and potentially increases the number of problem solving domains in which expert system techniques can be effectively applied.

View details for Web of Science ID A1987H761400006

View details for PubMedID 3301187

Curtis Langlotz

Senior Associate Vice Provost for Research, Professor of Radiology (Integrative Biomedical Imaging Informatics), of Medicine (BMIR), of Biomedical Data Science and Senior Fellow at the Stanford Institute for Human-Centered AI

Bio

Clinical Focus

Academic Appointments

Administrative Appointments

Honors & Awards

Boards, Advisory Committees, Professional Organizations

Professional Education

Patents

Contact

Additional Clinical Info

Additional Info

Links

Current Research and Scholarly Interests

Clinical Trials

2025-26 Courses

Stanford Advisees

Graduate and Fellowship Programs

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract