Clinical Focus
- Internal Medicine
Academic Appointments
-
Clinical Associate Professor, Medicine
Honors & Awards
-
Award for Excellence in Promotion of the Learning Environment and Student Wellness, Stanford University, School of Medicine (2024)
-
Department of Medicine Teaching Award, Stanford University, School of Medicine (2022)
-
Lawrence H. Mathers Award for Exceptional Commitment to Teaching & Active Involvement in Med Ed, Stanford University, School of Medicine (2021)
-
Henry J. Kaiser Family Foundation Award for Excellence in Clinical Teaching, Stanford University, School of Medicine (2017)
-
Award for Outstanding Teaching of Medical Residents and Students, VA Palo Alto Health Care System, Department of Medicine (2015)
-
Henry J. Kaiser Family Foundation Award for Outstanding & Innovative Contributions to Med Education, Stanford University, School of Medicine (2014)
-
Henry J. Kaiser Family Foundation Teaching Award for Pre-Clerkship Instruction, Stanford University, School of Medicine (2014)
Professional Education
-
Medical Education: NYU Grossman School of Medicine (2003) NY
-
Board Certification: American Board of Internal Medicine, Internal Medicine (2019)
-
Residency: Stanford University Internal Medicine Residency (2006) CA
-
B.S., Massachusetts Institute of Technology, Biology (1998)
Community and International Work
-
Domestic Violence in New Guinea, Papua New Guinea
Topic
Domestic Violence
Partnering Organization(s)
NYU
Populations Served
Rural
Location
International
Ongoing Project
No
Opportunities for Student Involvement
No
Current Research and Scholarly Interests
Physical diagnosis, ECG interpretation, generative AI, and clinical reasoning.
Stanford Advisees
-
E4C Mentor
Alan Anaya Gallegos, Andrew Beel, Sahar Caravan, Imilce Castro Paz, Laura Chang, Iris Cong, Leighton Daigh, Drew Daniel, Saachi Datta, Lori Dershowitz, Noelle Gorka, Victoria Harbour, Charlotte Herber, Xiaoyi Hu, Kelly Hyles, Isaac Jackson, Bhav Jain, Anushka Jetly, Ricardo Jimenez, Joey Kang, Mehr Kashyap, Jenny Kim, Taimur Kouser, Alice Li, Justin Liu, Sharon Loa, Isha Mehrotra, Trishna Narula, Teresa Nguyen, Zane Norville, Elisa Padron, Teja Peddada, Maya Ramachandran, Torsten Rotto, Yoni Samuel Rubin, Dina Sheira, Joshua Taylor, Brandon Wesley, Sierra Willens, Ezra Yoseph, Ada Zhang, Yuan Zhang, Elizabeth Zudock
All Publications
-
Large Language Model Influence on Management Reasoning: A Randomized Controlled Trial.
medRxiv : the preprint server for health sciences
2024
Abstract
Large language model (LLM) artificial intelligence (AI) systems have shown promise in diagnostic reasoning, but their utility in management reasoning with no clear right answers is unknown.To determine whether LLM assistance improves physician performance on open-ended management reasoning tasks compared to conventional resources.Prospective, randomized controlled trial conducted from 30 November 2023 to 21 April 2024.Multi-institutional study from Stanford University, Beth Israel Deaconess Medical Center, and the University of Virginia involving physicians from across the United States.92 practicing attending physicians and residents with training in internal medicine, family medicine, or emergency medicine.Five expert-developed clinical case vignettes were presented with multiple open-ended management questions and scoring rubrics created through a Delphi process. Physicians were randomized to use either GPT-4 via ChatGPT Plus in addition to conventional resources (e.g., UpToDate, Google), or conventional resources alone.The primary outcome was difference in total score between groups on expert-developed scoring rubrics. Secondary outcomes included domain-specific scores and time spent per case.Physicians using the LLM scored higher compared to those using conventional resources (mean difference 6.5 %, 95% CI 2.7-10.2, p<0.001). Significant improvements were seen in management decisions (6.1%, 95% CI 2.5-9.7, p=0.001), diagnostic decisions (12.1%, 95% CI 3.1-21.0, p=0.009), and case-specific (6.2%, 95% CI 2.4-9.9, p=0.002) domains. GPT-4 users spent more time per case (mean difference 119.3 seconds, 95% CI 17.4-221.2, p=0.02). There was no significant difference between GPT-4-augmented physicians and GPT-4 alone (-0.9%, 95% CI -9.0 to 7.2, p=0.8).LLM assistance improved physician management reasoning compared to conventional resources, with particular gains in contextual and patient-specific decision-making. These findings indicate that LLMs can augment management decision-making in complex cases.ClinicalTrials.gov Identifier: NCT06208423 ; https://classic.clinicaltrials.gov/ct2/show/NCT06208423.Question: Does large language model (LLM) assistance improve physician performance on complex management reasoning tasks compared to conventional resources?Findings: In this randomized controlled trial of 92 physicians, participants using GPT-4 achieved higher scores on management reasoning compared to those using conventional resources (e.g., UpToDate).Meaning: LLM assistance enhances physician management reasoning performance in complex cases with no clear right answers.
View details for DOI 10.1101/2024.08.05.24311485
View details for PubMedID 39148822
View details for PubMedCentralID PMC11326321
-
Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study.
medRxiv : the preprint server for health sciences
2024
Abstract
Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning.To assess the impact of the GPT-4 LLM on physicians' diagnostic reasoning compared to conventional resources.Multi-center, randomized clinical vignette study.The study was conducted using remote video conferencing with physicians across the country and in-person participation across multiple academic medical institutions.Resident and attending physicians with training in family medicine, internal medicine, or emergency medicine.Participants were randomized to access GPT-4 in addition to conventional diagnostic resources or to just conventional resources. They were allocated 60 minutes to review up to six clinical vignettes adapted from established diagnostic reasoning exams.The primary outcome was diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps. Secondary outcomes included time spent per case and final diagnosis.50 physicians (26 attendings, 24 residents) participated, with an average of 5.2 cases completed per participant. The median diagnostic reasoning score per case was 76.3 percent (IQR 65.8 to 86.8) for the GPT-4 group and 73.7 percent (IQR 63.2 to 84.2) for the conventional resources group, with an adjusted difference of 1.6 percentage points (95% CI -4.4 to 7.6; p=0.60). The median time spent on cases for the GPT-4 group was 519 seconds (IQR 371 to 668 seconds), compared to 565 seconds (IQR 456 to 788 seconds) for the conventional resources group, with a time difference of -82 seconds (95% CI -195 to 31; p=0.20). GPT-4 alone scored 15.5 percentage points (95% CI 1.5 to 29, p=0.03) higher than the conventional resources group.In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.
View details for DOI 10.1101/2024.03.12.24303785
View details for PubMedID 38559045
View details for PubMedCentralID PMC10980135
-
Chatbot vs Medical Student Performance on Free-Response Clinical Reasoning Examinations.
JAMA internal medicine
2023
View details for DOI 10.1001/jamainternmed.2023.2909
View details for PubMedID 37459090
-
YouTube as an educational resource for learning ECGs.
Journal of electrocardiology
2014; 47 (5): 758-759
View details for DOI 10.1016/j.jelectrocard.2014.05.005
View details for PubMedID 24973138