Clinical Focus


  • Fellow
  • Regional Anesthesia and Acute Pain Medicine

Academic Appointments


Professional Education


  • Board Certification: American Board of Anesthesiology, Anesthesiology (2024)
  • Fellowship: Stanford University Anesthesiology Fellowships (2024) CA
  • A.B., Harvard College, Economics (2012)
  • M.D., Perelman School of Medicine, University of Pennsylvania, MD (2019)
  • M.B.A., The Wharton School, Health Care Management (2019)
  • Internship, Stanford Health Care, Internal Medicine (2020)
  • Residency: Stanford University Anesthesiology Residency (2023) CA
  • Residency, Stanford Health Care, Anesthesiology (2023)
  • Board Certification, American Board of Anesthesiology, Anesthesiology (2024)
  • Internship: Stanford University Internal Medicine Residency (2020) CA
  • Medical Education: Perelman School of Medicine University of Pennsylvania (2019) PA

All Publications


  • Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review. JAMA Bedi, S., Liu, Y., Orr-Ewing, L., Dash, D., Koyejo, S., Callahan, A., Fries, J. A., Wornow, M., Swaminathan, A., Lehmann, L. S., Hong, H. J., Kashyap, M., Chaurasia, A. R., Shah, N. R., Singh, K., Tazbaz, T., Milstein, A., Pfeffer, M. A., Shah, N. H. 2024

    Abstract

    Large language models (LLMs) can assist in various health care activities, but current evaluation approaches may not adequately identify the most useful application areas.To summarize existing evaluations of LLMs in health care in terms of 5 components: (1) evaluation data type, (2) health care task, (3) natural language processing (NLP) and natural language understanding (NLU) tasks, (4) dimension of evaluation, and (5) medical specialty.A systematic search of PubMed and Web of Science was performed for studies published between January 1, 2022, and February 19, 2024.Studies evaluating 1 or more LLMs in health care.Three independent reviewers categorized studies via keyword searches based on the data used, the health care tasks, the NLP and NLU tasks, the dimensions of evaluation, and the medical specialty.Of 519 studies reviewed, published between January 1, 2022, and February 19, 2024, only 5% used real patient care data for LLM evaluation. The most common health care tasks were assessing medical knowledge such as answering medical licensing examination questions (44.5%) and making diagnoses (19.5%). Administrative tasks such as assigning billing codes (0.2%) and writing prescriptions (0.2%) were less studied. For NLP and NLU tasks, most studies focused on question answering (84.2%), while tasks such as summarization (8.9%) and conversational dialogue (3.3%) were infrequent. Almost all studies (95.4%) used accuracy as the primary dimension of evaluation; fairness, bias, and toxicity (15.8%), deployment considerations (4.6%), and calibration and uncertainty (1.2%) were infrequently measured. Finally, in terms of medical specialty area, most studies were in generic health care applications (25.6%), internal medicine (16.4%), surgery (11.4%), and ophthalmology (6.9%), with nuclear medicine (0.6%), physical medicine (0.4%), and medical genetics (0.2%) being the least represented.Existing evaluations of LLMs mostly focus on accuracy of question answering for medical examinations, without consideration of real patient care data. Dimensions such as fairness, bias, and toxicity and deployment considerations received limited attention. Future evaluations should adopt standardized applications and metrics, use clinical data, and broaden focus to include a wider range of tasks and specialties.

    View details for DOI 10.1001/jama.2024.21700

    View details for PubMedID 39405325

    View details for PubMedCentralID PMC11480901

  • Enhancing the Readability of Preoperative Patient Instructions Using Large Language Models. Anesthesiology Hong, H. J., Schmiesing, C. A., Goodell, A. J. 2024; 141 (3): 608-610

    View details for DOI 10.1097/ALN.0000000000005122

    View details for PubMedID 39136480

  • Artificial Intelligence in Perioperative Care: Opportunities and Challenges. Anesthesiology Han, L., Char, D. S., Aghaeepour, N. 2024; 141 (2): 379-387

    View details for DOI 10.1097/ALN.0000000000005013

    View details for PubMedID 38980160

  • Engaging Housestaff as Informatics Collaborators: Educational and Operational Opportunities. Applied clinical informatics Shenson, J. A., Jankovic, I., Hong, H. J., Weia, B., White, L., Chen, J. H., Eisenberg, M. 2021; 12 (5): 1150-1156

    Abstract

    BACKGROUND: In academic hospitals, housestaff (interns, residents, and fellows) are a core user group of clinical information technology (IT) systems, yet are often relegated to being recipients of change, rather than active partners in system improvement. These information systems are an integral part of health care delivery and formal efforts to involve and educate housestaff are nascent.OBJECTIVE: This article develops a sustainable forum for effective engagement of housestaff in hospital informatics initiatives and creates opportunities for professional development.METHODS: A housestaff-led IT council was created within an academic medical center and integrated with informatics and graduate medical education leadership. The Council was designed to provide a venue for hands-on clinical informatics educational experiences to housestaff across all specialties.RESULTS: In the first year, five housestaff co-chairs and 50 members were recruited. More than 15 projects were completed with substantial improvements made to clinical systems impacting more than 1,300 housestaff and with touchpoints to nearly 3,000 staff members. Council leadership was integrally involved in hospital governance committees and became the go-to source for housestaff input on informatics efforts. Positive experiences informed members' career development toward informatics roles. Key lessons learned in building for success are discussed.CONCLUSION: The council model has effectively engaged housestaff as learners, local champions, and key informatics collaborators, with positive impact for the participating members and the institution. Requiring few resources for implementation, the model should be replicable at other institutions.

    View details for DOI 10.1055/s-0041-1740258

    View details for PubMedID 34879406