Lauren Yu's Profile | Stanford Profiles

Contact

Academic
laurenyu@stanford.edu

University - Student Department: Computer Science Position: Undergraduate

University - Student Department: English Position: Undergraduate

University - Student Department: Computer Science Position: Graduate

Additional Info

Mail Code: 9000

All Publications

Large Language Models for Maternal and Neonatal Healthcare in Low- and Middle-Income Countries. The Journal of pediatrics Yu, L., Darmstadt, G. L., Ward, V., Wong, R. J., Stevenson, D. K., Maric, I. 2026: 115037

Abstract

To investigate whether LLMs can assist with maternal and neonatal healthcare in low- and middle-income countries (LMICs).We evaluated the ability of GPT-4o to generate accurate answers across countries in 4 domains related to maternal and neonatal health: 1) prevalence of conditions when generating medical case examples; 2) prevalence of conditions in countries without reliable national prevalence data; 3) standardized medical examination questions; and 4) subjective health-related questions. We used the GPT-4o Application Programming Interface (API) except for domain 2, for which we used ChatGPT, and used repeated prompts to guarantee statistical significance of answers. We utilized publicly available data from 6 WHO regions and 204 countries.We observed challenges for LLMs to provide accurate answers on a global scale. Medical cases generated by GPT-4o did not reflect true prevalences of outcomes, overrepresenting the Americas. GPT-4o demonstrated explicit bias, giving lower rankings for subjective health-related topics to countries with high infant mortality rates. In 44% of cases, GPT-4o provided pregnancy-related statistics in regions where those statistics were not available, while not acknowledging the uncertainty, and nearly half (46.7%) of source citations were erroneous. GPT-4o answered the majority (79%) of pregnancy-related medical examination questions correctly but made errors when answering based on prevalent health issues in specific regions while overlooking symptoms.Identified challenges in using GPT-4o highlight important limitations in applying general-purpose LLMs to guide maternal and neonatal healthcare in LMICs. These findings can guide further studies and solutions in fine-tuning LLMs on contextualized data.

View details for DOI 10.1016/j.jpeds.2026.115037

View details for PubMedID 41692226

Lauren Yu

Masters Student in Computer Science, admitted Autumn 2023

Contact

Additional Info

All Publications

Abstract