Teresa Phuongtram Nguyen

Clinical Assistant Professor, Anesthesiology, Perioperative and Pain Medicine

Practices at Stanford Health Care

Bio

Dr. Teresa Nguyen is a physician in Anesthesiology at Stanford Medicine and affiliated faculty at the Stanford Institute of Human Centered Artificial Intelligence (HAI). She is passionate about medical innovation and is committed to advancing science education and mentorship. Her research is focused on the intersection of AI, robotics, and medicine. She is co - Principle Investigator through the Stanford HAI, in collaboration with the Department of Computer Science. on research efforts for the development and application of AI-enabled quadruped robots to improve patient outcomes. Her research in AI also focuses on the applications of large language models in healthcare and subsequent impacts on society. She is the instructor for Chem 93: "Chemistry Unleashed: Exploring the Chemistry that Transforms Our World" at the Stanford Department of Chemistry and is a helicopter pilot.

Dr. Nguyen completed her Bachelor of Science degree in Chemistry at Stanford University, where she was awarded a U.S. Department of State Critical Language Scholarship in Arabic and the Bing Fellowship for her research in Chemistry. She then became a Scientific Researcher at Genentech, where she co-invented and patented a series of drugs for the potential treatment of chronic and neuropathic pain. She attended and received her MD from Stanford University School of Medicine, where she was a Medical Scholars Research Fellow under the mentorship of Professor Carolyn Bertozzi (Nobel laureate in Chemistry 2022). She has published across several medical subspecialties, including head and neck surgery, rhinology, urology, and orthopedic surgery.

Dr. Nguyen is deeply passionate about diversity, equity, and inclusion initiatives. She is the founder of the Lighthouse Initiative, a nonprofit organization whose mission is to provide resources and mentorship to first-generation, low-income, and minority individuals, with a 100% success rate in aiding college admissions for its members. She is also the co-founder of Hands-On Robotics, a nonprofit organization which supports robotics initiatives and education.

Clinical Focus

Anesthesiology, Perioperative, and Pain Medicine
Anesthesia

Academic Appointments

Clinical Assistant Professor, Anesthesiology, Perioperative and Pain Medicine

Administrative Appointments

Affiliated Faculty, Stanford Institute for Human Centered Artificial Intelligence (2024 - Present)

Honors & Awards

Critical Language Scholarship - Arabic, United States Department of State
Bing Fellowship, Stanford University Department of Chemistry, Prof. Barry Trost
Medical Scholars, Stanford School of Medicine, Prof. Carolyn Bertozzi (2017)

Professional Education

Board Certification: American Board of Anesthesiology, Anesthesia (2025)
Medical Education: Stanford University School of Medicine (2020) CA
Residency: Stanford University Anesthesiology Residency (2024) CA
Internship: Kaiser Permanente at Santa Clara (2021) CA
Medical Doctorate, Stanford University School of Medicine (2020)
Bachelor of Science, Stanford University, Chemistry (2014)

Community and International Work

Co-Founder

Topic

Hands On Robotics

Ongoing Project

Yes

Opportunities for Student Involvement

Yes
Founder

Partnering Organization(s)

The Lighthouse Initiative

Ongoing Project

Yes

Opportunities for Student Involvement

Yes

Patents

Bergeron, P, Chowdhury, S, Dehnhardt, CM., Focken, T, Grimwood, ME, Hasan, A, Lai, KW, Liu, Z, McKerrall, S, Nguyen, TP, Safina, B, Sutherlin, D, Tan, WT. "United States Patent WO 2017058821 A1 Therapeutic Compounds and Methods Use Thereof", Apr 16, 2017

Contact

Academic
Department: Anesthesia - MSD Position: Clinical Assistant Professor

Clinical (Primary) Office of Student Affairs 251 Campus Dr Rm 323 MSOB Stanford, CA 94305
- (650) 725-3944 (office)
(650) 725-8544 (fax)

Additional Clinical Info

Stanford Health Care

2025-26 Courses

Chemistry Unleashed: Exploring the Chemistry that Transforms Our World
CHEM 93 (Win)
Prior Year Courses
2024-25 Courses
- Chemistry Unleashed: Exploring the Chemistry that Transforms Our World
  CHEM 93 (Spr)
2023-24 Courses
- Chemistry Unleashed: Exploring the Chemistry that Transforms Our World
  CHEM 93 (Spr)

All Publications

Comparison of artificial intelligence large language model chatbots in answering frequently asked questions in anaesthesia. BJA open Nguyen, T. P., Carvalho, B., Sukhdeo, H., Joudi, K., Guo, N., Chen, M., Wolpaw, J. T., Kiefer, J. J., Byrne, M., Jamroz, T., Mootz, A. A., Reale, S. C., Zou, J., Sultan, P. 2024; 10: 100280

Abstract

Patients are increasingly using artificial intelligence (AI) chatbots to seek answers to medical queries.Ten frequently asked questions in anaesthesia were posed to three AI chatbots: ChatGPT4 (OpenAI), Bard (Google), and Bing Chat (Microsoft). Each chatbot's answers were evaluated in a randomised, blinded order by five residency programme directors from 15 medical institutions in the USA. Three medical content quality categories (accuracy, comprehensiveness, safety) and three communication quality categories (understandability, empathy/respect, and ethics) were scored between 1 and 5 (1 representing worst, 5 representing best).ChatGPT4 and Bard outperformed Bing Chat (median [inter-quartile range] scores: 4 [3-4], 4 [3-4], and 3 [2-4], respectively; P<0.001 with all metrics combined). All AI chatbots performed poorly in accuracy (score of ≥4 by 58%, 48%, and 36% of experts for ChatGPT4, Bard, and Bing Chat, respectively), comprehensiveness (score ≥4 by 42%, 30%, and 12% of experts for ChatGPT4, Bard, and Bing Chat, respectively), and safety (score ≥4 by 50%, 40%, and 28% of experts for ChatGPT4, Bard, and Bing Chat, respectively). Notably, answers from ChatGPT4, Bard, and Bing Chat differed statistically in comprehensiveness (ChatGPT4, 3 [2-4] vs Bing Chat, 2 [2-3], P<0.001; and Bard 3 [2-4] vs Bing Chat, 2 [2-3], P=0.002). All large language model chatbots performed well with no statistical difference for understandability (P=0.24), empathy (P=0.032), and ethics (P=0.465).In answering anaesthesia patient frequently asked questions, the chatbots perform well on communication metrics but are suboptimal for medical content metrics. Overall, ChatGPT4 and Bard were comparable to each other, both outperforming Bing Chat.

View details for DOI 10.1016/j.bjao.2024.100280

View details for PubMedID 38764485

View details for PubMedCentralID PMC11099318
Innovating pediatric care with social robots to alleviate anxiety. Paediatric anaesthesia Nguyen, T., Yamaguchi, C., Tilton, L., Caruso, T. 2023

View details for DOI 10.1111/pan.14798

View details for PubMedID 37936541
Structure- and Ligand-Based Discovery of Chromane Arylsulfonamide Nav1.7 Inhibitors for the Treatment of Chronic Pain. Journal of medicinal chemistry McKerrall, S. J., Nguyen, T., Lai, K. W., Bergeron, P., Deng, L., DiPasquale, A., Chang, J. H., Chen, J., Chernov-Rogan, T., Hackos, D. H., Maher, J., Ortwine, D. F., Pang, J., Payandeh, J., Proctor, W. R., Shields, S. D., Vogt, J., Ji, P., Liu, W., Ballini, E., Schumann, L., Tarozzo, G., Bankar, G., Chowdhury, S., Hasan, A., Johnson, J. P., Khakh, K., Lin, S., Cohen, C. J., Dehnhardt, C. M., Safina, B. S., Sutherlin, D. P. 2019; 62 (8): 4091-4109

Abstract

Using structure- and ligand-based design principles, a novel series of piperidyl chromane arylsulfonamide Nav1.7 inhibitors was discovered. Early optimization focused on improvement of potency through refinement of the low energy ligand conformation and mitigation of high in vivo clearance. An in vitro hepatotoxicity hazard was identified and resolved through optimization of lipophilicity and lipophilic ligand efficiency to arrive at GNE-616 (24), a highly potent, metabolically stable, subtype selective inhibitor of Nav1.7. Compound 24 showed a robust PK/PD response in a Nav1.7-dependent mouse model, and site-directed mutagenesis was used to identify residues critical for the isoform selectivity profile of 24.

View details for DOI 10.1021/acs.jmedchem.9b00141

View details for PubMedID 30943032
An automated framework for assessing how well LLMs cite relevant medical references. Nature communications Wu, K., Wu, E., Wei, K., Zhang, A., Casasola, A., Nguyen, T., Riantawan, S., Shi, P., Ho, D., Zou, J. 2025; 16 (1): 3615

Abstract

As large language models (LLMs) are increasingly used to address health-related queries, it is crucial that they support their conclusions with credible references. While models can cite sources, the extent to which these support claims remains unclear. To address this gap, we introduce SourceCheckup, an automated agent-based pipeline that evaluates the relevance and supportiveness of sources in LLM responses. We evaluate seven popular LLMs on a dataset of 800 questions and 58,000 pairs of statements and sources on data that represent common medical queries. Our findings reveal that between 50% and 90% of LLM responses are not fully supported, and sometimes contradicted, by the sources they cite. Even for GPT-4o with Web Search, approximately 30% of individual statements are unsupported, and nearly half of its responses are not fully supported. Independent assessments by doctors further validate these results. Our research underscores significant limitations in current LLMs to produce trustworthy medical references.

View details for DOI 10.1038/s41467-025-58551-6

View details for PubMedID 40240349

View details for PubMedCentralID 10543445
The evaluation of the performance of ChatGPT in the management of labor analgesia. Journal of clinical anesthesia Ismaiel, N., Nguyen, T. P., Guo, N., Carvalho, B., Sultan, P. 2024; 98: 111582

Abstract

ChatGPT4 is a leading large language model (LLM) chatbot released by OpenAI in 2023. ChatGPT4 can respond to free-text queries, answer questions and make suggestions regarding virtually any topic. ChatGPT4 has successfully answered anesthesia and even obstetric anesthesia knowledge-based questions with reasonable accuracy. However, ChatGPT4 has yet to be challenged in obstetric anesthesia clinical decision-making.In this study, we evaluated the performance of ChatGPT4 in the management of clinical labor analgesia scenarios compared to expert obstetric anesthesiologists.Eight clinical questions with progressively increasing medical complexity were posed to ChatGPT4.The ChatGPT4 responses were rated by seven expert obstetric anesthesiologists based on safety, accuracy and completeness of each response using a five-point Likert rating scale.ChatGPT4 was deemed safe in 73% of responses to the presented obstetric anesthesia clinical scenarios (27% of responses were deemed unsafe). None of the ChatGPT4 responses were unanimously deemed to be safe by all seven expert obstetric anesthesiologists. Moreover, ChatGPT4 responses were overall partly accurate (score 4 out of 5) and somewhat incomplete (score 3.5 out of 5).In summary, approximately one quarter of all responses by ChatGPT4 were deemed unsafe by expert obstetric anesthesiologists. These findings may suggest the need for more fine-tuning and training of LLMs such as ChatGPT4 specifically for clinical decision making in obstetric anesthesia or other specialized medical fields. These LLMs may come to play an important future role in assisting obstetric anesthesiologists in clinical decision making and enhancing overall patient care.

View details for DOI 10.1016/j.jclinane.2024.111582

View details for PubMedID 39167880
In Response. Anesthesia and analgesia Mootz, A. A., Carvalho, B., Sultan, P., Nguyen, T. P., Reale, S. C. 2024; 138 (6): e37-e38

View details for DOI 10.1213/ANE.0000000000006979

View details for PubMedID 38771606
A comparative study of English and Japanese ChatGPT responses to anaesthesia-related medical questions. BJA open Ando, K., Sato, M., Wakatsuki, S., Nagai, R., Chino, K., Kai, H., Sasaki, T., Kato, R., Nguyen, T. P., Guo, N., Sultan, P. 2024; 10: 100296

Abstract

The expansion of artificial intelligence (AI) within large language models (LLMs) has the potential to streamline healthcare delivery. Despite the increased use of LLMs, disparities in their performance particularly in different languages, remain underexplored. This study examines the quality of ChatGPT responses in English and Japanese, specifically to questions related to anaesthesiology.Anaesthesiologists proficient in both languages were recruited as experts in this study. Ten frequently asked questions in anaesthesia were selected and translated for evaluation. Three non-sequential responses from ChatGPT were assessed for content quality (accuracy, comprehensiveness, and safety) and communication quality (understanding, empathy/tone, and ethics) by expert evaluators.Eight anaesthesiologists evaluated English and Japanese LLM responses. The overall quality for all questions combined was higher in English compared with Japanese responses. Content and communication quality were significantly higher in English compared with Japanese LLMs responses (both P<0.001) in all three responses. Comprehensiveness, safety, and understanding were higher scores in English LLM responses. In all three responses, more than half of the evaluators marked overall English responses as better than Japanese responses.English LLM responses to anaesthesia-related frequently asked questions were superior in quality to Japanese responses when assessed by bilingual anaesthesia experts in this report. This study highlights the potential for language-related disparities in healthcare information and the need to improve the quality of AI responses in underrepresented languages. Future studies are needed to explore these disparities in other commonly spoken languages and to compare the performance of different LLMs.

View details for DOI 10.1016/j.bjao.2024.100296

View details for PubMedID 38975242

View details for PubMedCentralID PMC11225650
The Accuracy of ChatGPT-Generated Responses in Answering Commonly Asked Patient Questions About Labor Epidurals: A Survey-Based Study. Anesthesia and analgesia Mootz, A. A., Carvalho, B., Sultan, P., Nguyen, T. P., Reale, S. C. 2024

View details for DOI 10.1213/ANE.0000000000006801

View details for PubMedID 38180897
Consumption of cruciferous vegetables and the risk of bladder cancer in a prospective US cohort: data from the NIH-AARP diet and health study AMERICAN JOURNAL OF CLINICAL AND EXPERIMENTAL UROLOGY Nguyen, T. P., Zhang, C. A., Sonn, G. A., Eisenberg, M. L., Brooks, J. D. 2021; 9 (3): 229-238

View details for Web of Science ID 000672671600004
Hemodynamic changes in patients undergoing office-based sinus procedures under local anesthesia. International forum of allergy & rhinology Chang, M. T., Jitaroon, K. n., Nguyen, T. n., Yan, C. H., Overdevest, J. B., Nayak, J. V., Hwang, P. H., Patel, Z. M. 2020; 10 (1): 114–20

Abstract

The objective of this study is to characterize changes in hemodynamics, pain, and anxiety during office-based endoscopic sinus procedures performed under local anesthesia.We conducted a prospective study of adults undergoing in-office endoscopic sinus procedures under local anesthesia. Patients with American Society of Anesthesiologists (ASA) Physical Status Classification System class 1 or 2 were included. Anesthesia was administered by topical 4% lidocaine/oxymetazoline and submucosal injection of 1% lidocaine/1:200,000 epinephrine. Vital signs and pain were measured at baseline, postinjection, and 5-minute intervals throughout the procedure. Anxiety levels were scored using the State-Trait Anxiety Inventory (STAI). Univariate and multivariate regression analyses were performed to identify factors significantly associated with changes in each hemodynamic metric.Twenty-five patients were studied. This cohort was 52% male, mean age of 57.8 ± 14.4 years, and Charlson Comorbidity Index (CCI) median of 2. Mean procedure duration was 25.0 ± 10.3 minutes. Mean maximal increase in systolic blood pressure (SBP) was 24.6 ± 17.8 mmHg from baseline. Mean maximal heart rate increase was 22.8 ± 10.8 beats per minute (bpm) from baseline. In multivariate regression analysis, when accounting for patient age, cardiac comorbidity, CCI, and ASA, older age was significantly associated with an increase of >20 mmHg in SBP (p = 0.043). Mean pain score during procedures was 1.5 ± 1.3 with a mean maximum of 4.0 ± 2.6. STAI anxiety scores did not change significantly from preprocedure to postprocedure (32.8 ± 11.6 to 31.0 ± 12.6, p = 0.46). No medical complications occurred.Although patients appear to tolerate office procedures well, providers should recognize the potential for significant fluctuations in blood pressure during the procedure, especially in older patients.

View details for DOI 10.1002/alr.22460

View details for PubMedID 31899857
Biomechanical Study of a Multifilament Stainless Steel Cable Crimp System Versus a Multistrand Ultra-High Molecular Weight Polyethylene Polyester Suture Krackow Technique for Achilles Tendon Rupture Repair. The Journal of foot and ankle surgery : official publication of the American College of Foot and Ankle Surgeons Nguyen, T. P., Keyt, L. K., Herfat, S. n., Gordon, L. n., Palanca, A. n. 2019; 59 (1): 86–90

Abstract

Currently, Achilles tendon rupture repair is surgically addressed with an open or minimally invasive approach using a heavy, nonabsorbable suture in a locking stitch configuration. However, these sutures have low stiffness and a propensity to stretch, which can result in gapping at the repair site. Our study compares a new multifilament stainless steel cable-crimp repair method to a standard Krackow repair using multistrand, ultra-high molecular weight polyethylene polyester sutures. Eight matched pairs of cadavers were randomly assigned for Achilles tendon repair using either Krackow technique with polyethylene polyester sutures or the multifilament stainless steel cable-crimp technique. Each repair was cyclically loaded from 10 to 50 N for 100 loading cycles, followed by a linear increase in load until complete failure of the repair. During cyclic loading, 4 of the 8 Krackow polyethylene polyester suture repairs failed, whereas none of the multifilament stainless steel cable crimp repairs failed. Load to failure was greater for the multifilament stainless steel cable crimp repairs (321.03 ± 118.71 N) than for the Krackow polyethylene polyester suture repairs (132.47 ± 103.39 N, p = .0078). The ultimate tensile strength of the multifilament stainless steel cable crimp repairs was also greater than that of the Krackow polyethylene polyester suture repairs (485.69 ± 47.93 N vs 378.71 ± 107.23 N, respectively, p = .12). The mode of failure was by suture breakage at the crimp for all cable-crimp repairs and by suture breakage at the knot, within the tendon, or suture pullout for the polyethylene polyester suture repairs. The multifilament stainless steel cable crimp construct may be a better alternative for Achilles tendon rupture repairs.

View details for DOI 10.1053/j.jfas.2019.01.022

View details for PubMedID 31882153
Budesonide irrigation with olfactory training improves outcomes compared with olfactory training alone in patients with olfactory loss Nguyen, T. P., Patel, Z. M. WILEY. 2018: 977–81

View details for DOI 10.1002/alr.22140

View details for Web of Science ID 000443132000002

Teresa Phuongtram Nguyen

Clinical Assistant Professor, Anesthesiology, Perioperative and Pain Medicine

Bio

Clinical Focus

Academic Appointments

Administrative Appointments

Honors & Awards

Professional Education

Community and International Work

Topic

Ongoing Project

Opportunities for Student Involvement

Partnering Organization(s)

Ongoing Project

Opportunities for Student Involvement

Patents

Contact

Additional Clinical Info

Links

2025-26 Courses

2024-25 Courses

2023-24 Courses

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract