Bio


April S. Liang, M.D., is a Board-Certified internist and Clinical Informaticist. She serves as Clinical Assistant Professor in the Stanford Division of Hospital Medicine as well as Medical Informatics Director. Dr. Liang holds a B.S.E. in Computer Science from Princeton University and an M.D. from UCSF School of Medicine. She completed Internal Medicine residency at UCSF and Clinical Informatics fellowship at Stanford. Dr. Liang’s informatics interests include the implementation of AI tools in healthcare and data-driven quality improvement. Her past work includes the integrating a machine learning-driven clinical decision support tool in the EHR targeting lab overutilization and measuring the impact of ambient AI scribes on clinician documentation time.

Clinical Focus


  • Internal Medicine
  • Hospital Medicine
  • Medical Informatics
  • Machine Learning
  • Artificial Intelligence
  • Clinical Decision Support

Academic Appointments


  • Clinical Assistant Professor, Medicine

Honors & Awards


  • Research Abstracts Overall Winner, Society of Hospital Medicine Converge Conference (April 2025)
  • 2nd Place Winner, AMIA/HL7 FHIR App Competition (November 2024)
  • Pediatrics Fellow Scholarship Award, Stanford Department of Pediatrics (December 2023)

Boards, Advisory Committees, Professional Organizations


  • Interim Secretary, Physicians in AMIA (PINA) (2025 - Present)
  • Executive Board, Chair of Networking, AMIA Clinical Informatics Fellows (ACIF) (2024 - 2025)
  • Project Leader, Stanford Resident Safety Council (2024 - 2025)
  • Co-Chair, Housestaff Information Technology Enhancement Council (HITEC) (2024 - 2025)

Professional Education


  • Fellowship: Stanford University Clinical Informatics Fellowship (2025) CA
  • Board Certification: American Board of Internal Medicine, Internal Medicine (2023)
  • Residency: UCSF Dept of Internal Medicine (2023) CA
  • Medical Education: University of California at San Francisco School of Medicine (2020) CA
  • Fellowship, Stanford University, Clinical Informatics (2025)
  • Residency, UCSF, Internal Medicine (2023)
  • MD, UCSF (2020)
  • BSE, Princeton University, Computer Science (2015)

Graduate and Fellowship Programs


All Publications


  • Artificial intelligence-generated draft replies to patient messages in pediatrics. JAMIA open Liang, A. S., Vedak, S., Dussaq, A., Yao, D., Villarreal, J. A., Thomas, S., Chen, N., Townsend, T., Pageler, N. M., Morse, K. 2025; 8 (6): ooaf159

    Abstract

    Objectives: This study describes the utilization and experiences of artificial intelligence (AI)-generated draft responses to patient messages in pediatric ambulatory clinicians and contextualizes their experiences in relation to those of adult specialty clinicians.Materials and Methods: A prospective pilot was conducted from September 2023 to August 2024 in 2 pediatric clinics (General Pediatric and Adolescent Medicine) and 2 obstetric clinics (Reproductive Endocrinology and Infertility and General Obstetrics) within an academic health system in Northern California. Participants included physician, nurse, and medical assistant volunteers. The intervention involved a feature utilizing large language models embedded in the electronic health record to generate draft responses. Proportion of AI-generated draft used was collected, as were prepilot and follow-up surveys.Results: A total of 61 clinicians (26 pediatric, 35 obstetric) enrolled, with 46 (75%) completing both surveys. Pediatric clinicians utilized 13.3% (95% CI, 12.3%-14.4%) of AI-generated drafts, and usage rates when responding to patients vs their proxies was similar (15% vs 12.9%, P=.24). Despite using AI-generated drafts significantly less than obstetric clinicians (18.3% [17.2%-19.5%], P<.0001), pediatric clinicians reported a significant reduction in perceived task load (NASA Task Load Index: 59.9-50.9, P=.04) and were more likely to recommend the tool (LTR: 7.0 vs 5.2, P=.04).Discussion and Conclusion: Pediatric clinicians used AI-generated drafts at a rate within previously reported ranges in adult specialties and experienced utility. These findings suggest this tool has potential for enhancing efficiency and reducing task load in pediatric care.

    View details for DOI 10.1093/jamiaopen/ooaf159

    View details for PubMedID 41293120

  • Physician Perspectives on Ambient AI Scribes. JAMA network open Shah, S. J., Crowell, T., Jeong, Y., Devon-Sand, A., Smith, M., Yang, B., Ma, S. P., Liang, A. S., Delahaie, C., Hsia, C., Shanafelt, T., Pfeffer, M. A., Sharp, C., Lin, S., Garcia, P. 2025; 8 (3): e251904

    Abstract

    Limited qualitative studies exist evaluating ambient artificial intelligence (AI) scribe tools. Such studies can provide deeper insights into ambient AI implementations by capturing lived experiences.To evaluate physician perspectives on ambient AI scribes.A qualitative study using semistructured interviews guided by the Reach, Efficacy, Adoption, Implementation, Maintenance/Practical, Robust Implementation, and Sustainability Model (RE-AIM/PRISM) framework, with thematic analysis using both inductive and deductive approaches. Physicians participating in an AI scribe pilot that included community and faculty practices, across primary care and ambulatory specialties, were invited to participate in interviews. This ambient AI scribe pilot at a health care organization in California was conducted from November 2023 to January 2024.Facilitators and barriers to adoption, practical effectiveness, and suggestions for improvement to enhance sustainability.Twenty-two semistructured interviews were conducted with AI pilot physicians from primary care (13 [59%]) and ambulatory specialties (9 [41%]), including physicians from community practices (12 [55%]) and faculty practices (10 [45%]). Facilitators to adoption included ease of use, ease of editing, and generally positive perspectives of tool quality. Physicians expressed positive sentiments about the impact of the ambient AI scribe tool on cognitive demand (16 of 16 comments [100%]), temporal demand (28 comments [62%]), work-life integration (10 of 11 comments [91%]), and overall workload (8 of 9 comments [89%]). Physician perspectives of the impact of the ambient AI scribe tool on their engagement with patients were mostly positive (38 of 56 comments [68%]). Barriers to adoption included limited functionality with non-English speaking patients and lack of access for physicians without a specific device. Physician perspectives on accuracy and style were largely negative, particularly regarding note length and editing requirements. Several specific suggestions for tool improvement were identified, and physicians were optimistic regarding the potential for long-term use of ambient AI scribes.In this qualitative study, ambient AI scribes were found to positively impact physician workload, work-life integration, and patient engagement. Key facilitators and barriers to adoption were identified, along with specific suggestions for tool improvement. These findings suggest the potential for ambient AI scribes to reduce clinician burden, with user-centered recommendations offering practical guidance on ways to improve future iterations and improve adoption.

    View details for DOI 10.1001/jamanetworkopen.2025.1904

    View details for PubMedID 40126477

  • Clinical entity augmented retrieval for clinical information extraction. NPJ digital medicine Lopez, I., Swaminathan, A., Vedula, K., Narayanan, S., Nateghi Haredasht, F., Ma, S. P., Liang, A. S., Tate, S., Maddali, M., Gallo, R. J., Shah, N. H., Chen, J. H. 2025; 8 (1): 45

    Abstract

    Large language models (LLMs) with retrieval-augmented generation (RAG) have improved information extraction over previous methods, yet their reliance on embeddings often leads to inefficient retrieval. We introduce CLinical Entity Augmented Retrieval (CLEAR), a RAG pipeline that retrieves information using entities. We compared CLEAR to embedding RAG and full-note approaches for extracting 18 variables using six LLMs across 20,000 clinical notes. Average F1 scores were 0.90, 0.86, and 0.79; inference times were 4.95, 17.41, and 20.08 s per note; average model queries were 1.68, 4.94, and 4.18 per note; and average input tokens were 1.1k, 3.8k, and 6.1k per note for CLEAR, embedding RAG, and full-note approaches, respectively. In conclusion, CLEAR utilizes clinical entities for information retrieval and achieves >70% reduction in token usage and inference time with improved performance compared to modern methods.

    View details for DOI 10.1038/s41746-024-01377-1

    View details for PubMedID 39828800

    View details for PubMedCentralID 4287068

  • Feasibility of Automated Precharting using GPT-4 in New Specialty Referrals. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science Liang, A. S., Banda, J. M., Savage, T., Pandya, A., Carey, R., Megwalu, U. C., Chang, M. T., Dash, D., Corbin, C. K., Sharma, A., Thapa, R., Kotecha, N., Shah, N. H., Lee, J. Y., Chen, J. H. 2025; 2025: 312-321

    Abstract

    This study evaluates the feasibility of using GPT-4 to automate precharting for specialty referrals, focusing on new patients referred to an otolaryngology clinic for nasal congestion. We describe the design decisions and strategies tested in creating this precharting utility, including methods for prompt design and token limit handling. Through iterative testing and building, our tool achieved 95.0% agreement with physician consensus in a small retrospective test sample. Results from a small prospective pilot showed favorable feedback of summaries in a real-world clinical setting, though there was a discrepancy between high intention to use the summary but lower perception of time savings. Our results demonstrate that automated pre-charting with accuracy and clinical relevance can be feasible with large language models such as GPT-4. Our design features can inform the development of vendor chart summarization solutions.

    View details for PubMedID 40502261

  • Ambient artificial intelligence scribes: utilization and impact on documentation time. Journal of the American Medical Informatics Association : JAMIA Ma, S. P., Liang, A. S., Shah, S. J., Smith, M., Jeong, Y., Devon-Sand, A., Crowell, T., Delahaie, C., Hsia, C., Lin, S., Shanafelt, T., Pfeffer, M. A., Sharp, C., Garcia, P. 2024

    Abstract

    To quantify utilization and impact on documentation time of a large language model-powered ambient artificial intelligence (AI) scribe.This prospective quality improvement study was conducted at a large academic medical center with 45 physicians from 8 ambulatory disciplines over 3 months. Utilization and documentation times were derived from electronic health record (EHR) use measures.The ambient AI scribe was utilized in 9629 of 17 428 encounters (55.25%) with significant interuser heterogeneity. Compared to baseline, median time per note reduced significantly by 0.57 minutes. Median daily documentation, afterhours, and total EHR time also decreased significantly by 6.89, 5.17, and 19.95 minutes/day, respectively.An early pilot of an ambient AI scribe demonstrated robust utilization and reduced time spent on documentation and in the EHR. There was notable individual-level heterogeneity.Large language model-powered ambient AI scribes may reduce documentation burden. Further studies are needed to identify which users benefit most from current technology and how future iterations can support a broader audience.

    View details for DOI 10.1093/jamia/ocae304

    View details for PubMedID 39688515

  • Ambient artificial intelligence scribes: physician burnout and perspectives on usability and documentation burden. Journal of the American Medical Informatics Association : JAMIA Shah, S. J., Devon-Sand, A., Ma, S. P., Jeong, Y., Crowell, T., Smith, M., Liang, A. S., Delahaie, C., Hsia, C., Shanafelt, T., Pfeffer, M. A., Sharp, C., Lin, S., Garcia, P. 2024

    Abstract

    This study evaluates the pilot implementation of ambient AI scribe technology to assess physician perspectives on usability and the impact on physician burden and burnout.This prospective quality improvement study was conducted at Stanford Health Care with 48 physicians over a 3-month period. Outcome measures included burden, burnout, usability, and perceived time savings.Paired survey analysis (n = 38) revealed large statistically significant reductions in task load (-24.42, p <.001) and burnout (-1.94, p <.001), and moderate statistically significant improvements in usability scores (+10.9, p <.001). Post-survey responses (n = 46) indicated favorable utility with improved perceptions of efficiency, documentation quality, and ease of use.In one of the first pilot implementations of ambient AI scribe technology, improvements in physician task load, burnout, and usability were demonstrated.Ambient AI scribes like DAX Copilot may enhance clinical workflows. Further research is needed to optimize widespread implementation and evaluate long-term impacts.

    View details for DOI 10.1093/jamia/ocae295

    View details for PubMedID 39657021

  • Perspectives on Artificial Intelligence-Generated Responses to Patient Messages. JAMA network open Kim, J., Chen, M. L., Rezaei, S. J., Liang, A. S., Seav, S. M., Onyeka, S., Lee, J. J., Vedak, S. C., Mui, D., Lal, R. A., Pfeffer, M. A., Sharp, C., Pageler, N. M., Asch, S. M., Linos, E. 2024; 7 (10): e2438535

    View details for DOI 10.1001/jamanetworkopen.2024.38535

    View details for PubMedID 39412810

  • Using a Large Language Model to Identify Adolescent Patient Portal Account Access by Guardians. JAMA network open Liang, A. S., Vedak, S., Dussaq, A., Yao, D. H., Morse, K., Ip, W., Pageler, N. M. 2024; 7 (6): e2418454

    View details for DOI 10.1001/jamanetworkopen.2024.18454

    View details for PubMedID 38916895

  • SMARTALERT: DESIGNING A MACHINE LEARNING-DRIVEN CLINICAL ALERT BASED ON PROVIDER ATTITUDES Liang, A. S., Ma, S. P., Pham, T. D., Shieh, L., Sharp, C., Chen, J. SPRINGER. 2024: S925
  • Transcription-independent TFIIIC-bound sites cluster near heterochromatin boundaries within lamina-associated domains in C. elegans. Epigenetics & chromatin Stutzman, A. V., Liang, A. S., Beilinson, V., Ikegami, K. 2020; 13 (1): 1

    Abstract

    Chromatin organization is central to precise control of gene expression. In various eukaryotic species, domains of pervasive cis-chromatin interactions demarcate functional domains of the genomes. In nematode Caenorhabditis elegans, however, pervasive chromatin contact domains are limited to the dosage-compensated sex chromosome, leaving the principle of C. elegans chromatin organization unclear. Transcription factor III C (TFIIIC) is a basal transcription factor complex for RNA polymerase III, and is implicated in chromatin organization. TFIIIC binding without RNA polymerase III co-occupancy, referred to as extra-TFIIIC binding, has been implicated in insulating active and inactive chromatin domains in yeasts, flies, and mammalian cells. Whether extra-TFIIIC sites are present and contribute to chromatin organization in C. elegans remains unknown.We identified 504 TFIIIC-bound sites absent of RNA polymerase III and TATA-binding protein co-occupancy characteristic of extra-TFIIIC sites in C. elegans embryos. Extra-TFIIIC sites constituted half of all identified TFIIIC binding sites in the genome. Extra-TFIIIC sites formed dense clusters in cis. The clusters of extra-TFIIIC sites were highly over-represented within the distal arm domains of the autosomes that presented a high level of heterochromatin-associated histone H3K9 trimethylation (H3K9me3). Furthermore, extra-TFIIIC clusters were embedded in the lamina-associated domains. Despite the heterochromatin environment of extra-TFIIIC sites, the individual clusters of extra-TFIIIC sites were devoid of and resided near the individual H3K9me3-marked regions.Clusters of extra-TFIIIC sites were pervasive in the arm domains of C. elegans autosomes, near the outer boundaries of H3K9me3-marked regions. Given the reported activity of extra-TFIIIC sites in heterochromatin insulation in yeasts, our observation raised the possibility that TFIIIC may also demarcate heterochromatin in C. elegans.

    View details for DOI 10.1186/s13072-019-0325-2

    View details for PubMedID 31918747

    View details for PubMedCentralID PMC6950938

  • Large Scale Semi-Automated Labeling of Routine Free-Text Clinical Records for Deep Learning. Journal of digital imaging Trivedi, H. M., Panahiazar, M., Liang, A., Lituiev, D., Chang, P., Sohn, J. H., Chen, Y. Y., Franc, B. L., Joe, B., Hadley, D. 2019; 32 (1): 30-37

    Abstract

    Breast cancer is a leading cause of cancer death among women in the USA. Screening mammography is effective in reducing mortality, but has a high rate of unnecessary recalls and biopsies. While deep learning can be applied to mammography, large-scale labeled datasets, which are difficult to obtain, are required. We aim to remove many barriers of dataset development by automatically harvesting data from existing clinical records using a hybrid framework combining traditional NLP and IBM Watson. An expert reviewer manually annotated 3521 breast pathology reports with one of four outcomes: left positive, right positive, bilateral positive, negative. Traditional NLP techniques using seven different machine learning classifiers were compared to IBM Watson's automated natural language classifier. Techniques were evaluated using precision, recall, and F-measure. Logistic regression outperformed all other traditional machine learning classifiers and was used for subsequent comparisons. Both traditional NLP and Watson's NLC performed well for cases under 1024 characters with weighted average F-measures above 0.96 across all classes. Performance of traditional NLP was lower for cases over 1024 characters with an F-measure of 0.83. We demonstrate a hybrid framework using traditional NLP techniques combined with IBM Watson to annotate over 10,000 breast pathology reports for development of a large-scale database to be used for deep learning in mammography. Our work shows that traditional NLP and IBM Watson perform extremely well for cases under 1024 characters and can accelerate the rate of data annotation.

    View details for DOI 10.1007/s10278-018-0105-8

    View details for PubMedID 30128778

    View details for PubMedCentralID PMC6382632

  • Development and Validation of an Electronic Health Record-Based Machine Learning Model to Estimate Delirium Risk in Newly Hospitalized Patients Without Known Cognitive Impairment. JAMA network open Wong, A., Young, A. T., Liang, A. S., Gonzales, R., Douglas, V. C., Hadley, D. 2018; 1 (4): e181018

    Abstract

    Current methods for identifying hospitalized patients at increased risk of delirium require nurse-administered questionnaires with moderate accuracy.To develop and validate a machine learning model that predicts incident delirium risk based on electronic health data available on admission.Retrospective cohort study evaluating 5 machine learning algorithms to predict delirium using 796 clinical variables identified by an expert panel as relevant to delirium prediction and consistently available in electronic health records within 24 hours of admission. The training set comprised 14 227 adult patients with non-intensive care unit hospital stays and no delirium on admission who were discharged between January 1, 2016, and August 31, 2017, from UCSF Health, a large academic health institution. The test set comprised 3996 patients with hospital stays who were discharged between August 1, 2017, and November 30, 2017.Patient demographic characteristics, diagnoses, nursing records, laboratory results, and medications available in electronic health records during hospitalization.Delirium was defined as a positive Nursing Delirium Screening Scale or Confusion Assessment Method for the Intensive Care Unit score. Models were assessed using the area under the receiver operating characteristic curve (AUC) and compared against the 4-point scoring system AWOL (age >79 years, failure to spell world backward, disorientation to place, and higher nurse-rated illness severity), a validated delirium risk-assessment tool routinely administered in this cohort.The training set included 14 227 patients (5113 [35.9%] aged >64 years; 7335 [51.6%] female; 687 [4.8%] with delirium), and the test set included 3996 patients (1491 [37.3%] aged >64 years; 1966 [49.2%] female; 191 [4.8%] with delirium). In total, the analysis included 18 223 hospital admissions (6604 [36.2%] aged >64 years; 9301 [51.0%] female; 878 [4.8%] with delirium). The AWOL system achieved a baseline AUC of 0.678. The gradient boosting machine model performed best, with an AUC of 0.855. Setting specificity at 90%, the model had a 59.7% (95% CI, 52.4%-66.7%) sensitivity, 23.1% (95% CI, 20.5%-25.9%) positive predictive value, 97.8% (95% CI, 97.4%-98.1%) negative predictive value, and a number needed to screen of 4.8. Penalized logistic regression and random forest models also performed well, with AUCs of 0.854 and 0.848, respectively.Machine learning can be used to estimate hospital-acquired delirium risk using electronic health record data available within 24 hours of hospital admission. Such a model may allow more precise targeting of delirium prevention resources without increasing the burden on health care professionals.

    View details for DOI 10.1001/jamanetworkopen.2018.1018

    View details for PubMedID 30646095

    View details for PubMedCentralID PMC6324291