Malvika Pillai is a postdoctoral research fellow in the VA Big Data Scientist Training Enhancement Program (BD-STEP), jointly in Stanford University in Medicine (Biomedical Informatics) in the Boussard Lab and VA Palo Alto. She received her BS in Quantitative Biology and PhD in Health Informatics from the University of North Carolina at Chapel Hill. Her current work focuses on the development, evaluation and implementation of machine learning algorithms for clinical decision support.

Professional Education

  • Doctor of Philosophy, University of North Carolina, Chapel Hill (2022)
  • Bachelor of Science, University of North Carolina, Chapel Hill (2017)
  • PhD, University of North Carolina at Chapel Hill, North Carolina, Health Informatics (2022)
  • BS, University of North Carolina at Chapel Hill, North Carolina, Quantitative Biology (2017)

Stanford Advisors

All Publications

  • Using an explainable machine learning approach to prioritize factors contributing to healthcare professionals' burnout JOURNAL OF INTELLIGENT INFORMATION SYSTEMS Pillai, M., Liu, C., Kwong, E., Kratzke, I., Charguia, N., Mazur, L., Adapa, K. 2024
  • Leveraging a Large Language Model to Assess Quality-of-Care: Monitoring ADHD Medication Side Effects. medRxiv : the preprint server for health sciences Bannett, Y., Gunturkun, F., Pillai, M., Herrmann, J. E., Luo, I., Huffman, L. C., Feldman, H. M. 2024


    To assess the accuracy of a large language model (LLM) in measuring clinician adherence to practice guidelines for monitoring side effects after prescribing medications for children with attention-deficit/hyperactivity disorder (ADHD).Retrospective population-based cohort study of electronic health records. Cohort included children aged 6-11 years with ADHD diagnosis and ≥2 ADHD medication encounters (stimulants or non-stimulants prescribed) between 2015-2022 in a community-based primary healthcare network (n=1247). To identify documentation of side effects inquiry, we trained, tested, and deployed an open-source LLM (LLaMA) on all clinical notes from ADHD-related encounters (ADHD diagnosis or ADHD medication prescription), including in-clinic/telehealth and telephone encounters (n=15,593 notes). Model performance was assessed using holdout and deployment test sets, compared to manual chart review.The LLaMA model achieved excellent performance in classifying notes that contain side effects inquiry (sensitivity= 87.2%, specificity=86.3/90.3%, area under curve (AUC)=0.93/0.92 on holdout/deployment test sets). Analyses revealed no model bias in relation to patient age, sex, or insurance. Mean age (SD) at first prescription was 8.8 (1.6) years; patient characteristics were similar across patients with and without documented side effects inquiry. Rates of documented side effects inquiry were lower in telephone encounters than in-clinic/telehealth encounters (51.9% vs. 73.0%, p<0.01). Side effects inquiry was documented in 61% of encounters following stimulant prescriptions and 48% of encounters following non-stimulant prescriptions (p<0.01).Deploying an LLM on a variable set of clinical notes, including telephone notes, offered scalable measurement of quality-of-care and uncovered opportunities to improve psychopharmacological medication management in primary care.

    View details for DOI 10.1101/2024.04.23.24306225

    View details for PubMedID 38712037

    View details for PubMedCentralID PMC11071552

  • Patient Characteristics Associated With Phone and Video Visits at a Tele-Urgent Care Center During the Initial COVID-19 Response: Cross-Sectional Study. Online journal of public health informatics Khairat, S., John, R., Pillai, M., McDaniel, P., Edson, B. 2024; 16: e50962


    BACKGROUND: Health systems rapidly adopted telemedicine as an alternative health care delivery modality in response to the COVID-19 pandemic. Demographic factors, such as age and gender, may play a role in patients' choice of a phone or video visit. However, it is unknown whether there are differences in utilization between phone and video visits.OBJECTIVE: This study aimed to investigate patients' characteristics, patient utilization, and service characteristics of a tele-urgent care clinic during the initial response to the pandemic.METHODS: We conducted a cross-sectional study of urgent care patients using a statewide, on-demand telemedicine clinic with board-certified physicians during the initial phases of the pandemic. The study data were collected from March 3, 2020, through May 3, 2020.RESULTS: Of 1803 telemedicine visits, 1278 (70.9%) patients were women, 730 (40.5%) were aged 18 to 34 years, and 1423 (78.9%) were uninsured. There were significant differences between telemedicine modalities and gender (P<.001), age (P<.001), insurance status (P<.001), prescriptions given (P<.001), and wait times (P<.001). Phone visits provided significantly more access to rural areas than video visits (P<.001).CONCLUSIONS: Our findings suggest that offering patients a combination of phone and video options provided additional flexibility for various patient subgroups, particularly patients living in rural regions with limited internet bandwidth. Differences in utilization were significant based on patient gender, age, and insurance status. We also found differences in prescription administration between phone and video visits that require additional investigation.

    View details for DOI 10.2196/50962

    View details for PubMedID 38241073

  • Measuring quality-of-care in treatment of young children with attention-deficit/hyperactivity disorder using pre-trained language models. Journal of the American Medical Informatics Association : JAMIA Pillai, M., Posada, J., Gardner, R. M., Hernandez-Boussard, T., Bannett, Y. 2024


    To measure pediatrician adherence to evidence-based guidelines in the treatment of young children with attention-deficit/hyperactivity disorder (ADHD) in a diverse healthcare system using natural language processing (NLP) techniques.We extracted structured and free-text data from electronic health records (EHRs) of all office visits (2015-2019) of children aged 4-6 years in a community-based primary healthcare network in California, who had ≥1 visits with an ICD-10 diagnosis of ADHD. Two pediatricians annotated clinical notes of the first ADHD visit for 423 patients. Inter-annotator agreement (IAA) was assessed for the recommendation for the first-line behavioral treatment (F-measure = 0.89). Four pre-trained language models, including BioClinical Bidirectional Encoder Representations from Transformers (BioClinicalBERT), were used to identify behavioral treatment recommendations using a 70/30 train/test split. For temporal validation, we deployed BioClinicalBERT on 1,020 unannotated notes from other ADHD visits and well-care visits; all positively classified notes (n = 53) and 5% of negatively classified notes (n = 50) were manually reviewed.Of 423 patients, 313 (74%) were male; 298 (70%) were privately insured; 138 (33%) were White; 61 (14%) were Hispanic. The BioClinicalBERT model trained on the first ADHD visits achieved F1 = 0.76, precision = 0.81, recall = 0.72, and AUC = 0.81 [0.72-0.89]. Temporal validation achieved F1 = 0.77, precision = 0.68, and recall = 0.88. Fairness analysis revealed low model performance in publicly insured patients (F1 = 0.53).Deploying pre-trained language models on a variable set of clinical notes accurately captured pediatrician adherence to guidelines in the treatment of children with ADHD. Validating this approach in other patient populations is needed to achieve equitable measurement of quality of care at scale and improve clinical care for mental health conditions.

    View details for DOI 10.1093/jamia/ocae001

    View details for PubMedID 38244997

  • Leveraging Large Language Models to Assess Medication Side Effects Documentation in Children with Attention-Deficit/Hyperactivity Disorder Bannett, Y., Gunturkun, F., Pillai, M., Huffman, L. C., Feldman, H. M. LIPPINCOTT WILLIAMS & WILKINS. 2024: E119
  • Augmenting Quality Assurance Measures in Treatment Review with Machine Learning in Radiation Oncology ADVANCES IN RADIATION ONCOLOGY Pillai, M., Shumway, J. W., Adapa, K., Dooley, J., McGurk, R., Mazur, L. M., Das, S. K., Chera, B. S. 2023; 8 (6): 101234


    Pretreatment quality assurance (QA) of treatment plans often requires a high cognitive workload and considerable time expenditure. This study explores the use of machine learning to classify pretreatment chart check QA for a given radiation plan as difficult or less difficult, thereby alerting the physicists to increase scrutiny on difficult plans.Pretreatment QA data were collected for 973 cases between July 2018 and October 2020. The outcome variable, a degree of difficulty, was collected as a subjective rating by physicists who performed the pretreatment chart checks. Potential features were identified based on clinical relevance, contribution to plan complexity, and QA metrics. Five machine learning models were developed: support vector machine, random forest classifier, adaboost classifier, decision tree classifier, and neural network. These were incorporated into a voting classifier, where at least 2 algorithms needed to predict a case as difficult for it to be classified as such. Sensitivity analyses were conducted to evaluate feature importance.The voting classifier achieved an overall accuracy of 77.4% on the test set, with 76.5% accuracy on difficult cases and 78.4% accuracy on less difficult cases. Sensitivity analysis showed features associated with plan complexity (number of fractions, dose per monitor unit, number of planning structures, and number of image sets) and clinical relevance (patient age) were sensitive across at least 3 algorithms.This approach can be used to equitably allocate plans to physicists rather than randomly allocate them, potentially improving pretreatment chart check effectiveness by reducing errors propagating downstream.

    View details for DOI 10.1016/j.adro.2023.101234

    View details for Web of Science ID 001054543200001

    View details for PubMedID 37205277

    View details for PubMedCentralID PMC10185740

  • Toward Community-Based Natural Language Processing (CBNLP): Cocreating With Communities. Journal of medical Internet research Pillai, M., Griffin, A. C., Kronk, C. A., McCall, T. 2023; 25: e48498


    Rapid development and adoption of natural language processing (NLP) techniques has led to a multitude of exciting and innovative societal and health care applications. These advancements have also generated concerns around perpetuation of historical injustices and that these tools lack cultural considerations. While traditional health care NLP techniques typically include clinical subject matter experts to extract health information or aid in interpretation, few NLP tools involve community stakeholders with lived experiences. In this perspective paper, we draw upon the field of community-based participatory research, which gathers input from community members for development of public health interventions, to identify and examine ways to equitably involve communities in developing health care NLP tools. To realize the potential of community-based NLP (CBNLP), research and development teams must thoughtfully consider mechanisms and resources needed to effectively collaborate with community members for maximal societal and ethical impact of NLP-based tools.

    View details for DOI 10.2196/48498

    View details for PubMedID 37540551

  • Validation approaches for computational drug repurposing: a review. AMIA ... Annual Symposium proceedings. AMIA Symposium Pillai, M., Wu, D. 2023; 2023: 559-568

    View details for PubMedID 38222367

    View details for PubMedCentralID PMC10785886

  • Recommendations for design of a mobile application to support management of anxiety and depression among Black American women. Frontiers in digital health McCall, T., Threats, M., Pillai, M., Lakdawala, A., Bolton, C. S. 2022; 4: 1028408


    Black American women experience adverse health outcomes due to anxiety and depression. They face systemic barriers to accessing culturally appropriate mental health care leading to the underutilization of mental health services and resources. Mobile technology can be leveraged to increase access to culturally relevant resources, however, the specific needs and preferences that Black women feel are useful in an app to support management of anxiety and depression are rarely reflected in existing digital health tools. This study aims to assess what types of content, features, and important considerations should be included in the design of a mobile app tailored to support management of anxiety and depression among Black women. Focus groups were conducted with 20 women (mean age 36.6 years, SD 17.8 years), with 5 participants per group. Focus groups were led by a moderator, with notetaker present, using an interview guide to discuss topics, such as participants' attitudes and perceptions towards mental health and use of mental health services, and content, features, and concerns for design of a mobile app to support management of anxiety and depression. Descriptive qualitative content analysis was conducted. Recommendations for content were either informational (e.g., information to find a Black woman therapist) or inspirational (e.g., encouraging stories about overcoming adversity). Suggested features allow users to monitor their progress, practice healthy coping techniques, and connect with others. The importance of feeling "a sense of community" was emphasized. Transparency about who created and owns the app, and how users' data will be used and protected was recommended to establish trust. The findings from this study were consistent with previous literature which highlighted the need for educational, psychotherapy, and personal development components for mental health apps. There has been exponential growth in the digital mental health space due to the COVID-19 pandemic; however, a one-size-fits-all approach may lead to more options but continued disparity in receiving mental health care. Designing a mental health app for and with Black women may help to advance digital health equity by providing a tool that addresses their specific needs and preferences, and increase engagement.

    View details for DOI 10.3389/fdgth.2022.1028408

    View details for PubMedID 36620185

    View details for PubMedCentralID PMC9816326

  • An Interpretable Machine Learning Approach to Prioritizing Factors Contributing to Clinician Burnout Pillai, M., Adapa, K., Foster, M., Kratzke, I., Charguia, N., Mazur, L., Ceci, M., Flesca, S., Masciari, E., Manco, G., Ras, Z. W. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 149-161