Malvika Pillai is a postdoctoral research fellow in the VA Big Data Scientist Training Enhancement Program (BD-STEP), jointly in Stanford University in Medicine (Biomedical Informatics) in the Boussard Lab and VA Palo Alto. She received her BS in Quantitative Biology and PhD in Health Informatics from the University of North Carolina at Chapel Hill. Her current work focuses on the development, evaluation and implementation of machine learning algorithms for clinical decision support.

Professional Education

  • Doctor of Philosophy, University of North Carolina, Chapel Hill (2022)
  • Bachelor of Science, University of North Carolina, Chapel Hill (2017)
  • PhD, University of North Carolina at Chapel Hill, North Carolina, Health Informatics (2022)
  • BS, University of North Carolina at Chapel Hill, North Carolina, Quantitative Biology (2017)

Stanford Advisors

All Publications

  • Measuring quality-of-care in treatment of young children with attention-deficit/hyperactivity disorder using pre-trained language models. Journal of the American Medical Informatics Association : JAMIA Pillai, M., Posada, J., Gardner, R. M., Hernandez-Boussard, T., Bannett, Y. 2024


    To measure pediatrician adherence to evidence-based guidelines in the treatment of young children with attention-deficit/hyperactivity disorder (ADHD) in a diverse healthcare system using natural language processing (NLP) techniques.We extracted structured and free-text data from electronic health records (EHRs) of all office visits (2015-2019) of children aged 4-6 years in a community-based primary healthcare network in California, who had ≥1 visits with an ICD-10 diagnosis of ADHD. Two pediatricians annotated clinical notes of the first ADHD visit for 423 patients. Inter-annotator agreement (IAA) was assessed for the recommendation for the first-line behavioral treatment (F-measure = 0.89). Four pre-trained language models, including BioClinical Bidirectional Encoder Representations from Transformers (BioClinicalBERT), were used to identify behavioral treatment recommendations using a 70/30 train/test split. For temporal validation, we deployed BioClinicalBERT on 1,020 unannotated notes from other ADHD visits and well-care visits; all positively classified notes (n = 53) and 5% of negatively classified notes (n = 50) were manually reviewed.Of 423 patients, 313 (74%) were male; 298 (70%) were privately insured; 138 (33%) were White; 61 (14%) were Hispanic. The BioClinicalBERT model trained on the first ADHD visits achieved F1 = 0.76, precision = 0.81, recall = 0.72, and AUC = 0.81 [0.72-0.89]. Temporal validation achieved F1 = 0.77, precision = 0.68, and recall = 0.88. Fairness analysis revealed low model performance in publicly insured patients (F1 = 0.53).Deploying pre-trained language models on a variable set of clinical notes accurately captured pediatrician adherence to guidelines in the treatment of children with ADHD. Validating this approach in other patient populations is needed to achieve equitable measurement of quality of care at scale and improve clinical care for mental health conditions.

    View details for DOI 10.1093/jamia/ocae001

    View details for PubMedID 38244997

  • Augmenting Quality Assurance Measures in Treatment Review with Machine Learning in Radiation Oncology ADVANCES IN RADIATION ONCOLOGY Pillai, M., Shumway, J. W., Adapa, K., Dooley, J., McGurk, R., Mazur, L. M., Das, S. K., Chera, B. S. 2023; 8 (6): 101234


    Pretreatment quality assurance (QA) of treatment plans often requires a high cognitive workload and considerable time expenditure. This study explores the use of machine learning to classify pretreatment chart check QA for a given radiation plan as difficult or less difficult, thereby alerting the physicists to increase scrutiny on difficult plans.Pretreatment QA data were collected for 973 cases between July 2018 and October 2020. The outcome variable, a degree of difficulty, was collected as a subjective rating by physicists who performed the pretreatment chart checks. Potential features were identified based on clinical relevance, contribution to plan complexity, and QA metrics. Five machine learning models were developed: support vector machine, random forest classifier, adaboost classifier, decision tree classifier, and neural network. These were incorporated into a voting classifier, where at least 2 algorithms needed to predict a case as difficult for it to be classified as such. Sensitivity analyses were conducted to evaluate feature importance.The voting classifier achieved an overall accuracy of 77.4% on the test set, with 76.5% accuracy on difficult cases and 78.4% accuracy on less difficult cases. Sensitivity analysis showed features associated with plan complexity (number of fractions, dose per monitor unit, number of planning structures, and number of image sets) and clinical relevance (patient age) were sensitive across at least 3 algorithms.This approach can be used to equitably allocate plans to physicists rather than randomly allocate them, potentially improving pretreatment chart check effectiveness by reducing errors propagating downstream.

    View details for DOI 10.1016/j.adro.2023.101234

    View details for Web of Science ID 001054543200001

    View details for PubMedID 37205277

    View details for PubMedCentralID PMC10185740

  • Toward Community-Based Natural Language Processing (CBNLP): Cocreating With Communities. Journal of medical Internet research Pillai, M., Griffin, A. C., Kronk, C. A., McCall, T. 2023; 25: e48498


    Rapid development and adoption of natural language processing (NLP) techniques has led to a multitude of exciting and innovative societal and health care applications. These advancements have also generated concerns around perpetuation of historical injustices and that these tools lack cultural considerations. While traditional health care NLP techniques typically include clinical subject matter experts to extract health information or aid in interpretation, few NLP tools involve community stakeholders with lived experiences. In this perspective paper, we draw upon the field of community-based participatory research, which gathers input from community members for development of public health interventions, to identify and examine ways to equitably involve communities in developing health care NLP tools. To realize the potential of community-based NLP (CBNLP), research and development teams must thoughtfully consider mechanisms and resources needed to effectively collaborate with community members for maximal societal and ethical impact of NLP-based tools.

    View details for DOI 10.2196/48498

    View details for PubMedID 37540551

  • Validation approaches for computational drug repurposing: a review. AMIA ... Annual Symposium proceedings. AMIA Symposium Pillai, M., Wu, D. 2023; 2023: 559-568

    View details for PubMedID 38222367

    View details for PubMedCentralID PMC10785886

  • Recommendations for design of a mobile application to support management of anxiety and depression among Black American women. Frontiers in digital health McCall, T., Threats, M., Pillai, M., Lakdawala, A., Bolton, C. S. 2022; 4: 1028408


    Black American women experience adverse health outcomes due to anxiety and depression. They face systemic barriers to accessing culturally appropriate mental health care leading to the underutilization of mental health services and resources. Mobile technology can be leveraged to increase access to culturally relevant resources, however, the specific needs and preferences that Black women feel are useful in an app to support management of anxiety and depression are rarely reflected in existing digital health tools. This study aims to assess what types of content, features, and important considerations should be included in the design of a mobile app tailored to support management of anxiety and depression among Black women. Focus groups were conducted with 20 women (mean age 36.6 years, SD 17.8 years), with 5 participants per group. Focus groups were led by a moderator, with notetaker present, using an interview guide to discuss topics, such as participants' attitudes and perceptions towards mental health and use of mental health services, and content, features, and concerns for design of a mobile app to support management of anxiety and depression. Descriptive qualitative content analysis was conducted. Recommendations for content were either informational (e.g., information to find a Black woman therapist) or inspirational (e.g., encouraging stories about overcoming adversity). Suggested features allow users to monitor their progress, practice healthy coping techniques, and connect with others. The importance of feeling "a sense of community" was emphasized. Transparency about who created and owns the app, and how users' data will be used and protected was recommended to establish trust. The findings from this study were consistent with previous literature which highlighted the need for educational, psychotherapy, and personal development components for mental health apps. There has been exponential growth in the digital mental health space due to the COVID-19 pandemic; however, a one-size-fits-all approach may lead to more options but continued disparity in receiving mental health care. Designing a mental health app for and with Black women may help to advance digital health equity by providing a tool that addresses their specific needs and preferences, and increase engagement.

    View details for DOI 10.3389/fdgth.2022.1028408

    View details for PubMedID 36620185

    View details for PubMedCentralID PMC9816326

  • An Interpretable Machine Learning Approach to Prioritizing Factors Contributing to Clinician Burnout Pillai, M., Adapa, K., Foster, M., Kratzke, I., Charguia, N., Mazur, L., Ceci, M., Flesca, S., Masciari, E., Manco, G., Ras, Z. W. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 149-161