Academic Appointments

  • Lecturer, Computer Science

Professional Education

  • MS, Stanford University, Computer Science (2019)
  • MS, Stanford University, Statistics (2019)

2021-22 Courses

All Publications

  • Gap-filling eddy covariance methane fluxes: Comparison of machine learning model predictions and uncertainties at FLUXNET-CH4 wetlands AGRICULTURAL AND FOREST METEOROLOGY Irvin, J., Zhou, S., McNicol, G., Lu, F., Liu, V., Fluet-Chouinard, E., Ouyang, Z., Knox, S., Lucas-Moffat, A., Trotta, C., Papale, D., Vitale, D., Mammarella, I., Alekseychik, P., Aurela, M., Avati, A., Baldocchi, D., Bansal, S., Bohrer, G., Campbell, D., Chen, J., Chu, H., Dalmagro, H. J., Delwiche, K. B., Desai, A. R., Euskirchen, E., Feron, S., Goeckede, M., Heimann, M., Helbig, M., Helfter, C., Hemes, K. S., Hirano, T., Iwata, H., Jurasinski, G., Kalhori, A., Kondrich, A., Lai, D. F., Lohila, A., Malhotra, A., Merbold, L., Mitra, B., Ng, A., Nilsson, M. B., Noormets, A., Peichl, M., Rey-Sanchez, A., Richardson, A. D., Runkle, B. K., Schafer, K. R., Sonnentag, O., Stuart-Haentjens, E., Sturtevant, C., Ueyama, M., Valach, A. C., Vargas, R., Vourlitis, G. L., Ward, E. J., Wong, G., Zona, D., Alberto, M. R., Billesbach, D. P., Celis, G., Dolman, H., Friborg, T., Fuchs, K., Gogo, S., Gondwe, M. J., Goodrich, J. P., Gottschalk, P., Hortnagl, L., Jacotot, A., Koebsch, F., Kasak, K., Maier, R., Morin, T. H., Nemitz, E., Oechel, W. C., Oikawa, P. Y., Ono, K., Sachs, T., Sakabe, A., Schuur, E. A., Shortt, R., Sullivan, R. C., Szutu, D. J., Tuittila, E., Varlagin, A., Verfaillie, J. G., Wille, C., Windham-Myers, L., Poulter, B., Jackson, R. B. 2021; 308
  • Improving Hospital Readmission Prediction using Individualized Utility Analysis. Journal of biomedical informatics Ko, M., Chen, E., Agrawal, A., Rajpurkar, P., Avati, A., Ng, A., Basu, S., Shah, N. H. 2021: 103826


    Machine learning (ML) models for allocating readmission-mitigating interventions are typically selected according to their discriminative ability, which may not necessarily translate into utility in allocation of resources. Our objective was to determine whether ML models for allocating readmission-mitigating interventions have different usefulness based on their overall utility and discriminative ability.We conducted a retrospective utility analysis of ML models using claims data acquired from the Optum Clinformatics Data Mart, including 513,495 commercially-insured inpatients (mean [SD] age 69 [19] years; 294,895 [57%] Female) over the period January 2016 through January 2017 from all 50 states with mean 90 day cost of $11,552. Utility analysis estimates the cost, in dollars, of allocating interventions for lowering readmission risk based on the reduction in the 90-day cost.Allocating readmission-mitigating interventions based on a GBDT model trained to predict readmissions achieved an estimated utility gain of $104 per patient, and an AUC of 0.76 (95% CI 0.76, 0.77); allocating interventions based on a model trained to predict cost as a proxy achieved a higher utility of $175.94 per patient, and an AUC of 0.62 (95% CI 0.61, 0.62). A hybrid model combining both intervention strategies is comparable with the best models on either metric. Estimated utility varies by intervention cost and efficacy, with each model performing the best under different intervention settings.We demonstrate that machine learning models may be ranked differently based on overall utility and discriminative ability. Machine learning models for allocation of limited health resources should consider directly optimizing for utility.

    View details for DOI 10.1016/j.jbi.2021.103826

    View details for PubMedID 34087428

  • A framework for making predictive models useful in practice. Journal of the American Medical Informatics Association : JAMIA Jung, K., Kashyap, S., Avati, A., Harman, S., Shaw, H., Li, R., Smith, M., Shum, K., Javitz, J., Vetteth, Y., Seto, T., Bagley, S. C., Shah, N. H. 2020


    OBJECTIVE: To analyze the impact of factors in healthcare delivery on the net benefit of triggering an Advanced Care Planning (ACP) workflow based on predictions of 12-month mortality.MATERIALS AND METHODS: We built a predictive model of 12-month mortality using electronic health record data and evaluated the impact of healthcare delivery factors on the net benefit of triggering an ACP workflow based on the models' predictions. Factors included nonclinical reasons that make ACP inappropriate: limited capacity for ACP, inability to follow up due to patient discharge, and availability of an outpatient workflow to follow up on missed cases. We also quantified the relative benefits of increasing capacity for inpatient ACP versus outpatient ACP.RESULTS: Work capacity constraints and discharge timing can significantly reduce the net benefit of triggering the ACP workflow based on a model's predictions. However, the reduction can be mitigated by creating an outpatient ACP workflow. Given limited resources to either add capacity for inpatient ACP versus developing outpatient ACP capability, the latter is likely to provide more benefit to patient care.DISCUSSION: The benefit of using a predictive model for identifying patients for interventions is highly dependent on the capacity to execute the workflow triggered by the model. We provide a framework for quantifying the impact of healthcare delivery factors and work capacity constraints on achieved benefit.CONCLUSION: An analysis of the sensitivity of the net benefit realized by a predictive model triggered clinical workflow to various healthcare delivery factors is necessary for making predictive models useful in practice.

    View details for DOI 10.1093/jamia/ocaa318

    View details for PubMedID 33355350

  • Countdown Regression: Sharp and Calibrated Survival Predictions Avati, A., Duan, T., Zhou, S., Jung, K., Shah, N. H., Ng, A. Y., Adams, R. P., Gogate JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2020: 145-155
  • NGBoost: Natural Gradient Boosting for Probabilistic Prediction Duan, T., Avati, A., Ding, D., Thai, K. K., Basu, S., Ng, A., Schuler, A., Daume, H., Singh, A. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2020
  • Ambulatory Atrial Fibrillation Monitoring Using Wearable Photoplethysmography with Deep Learning Shen, Y., Voisin, M., Aliamiri, A., Avati, A., Hannun, A., Ng, A., Assoc Comp Machinery ASSOC COMPUTING MACHINERY. 2019: 1909–16
  • Improving palliative care with deep learning. BMC medical informatics and decision making Avati, A., Jung, K., Harman, S., Downing, L., Ng, A., Shah, N. H. 2018; 18 (Suppl 4): 122


    BACKGROUND: Access to palliative care is a key quality metric which most healthcare organizations strive to improve. The primary challenges to increasing palliative care access are a combination of physicians over-estimating patient prognoses, and a shortage of palliative staff in general. This, in combination with treatment inertia can result in a mismatch between patient wishes, and their actual care towards the end of life.METHODS: In this work, we address this problem, with Institutional Review Board approval, using machine learning and Electronic Health Record (EHR) data of patients. We train a Deep Neural Network model on the EHR data of patients from previous years, to predict mortality of patients within the next 3-12 month period. This prediction is used as a proxy decision for identifying patients who could benefit from palliative care.RESULTS: The EHR data of all admitted patients are evaluated every night by this algorithm, and the palliative care team is automatically notified of the list of patients with a positive prediction. In addition, we present a novel technique for decision interpretation, using which we provide explanations for the model's predictions.CONCLUSION: The automatic screening and notification saves the palliative care team the burden of time consuming chart reviews of all patients, and allows them to take a proactive approach in reaching out to such patients rather then relying on referrals from the treating physicians.

    View details for PubMedID 30537977

  • Automated and flexible identification of complex disease: building a model for systemic lupus erythematosus using noisy labeling. Journal of the American Medical Informatics Association : JAMIA Murray, S. G., Avati, A., Schmajuk, G., Yazdany, J. 2018


    Accurate and efficient identification of complex chronic conditions in the electronic health record (EHR) is an important but challenging task that has historically relied on tedious clinician review and oversimplification of the disease. Here we adapt methods that allow for automated "noisy labeling" of positive and negative controls to create a "silver standard" for machine learning to automate identification of systemic lupus erythematosus (SLE). Our final model, which includes both structured data as well as text processing of clinical notes, outperformed all existing algorithms for SLE (AUC 0.97). In addition, we demonstrate how the probabilistic outputs of this model can be adapted to various clinical needs, selecting high thresholds when specificity is the priority and lower thresholds when a more inclusive patient population is desired. Deploying a similar methodology to other complex diseases has the potential to dramatically simplify the landscape of population identification in the EHR.MeSH terms: Electronic Health Records, Machine Learning, Lupus Erythematosus, Phenotype, Algorithms.

    View details for PubMedID 30476175

  • Performance of Machine Learning Methods Using Electronic Medical Records to Predict Varicella Zoster Virus Infection Gianfrancesco, M., Schmajuk, G., Murray, S., Ludwig, D., Hannun, A., Avati, A., Tamang, S., Yazdany, J. WILEY. 2017