I work on problems at the intersection of machine learning and medicine, with a focus on medical imaging. I am interested in how new technology can improve the affordability and accessibility of quality healthcare. At Stanford, I am advised by Chris Ré in computer science and Curt Langlotz and Sanjiv Sam Gambhir in radiology.
Honors & Awards
Fellow, Hertz Foundation
Fellow, National Science Foundation Graduate Research Fellowship
Fellow, Stanford Graduate Fellowship in Science and Engineering
Education & Certifications
B.S., Rice University, Electrical Engineering (2017)
Minor, Rice University, Global Health Technologies (2017)
Current Research and Scholarly Interests
Medical and molecular imaging, machine learning to improve healthcare
Sanjiv Gambhir, (1/8/2018)
Evaluating semi-supervision methods for medical image segmentation: applications in cardiac magnetic resonance imaging.
Journal of medical imaging (Bellingham, Wash.)
2023; 10 (2): 024007
Neural networks have potential to automate medical image segmentation but require expensive labeling efforts. While methods have been proposed to reduce the labeling burden, most have not been thoroughly evaluated on large, clinical datasets or clinical tasks. We propose a method to train segmentation networks with limited labeled data and focus on thorough network evaluation.We propose a semi-supervised method that leverages data augmentation, consistency regularization, and pseudolabeling and train four cardiac magnetic resonance (MR) segmentation networks. We evaluate the models on multiinstitutional, multiscanner, multidisease cardiac MR datasets using five cardiac functional biomarkers, which are compared to an expert's measurements using Lin's concordance correlation coefficient (CCC), the within-subject coefficient of variation (CV), and the Dice coefficient.The semi-supervised networks achieve strong agreement using Lin's CCC ( > 0.8 ), CV similar to an expert, and strong generalization performance. We compare the error modes of the semi-supervised networks against fully supervised networks. We evaluate semi-supervised model performance as a function of labeled training data and with different types of model supervision, showing that a model trained with 100 labeled image slices can achieve a Dice coefficient within 1.10% of a network trained with 16,000+ labeled image slices.We evaluate semi-supervision for medical image segmentation using heterogeneous datasets and clinical metrics. As methods for training models with little labeled data become more common, knowledge about how they perform on clinical tasks, how they fail, and how they perform with different amounts of labeled data is useful to model developers and users.
View details for DOI 10.1117/1.JMI.10.2.024007
View details for PubMedID 37009059
View details for PubMedCentralID PMC10061343
Multi-tracer PET Imaging Using Deep Learning: Applications in Patients with High-Grade Gliomas
SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 24-35
View details for DOI 10.1007/978-3-031-16919-9_3
View details for Web of Science ID 000867616800003
Impact of Upstream Medical Image Processing on Downstream Performance of a Head CT Triage Neural Network.
Radiology. Artificial intelligence
2021; 3 (4): e200229
Purpose: To develop a convolutional neural network (CNN) to triage head CT (HCT) studies and investigate the effect of upstream medical image processing on the CNN's performance.Materials and Methods: A total of 9776 HCT studies were retrospectively collected from 2001 through 2014, and a CNN was trained to triage them as normal or abnormal. CNN performance was evaluated on a held-out test set, assessing triage performance and sensitivity to 20 disorders to assess differential model performance, with 7856 CT studies in the training set, 936 in the validation set, and 984 in the test set. This CNN was used to understand how the upstream imaging chain affects CNN performance by evaluating performance after altering three variables: image acquisition by reducing the number of x-ray projections, image reconstruction by inputting sinogram data into the CNN, and image preprocessing. To evaluate performance, the DeLong test was used to assess differences in the area under the receiver operating characteristic curve (AUROC), and the McNemar test was used to compare sensitivities.Results: The CNN achieved a mean AUROC of 0.84 (95% CI: 0.83, 0.84) in discriminating normal and abnormal HCT studies. The number of x-ray projections could be reduced by 16 times and the raw sensor data could be input into the CNN with no statistically significant difference in classification performance. Additionally, CT windowing consistently improved CNN performance, increasing the mean triage AUROC by 0.07 points.Conclusion: A CNN was developed to triage HCT studies, which may help streamline image evaluation, and the means by which upstream image acquisition, reconstruction, and preprocessing affect downstream CNN performance was investigated, bringing focus to this important part of the imaging chain.Keywords Head CT, Automated Triage, Deep Learning, Sinogram, DatasetSupplemental material is available for this article.©RSNA, 2021.
View details for DOI 10.1148/ryai.2021200229
View details for PubMedID 34350412
Multiparametric Photoacoustic Analysis of Human Thyroid Cancers In Vivo.
Thyroid cancer is one of the most common cancers, with a global increase in incidence rate for both genders. Ultrasound-guided fine-needle aspiration is the current gold standard to diagnose thyroid cancers, but the results are inaccurate, leading to repeated biopsies and unnecessary surgeries. To reduce the number of unnecessary biopsies, we explored the use of multiparametric photoacoustic (PA) analysis in combination with the American Thyroid Association (ATA) Guideline (ATAP). In this study, we performed in vivo multispectral PA imaging on thyroid nodules from 52 patients, comprising 23 papillary thyroid cancer (PTC) and 29 benign cases. From the multispectral PA data, we calculated hemoglobin oxygen saturation level in the nodule area, then classified the PTC and benign nodules with multiparametric analysis. Statistical analyses showed that this multiparametric analysis of multispectral PA responses could classify PTC nodules. Combining the photoacoustically indicated probability of PTC and the ATAP led to a new scoring method that achieved a sensitivity of 83% and a specificity of 93%. This study is the first multiparametric analysis of multispectral PA data of thyroid nodules with statistical significance. As a proof of concept, the results show that the proposed new ATAP scoring can help physicians examine thyroid nodules for fine-needle aspiration biopsy, thus reducing unnecessary biopsies.
View details for DOI 10.1158/0008-5472.CAN-20-3334
View details for PubMedID 34185675
Observational Supervision for Medical Image Classification Using Gaze Data
SPRINGER INTERNATIONAL PUBLISHING AG. 2021: 603-614
View details for DOI 10.1007/978-3-030-87196-3_56
View details for Web of Science ID 000712020700056
Multispectral Photoacoustic Assessment of Thyroid Cancer Nodules In Vivo
SPIE-INT SOC OPTICAL ENGINEERING. 2020
View details for DOI 10.1117/12.2546616
View details for Web of Science ID 000558347500001
Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2020
View details for Web of Science ID 000683178503037