Zhi Huang received his Bachelor of Science degree in Automation (BS--MS straight entrance class) from Xi'an Jiaotong University School of Electronic and Information Engineering in June 2015. In August 2021, He received a Ph.D. degree from Purdue University, majoring in Electrical and Computer Engineering (ECE).
His background is in the area of Machine and Deep Learning, Computational Pathology, Computational Biology, and Bioinformatics.
From May 2019 to August 2019, he was at Philips Research North America as a Research Intern.

All Publications

  • A pathologist-AI collaboration framework for enhancing diagnostic accuracies and efficiencies. Nature biomedical engineering Huang, Z., Yang, E., Shen, J., Gratzinger, D., Eyerer, F., Liang, B., Nirschl, J., Bingham, D., Dussaq, A. M., Kunder, C., Rojansky, R., Gilbert, A., Chang-Graham, A. L., Howitt, B. E., Liu, Y., Ryan, E. E., Tenney, T. B., Zhang, X., Folkins, A., Fox, E. J., Montine, K. S., Montine, T. J., Zou, J. 2024


    In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework,, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models. We validate the effectiveness of the framework via two crossover user studies that leveraged collaboration between the AI and the pathologist, including the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. In both studies, yielded considerable diagnostic performance improvements. Collaboration between clinicians and AI will aid digital pathology by enhancing accuracies and efficiencies.

    View details for DOI 10.1038/s41551-024-01223-5

    View details for PubMedID 38898173

    View details for PubMedCentralID 6345440

  • Systematic analysis of off-label and off-guideline cancer therapy usage in a real-world cohort of 165,912US patients. Cell reports. Medicine Liu, R., Wang, L., Rizzo, S., Garmhausen, M. R., Pal, N., Waliany, S., McGough, S., Lin, Y. G., Huang, Z., Neal, J., Copping, R., Zou, J. 2024: 101444


    Patients with cancer may be given treatments that are not officially approved (off-label) or recommended by guidelines (off-guideline). Here we present a data science framework to systematically characterize off-label and off-guideline usages using real-world data from de-identified electronic health records (EHR). We analyze treatment patterns in 165,912US patients with 14 common cancer types. We find that 18.6% and 4.4% of patients have received at least one line of off-label and off-guideline cancer drugs, respectively. Patients with worse performance status, in later lines, or treated at academic hospitals are significantly more likely to receive off-label and off-guideline drugs. To quantify how predictable off-guideline usage is, we developed machine learning models to predict which drug a patient is likely to receive based on their clinical characteristics and previous treatments. Finally, we demonstrate that our systematic analyses generate hypotheses about patients' response to treatments.

    View details for DOI 10.1016/j.xcrm.2024.101444

    View details for PubMedID 38428426

  • Understanding the molecular basis of resilience to Alzheimer's disease FRONTIERS IN NEUROSCIENCE Montine, K. S., Berson, E., Phongpreecha, T., Huang, Z., Aghaeepour, N., Zou, J. Y., Maccoss, M. J., Montine, T. J. 2023; 17
  • Understanding the molecular basis of resilience to Alzheimer's disease. Frontiers in neuroscience Montine, K. S., Berson, E., Phongpreecha, T., Huang, Z., Aghaeepour, N., Zou, J. Y., MacCoss, M. J., Montine, T. J. 2023; 17: 1311157


    The cellular and molecular distinction between brain aging and neurodegenerative disease begins to blur in the oldest old. Approximately 15-25% of observations in humans do not fit predicted clinical manifestations, likely the result of suppressed damage despite usually adequate stressors and of resilience, the suppression of neurological dysfunction despite usually adequate degeneration. Factors during life may predict the clinico-pathologic state of resilience: cardiovascular health and mental health, more so than educational attainment, are predictive of a continuous measure of resilience to Alzheimer's disease (AD) and AD-related dementias (ADRDs). In resilience to AD alone (RAD), core features include synaptic and axonal processes, especially in the hippocampus. Future focus on larger and more diverse cohorts and additional regions offer emerging opportunities to understand this counterforce to neurodegeneration. The focus of this review is the molecular basis of resilience to AD.

    View details for DOI 10.3389/fnins.2023.1311157

    View details for PubMedID 38192507

    View details for PubMedCentralID PMC10773681

  • Unveiling Resilience to Alzheimer's Disease: Insights From Brain Regional Proteomic Markers. Neuroscience insights Huang, Z., Merrihew, G. E., Larson, E. B., Park, J., Plubell, D., Fox, E. J., Montine, K. S., Keene, C. D., Latimer, C. S., Zou, J. Y., MacCoss, M. J., Montine, T. J. 2023; 18: 26331055231201600


    Studying proteomics data of the human brain could offer numerous insights into unraveling the signature of resilience to Alzheimer's disease. In our previous study with rigorous cohort selection criteria that excluded 4 common comorbidities, we harnessed multiple brain regions from 43 research participants with 12 of them displaying cognitive resilience to Alzheimer's disease. Based on the previous findings, this work focuses on 6 proteins out of the 33 differentially expressed proteins associated with resilience to Alzheimer's disease. These proteins are used to construct a decision tree classifier, enabling the differentiation of 3 groups: (i) healthy control, (ii) resilience to Alzheimer's disease, and (iii) Alzheimer's disease with dementia. Our analysis unveiled 2 important regional proteomic markers: Aβ peptides in the hippocampus and PA1B3 in the inferior parietal lobule. These findings underscore the potential of using distinct regional proteomic markers as signatures in characterizing the resilience to Alzheimer's disease.

    View details for DOI 10.1177/26331055231201600

    View details for PubMedID 37810186

    View details for PubMedCentralID PMC10557413

  • A visual-language foundation model for pathology image analysis using medical Twitter. Nature medicine Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J., Zou, J. 2023


    The lack of annotated publicly available medical images is a major barrier for computational research and education innovations. At the same time, many de-identified images and much knowledge are shared by clinicians on public forums such as medical Twitter. Here we harness these crowd platforms to curate OpenPath, a large dataset of 208,414 pathology images paired with natural language descriptions. We demonstrate the value of this resource by developing pathology language-image pretraining (PLIP), a multimodal artificial intelligence with both image and text understanding, which is trained on OpenPath. PLIP achieves state-of-the-art performances for classifying new pathology images across four external datasets: for zero-shot classification, PLIP achieves F1 scores of 0.565-0.832 compared to F1 scores of 0.030-0.481 for previous contrastive language-image pretrained model. Training a simple supervised classifier on top of PLIP embeddings also achieves 2.5% improvement in F1 scores compared to using other supervised model embeddings. Moreover, PLIP enables users to retrieve similar cases by either image or natural language search, greatly facilitating knowledge sharing. Our approach demonstrates that publicly shared medical information is a tremendous resource that can be harnessed to develop medical artificial intelligence for enhancing diagnosis, knowledge sharing and education.

    View details for DOI 10.1038/s41591-023-02504-3

    View details for PubMedID 37592105

    View details for PubMedCentralID 9883475

  • Brain proteomic analysis implicates actin filament processes and injury response in resilience to Alzheimer's disease. Nature communications Huang, Z., Merrihew, G. E., Larson, E. B., Park, J., Plubell, D., Fox, E. J., Montine, K. S., Latimer, C. S., Dirk Keene, C., Zou, J. Y., MacCoss, M. J., Montine, T. J. 2023; 14 (1): 2747


    Resilience to Alzheimer's disease is an uncommon combination of high disease burden without dementia that offers valuable insights into limiting clinical impact. Here we assessed 43 research participants meeting stringent criteria, 11 healthy controls, 12 resilience to Alzheimer's disease and 20 Alzheimer's disease with dementia and analyzed matched isocortical regions, hippocampus, and caudate nucleus by mass spectrometry-based proteomics. Of 7115 differentially expressed soluble proteins, lower isocortical and hippocampal soluble Aβ levels is a significant feature of resilience when compared to healthy control and Alzheimer's disease dementia groups. Protein co-expression analysis reveals 181 densely-interacting proteins significantly associated with resilience that were enriched for actin filament-based processes, cellular detoxification, and wound healing in isocortex and hippocampus, further supported by four validation cohorts. Our results suggest that lowering soluble Aβ concentration may suppress severe cognitive impairment along the Alzheimer's disease continuum. The molecular basis of resilience likely holds important therapeutic insights.

    View details for DOI 10.1038/s41467-023-38376-x

    View details for PubMedID 37173305

    View details for PubMedCentralID 3266529

  • Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images. NPJ precision oncology Huang, Z., Shao, W., Han, Z., Alkashash, A. M., De la Sancha, C., Parwani, A. V., Nitta, H., Hou, Y., Wang, T., Salama, P., Rizkalla, M., Zhang, J., Huang, K., Li, Z. 2023; 7 (1): 14


    Advances in computational algorithms and tools have made the prediction of cancer patient outcomes using computational pathology feasible. However, predicting clinical outcomes from pre-treatment histopathologic images remains a challenging task, limited by the poor understanding of tumor immune micro-environments. In this study, an automatic, accurate, comprehensive, interpretable, and reproducible whole slide image (WSI) feature extraction pipeline known as, IMage-based Pathological REgistration and Segmentation Statistics (IMPRESS), is described. We used both H&E and multiplex IHC (PD-L1, CD8+, and CD163+) images, investigated whether artificial intelligence (AI)-based algorithms using automatic feature extraction methods can predict neoadjuvant chemotherapy (NAC) outcomes in HER2-positive (HER2+) and triple-negative breast cancer (TNBC) patients. Features are derived from tumor immune micro-environment and clinical data and used to train machine learning models to accurately predict the response to NAC in breast cancer patients (HER2+ AUC=0.8975; TNBC AUC=0.7674). The results demonstrate that this method outperforms the results trained from features that were manually generated by pathologists. The developed image features and algorithms were further externally validated by independent cohorts, yielding encouraging results, especially for the HER2+ subtype.

    View details for DOI 10.1038/s41698-023-00352-5

    View details for PubMedID 36707660

  • Systematic pan-cancer analysis of mutation-treatment interactions using large real-world clinicogenomics data. Nature medicine Liu, R., Rizzo, S., Waliany, S., Garmhausen, M. R., Pal, N., Huang, Z., Chaudhary, N., Wang, L., Harbron, C., Neal, J., Copping, R., Zou, J. 2022


    Quantifying the effectiveness of different cancer therapies in patients with specific tumor mutations is critical for improving patient outcomes and advancing precision medicine. Here we perform a large-scale computational analysis of 40,903 US patients with cancer who have detailed mutation profiles, treatment sequences and outcomes derived from electronic health records. We systematically identify 458 mutations that predict the survival of patients on specific immunotherapies, chemotherapy agents or targeted therapies across eight common cancer types. We further characterize mutation-mutation interactions that impact the outcomes of targeted therapies. This work demonstrates how computational analysis of large real-world data generates insights, hypotheses and resources to enable precision oncology.

    View details for DOI 10.1038/s41591-022-01873-5

    View details for PubMedID 35773542

  • TSUNAMI: Translational Bioinformatics Tool Suite for Network Analysis and Mining GENOMICS PROTEOMICS & BIOINFORMATICS Huang, Z., Han, Z., Wang, T., Shao, W., Xiang, S., Salama, P., Rizkalla, M., Huang, K., Zhang, J. 2021; 19 (6): 1023-1031


    Gene co-expression network (GCN) mining identifies gene modules with highly correlated expression profiles across samples/conditions. It enables researchers to discover latent gene/molecule interactions, identify novel gene functions, and extract molecular features from certain disease/condition groups, thus helping to identify disease biomarkers. However, there lacks an easy-to-use tool package for users to mine GCN modules that are relatively small in size with tightly connected genes that can be convenient for downstream gene set enrichment analysis, as well as modules that may share common members. To address this need, we developed an online GCN mining tool package: TSUNAMI (Tools SUite for Network Analysis and MIning). TSUNAMI incorporates our state-of-the-art lmQCM algorithm to mine GCN modules for both public and user-input data (microarray, RNA-seq, or any other numerical omics data), and then performs downstream gene set enrichment analysis for the identified modules. It has several features and advantages: 1) a user-friendly interface and real-time co-expression network mining through a web server; 2) direct access and search of NCBI Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases, as well as user-input gene expression matrices for GCN module mining; 3) multiple co-expression analysis tools to choose from, all of which are highly flexible in regards to parameter selection options; 4) identified GCN modules are summarized to eigengenes, which are convenient for users to check their correlation with other clinical traits; 5) integrated downstream Enrichr enrichment analysis and links to other gene set enrichment tools; and 6) visualization of gene loci by Circos plot in any step of the process. The web service is freely accessible through URL: Source code is available at

    View details for DOI 10.1016/j.gpb.2019.05.006

    View details for Web of Science ID 000847852700013

    View details for PubMedID 33705981

    View details for PubMedCentralID PMC9403021