Stanford Advisors


All Publications


  • Evaluating Vision and Pathology Foundation Models for Computational Pathology: A Comprehensive Benchmark Study. Research square Gevaert, O., Bareja, R., Carrillo-Perez, F., Zheng, Y., Pizurica, M., Nandi, T., Shen, J., Madduri, R. 2025

    Abstract

    To advance precision medicine in pathology, robust AI-driven foundation models are increasingly needed to uncover complex patterns in large-scale pathology datasets, enabling more accurate disease detection, classification, and prognostic insights. However, despite substantial progress in deep learning and computer vision, the comparative performance and generalizability of these pathology foundation models across diverse histopathological datasets and tasks remain largely unexamined. In this study, we conduct a comprehensive benchmarking of 31 AI foundation models for computational pathology, including general vision models (VM), general vision-language models (VLM), pathology-specific vision models (Path-VM), and pathology-specific vision-language models (Path-VLM), evaluated over 41 tasks sourced from TCGA, CPTAC, external benchmarking datasets, and out-of-domain datasets. Our study demonstrates that Virchow2, a pathology foundation model, delivered the highest performance across TCGA, CPTAC, and external tasks, highlighting its effectiveness in diverse histopathological evaluations. We also show that Path-VM outperformed both Path-VLM and VM, securing top rankings across tasks despite lacking a statistically significant edge over vision models. Our findings reveal that model size and data size did not consistently correlate with improved performance in pathology foundation models, challenging assumptions about scaling in histopathological applications. Lastly, our study demonstrates that a fusion model, integrating top-performing foundation models, achieved superior generalization across external tasks and diverse tissues in histopathological analysis. These findings emphasize the need for further research to understand the underlying factors influencing model performance and to develop strategies that enhance the generalizability and robustness of pathology-specific vision foundation models across different tissue types and datasets. PathBench : https://pathbench.stanford.edu/.

    View details for DOI 10.21203/rs.3.rs-6823810/v1

    View details for PubMedID 40630532

    View details for PubMedCentralID PMC12236927

  • Towards a more inductive world for drug repurposing approaches NATURE MACHINE INTELLIGENCE de la Fuente, J., Serrano, G., Veleiro, U., Casals, M., Vera, L., Pizurica, M., Gomez-Cebrian, N., Puchades-Carrasco, L., Pineda-Lucena, A., Ochoa, I., Vicent, S., Gevaert, O., Hernaez, M. 2025
  • Synthetic multimodal data modelling for data imputation. Nature biomedical engineering Carrillo-Perez, F., Pizurica, M., Marchal, K., Gevaert, O. 2024

    View details for DOI 10.1038/s41551-024-01324-1

    View details for PubMedID 39715898

    View details for PubMedCentralID 8971486

  • Digital profiling of gene expression from histology images with linearized attention. Nature communications Pizurica, M., Zheng, Y., Carrillo-Perez, F., Noor, H., Yao, W., Wohlfart, C., Vladimirova, A., Marchal, K., Gevaert, O. 2024; 15 (1): 9886

    Abstract

    Cancer is a heterogeneous disease requiring costly genetic profiling for better understanding and management. Recent advances in deep learning have enabled cost-effective predictions of genetic alterations from whole slide images (WSIs). While transformers have driven significant progress in non-medical domains, their application to WSIs lags behind due to high model complexity and limited dataset sizes. Here, we introduce SEQUOIA, a linearized transformer model that predicts cancer transcriptomic profiles from WSIs. SEQUOIA is developed using 7584 tumor samples across 16 cancer types, with its generalization capacity validated on two independent cohorts comprising 1368 tumors. Accurately predicted genes are associated with key cancer processes, including inflammatory response, cell cycles and metabolism. Further, we demonstrate the value of SEQUOIA in stratifying the risk of breast cancer recurrence and in resolving spatial gene expression at loco-regional levels. SEQUOIA hence deciphers clinically relevant information from WSIs, opening avenues for personalized cancer management.

    View details for DOI 10.1038/s41467-024-54182-5

    View details for PubMedID 39543087

  • Towards Digital Quantification of Ploidy from Pan-Cancer Digital Pathology Slides using Deep Learning. bioRxiv : the preprint server for biology Carrillo-Perez, F., Cramer, E. M., Pizurica, M., Andor, N., Gevaert, O. 2024

    Abstract

    Abnormal DNA ploidy, found in numerous cancers, is increasingly being recognized as a contributor in driving chromosomal instability, genome evolution, and the heterogeneity that fuels cancer cell progression. Furthermore, it has been linked with poor prognosis of cancer patients. While next-generation sequencing can be used to approximate tumor ploidy, it has a high error rate for near-euploid states, a high cost and is time consuming, motivating alternative rapid quantification methods. We introduce PloiViT, a transformer-based model for tumor ploidy quantification that outperforms traditional machine learning models, enabling rapid and cost-effective quantification directly from pathology slides. We trained PloiViT on a dataset of fifteen cancer types from The Cancer Genome Atlas and validated its performance in multiple independent cohorts. Additionally, we explored the impact of self-supervised feature extraction on performance. PloiViT, using self-supervised features, achieved the lowest prediction error in multiple independent cohorts, exhibiting better generalization capabilities. Our findings demonstrate that PloiViT predicts higher ploidy values in aggressive cancer groups and patients with specific mutations, validating PloiViT potential as complementary for ploidy assessment to next-generation sequencing data. To further promote its use, we release our models as a user-friendly inference application and a Python package for easy adoption and use.

    View details for DOI 10.1101/2024.08.19.608555

    View details for PubMedID 39229200

    View details for PubMedCentralID PMC11370345

  • Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models. Nature biomedical engineering Carrillo-Perez, F., Pizurica, M., Zheng, Y., Nandi, T. N., Madduri, R., Shen, J., Gevaert, O. 2024

    Abstract

    Training machine-learning models with synthetically generated data can alleviate the problem of data scarcity when acquiring diverse and sufficiently large datasets is costly and challenging. Here we show that cascaded diffusion models can be used to synthesize realistic whole-slide image tiles from latent representations of RNA-sequencing data from human tumours. Alterations in gene expression affected the composition of cell types in the generated synthetic image tiles, which accurately preserved the distribution of cell types and maintained the cell fraction observed in bulk RNA-sequencing data, as we show for lung adenocarcinoma, kidney renal papillary cell carcinoma, cervical squamous cell carcinoma, colon adenocarcinoma and glioblastoma. Machine-learning models pretrained with the generated synthetic data performed better than models trained from scratch. Synthetic data may accelerate the development of machine-learning models in scarce-data settings and allow for the imputation of missing data modalities.

    View details for DOI 10.1038/s41551-024-01193-8

    View details for PubMedID 38514775

  • GeNNius: An ultrafast drug-target interaction inference method based on graph neural networks. Bioinformatics (Oxford, England) Veleiro, U., de la Fuente, J., Serrano, G., Pizurica, M., Casals, M., Pineda-Lucena, A., Vicent, S., Ochoa, I., Gevaert, O., Hernaez, M. 2023

    Abstract

    Drug-target interaction (DTI) prediction is a relevant but challenging task in the drug repurposing field. In-silico approaches have drawn particular attention as they can reduce associated costs and time commitment of traditional methodologies. Yet, current state-of-the-art methods present several limitations: existing DTI prediction approaches are computationally expensive, thereby hindering the ability to use large networks and exploit available datasets and, the generalization to unseen datasets of DTI prediction methods remains unexplored, which could potentially improve the development processes of DTI inferring approaches in terms of accuracy and robustness.In this work, we introduce GeNNius (Graph Embedding Neural Network Interaction Uncovering System), a Graph Neural Network (GNN)-based method that outperforms state-of-the-art models in terms of both accuracy and time efficiency across a variety of datasets. We also demonstrated its prediction power to uncover new interactions by evaluating not previously known DTIs for each dataset. We further assessed the generalization capability of GeNNius by training and testing it on different datasets, showing that this framework can potentially improve the DTI prediction task by training on large datasets and testing on smaller ones. Finally, we investigated qualitatively the embeddings generated by GeNNius, revealing that the GNN encoder maintains biological information after the graph convolutions while diffusing this information through nodes, eventually distinguishing protein families in the node embedding space.GeNNius code is available at https://github.com/ubioinformat/GeNNius.

    View details for DOI 10.1093/bioinformatics/btad774

    View details for PubMedID 38134424

  • Digital profiling of cancer transcriptomes from histology images with grouped vision attention. bioRxiv : the preprint server for biology Zheng, Y., Pizurica, M., Carrillo-Perez, F., Noor, H., Yao, W., Wohlfart, C., Marchal, K., Vladimirova, A., Gevaert, O. 2023

    Abstract

    Cancer is a heterogeneous disease that demands precise molecular profiling for better understanding and management. RNA-sequencing has emerged as a potent tool to unravel the transcriptional heterogeneity. However, large-scale characterization of cancer transcriptomes is hindered by the limitations of costs and tissue accessibility. Here, we develop SEQUOIA , a deep learning model employing a transformer architecture to predict cancer transcriptomes from whole-slide histology images. We pre-train the model using data from 2,242 normal tissues, and the model is fine-tuned and evaluated in 4,218 tumor samples across nine cancer types. The results are further validated across two independent cohorts compromising 1,305 tumors. The highest performance was observed in cancers from breast, kidney and lung, where SEQUOIA accurately predicted 13,798, 10,922 and 9,735 genes, respectively. The well predicted genes are associated with the regulation of inflammatory response, cell cycles and hypoxia-related metabolic pathways. Leveraging the well predicted genes, we develop a digital signature to predict the risk of recurrence in breast cancer. While the model is trained at the tissue-level, we showcase its potential in predicting spatial gene expression patterns using spatial transcriptomics datasets. SEQUOIA deciphers clinically relevant gene expression patterns from histology images, opening avenues for improved cancer management and personalized therapies.

    View details for DOI 10.1101/2023.09.28.560068

    View details for PubMedID 37808782

  • Synthetic whole-slide image tile generation with gene expression profile-infused deep generative models. Cell reports methods Carrillo-Perez, F., Pizurica, M., Ozawa, M. G., Vogel, H., West, R. B., Kong, C. S., Herrera, L. J., Shen, J., Gevaert, O. 2023; 3 (8): 100534

    Abstract

    In this work, we propose an approach to generate whole-slide image (WSI) tiles by using deep generative models infused with matched gene expression profiles. First, we train a variational autoencoder (VAE) that learns a latent, lower-dimensional representation of multi-tissue gene expression profiles. Then, we use this representation to infuse generative adversarial networks (GANs) that generate lung and brain cortex tissue tiles, resulting in a new model that we call RNA-GAN. Tiles generated by RNA-GAN were preferred by expert pathologists compared with tiles generated using traditional GANs, and in addition, RNA-GAN needs fewer training epochs to generate high-quality tiles. Finally, RNA-GAN was able to generalize to gene expression profiles outside of the training set, showing imputation capabilities. A web-based quiz is available for users to play a game distinguishing real and synthetic tiles: https://rna-gan.stanford.edu/, and the code for RNA-GAN is available here: https://github.com/gevaertlab/RNA-GAN.

    View details for DOI 10.1016/j.crmeth.2023.100534

    View details for PubMedID 37671024

  • Spatial cellular architecture predicts prognosis in glioblastoma. Nature communications Zheng, Y., Carrillo-Perez, F., Pizurica, M., Heiland, D. H., Gevaert, O. 2023; 14 (1): 4122

    Abstract

    Intra-tumoral heterogeneity and cell-state plasticity are key drivers for the therapeutic resistance of glioblastoma. Here, we investigate the association between spatial cellular organization and glioblastoma prognosis. Leveraging single-cell RNA-seq and spatial transcriptomics data, we develop a deep learning model to predict transcriptional subtypes of glioblastoma cells from histology images. Employing this model, we phenotypically analyze 40 million tissue spots from 410 patients and identify consistent associations between tumor architecture and prognosis across two independent cohorts. Patients with poor prognosis exhibit higher proportions of tumor cells expressing a hypoxia-induced transcriptional program. Furthermore, a clustering pattern of astrocyte-like tumor cells is associated with worse prognosis, while dispersion and connection of the astrocytes with other transcriptional subtypes correlate with decreased risk. To validate these results, we develop a separate deep learning model that utilizes histology images to predict prognosis. Applying this model to spatial transcriptomics data reveal survival-associated regional gene expression programs. Overall, our study presents a scalable approach to unravel the transcriptional heterogeneity of glioblastoma and establishes a critical connection between spatial cellular architecture and clinical outcomes.

    View details for DOI 10.1038/s41467-023-39933-0

    View details for PubMedID 37433817

    View details for PubMedCentralID PMC10336135

  • Whole slide imaging-based prediction of TP53 mutations identifies an aggressive disease phenotype in prostate cancer. Cancer research Pizurica, M., Larmuseau, M., Van der Eecken, K., de Schaetzen van Brienen, L., Carrillo-Perez, F., Isphording, S., Lumen, N., Van Dorpe, J., Ost, P., Verbeke, S., Gevaert, O., Marchal, K. 2023

    Abstract

    In prostate cancer, there is an urgent need for objective prognostic biomarkers that identify the metastatic potential of a tumor at an early stage. While recent analyses indicated TP53 mutations as candidate biomarkers, molecular profiling in a clinical setting is complicated by tumor heterogeneity. Deep learning models that predict the spatial presence of TP53 mutations in whole slide images (WSIs) offer the potential to mitigate this issue. To assess the potential of WSIs as proxies for spatially resolved profiling and as biomarkers for aggressive disease, we developed TiDo, a deep learning model that achieves state-of-the-art performance in predicting TP53 mutations from WSIs of primary prostate tumors. In an independent multi-focal cohort, the model showed successful generalization at both the patient and lesion level. Analysis of model predictions revealed that false positive (FP) predictions could at least partially be explained by TP53 deletions, suggesting that some FP carry an alteration that leads to the same histological phenotype as TP53 mutations. Comparative expression and histological cell type analyses identified a TP53-like cellular phenotype triggered by expression of pathways affecting stromal composition. Together, these findings indicate that WSI-based models might not be able to perfectly predict the spatial presence of individual TP53 mutations but they have the potential to elucidate the prognosis of a tumor by depicting a downstream phenotype associated with aggressive disease biomarkers.

    View details for DOI 10.1158/0008-5472.CAN-22-3113

    View details for PubMedID 37352385

  • Multimodal data fusion for cancer biomarker discovery with deep learning NATURE MACHINE INTELLIGENCE Steyaert, S., Pizurica, M., Nagaraj, D., Khandelwal, P., Hernandez-Boussard, T., Gentles, A. J., Gevaert, O. 2023
  • Multimodal data fusion for cancer biomarker discovery with deep learning. Nature machine intelligence Steyaert, S., Pizurica, M., Nagaraj, D., Khandelwal, P., Hernandez-Boussard, T., Gentles, A. J., Gevaert, O. 2023; 5 (4): 351-362

    Abstract

    Technological advances now make it possible to study a patient from multiple angles with high-dimensional, high-throughput multi-scale biomedical data. In oncology, massive amounts of data are being generated ranging from molecular, histopathology, radiology to clinical records. The introduction of deep learning has significantly advanced the analysis of biomedical data. However, most approaches focus on single data modalities leading to slow progress in methods to integrate complementary data types. Development of effective multimodal fusion approaches is becoming increasingly important as a single modality might not be consistent and sufficient to capture the heterogeneity of complex diseases to tailor medical care and improve personalised medicine. Many initiatives now focus on integrating these disparate modalities to unravel the biological processes involved in multifactorial diseases such as cancer. However, many obstacles remain, including lack of usable data as well as methods for clinical validation and interpretation. Here, we cover these current challenges and reflect on opportunities through deep learning to tackle data sparsity and scarcity, multimodal interpretability, and standardisation of datasets.

    View details for DOI 10.1038/s42256-023-00633-5

    View details for PubMedID 37693852

    View details for PubMedCentralID PMC10484010