Bio


My research lies at the intersection of computer vision and deep learning, focusing on the development of intelligent systems to advance large-scale image analytics in computational pathology, emphasizing histopathology-based survival analysis, whole slide image classification, and multi-modal representation learning. Through this research, I aim to bridge perceptual understanding and medical reasoning, enabling models to achieve high predictive accuracy and offer clinically interpretable insights for precision medicine.

Stanford Advisors


All Publications


  • A deep learning-based automated pipeline for colorectal cancer detection in contrast-enhanced CT images. Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society Qiu, C., Miller, S., Subramanian, B., Ryu, A., Zhang, H., Fisher, G. A., Shah, N. H., Mongan, J., Langlotz, C., Poullos, P., Shen, J. 2026; 128: 102717

    Abstract

    Colorectal cancer (CRC) is the third most commonly diagnosed malignancy worldwide and a leading cause of cancer-related mortality. This study aims to investigate an automatic detection pipeline for identification and localization of the primary CRC in portal venous phase contrast-enhanced CT scans, which is a crucial first step for downstream CRC staging, prognostication, and treatment planning. We propose a deep learning-based automated detection pipeline using YOLOv11 as the baseline architecture. A ResNet50 module was incorporated into the YOLOv11 backbone to enhance image feature extraction. Additionally, a scale-adaptive loss function, which introduces an adaptive coefficient and a scaling factor to adaptively measure the Intersection over Union (IoU) and center point distance for improving box regression performance, was designed to further improve detection performance. The proposed pipeline achieved a recall of 0.8092, precision of 0.8187, and F-1 score of 0.8139 for CRC detection on our in-house dataset at the patient level (inter-patient evaluation) and a recall of 0.9949, precision of 0.9894, and F-1 score of 0.9921 at the slice level (intra-patient evaluation). Validation on an external public dataset demonstrated that our pipeline, when trained on a patient-level in-house dataset, obtained a recall of 0.8283, precision of 0.8414, and F-1 score of 0.8348 and, when trained on a slice-level in-house dataset, achieved a recall of 0.6897, precision of 0.7888, and F-1 score of 0.7358, outperforming existing representative detection methods. The superior CRC detection performance on the in-house CT dataset and state-of-the-art generalization performance on the public dataset (with a 31.97 %age point improvement in detection sensitivity (recall) over the next closest state-of-the-art method), highlight the potential translational value of our pipeline for CRC clinical decision support, conditional upon validation in larger cohorts.

    View details for DOI 10.1016/j.compmedimag.2026.102717

    View details for PubMedID 41633187

  • STARC-9: A Large-scale Dataset for Multi-Class Tissue Classification for CRC Histopathology. ArXiv Subramanian, B., Jeyaraj, R., Peterson, M. N., Guo, T., Shah, N., Langlotz, C., Ng, A. Y., Shen, J. 2025

    Abstract

    Multi-class tissue-type classification of colorectal cancer (CRC) histopathologic images is a significant step in the development of downstream machine learning models for diagnosis and treatment planning. However, existing public CRC datasets often lack morphologic diversity, suffer from class imbalance, and contain low-quality image tiles, limiting model performance and generalizability. To address these issues, we introduce STARC-9 (STAnford coloRectal Cancer), a large-scale dataset for multi-class tissue classification. STARC-9 contains 630,000 hematoxylin and eosin-stained image tiles uniformly sampled across nine clinically relevant tissue classes (70,000 tiles per class) from 200 CRC patients at the Stanford University School of Medicine. The dataset was built using a novel framework, DeepCluster++, designed to ensure intra-class diversity and reduce manual curation. First, an encoder from a histopathology-specific autoencoder extracts feature vectors from tiles within each whole-slide image. Then, K-means clustering groups morphologically similar tiles, followed by equal-frequency binning to sample diverse morphologic patterns within each class. The selected tiles are subsequently verified by expert gastrointestinal pathologists to ensure accuracy. This semi-automated process significantly reduces manual effort while producing high-quality, diverse tiles. To evaluate STARC-9, we benchmarked convolutional neural networks, transformers, and pathology-specific foundation models on multi-class CRC tissue classification and segmentation tasks, showing superior generalizability compared to models trained on existing datasets. Although we demonstrate the utility of DeepCluster++ on CRC as a pilot use-case, it is a flexible framework that can be used for constructing high-quality datasets from large WSI repositories across a wide range of cancer and non-cancer applications.

    View details for PubMedID 41953104

    View details for PubMedCentralID PMC13054540