Current Role at Stanford

Research and Development Scientist at Engineer at Urologic Cancer Innovation Lab, Urology Department.

Honors & Awards

  • Fourth Place in the PI-CAI 2023 Grand Challenge (the only team from US), PI-CAI (2023)
  • Second Place in the Learn2Reg Grand Challenge, MICCAI (2021)
  • Second Place in the Multi-sequence Cardiac MR Segmentation Challenge (MS-CMRSeg), MICCAI-STACOM (2019)
  • Second Place in the Atrial Segmentation Challenge, MICCAI-STACOM (2018)
  • Graduate School Scholarship Programme (GSSP) Recipient, German Academic Exchange Service (DAAD) (2016)
  • SAARC India Silver Jubilee Scholarships Recipient, SARRC (2011)

Education & Certifications

  • PhD, Friedrich-Alexander-Universität Erlangen-Nürnberg, Medical Engineering (2021)
  • M.Sc., South Asian University, Computer Science (2013)
  • B.Sc., Kabul University, Computer Science (2010)

Work Experience

  • Research Data Scientist, University of California San Francisco (UCSF) (8/1/2020 - 2/28/2021)


    San Francisco, CA

All Publications

  • Learn2Reg: Comprehensive Multi-Task Medical Image Registration Challenge, Dataset and Evaluation in the Era of Deep Learning IEEE TRANSACTIONS ON MEDICAL IMAGING Hering, A., Hansen, L., Mok, T. W., Chung, A. S., Siebert, H., Hager, S., Lange, A., Kuckertz, S., Heldmann, S., Shao, W., Vesal, S., Rusu, M., Sonn, G., Estienne, T., Vakalopoulou, M., Han, L., Huang, Y., Yap, P., Brudfors, M., Balbastre, Y., Joutard, S., Modat, M., Lifshitz, G., Raviv, D., Lv, J., Li, Q., Jaouen, V., Visvikis, D., Fourcade, C., Rubeaux, M., Pan, W., Xu, Z., Jian, B., De Benetti, F., Wodzinski, M., Gunnarsson, N., Sjolund, J., Grzech, D., Qiu, H., Li, Z., Thorley, A., Duan, J., Grossbroehmer, C., Hoopes, A., Reinertsen, I., Xiao, Y., Landman, B., Huo, Y., Murphy, K., Lessmann, N., van Ginneken, B., Dalca, A. V., Heinrich, M. P. 2023; 42 (3): 697-712


    Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing approaches. The Learn2Reg challenge addresses these limitations by providing a multi-task medical image registration data set for comprehensive characterisation of deformable registration algorithms. A continuous evaluation will be possible at Learn2Reg covers a wide range of anatomies (brain, abdomen, and thorax), modalities (ultrasound, CT, MR), availability of annotations, as well as intra- and inter-patient registration evaluation. We established an easily accessible framework for training and validation of 3D registration methods, which enabled the compilation of results of over 65 individual method submissions from more than 20 unique teams. We used a complementary set of metrics, including robustness, accuracy, plausibility, and runtime, enabling unique insight into the current state-of-the-art of medical image registration. This paper describes datasets, tasks, evaluation methods and results of the challenge, as well as results of further analysis of transferability to new datasets, the importance of label supervision, and resulting bias. While no single approach worked best across all tasks, many methodological aspects could be identified that push the performance of medical image registration to new state-of-the-art performance. Furthermore, we demystified the common belief that conventional registration methods have to be much slower than deep-learning-based methods.

    View details for DOI 10.1109/TMI.2022.3213983

    View details for Web of Science ID 000971629600011

    View details for PubMedID 36264729

  • The Association of Tissue Change and Treatment Success During High-intensity Focused Ultrasound Focal Therapy for Prostate Cancer. European urology focus Khandwala, Y. S., Soerensen, S. J., Morisetty, S., Ghanouni, P., Fan, R. E., Vesal, S., Rusu, M., Sonn, G. A. 2022


    BACKGROUND: Tissue preservation strategies have been increasingly used for the management of localized prostate cancer. Focal ablation using ultrasound-guided high-intensity focused ultrasound (HIFU) has demonstrated promising short and medium-term oncological outcomes. Advancements in HIFU therapy such as the introduction of tissue change monitoring (TCM) aim to further improve treatment efficacy.OBJECTIVE: To evaluate the association between intraoperative TCM during HIFU focal therapy for localized prostate cancer and oncological outcomes 12 mo afterward.DESIGN, SETTING, AND PARTICIPANTS: Seventy consecutive men at a single institution with prostate cancer were prospectively enrolled. Men with prior treatment, metastases, or pelvic radiation were excluded to obtain a final cohort of 55 men.INTERVENTION: All men underwent HIFU focal therapy followed by magnetic resonance (MR)-fusion biopsy 12 mo later. Tissue change was quantified intraoperatively by measuring the backscatter of ultrasound waves during ablation.OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS: Gleason grade group (GG) ≥2 cancer on postablation biopsy was the primary outcome. Secondary outcomes included GG ≥1 cancer, Prostate Imaging Reporting and Data System (PI-RADS) scores ≥3, and evidence of tissue destruction on post-treatment magnetic resonance imaging (MRI). A Student's t - test analysis was performed to evaluate the mean TCM scores and efficacy of ablation measured by histopathology. Multivariate logistic regression was also performed to identify the odds of residual cancer for each unit increase in the TCM score.RESULTS AND LIMITATIONS: A lower mean TCM score within the region of the tumor (0.70 vs 0.97, p=0.02) was associated with the presence of persistent GG ≥2 cancer after HIFU treatment. Adjusting for initial prostate-specific antigen, PI-RADS score, Gleason GG, positive cores, and age, each incremental increase of TCM was associated with an 89% reduction in the odds (odds ratio: 0.11, confidence interval: 0.01-0.97) of having residual GG ≥2 cancer on postablation biopsy. Men with higher mean TCM scores (0.99 vs 0.72, p=0.02) at the time of treatment were less likely to have abnormal MRI (PI-RADS ≥3) at 12 mo postoperatively. Cases with high TCM scores also had greater tissue destruction measured on MRI and fewer visible lesions on postablation MRI.CONCLUSIONS: Tissue change measured using TCM values during focal HIFU of the prostate was associated with histopathology and radiological outcomes 12 mo after the procedure.PATIENT SUMMARY: In this report, we looked at how well ultrasound changes of the prostate during focal high-intensity focused ultrasound (HIFU) therapy for the treatment of prostate cancer predict patient outcomes. We found that greater tissue change measured by the HIFU device was associated with less residual cancer at 1 yr. This tool should be used to ensure optimal ablation of the cancer and may improve focal therapy outcomes in the future.

    View details for DOI 10.1016/j.euf.2022.10.010

    View details for PubMedID 36372735

  • A review of artificial intelligence in prostate cancer detection on imaging. Therapeutic advances in urology Bhattacharya, I., Khandwala, Y. S., Vesal, S., Shao, W., Yang, Q., Soerensen, S. J., Fan, R. E., Ghanouni, P., Kunder, C. A., Brooks, J. D., Hu, Y., Rusu, M., Sonn, G. A. 2022; 14: 17562872221128791


    A multitude of studies have explored the role of artificial intelligence (AI) in providing diagnostic support to radiologists, pathologists, and urologists in prostate cancer detection, risk-stratification, and management. This review provides a comprehensive overview of relevant literature regarding the use of AI models in (1) detecting prostate cancer on radiology images (magnetic resonance and ultrasound imaging), (2) detecting prostate cancer on histopathology images of prostate biopsy tissue, and (3) assisting in supporting tasks for prostate cancer detection (prostate gland segmentation, MRI-histopathology registration, MRI-ultrasound registration). We discuss both the potential of these AI models to assist in the clinical workflow of prostate cancer diagnosis, as well as the current limitations including variability in training data sets, algorithms, and evaluation criteria. We also discuss ongoing challenges and what is needed to bridge the gap between academic research on AI for prostate cancer and commercial solutions that improve routine clinical care.

    View details for DOI 10.1177/17562872221128791

    View details for PubMedID 36249889

    View details for PubMedCentralID PMC9554123

  • Domain generalization for prostate segmentation in transrectal ultrasound images: A multi-center study. Medical image analysis Vesal, S., Gayo, I., Bhattacharya, I., Natarajan, S., Marks, L. S., Barratt, D. C., Fan, R. E., Hu, Y., Sonn, G. A., Rusu, M. 2022; 82: 102620


    Prostate biopsy and image-guided treatment procedures are often performed under the guidance of ultrasound fused with magnetic resonance images (MRI). Accurate image fusion relies on accurate segmentation of the prostate on ultrasound images. Yet, the reduced signal-to-noise ratio and artifacts (e.g., speckle and shadowing) in ultrasound images limit the performance of automated prostate segmentation techniques and generalizing these methods to new image domains is inherently difficult. In this study, we address these challenges by introducing a novel 2.5D deep neural network for prostate segmentation on ultrasound images. Our approach addresses the limitations of transfer learning and finetuning methods (i.e., drop in performance on the original training data when the model weights are updated) by combining a supervised domain adaptation technique and a knowledge distillation loss. The knowledge distillation loss allows the preservation of previously learned knowledge and reduces the performance drop after model finetuning on new datasets. Furthermore, our approach relies on an attention module that considers model feature positioning information to improve the segmentation accuracy. We trained our model on 764 subjects from one institution and finetuned our model using only ten subjects from subsequent institutions. We analyzed the performance of our method on three large datasets encompassing 2067 subjects from three different institutions. Our method achieved an average Dice Similarity Coefficient (Dice) of 94.0±0.03 and Hausdorff Distance (HD95) of 2.28mm in an independent set of subjects from the first institution. Moreover, our model generalized well in the studies from the other two institutions (Dice: 91.0±0.03; HD95: 3.7mm and Dice: 82.0±0.03; HD95: 7.1mm). We introduced an approach that successfully segmented the prostate on ultrasound images in a multi-center study, suggesting its clinical potential to facilitate the accurate fusion of ultrasound and MRI images to drive biopsy and image-guided treatments.

    View details for DOI 10.1016/

    View details for PubMedID 36148705

  • Cardiac segmentation on late gadolinium enhancement MRI: A benchmark study from multi-sequence cardiac MR segmentation challenge. Medical image analysis Zhuang, X., Xu, J., Luo, X., Chen, C., Ouyang, C., Rueckert, D., Campello, V. M., Lekadir, K., Vesal, S., RaviKumar, N., Liu, Y., Luo, G., Chen, J., Li, H., Ly, B., Sermesant, M., Roth, H., Zhu, W., Wang, J., Ding, X., Wang, X., Yang, S., Li, L. 2022; 81: 102528


    Accurate computing, analysis and modeling of the ventricles and myocardium from medical images are important, especially in the diagnosis and treatment management for patients suffering from myocardial infarction (MI). Late gadolinium enhancement (LGE) cardiac magnetic resonance (CMR) provides an important protocol to visualize MI. However, compared with the other sequences LGE CMR images with gold standard labels are particularly limited. This paper presents the selective results from the Multi-Sequence Cardiac MR (MS-CMR) Segmentation challenge, in conjunction with MICCAI 2019. The challenge offered a data set of paired MS-CMR images, including auxiliary CMR sequences as well as LGE CMR, from 45 patients who underwent cardiomyopathy. It was aimed to develop new algorithms, as well as benchmark existing ones for LGE CMR segmentation focusing on myocardial wall of the left ventricle and blood cavity of the two ventricles. In addition, the paired MS-CMR images could enable algorithms to combine the complementary information from the other sequences for the ventricle segmentation of LGE CMR. Nine representative works were selected for evaluation and comparisons, among which three methods are unsupervised domain adaptation (UDA) methods and the other six are supervised. The results showed that the average performance of the nine methods was comparable to the inter-observer variations. Particularly, the top-ranking algorithms from both the supervised and UDA methods could generate reliable and robust segmentation results. The success of these methods was mainly attributed to the inclusion of the auxiliary sequences from the MS-CMR images, which provide important label information for the training of deep neural networks. The challenge continues as an ongoing resource, and the gold standard segmentation as well as the MS-CMR images of both the training and test data are available upon registration via its homepage (

    View details for DOI 10.1016/

    View details for PubMedID 35834896

  • Deep learning based denoising of mammographic x-ray images: An investigation of loss functions and their detail-preserving properties Eckert, D., Ritschl, L., Herbst, M., Wicklein, J., Vesal, S., Kappler, S., Maier, A., Stober, S., Zhao, W., Yu, L. SPIE-INT SOC OPTICAL ENGINEERING. 2022

    View details for DOI 10.1117/12.2612403

    View details for Web of Science ID 000836294000064

  • Adapt Everywhere: Unsupervised Adaptation of Point-Clouds and Entropy Minimisation for Multi-modal Cardiac Image Segmentation. IEEE transactions on medical imaging Vesal, S., Gu, M., Kosti, R., Maier, A., Ravikumar, N. 2021; PP


    Deep learning models are sensitive to domain shift phenomena. A model trained on images from one domain cannot generalise well when tested on images from a different domain, despite capturing similar anatomical structures. It is mainly because the data distribution between the two domains is different. Moreover, creating annotation for every new modality is a tedious and time-consuming task, which also suffers from high inter- and intra- observer variability. Unsupervised domain adaptation (UDA) methods intend to reduce the gap between source and target domains by leveraging source domain labelled data to generate labels for the target domain. However, current state-of-the-art (SOTA) UDA methods demonstrate degraded performance when there is insufficient data in source and target domains. In this paper, we present a novel UDA method for multi-modal cardiac image segmentation. The proposed method is based on adversarial learning and adapts network features between source and target domain in different spaces. The paper introduces an end-to-end framework that integrates: a) entropy minimisation, b) output feature space alignment and c) a novel point-cloud shape adaptation based on the latent features learned by the segmentation model. We validated our method on two cardiac datasets by adapting from the annotated source domain, bSSFP-MRI (balanced Steady-State Free Procession-MRI), to the unannotated target domain, LGE-MRI (Late-gadolinium enhance-MRI), for the multi-sequence dataset; and from MRI (source) to CT (target) for the cross-modality dataset. The results highlighted that by enforcing adversarial learning in different parts of the network, the proposed method delivered promising performance, compared to other SOTA methods.

    View details for DOI 10.1109/TMI.2021.3066683

    View details for PubMedID 33729930

  • A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging. Medical image analysis Xiong, Z., Xia, Q., Hu, Z., Huang, N., Bian, C., Zheng, Y., Vesal, S., Ravikumar, N., Maier, A., Yang, X., Heng, P. A., Ni, D., Li, C., Tong, Q., Si, W., Puybareau, E., Khoudli, Y., Géraud, T., Chen, C., Bai, W., Rueckert, D., Xu, L., Zhuang, X., Luo, X., Jia, S., Sermesant, M., Liu, Y., Wang, K., Borra, D., Masci, A., Corsi, C., de Vente, C., Veta, M., Karim, R., Preetha, C. J., Engelhardt, S., Qiao, M., Wang, Y., Tao, Q., Nuñez-Garcia, M., Camara, O., Savioli, N., Lamata, P., Zhao, J. 2021; 67: 101832


    Segmentation of medical images, particularly late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) used for visualizing diseased atrial structures, is a crucial first step for ablation treatment of atrial fibrillation. However, direct segmentation of LGE-MRIs is challenging due to the varying intensities caused by contrast agents. Since most clinical studies have relied on manual, labor-intensive approaches, automatic methods are of high interest, particularly optimized machine learning approaches. To address this, we organized the 2018 Left Atrium Segmentation Challenge using 154 3D LGE-MRIs, currently the world's largest atrial LGE-MRI dataset, and associated labels of the left atrium segmented by three medical experts, ultimately attracting the participation of 27 international teams. In this paper, extensive analysis of the submitted algorithms using technical and biological metrics was performed by undergoing subgroup analysis and conducting hyper-parameter analysis, offering an overall picture of the major design choices of convolutional neural networks (CNNs) and practical considerations for achieving state-of-the-art left atrium segmentation. Results show that the top method achieved a Dice score of 93.2% and a mean surface to surface distance of 0.7 mm, significantly outperforming prior state-of-the-art. Particularly, our analysis demonstrated that double sequentially used CNNs, in which a first CNN is used for automatic region-of-interest localization and a subsequent CNN is used for refined regional segmentation, achieved superior results than traditional methods and machine learning approaches containing single CNNs. This large-scale benchmarking study makes a significant step towards much-improved segmentation methods for atrial LGE-MRIs, and will serve as an important benchmark for evaluating and comparing the future works in the field. Furthermore, the findings from this study can potentially be extended to other imaging datasets and modalities, having an impact on the wider medical imaging community.

    View details for DOI 10.1016/

    View details for PubMedID 33166776

  • Spatio-temporal Multi-task Learning for Cardiac MRI Left Ventricle Quantification. IEEE journal of biomedical and health informatics Vesal, S., Gu, M., Maier, A., Ravikumar, N. 2020; PP


    Quantitative assessment of cardiac left ventricle (LV) morphology is essential to assess cardiac function and improve the diagnosis of different cardiovascular diseases. In current clinical practice, LV quantification depends on the measurement of myocardial shape indices, which is usually achieved by manual delineation. However, this process is time-consuming and subject to inter and intra-observer variability. In this paper, we propose a Spatio-temporal multi-task learning approach to obtain a complete set of measurements quantifying cardiac LV morphology, regional-wall thickness (RWT), and additionally detecting the cardiac phase cycle (systole and diastole) for a given 3D Cine-magnetic resonance (MR) image sequence. We first segment cardiac LVs using an encoder-decoder network and then introduce a multitask framework to regress 11 LV indices and classify the cardiac phase, as parallel tasks during model optimization. The proposed deep learning model is based on the 3D Spatio-temporal convolutions, which extract spatial and temporal features from MR images. We demonstrate the efficacy of the proposed method using cine-MR sequences of 145 subjects and comparing the performance with other state-of-the-art quantification methods. The proposed method achieved high prediction accuracy, with an average mean absolute error (MAE) of 129 mm2, 1.23 mm, 1.76 mm, Pearson correlation coefficient (PCC) of 96.4%, 87.2%, and 97.5% for LV and myocardium (Myo) cavity regions, 6 RWTs, 3 LV dimensions, and an error rate of 9.0% for phase classification. The experimental results highlight the robustness of the proposed method, despite varying degrees of cardiac morphology, image appearance, and low contrast in the cardiac MR sequences.

    View details for DOI 10.1109/JBHI.2020.3046449

    View details for PubMedID 33351771

  • Fully Automated 3D Cardiac MRI Localisation and Segmentation Using Deep Neural Networks JOURNAL OF IMAGING Vesal, S., Maier, A., Ravikumar, N. 2020; 6 (7)
  • Implementation of machine learning into clinical breast MRI: Potential for objective and accurate decision-making in suspicious breast masses. PloS one Ellmann, S., Wenkel, E., Dietzel, M., Bielowski, C., Vesal, S., Maier, A., Hammon, M., Janka, R., Fasching, P. A., Beckmann, M. W., Schulz Wendtland, R., Uder, M., Bäuerle, T. 2020; 15 (1): e0228446


    We investigated whether the integration of machine learning (ML) into MRI interpretation can provide accurate decision rules for the management of suspicious breast masses. A total of 173 consecutive patients with suspicious breast masses upon complementary assessment (BI-RADS IV/V: n = 100/76) received standardized breast MRI prior to histological verification. MRI findings were independently assessed by two observers (R1/R2: 5 years of experience/no experience in breast MRI) using six (semi-)quantitative imaging parameters. Interobserver variability was studied by ICC (intraclass correlation coefficient). A polynomial kernel function support vector machine was trained to differentiate between benign and malignant lesions based on the six imaging parameters and patient age. Ten-fold cross-validation was applied to prevent overfitting. Overall diagnostic accuracy and decision rules (rule-out criteria) to accurately exclude malignancy were evaluated. Results were integrated into a web application and published online. Malignant lesions were present in 107 patients (60.8%). Imaging features showed excellent interobserver variability (ICC: 0.81-0.98) with variable diagnostic accuracy (AUC: 0.65-0.82). Overall performance of the ML algorithm was high (AUC = 90.1%; BI-RADS IV: AUC = 91.6%). The ML algorithm provided decision rules to accurately rule-out malignancy with a false negative rate <1% in 31.3% of the BI-RADS IV cases. Thus, integration of ML into MRI interpretation can provide objective and accurate decision rules for the management of suspicious breast masses, and could help to reduce the number of potentially unnecessary biopsies.

    View details for DOI 10.1371/journal.pone.0228446

    View details for PubMedID 31999755

    View details for PubMedCentralID PMC6992224

  • Classification of Breast Cancer Histology Images Using Transfer Learning Vesal, S., Ravikumar, N., Davari, A., Ellmann, S., Maier, A., Campilho, A., Karray, F., Romeny, B. T. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 812–19
  • A Multi-task Framework for Skin Lesion Detection and Segmentation Vesal, S., Patil, S., Ravikumar, N., Maier, A. K., Stoyanov, D., Taylor, Z., Sarikaya, D., McLeod, J., Ballester, M. A., Codella, N. C., Martel, A., Maier-Hein, L., Malpani, A., Zenati, M. A., De Ribaupierre, S., Xiongbiao, L., Collins, T., Reichl, T., Drechsler, K., Erdt, M., Linguraru, M. G., Laura, C. O., Shekhar, R., Wesarg, S., Celebi, M. E., Dana, K., Halpern, A. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 285–93