Bio


Dr. Yongkai Liu is a postdoctoral scholar at Stanford's Center for Advanced Functional Neuroimaging, led by Drs. Greg Zaharchuk and Michael Moseley. His interests lie in developing and evaluating advanced techniques for improving treatment decision-making and prognostics in brain diseases, especially stroke, using imaging and deep learning.

Before joining Stanford, he earned a Ph.D. from UCLA, majoring in Physics and Biology in Medicine, under the supervision of Prof. Kyung Sung. This gave him a solid foundation in medicine, deep learning, and physics. His Ph.D. thesis, titled "Advancing Segmentation and Classification Methods in Magnetic Resonance Imaging via Artificial Intelligence," focused on the development of advanced deep learning and machine learning techniques specifically for MRI-based clinical applications. During his master's degree, he studied CT Virtual Colonoscopy under the supervision of Prof. Jerome Liang. In addition, he served as a reviewing editor for Frontiers in Oncology and as a peer reviewer for several critical journals in medical imaging, such as Medical Physics, Scientific Reports, British Journal of Radiology, BJR|Artificial Intelligence, Annals of Clinical and Translational Neurology, IEEE Transactions on Medical Imaging, IEEE Journal of Biomedical and Health Informatics, IEEE Transactions on Radiation and Plasma Medical Sciences, IEEE Transactions on Biomedical Engineering, and IEEE Transactions on Neural Networks and Learning Systems.

Dr. Liu is an emerging leader in neuroimaging, stroke, and AI, earning widespread recognition for his work. Being named the recipient of the 2024 AJNR Lucien Levy Award, the David M. Yousem Research Fellow Award, and a semi-finalist for the 2024 Cornelius G. Dyke Award underscores his potential to make significant future contributions. (https://med.stanford.edu/rsl/news/yongkai-liu-receives-research-fellow-award.html)

Honors & Awards


  • UCLA PhD Fellowship, UCLA (2018)
  • Lucien Levy Award, American Journal of Neuroradiology (2024)
  • David M. Yousem Research Fellow Award, American Society of Neuroradiology (2024)
  • Semi-finalist for the 2024 Cornelius G. Dyke Award, American Society of Neuroradiology (2024)

Professional Education


  • Master of Engineering, Tsinghua University (2017)
  • Doctor of Philosophy, University of California Los Angeles (2022)

Stanford Advisors


All Publications


  • A Clinical and Imaging Fused Deep Learning Model Matches Expert Clinician Prediction of 90-Day Stroke Outcomes. AJNR. American journal of neuroradiology Liu, Y., Shah, P., Yu, Y., Horsey, J., Ouyang, J., Jiang, B., Yang, G., Heit, J. J., McCullough-Hicks, M. E., Hugdal, S. M., Wintermark, M., Michel, P., Liebeskind, D. S., Lansberg, M. G., Albers, G. W., Zaharchuk, G. 2024

    Abstract

    Predicting long-term clinical outcome in acute ischemic stroke is beneficial for prognosis, clinical trial design, resource management, and patient expectations. This study used a deep learning-based predictive model (DLPD) to predict 90-day mRS outcomes and compared its predictions with those made by physicians.A previously developed DLPD that incorporated DWI and clinical data from the acute period was used to predict 90-day mRS outcomes in 80 consecutive patients with acute ischemic stroke from a single-center registry. We assessed the predictions of the model alongside those of 5 physicians (2 stroke neurologists and 3 neuroradiologists provided with the same imaging and clinical information). The primary analysis was the agreement between the ordinal mRS predictions of the model or physician and the ground truth using the Gwet Agreement Coefficient. We also evaluated the ability to identify unfavorable outcomes (mRS >2) using the area under the curve, sensitivity, and specificity. Noninferiority analyses were undertaken using limits of 0.1 for the Gwet Agreement Coefficient and 0.05 for the area under the curve analysis. The accuracy of prediction was also assessed using the mean absolute error for prediction, percentage of predictions ±1 categories away from the ground truth (±1 accuracy [ACC]), and percentage of exact predictions (ACC).To predict the specific mRS score, the DLPD yielded a Gwet Agreement Coefficient score of 0.79 (95% CI, 0.71-0.86), surpassing the physicians' score of 0.76 (95% CI, 0.67-0.84), and was noninferior to the readers (P < .001). For identifying unfavorable outcome, the model achieved an area under the curve of 0.81 (95% CI, 0.72-0.89), again noninferior to the readers' area under the curve of 0.79 (95% CI, 0.69-0.87) (P < .005). The mean absolute error, ±1ACC, and ACC were 0.89, 81%, and 36% for the DLPD.A deep learning method using acute clinical and imaging data for long-term functional outcome prediction in patients with acute ischemic stroke, the DLPD, was noninferior to that of clinical readers.

    View details for DOI 10.3174/ajnr.A8140

    View details for PubMedID 38331959

  • Random expert sampling for deep learning segmentation of acute ischemic stroke on non-contrast CT. Journal of neurointerventional surgery Ostmeier, S., Axelrod, B., Liu, Y., Yu, Y., Jiang, B., Yuen, N., Pulli, B., Verhaaren, B. F., Kaka, H., Wintermark, M., Michel, P., Mahammedi, A., Federau, C., Lansberg, M. G., Albers, G. W., Moseley, M. E., Zaharchuk, G., Heit, J. J. 2024

    Abstract

    Outlining acutely infarcted tissue on non-contrast CT is a challenging task for which human inter-reader agreement is limited. We explored two different methods for training a supervised deep learning algorithm: one that used a segmentation defined by majority vote among experts and another that trained randomly on separate individual expert segmentations.The data set consisted of 260 non-contrast CT studies in 233 patients with acute ischemic stroke recruited from the multicenter DEFUSE 3 (Endovascular Therapy Following Imaging Evaluation for Ischemic Stroke 3) trial. Additional external validation was performed using 33 patients with matched stroke onset times from the University Hospital Lausanne. A benchmark U-Net was trained on the reference annotations of three experienced neuroradiologists to segment ischemic brain tissue using majority vote and random expert sampling training schemes. The median of volume, overlap, and distance segmentation metrics were determined for agreement in lesion segmentations between (1) three experts, (2) the majority model and each expert, and (3) the random model and each expert. The two sided Wilcoxon signed rank test was used to compare performances (1) to 2) and (1) to (3). We further compared volumes with the 24 hour follow-up diffusion weighted imaging (DWI, final infarct core) and correlations with clinical outcome (modified Rankin Scale (mRS) at 90 days) with the Spearman method.The random model outperformed the inter-expert agreement ((1) to (2)) and the majority model ((1) to (3)) (dice 0.51±0.04 vs 0.36±0.05 (P<0.0001) vs 0.45±0.05 (P<0.0001)). The random model predicted volume correlated with clinical outcome (0.19, P<0.05), whereas the median expert volume and majority model volume did not. There was no significant difference when comparing the volume correlations between random model, median expert volume, and majority model to 24 hour follow-up DWI volume (P>0.05, n=51).The random model for ischemic injury delineation on non-contrast CT surpassed the inter-expert agreement ((1) to (2)) and the performance of the majority model ((1) to (3)). We showed that the random model volumetric measures of the model were consistent with 24 hour follow-up DWI.

    View details for DOI 10.1136/jnis-2023-021283

    View details for PubMedID 38302420

  • Non-inferiority of deep learning ischemic stroke segmentation on non-contrast CT within 16-hours compared to expert neuroradiologists. Scientific reports Ostmeier, S., Axelrod, B., Verhaaren, B. F., Christensen, S., Mahammedi, A., Liu, Y., Pulli, B., Li, L., Zaharchuk, G., Heit, J. J. 2023; 13 (1): 16153

    Abstract

    We determined if a convolutional neural network (CNN) deep learning model can accurately segment acute ischemic changes on non-contrast CT compared to neuroradiologists. Non-contrast CT (NCCT) examinations from 232 acute ischemic stroke patients who were enrolled in the DEFUSE 3 trial were included in this study. Three experienced neuroradiologists independently segmented hypodensity that reflected the ischemic core on each scan. The neuroradiologist with the most experience (expert A) served as the ground truth for deep learning model training. Two additional neuroradiologists' (experts B and C) segmentations were used for data testing. The 232 studies were randomly split into training and test sets. The training set was further randomly divided into 5 folds with training and validation sets. A 3-dimensional CNN architecture was trained and optimized to predict the segmentations of expert A from NCCT. The performance of the model was assessed using a set of volume, overlap, and distance metrics using non-inferiority thresholds of 20%, 3 ml, and 3 mm, respectively. The optimized model trained on expert A was compared to test experts B and C. We used a one-sided Wilcoxon signed-rank test to test for the non-inferiority of the model-expert compared to the inter-expert agreement. The final model performance for the ischemic core segmentation task reached a performance of 0.46 ± 0.09 Surface Dice at Tolerance 5mm and 0.47 ± 0.13 Dice when trained on expert A. Compared to the two test neuroradiologists the model-expert agreement was non-inferior to the inter-expert agreement, [Formula: see text]. The before, CNN accurately delineates the hypodense ischemic core on NCCT in acute ischemic stroke patients with an accuracy comparable to neuroradiologists.

    View details for DOI 10.1038/s41598-023-42961-x

    View details for PubMedID 37752162

  • Functional Outcome Prediction in Acute Ischemic Stroke Using a Fused Imaging and Clinical Deep Learning Model. Stroke Liu, Y., Yu, Y., Ouyang, J., Jiang, B., Yang, G., Ostmeier, S., Wintermark, M., Michel, P., Liebeskind, D. S., Lansberg, M. G., Albers, G. W., Zaharchuk, G. 2023

    Abstract

    Predicting long-term clinical outcome based on the early acute ischemic stroke information is valuable for prognostication, resource management, clinical trials, and patient expectations. Current methods require subjective decisions about which imaging features to assess and may require time-consuming postprocessing. This study's goal was to predict ordinal 90-day modified Rankin Scale (mRS) score in acute ischemic stroke patients by fusing a Deep Learning model of diffusion-weighted imaging images and clinical information from the acute period.A total of 640 acute ischemic stroke patients who underwent magnetic resonance imaging within 1 to 7 days poststroke and had 90-day mRS follow-up data were randomly divided into 70% (n=448) for model training, 15% (n=96) for validation, and 15% (n=96) for internal testing. Additionally, external testing on a cohort from Lausanne University Hospital (n=280) was performed to further evaluate model generalization. Accuracy for ordinal mRS, accuracy within ±1 mRS category, mean absolute prediction error, and determination of unfavorable outcome (mRS score >2) were evaluated for clinical only, imaging only, and 2 fused clinical-imaging models.The fused models demonstrated superior performance in predicting ordinal mRS score and unfavorable outcome in both internal and external test cohorts when compared with the clinical and imaging models. For the internal test cohort, the top fused model had the highest area under the curve of 0.92 for unfavorable outcome prediction and the lowest mean absolute error (0.96 [95% CI, 0.77-1.16]), with the highest proportion of mRS score predictions within ±1 category (79% [95% CI, 71%-88%]). On the external Lausanne University Hospital cohort, the best fused model had an area under the curve of 0.90 for unfavorable outcome prediction and outperformed other models with an mean absolute error of 0.90 (95% CI, 0.79-1.01), and the highest percentage of mRS score predictions within ±1 category (83% [95% CI, 78%-87%]).A Deep Learning-based imaging model fused with clinical variables can be used to predict 90-day stroke outcome with reduced subjectivity and user burden.

    View details for DOI 10.1161/STROKEAHA.123.044072

    View details for PubMedID 37485663

  • Evaluation of Spatial Attentive Deep Learning for Automatic Placental Segmentation on Longitudinal MRI JOURNAL OF MAGNETIC RESONANCE IMAGING Liu, Y., Zabihollahy, F., Yan, R., Lee, B., Janzen, C., Devaskar, S., Sung, K. 2022: 1533-1540

    Abstract

    Automated segmentation of the placenta by MRI in early pregnancy may help predict normal and aberrant placenta function, which could improve the efficiency of placental assessment and the prediction of pregnancy outcomes. An automated segmentation method that works at one gestational age may not transfer effectively to other gestational ages.To evaluate a spatial attentive deep learning method (SADL) for automated placental segmentation on longitudinal placental MRI scans.Prospective, single-center.A total of 154 pregnant women who underwent MRI scans at both 14-18 weeks of gestation and at 19-24 weeks of gestation, divided into training (N = 108), validation (N = 15), and independent testing datasets (N = 31).A 3 T, T2-weighted half Fourier single-shot turbo spin-echo (T2-HASTE) sequence.The reference standard of placental segmentation was manual delineation on T2-HASTE by a third-year neonatology clinical fellow (B.L.) under the supervision of an experienced maternal-fetal medicine specialist (C.J. with 20 years of experience) and an MRI scientist (K.S. with 19 years of experience).The three-dimensional Dice similarity coefficient (DSC) was used to measure the automated segmentation performance compared to the manual placental segmentation. A paired t-test was used to compare the DSCs between SADL and U-Net methods. A Bland-Altman plot was used to analyze the agreement between manual and automated placental volume measurements. A P value < 0.05 was considered statistically significant.In the testing dataset, SADL achieved average DSCs of 0.83 ± 0.06 and 0.84 ± 0.05 in the first and second MRI, which were significantly higher than those achieved by U-Net (0.77 ± 0.08 and 0.76 ± 0.10, respectively). A total of 6 out of 62 MRI scans (9.6%) had volume measurement differences between the SADL-based automated and manual volume measurements that were out of 95% limits of agreement.SADL can automatically detect and segment the placenta with high performance in MRI at two different gestational ages.4 TECHNICAL EFFICACY STAGE: 2.

    View details for DOI 10.1002/jmri.28403

    View details for PubMedCentralID PMC10080136

  • Multiparametric MRI-based radiomics model to predict pelvic lymph node invasion for patients with prostate cancer EUROPEAN RADIOLOGY Zheng, H., Miao, Q., Liu, Y., Mirak, S., Hosseiny, M., Scalzo, F., Raman, S. S., Sung, K. 2022

    Abstract

    To identify which patient with prostate cancer (PCa) could safely avoid extended pelvic lymph node dissection (ePLND) by predicting lymph node invasion (LNI), via a radiomics-based machine learning approach.An integrative radiomics model (IRM) was proposed to predict LNI, confirmed by the histopathologic examination, integrating radiomics features, extracted from prostatic index lesion regions on MRI images, and clinical features via SVM. The study cohort comprised 244 PCa patients with MRI and followed by radical prostatectomy (RP) and ePLND within 6 months between 2010 and 2019. The proposed IRM was trained in training/validation set and evaluated in an internal independent testing set. The model's performance was measured by area under the curve (AUC), sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). AUCs were compared via Delong test with 95% confidence interval (CI), and the rest measurements were compared via chi-squared test or Fisher's exact test.Overall, 17 (10.6%) and 14 (16.7%) patients with LNI were included in training/validation set and testing set, respectively. Shape and first-order radiomics features showed usefulness in building the IRM. The proposed IRM achieved an AUC of 0.915 (95% CI: 0.846-0.984) in the testing set, superior to pre-existing nomograms whose AUCs were from 0.698 to 0.724 (p < 0.05).The proposed IRM could be potentially feasible to predict the risk of having LNI for patients with PCa. With the improved predictability, it could be utilized to assess which patients with PCa could safely avoid ePLND, thus reduce the number of unnecessary ePLND.• The combination of MRI-based radiomics features with clinical information improved the prediction of lymph node invasion, compared with the model using only radiomics features or clinical features. • With improved prediction performance on predicting lymph node invasion, the number of extended pelvic lymph node dissection (ePLND) could be reduced by the proposed integrative radiomics model (IRM), compared with the existing nomograms.

    View details for DOI 10.1007/s00330-022-08625-6

    View details for Web of Science ID 000763863700003

    View details for PubMedID 35238971

  • Deep Learning Enables Prostate MRI Segmentation: A Large Cohort Evaluation With Inter-Rater Variability Analysis FRONTIERS IN ONCOLOGY Liu, Y., Miao, Q., Surawech, C., Zheng, H., Nguyen, D., Yang, G., Raman, S. S., Sung, K. 2021; 11: 801876

    Abstract

    Whole-prostate gland (WPG) segmentation plays a significant role in prostate volume measurement, treatment, and biopsy planning. This study evaluated a previously developed automatic WPG segmentation, deep attentive neural network (DANN), on a large, continuous patient cohort to test its feasibility in a clinical setting. With IRB approval and HIPAA compliance, the study cohort included 3,698 3T MRI scans acquired between 2016 and 2020. In total, 335 MRI scans were used to train the model, and 3,210 and 100 were used to conduct the qualitative and quantitative evaluation of the model. In addition, the DANN-enabled prostate volume estimation was evaluated by using 50 MRI scans in comparison with manual prostate volume estimation. For qualitative evaluation, visual grading was used to evaluate the performance of WPG segmentation by two abdominal radiologists, and DANN demonstrated either acceptable or excellent performance in over 96% of the testing cohort on the WPG or each prostate sub-portion (apex, midgland, or base). Two radiologists reached a substantial agreement on WPG and midgland segmentation (κ = 0.75 and 0.63) and moderate agreement on apex and base segmentation (κ = 0.56 and 0.60). For quantitative evaluation, DANN demonstrated a dice similarity coefficient of 0.93 ± 0.02, significantly higher than other baseline methods, such as DeepLab v3+ and UNet (both p values < 0.05). For the volume measurement, 96% of the evaluation cohort achieved differences between the DANN-enabled and manual volume measurement within 95% limits of agreement. In conclusion, the study showed that the DANN achieved sufficient and consistent WPG segmentation on a large, continuous study cohort, demonstrating its great potential to serve as a tool to measure prostate volume.

    View details for DOI 10.3389/fonc.2021.801876

    View details for Web of Science ID 000739069500001

    View details for PubMedID 34993152

    View details for PubMedCentralID PMC8724207

  • Textured-Based Deep Learning in Prostate Cancer Classification with 3T Multiparametric MRI: Comparison with PI-RADS-Based Classification DIAGNOSTICS Liu, Y., Zheng, H., Liang, Z., Miao, Q., Brisbane, W. G., Marks, L. S., Raman, S. S., Reiter, R. E., Yang, G., Sung, K. 2021; 11 (10)

    Abstract

    The current standardized scheme for interpreting MRI requires a high level of expertise and exhibits a significant degree of inter-reader and intra-reader variability. An automated prostate cancer (PCa) classification can improve the ability of MRI to assess the spectrum of PCa. The purpose of the study was to evaluate the performance of a texture-based deep learning model (Textured-DL) for differentiating between clinically significant PCa (csPCa) and non-csPCa and to compare the Textured-DL with Prostate Imaging Reporting and Data System (PI-RADS)-based classification (PI-RADS-CLA), where a threshold of PI-RADS ≥ 4, representing highly suspicious lesions for csPCa, was applied. The study cohort included 402 patients (60% (n = 239) of patients for training, 10% (n = 42) for validation, and 30% (n = 121) for testing) with 3T multiparametric MRI matched with whole-mount histopathology after radical prostatectomy. For a given suspicious prostate lesion, the volumetric patches of T2-Weighted MRI and apparent diffusion coefficient images were cropped and used as the input to Textured-DL, consisting of a 3D gray-level co-occurrence matrix extractor and a CNN. PI-RADS-CLA by an expert reader served as a baseline to compare classification performance with Textured-DL in differentiating csPCa from non-csPCa. Sensitivity and specificity comparisons were performed using Mcnemar's test. Bootstrapping with 1000 samples was performed to estimate the 95% confidence interval (CI) for AUC. CIs of sensitivity and specificity were calculated by the Wald method. The Textured-DL model achieved an AUC of 0.85 (CI [0.79, 0.91]), which was significantly higher than the PI-RADS-CLA (AUC of 0.73 (CI [0.65, 0.80]); p < 0.05) for PCa classification, and the specificity was significantly different between Textured-DL and PI-RADS-CLA (0.70 (CI [0.59, 0.82]) vs. 0.47 (CI [0.35, 0.59]); p < 0.05). In sub-analyses, Textured-DL demonstrated significantly higher specificities in the peripheral zone (PZ) and solitary tumor lesions compared to the PI-RADS-CLA (0.78 (CI [0.66, 0.90]) vs. 0.42 (CI [0.28, 0.57]); 0.75 (CI [0.54, 0.96]) vs. 0.38 [0.14, 0.61]; all p values < 0.05). Moreover, Textured-DL demonstrated a high negative predictive value of 92% while maintaining a high positive predictive value of 58% among the lesions with a PI-RADS score of 3. In conclusion, the Textured-DL model was superior to the PI-RADS-CLA in the classification of PCa. In addition, Textured-DL demonstrated superior performance in the specificities for the peripheral zone and solitary tumors compared with PI-RADS-based risk assessment.

    View details for DOI 10.3390/diagnostics11101785

    View details for Web of Science ID 000712235200001

    View details for PubMedID 34679484

    View details for PubMedCentralID PMC8535024

  • Integrative Machine Learning Prediction of Prostate Biopsy Results From Negative Multiparametric MRI JOURNAL OF MAGNETIC RESONANCE IMAGING Zheng, H., Miao, Q., Liu, Y., Raman, S. S., Scalzo, F., Sung, K. 2022; 55 (1): 100-110

    Abstract

    Multiparametric MRI (mpMRI) is commonly recommended as a triage test prior to any prostate biopsy. However, there exists limited consensus on which patients with a negative prostate mpMRI could avoid prostate biopsy.To identify which patient could safely avoid prostate biopsy when the prostate mpMRI is negative, via a radiomics-based machine learning approach.Retrospective.Three hundred thirty patients with negative prostate 3T mpMRI between January 2016 and December 2018 were included.A 3.0 T/T2-weighted turbo spin echo (TSE) imaging (T2 WI) and diffusion-weighted imaging (DWI).The integrative machine learning (iML) model was trained to predict negative prostate biopsy results, utilizing both radiomics and clinical features. The final study cohort comprised 330 consecutive patients with negative mpMRI (PI-RADS < 3) who underwent systematic transrectal ultrasound-guided (TRUS) or MR-ultrasound fusion (MRUS) biopsy within 6 months. A secondary analysis of biopsy naïve subcohort (n = 227) was also conducted.The Mann-Whitney U test and Chi-Squared test were utilized to evaluate the significance of difference of clinical features between prostate biopsy positive and negative groups. The model performance was validated using leave-one-out cross-validation (LOOCV) and measured by AUC, sensitivity, specificity, and negative predictive value (NPV).Overall, 306/330 (NPV 92.7%) of the final study cohort patients had negative biopsies, and 207/227 (NPV 91.2%) of the biopsy naïve subcohort patients had negative biopsies. Our iML model achieved NPVs of 98.3% and 98.0% for the study cohort and subcohort, respectively, superior to prostate-specific antigen density (PSAD)-based risk assessment with NPVs of 94.9% and 93.9%, respectively.The proposed iML model achieved high performance in predicting negative prostate biopsy results for patients with negative mpMRI. With improved NPVs, the proposed model can be used to stratify patients who in whom we might obviate biopsies, thus reducing the number of unnecessary biopsies.3 TECHNICAL EFFICACY: Stage 2.

    View details for DOI 10.1002/jmri.27793

    View details for Web of Science ID 000664546200001

    View details for PubMedID 34160114

    View details for PubMedCentralID PMC8678175

  • ME-Net: Multi-encoder net framework for brain tumor segmentation INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY Zhang, W., Yang, G., Huang, H., Yang, W., Xu, X., Liu, Y., Lai, X. 2021; 31 (4): 1834-1848

    View details for DOI 10.1002/ima.22571

    View details for Web of Science ID 000625884900001

  • 3D PBV-Net: An automated prostate MRI data segmentation method COMPUTERS IN BIOLOGY AND MEDICINE Jin, Y., Yang, G., Fang, Y., Li, R., Xu, X., Liu, Y., Lai, X. 2021; 128: 104160

    Abstract

    Prostate cancer is one of the most common deadly diseases in men worldwide, which is seriously affecting people's life and health. Reliable and automated segmentation of the prostate gland in MRI data is exceptionally critical for diagnosis and treatment planning of prostate cancer. Although many automated segmentation methods have emerged, including deep learning based approaches, segmentation performance is still poor due to the large variability of image appearance, anisotropic spatial resolution, and imaging interference. This study proposes an automated prostate MRI data segmentation approach using bicubic interpolation with improved 3D V-Net (dubbed 3D PBV-Net). Considering the low-frequency components in the prostate gland, the bicubic interpolation is applied to preprocess the MRI data. On this basis, a 3D PBV-Net is developed to perform prostate MRI data segmentation. To illustrate the effectiveness of our approach, we evaluate the proposed 3D PBV-Net on two clinical prostate MRI data datasets, i.e., PROMISE 12 and TPHOH, with the manual delineations available as the ground truth. Our approach generates promising segmentation results, which have achieved 97.65% and 98.29% of average accuracy, 0.9613 and 0.9765 of Dice metric, 3.120 mm and 0.9382 mm of Hausdorff distance, and average boundary distance of 1.708, 0.7950 on PROMISE 12 and TPHOH datasets, respectively. Our method has effectively improved the accuracy of automated segmentation of the prostate MRI data and is promising to meet the accuracy requirements for telehealth applications.

    View details for DOI 10.1016/j.compbiomed.2020.104160

    View details for Web of Science ID 000604568300002

    View details for PubMedID 33310694

  • Exploring Uncertainty Measures in Bayesian Deep Attentive Neural Networks for Prostate Zonal Segmentation IEEE ACCESS Liu, Y., Yang, G., Hosseiny, M., Azadikhah, A., Mirak, S., Miao, Q., Raman, S. S., Sung, K. 2020; 8: 151817-151828

    Abstract

    Automatic segmentation of prostatic zones on multiparametric MRI (mpMRI) can improve the diagnostic workflow of prostate cancer. We designed a spatial attentive Bayesian deep learning network for the automatic segmentation of the peripheral zone (PZ) and transition zone (TZ) of the prostate with uncertainty estimation. The proposed method was evaluated by using internal and external independent testing datasets, and overall uncertainties of the proposed model were calculated at different prostate locations (apex, middle, and base). The study cohort included 351 MRI scans, of which 304 scans were retrieved from a de-identified publicly available datasets (PROSTATEX) and 47 scans were extracted from a large U.S. tertiary referral center (external testing dataset; ETD)). All the PZ and TZ contours were drawn by research fellows under the supervision of expert genitourinary radiologists. Within the PROSTATEX dataset, 259 and 45 patients (internal testing dataset; ITD) were used to develop and validate the model. Then, the model was tested independently using the ETD only. The segmentation performance was evaluated using the Dice Similarity Coefficient (DSC). For PZ and TZ segmentation, the proposed method achieved mean DSCs of 0.80±0.05 and 0.89±0.04 on ITD, as well as 0.79±0.06 and 0.87±0.07 on ETD. For both PZ and TZ, there was no significant difference between ITD and ETD for the proposed method. This DL-based method enabled the accuracy of the PZ and TZ segmentation, which outperformed the state-of-art methods (Deeplab V3+, Attention U-Net, R2U-Net, USE-Net and U-Net). We observed that segmentation uncertainty peaked at the junction between PZ, TZ and AFS. Also, the overall uncertainties were highly consistent with the actual model performance between PZ and TZ at three clinically relevant locations of the prostate.

    View details for DOI 10.1109/ACCESS.2020.3017168

    View details for Web of Science ID 000564244600001

    View details for PubMedID 33564563

    View details for PubMedCentralID PMC7869831

  • Automatic Prostate Zonal Segmentation Using Fully Convolutional Network With Feature Pyramid Attention IEEE ACCESS Liu, Y., Yang, G., Afshari Mirak, S., Hosseiny, M., Azadikhah, A., Zhong, X., Reiter, R. E., Lee, Y., Raman, S. S., Sung, K. 2019; 7: 163626-163632
  • Haustral loop extraction for CT colonography using geodesics INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY Liu, Y., Duan, C., Liang, J., Hu, J., Lu, H., Luo, M. 2017; 12 (3): 379-388

    Abstract

    The human colon has complex geometric structures because of its haustral folds, which are thin flat protrusions on the colon wall. The haustral loop is the curve (approximately triangular in shape) that encircles the highly convex region of the haustral fold, and is regarded as the natural landmark of the colon, intersecting the longitude of the colon in the middle. Haustral loop extraction can assist in reducing the structural complexity of the colon, and the loops can also serve as anatomic markers for computed tomographic colonography (CTC). Moreover, haustral loop sectioning of the colon can help with the performance of precise prone-supine registration.We propose an accurate approach of extracting haustral loops for CT virtual colonoscopy based on geodesics. First, the longitudinal geodesic (LG) connecting the start and end points is tracked by the geodesic method and the colon is cut along the LG. Second, key points are extracted from the LG, after which paired points that are used for seeking the potential haustral loops are calculated according to the key points. Next, for each paired point, the shortest distance (geodesic line) between the paired points twice is calculated, namely one on the original surface and the other on the cut surface. Then, the two geodesics are combined to form a potential haustral loop. Finally, erroneous and nonstandard potential loops are removed.To evaluate the haustral loop extraction algorithm, we first utilized the algorithm to extract the haustral loops. Then, we let the clinicians determine whether the haustral loops were correct and then identify the missing haustral loops. The extraction algorithm successfully detected 91.87% of all of the haustral loops with a very low false positive rate.We believe that haustral loop extraction may benefit many post-procedures in CTC, such as supine-prone registration, computer-aided diagnosis, and taenia coli extraction.

    View details for DOI 10.1007/s11548-016-1497-x

    View details for Web of Science ID 000394539600003

    View details for PubMedID 27854032

    View details for PubMedCentralID PMC5313587