Heather Selby, PhD
Basic Life Research Scientist, Stanford-Surgery Policy Improvement Research and Education Center
Bio
I am interested in developing medical imaging-based AI models to identify patients with locally advanced rectal cancer who achieve a clinical complete response to neoadjuvant chemoradiotherapy, with the goal of sparing them from surgery and its associated risks.
All Publications
-
AI-ready rectal cancer MR imaging: a workflow for tumor detection and segmentation.
BMC medical imaging
2025; 25 (1): 88
Abstract
Magnetic Resonance (MR) imaging is the preferred modality for staging in rectal cancer; however, despite its exceptional soft tissue contrast, segmenting rectal tumors on MR images remains challenging due to the overlapping appearance of tumor and normal tissues, variability in imaging parameters, and the inherent subjectivity of reader interpretation. For studies requiring accurate segmentation, reviews by multiple independent radiologists remain the gold standard, albeit at a substantial cost. The emergence of Artificial Intelligence (AI) offers promising solutions to semi- or fully-automatic segmentation, but the lack of publicly available, high-quality MR imaging datasets for rectal cancer remains a significant barrier to developing robust AI models.This study aimed to foster collaboration between a radiologist and two data scientists in the detection and segmentation of rectal tumors on T2- and diffusion-weighted MR images. By combining the radiologist's clinical expertise with the data scientists' imaging analysis skills, we sought to establish a foundation for future AI-driven approaches that streamline rectal tumor detection and segmentation, and optimize workflow efficiency.A total of 37 patients with rectal cancer were included in this study. Through radiologist-led training, attendance at Stanford's weekly Colorectal Cancer Multidisciplinary Tumor Board (CRC MDTB), and the use of radiologist annotations and clinical notes in Epic Electronic Health Records (EHR), data scientists learned how to detect and manually segment tumors on T2- and diffusion-weighted pre-treatment MR images. These segmentations were then reviewed and edited by a radiologist. The accuracy of the segmentations was evaluated using the Dice Similarity Coefficient (DSC) and Jaccard Index (JI), quantifying the overlap between the segmentations delineated by the data scientists and those edited by the radiologist.With the help of radiologist annotations and radiology notes in Epic EHR, the data scientists successfully identified rectal tumors in Slicer v5.7.0 across all evaluated T2- and diffusion-weighted MR images. Through radiologist-led training and participation at Stanford's weekly CRC MDTB, the data scientists' rectal tumor segmentations exhibited strong agreement with the radiologist's edits, achieving a mean DSC [95% CI] of 0.965 [0.939-0.992] and a mean JI [95% CI] of 0.943 [0.900, 0.985]. Discrepancies in segmentations were attributed to over- or under-segmentation, often incorporating surrounding structures such as the rectal wall and lumen.This study demonstrates the feasibility of generating high-quality labeled MR datasets through collaboration between a radiologist and two data scientists, which is essential for training AI models to automate tumor detection and segmentation in rectal cancer. By integrating expertise from radiology and data science, this approach has the potential to enhance AI model performance and transform clinical workflows in the future.
View details for DOI 10.1186/s12880-025-01614-3
View details for PubMedID 40087634
View details for PubMedCentralID PMC11909848
-
Performance of alternative manual and automated deep learning segmentation techniques for the prediction of benign and malignant lung nodules.
Journal of medical imaging (Bellingham, Wash.)
2023; 10 (4): 044006
Abstract
We aim to evaluate the performance of radiomic biopsy (RB), best-fit bounding box (BB), and a deep-learning-based segmentation method called no-new-U-Net (nnU-Net), compared to the standard full manual (FM) segmentation method for predicting benign and malignant lung nodules using a computed tomography (CT) radiomic machine learning model.A total of 188 CT scans of lung nodules from 2 institutions were used for our study. One radiologist identified and delineated all 188 lung nodules, whereas a second radiologist segmented a subset (n=20) of these nodules. Both radiologists employed FM and RB segmentation methods. BB segmentations were generated computationally from the FM segmentations. The nnU-Net, a deep-learning-based segmentation method, performed automatic nodule detection and segmentation. The time radiologists took to perform segmentations was recorded. Radiomic features were extracted from each segmentation method, and models to predict benign and malignant lung nodules were developed. The Kruskal-Wallis and DeLong tests were used to compare segmentation times and areas under the curve (AUC), respectively.For the delineation of the FM, RB, and BB segmentations, the two radiologists required a median time (IQR) of 113 (54 to 251.5), 21 (9.25 to 38), and 16 (12 to 64.25) s, respectively (p=0.04). In dataset 1, the mean AUC (95% CI) of the FM, RB, BB, and nnU-Net model were 0.964 (0.96 to 0.968), 0.985 (0.983 to 0.987), 0.961 (0.956 to 0.965), and 0.878 (0.869 to 0.888). In dataset 2, the mean AUC (95% CI) of the FM, RB, BB, and nnU-Net model were 0.717 (0.705 to 0.729), 0.919 (0.913 to 0.924), 0.699 (0.687 to 0.711), and 0.644 (0.632 to 0.657).Radiomic biopsy-based models outperformed FM and BB models in prediction of benign and malignant lung nodules in two independent datasets while deep-learning segmentation-based models performed similarly to FM and BB. RB could be a more efficient segmentation method, but further validation is needed.
View details for DOI 10.1117/1.JMI.10.4.044006
View details for PubMedID 37564098
View details for PubMedCentralID PMC10411216
-
Predicting treatment response for the safe non-operative management of patients with rectal cancer using an MRI-based deep-learning model
LIPPINCOTT WILLIAMS & WILKINS. 2023
View details for Web of Science ID 001053772002055
-
A 3D lung lesion variational autoencoder.
Cell reports methods
2024: 100695
Abstract
In this study, we develop a 3D beta variational autoencoder (beta-VAE) to advance lung cancer imaging analysis, countering the constraints of conventional radiomics methods. The autoencoder extracts information from public lung computed tomography (CT) datasets without additional labels. It reconstructs 3D lung nodule images with high quality (structural similarity: 0.774, peak signal-to-noise ratio: 26.1, and mean-squared error: 0.0008). The model effectively encodes lesion sizes in its latent embeddings, with a significant correlation with lesion size found after applying uniform manifold approximation and projection (UMAP) for dimensionality reduction. Additionally, the beta-VAE can synthesize new lesions of varying sizes by manipulating the latent features. The model can predict multiple clinical endpoints, including pathological N stage or KRAS mutation status, on the Stanford radiogenomics lung cancer dataset. Comparisons with other methods show that the beta-VAE performs equally well in these tasks, suggesting its potential as a pretrained model for predicting patient outcomes in medical imaging.
View details for DOI 10.1016/j.crmeth.2024.100695
View details for PubMedID 38278157
-
Topological data analysis of thoracic radiographic images shows improved radiomics-based lung tumor histology prediction.
Patterns (New York, N.Y.)
2023; 4 (1): 100657
Abstract
Topological data analysis provides tools to capture wide-scale structural shape information in data. Its main method, persistent homology, has found successful applications to various machine-learning problems. Despite its recent gain in popularity, much of its potential for medical image analysis remains undiscovered. We explore the prominent learning problems on thoracic radiographic images of lung tumors for which persistent homology improves radiomic-based learning. It turns out that our topological features well capture complementary information important for benign versus malignant and adenocarcinoma versus squamous cell carcinoma tumor prediction while contributing less consistently to small cell versus non-small cell-an interesting result in its own right. Furthermore, while radiomic features are better for predicting malignancy scores assigned by expert radiologists through visual inspection, we find that topological features are better for predicting more accurate histology assessed through long-term radiology review, biopsy, surgical resection, progression, or response.
View details for DOI 10.1016/j.patter.2022.100657
View details for PubMedID 36699734
-
RADIOMICS-BASED MULTI-MODAL PREDICTION OF TREATMENT RESPONSE TO PD-1/PD-L1 IMMUNE CHECKPOINT INHIBITOR (ICI) THERAPY IN STAGE IV NON-SMALL CELL LUNG CARCINOMA (MNSCLC)
BMJ PUBLISHING GROUP. 2022: A1346
View details for DOI 10.1136/jitc-2022-SITC2022.1296
View details for Web of Science ID 000919423401402
-
Machine Learning Radiomics Model for Early Identification of Small-Cell Lung Cancer on Computed Tomography Scans.
JCO clinical cancer informatics
2021; 5: 746-757
Abstract
PURPOSE: Small-cell lung cancer (SCLC) is the deadliest form of lung cancer, partly because of its short doubling time. Delays in imaging identification and diagnosis of nodules create a risk for stage migration. The purpose of our study was to determine if a machine learning radiomics model can detect SCLC on computed tomography (CT) among all nodules at least 1 cm in size.MATERIALS AND METHODS: Computed tomography scans from a single institution were selected and resampled to 1 * 1 * 1 mm. Studies were divided into SCLC and other scans comprising benign, adenocarcinoma, and squamous cell carcinoma that were segregated into group A (noncontrast scans) and group B (contrast-enhanced scans). Four machine learning classification models, support vector classifier, random forest (RF), XGBoost, and logistic regression, were used to generate radiomic models using 59 quantitative first-order and texture Imaging Biomarker Standardization Initiative compliant PyRadiomics features, which were found to be robust between two segmenters with minimum Redundancy Maximum Relevance feature selection within each leave-one-out-cross-validation to avoid overfitting. The performance was evaluated using a receiver operating characteristic curve. A final model was created using the RF classifier and aggregate minimum Redundancy Maximum Relevance to determine feature importance.RESULTS: A total of 103 studies were included in the analysis. The area under the receiver operating characteristic curve for RF, support vector classifier, XGBoost, and logistic regression was 0.81, 0.77, 0.84, and 0.84 in group A, and 0.88, 0.87, 0.85, and 0.81 in group B, respectively. Nine radiomic features in group A and 14 radiomic features in group B were predictive of SCLC. Six radiomic features overlapped between groups A and B.CONCLUSION: A machine learning radiomics model may help differentiate SCLC from other lung lesions.
View details for DOI 10.1200/CCI.21.00021
View details for PubMedID 34264747
-
A meta-learning approach for genomic survival analysis.
Nature communications
2020; 11 (1): 6350
Abstract
RNA sequencing has emerged as a promising approach in cancer prognosis as sequencing data becomes more easily and affordably accessible. However, it remains challenging to build good predictive models especially when the sample size is limited and the number of features is high, which is a common situation in biomedical settings. To address these limitations, we propose a meta-learning framework based on neural networks for survival analysis and evaluate it in a genomic cancer research setting. We demonstrate that, compared to regular transfer-learning, meta-learning is a significantly more effective paradigm to leverage high-dimensional data that is relevant but not directly related to the problem of interest. Specifically, meta-learning explicitly constructs a model, from abundant data of relevant tasks, to learn a new task with few samples effectively. For the application of predicting cancer survival outcome, we also show that the meta-learning framework with a few samples is able to achieve competitive performance with learning from scratch with a significantly larger number of samples. Finally, we demonstrate that the meta-learning model implicitly prioritizes genes based on their contribution to survival prediction and allows us to identify important pathways in cancer.
View details for DOI 10.1038/s41467-020-20167-3
View details for PubMedID 33311484