A. Solomon Henry's Profile | Stanford Profiles

Education & Certifications

Professional ML Engineer, Google, Machine Learning, Artificial Intelligence (2023)
Professional Data Engineer, Google, Google Cloud Platform (GCP) (2022)
Professional Certificate, Stanford University,Stanford USA, Advanced Project Management (2006)
Master Degree, BITS/India, Computer Science (1993)
Bachelor Degree, TCE/University of Madras/India, Computer Science and Engineering (1990)

Projects

Reynolds Cardiovascular Clinical Research, Stanford University, School of Medicine, Stanford University (March 1, 2002 - December 31, 2007)

The aim of the Donald W. Reynolds Cardiovascular Clinical Research Center at Stanford University was to provide better care for patients with heart disease through the application of modern genetic approaches. Projects utilized the techniques of modern molecular biology to identify genes for which abnormalities predisposed to heart disease in a specific way. These genes were then examined for unique mutations that could serve as markers to track disease in larger populations. The project was large in scope and consisted of several sub projects. I was involved in planning, designing and implementing systems for recruitment, scheduling clinic visits, generating reports and result letters, clinical visit data collection, barcode generation, and sample tracking. I was also involved in planning, designing and implementing Reynolds analysis dataset which served as a backbone for all the data analysis.

Location

Palo Alto, California, USA
Stanford Cancer Institute Research Database, School of Medicine, Stanford University (1/1/2012 - Present)

The Stanford Cancer Institute Research Database (SCIRDB) has a rich set of data which integrates many resources including EPIC/Clarity, STRIDE, specialized databases in surgical pathology and radiation oncology, the Stanford Cancer Registry, and the Social Security Death Index (SSDI). SCIRDB also provides extensive tools for browsing and searching the existing data.

Location

Palo Alto, California, USA
Bone Marrow Transplant research database (3/1/2010 - Present)

Developed web-enabled Bone Marrow Transplant research database. The BMT database is being used for three primary purposes (1) Research (2) Mandatory Outcome Reporting (3) Annual Request for Information (RFI). The Research component of the database which entails but is not limited to assuring quality of the data collected, managing computer systems for data storage and access, development of systems improvements and assembling, summarizing and analyzing data for scientific reports and audits. In turn, the BMT database supplies the vital information needed for the other 2 components: Mandatory Outcome Reporting and the Annual RFI. The Outcome Reporting to the CIBMTR of all transplant patients who must be followed until death was mandated by federal law in 2005. The RFI reports the survival and mortality statistics of our BMT program in relation to comparable transplant programs around the country.

Location

Stanford, CA
Lymphoma Program Project (LPP), Stanford University (1/1/2007 - Present)

Developed web-enabled Lymphoma Program Project (LPP) research database to track patient demographics, diagnosis, courses of treatment, long-term follow-up, and clinical responses to the diseases for Stanford's Non-Hodgkins and Hodgkins Lymphoma cases back to the late 1960's.

Location

Stanford, CA

All Publications

Leveraging large language models to extract smoking history from clinical notes for lung cancer surveillance. NPJ digital medicine Luo, I., Graber-Naidich, A., Zhang, M., Kaushik, R., Nieda, G. M., Chen, T., Gu, B., Choi, E., Ding, V. Y., Gunturkun, F., Satoyoshi, M., Bhat, A., Lee, T. Y., Su, C. C., Ellis-Caleo, T. J., Henry, A. S., Desai, M., Backhus, L. M., Lui, N. S., Leung, A., Neal, J. W., Kurian, A. W., Langlotz, C. P., Wakelee, H. A., Liang, S. Y., Khan, A., Han, S. S. 2025; 8 (1): 731

Abstract

Accurate smoking documentation in electronic health records (EHRs) is crucial for risk assessment and patient monitoring. However, key information is often missing or inaccurately recorded. Large language models (LLMs) present a promising solution for interpreting clinical narratives to extract comprehensive smoking data. We developed a framework utilizing LLMs combined with rule-based longitudinal smoothing techniques to enhance data quality. We compared generative LLMs (Gemini-1.5-Flash, PaLM-2-Text-Bison, GPT-4) against BERT-based models using 1683 manually annotated clinical notes from 518 patients across Stanford and Sutter Health systems. Generative LLMs achieved superior performance ( > 96% accuracy) across seven smoking variables, with external validation showing robust generalizability (97.5-98.8% accuracy). We deployed Gemini-1.5-Flash to 79,408 notes from 4792 lung cancer patients, demonstrating that risk model-based surveillance incorporating smoking factors outperformed NCCN Guidelines in identifying second malignancies. Our study highlights the potential of generative LLMs to improve smoking history documentation quality, enhancing lung cancer surveillance and broader clinical applications.

View details for DOI 10.1038/s41746-025-02009-y

View details for PubMedID 41315854

View details for PubMedCentralID 11745215
Automatic Abstraction of Computed Tomography Imaging Indication Using Natural Language Processing for Evaluation of Surveillance Patterns in Long-Term Lung Cancer Survivors. JCO clinical cancer informatics Khan, A., Choi, E., Su, C., Graber-Naidich, A., Henry, S., Satoyoshi, M. L., Bhat, A., Kurian, A. W., Liang, S. Y., Neal, J., Gould, M., Leung, A., Wakelee, H. A., Backhus, L. M., Langlotz, C., Wu, J., Han, S. S. 2025; 9: e2400279

Abstract

Despite its routine use to monitor patients with lung cancer (LC), real-world evaluations of the impact of computed tomography (CT) surveillance on overall survival (OS) have been inconsistent. A major confounder is the absence of imaging indications because patients undergo CT scans for purposes beyond surveillance, like symptom evaluations (eg, cough) linked to poor survival. We propose a novel natural language processing model to predict CT imaging indications (surveillance v others).We used electronic health records of 585 long-term LC survivors (≥5 years) at Stanford, followed for up to 22 years. Their 3,362 post-5-year CT reports (including 1,672 manually annotated) were used for modeling by integrating structured variables (eg, CT intervals) with key-phrase analysis of radiology reports. Naïve analysis compared OS in patients with CT for any indications (including symptoms) versus those without post-5-year CT, as in previous studies. Using model-predicted indications, we conducted exploratory analyses to compare OS between those with post-5-year surveillance CT and those without.The model showed high discrimination (AUC, 0.86), with key predictors including a longer interval (≥6-month) from the previous CT (odds ratios [OR], 5.50; P < .001) and surveillance-related key phrases (OR, 1.37; P = .03). Propensity-adjusted survival analysis indicated better OS for patients with any post-5-year surveillance CT versus those without (adjusted hazard ratio, 0.60; P = .016). By contrast, no significant survival difference was found (P = .53) between patients with any CT versus those without post-5-year CT.Our model abstracted CT indications from real-world data with high discrimination. Exploratory analyses revealed the obscured imaging-OS association when considering indications, highlighting the model's potential for future real-world studies.

View details for DOI 10.1200/CCI-24-00279

View details for PubMedID 40700679
The improved prognosis of FLT3-internal tandem duplication but not tyrosine kinase domain mutations in acute myeloid leukemia in the era of targeted therapy: a realworld study using large-scale electronic health record data. Haematologica Schwede, M., Rodriguez, G., Kennedy, V. E., Henry, S., Wood, D., Mannis, G. N., Majeti, R., Chen, J. H., Bendavid, E., Zhang, T. Y. 2025

Abstract

Not available.

View details for DOI 10.3324/haematol.2024.286695

View details for PubMedID 39911116
Harnessing Artificial Intelligence for Risk Stratification in Acute Myeloid Leukemia (AML): Evaluating the Utility of Longitudinal Electronic Health Record (EHR) Data Via Graph Neural Networks Sinha, R., Schwede, M., Ben Viggiano, Kuo, D., Henry, S., Wood, D., Mannis, G., Majeti, R., Chen, J., Zhang, T. Y. AMER SOC HEMATOLOGY. 2023

View details for DOI 10.1182/blood-2023-190151

View details for Web of Science ID 001159306703229
The Shifting Prognosis of FLT3 Mutations in Acute Myeloid Leukemia in the Era of Targeted Therapy: A Real-World Study Using Large-Scale Electronic Health Record Data Schwede, M., Rodriguez, G., Henry, S., Wood, D., Mannis, G., Majeti, R., Chen, J., Bendavid, E., Zhang, T. Y. AMER SOC HEMATOLOGY. 2023

View details for DOI 10.1182/blood-2023-187725

View details for Web of Science ID 001159306703227
Overall Survival Among Patients With De Novo Stage IV Metastatic and Distant Metastatic Recurrent Non-Small Cell Lung Cancer. JAMA network open Su, C. C., Wu, J. T., Choi, E., Myall, N. J., Neal, J. W., Kurian, A. W., Stehr, H., Wood, D., Henry, S. M., Backhus, L. M., Leung, A. N., Wakelee, H. A., Han, S. S. 2023; 6 (9): e2335813

Abstract

Despite recent breakthroughs in therapy, advanced lung cancer still poses a therapeutic challenge. The survival profile of patients with metastatic lung cancer remains poorly understood by metastatic disease type (ie, de novo stage IV vs distant recurrence).To evaluate the association of metastatic disease type on overall survival (OS) among patients with non-small cell lung cancer (NSCLC) and to identify potential mechanisms underlying any survival difference.Cohort study of a national US population based at a tertiary referral center in the San Francisco Bay Area using participant data from the National Lung Screening Trial (NLST) who were enrolled between 2002 and 2004 and followed up for up to 7 years as the primary cohort and patient data from Stanford Healthcare (SHC) for diagnoses between 2009 and 2019 and followed up for up to 13 years as the validation cohort. Participants from NLST with de novo metastatic or distant recurrent NSCLC diagnoses were included. Data were analyzed from January 2021 to March 2023.De novo stage IV vs distant recurrent metastatic disease.OS after diagnosis of metastatic disease.The NLST and SHC cohort consisted of 660 and 180 participants, respectively (411 men [62.3%] vs 109 men [60.6%], 602 White participants [91.2%] vs 111 White participants [61.7%], and mean [SD] age of 66.8 [5.5] vs 71.4 [7.9] years at metastasis, respectively). Patients with distant recurrence showed significantly better OS than patients with de novo metastasis (adjusted hazard ratio [aHR], 0.72; 95% CI, 0.60-0.87; P < .001) in NLST, which was replicated in SHC (aHR, 0.64; 95% CI, 0.43-0.96; P = .03). In SHC, patients with de novo metastasis more frequently progressed to the bone (63 patients with de novo metastasis [52.5%] vs 19 patients with distant recurrence [31.7%]) or pleura (40 patients with de novo metastasis [33.3%] vs 8 patients with distant recurrence [13.3%]) than patients with distant recurrence and were primarily detected through symptoms (102 patients [85.0%]) as compared with posttreatment surveillance (47 patients [78.3%]) in the latter. The main finding remained consistent after further adjusting for metastasis sites and detection methods.In this cohort study, patients with distant recurrent NSCLC had significantly better OS than those with de novo disease, and the latter group was associated with characteristics that may affect overall survival. This finding can help inform future clinical trial designs to ensure a balance for baseline patient characteristics.

View details for DOI 10.1001/jamanetworkopen.2023.35813

View details for PubMedID 37751203
A hybrid modelling approach for abstracting CT imaging indications by integrating natural language processing from radiology reports with structured data from electronic health records. Khan, A., Wu, J., Choi, E., Graber-Naidich, A., Henry, S., Wakelee, H. A., Kurian, A. W., Liang, S., Leung, A., Langlotz, C., Backhus, L. M., Han, S. S. AMER ASSOC CANCER RESEARCH. 2023

View details for Web of Science ID 001057852300077
Predictive Model to Guide Brain Magnetic Resonance Imaging Surveillance in Patients With Metastatic Lung Cancer: Impact on Real-World Outcomes. JCO precision oncology Wu, J., Ding, V., Luo, S., Choi, E., Hellyer, J., Myall, N., Henry, S., Wood, D., Stehr, H., Ji, H., Nagpal, S., Hayden Gephart, M., Wakelee, H., Neal, J., Han, S. S. 2022; 6: e2200220

Abstract

Brain metastasis is common in lung cancer, and treatment of brain metastasis can lead to significant morbidity. Although early detection of brain metastasis may improve outcomes, there are no prediction models to identify high-risk patients for brain magnetic resonance imaging (MRI) surveillance. Our goal is to develop a machine learning-based clinicogenomic prediction model to estimate patient-level brain metastasis risk.A penalized regression competing risk model was developed using 330 patients diagnosed with lung cancer between January 2014 and June 2019 and followed through June 2021 at Stanford HealthCare. The main outcome was time from the diagnosis of distant metastatic disease to the development of brain metastasis, death, or censoring.Among the 330 patients, 84 (25%) developed brain metastasis over 627 person-years, with a 1-year cumulative brain metastasis incidence of 10.2% (95% CI, 6.8 to 13.6). Features selected for model inclusion were histology, cancer stage, age at diagnosis, primary site, and RB1 and ALK alterations. The prediction model yielded high discrimination (area under the curve 0.75). When the cohort was stratified by risk using a 1-year risk threshold of > 14.2% (85th percentile), the high-risk group had increased 1-year cumulative incidence of brain metastasis versus the low-risk group (30.8% v 6.1%, P < .01). Of 48 high-risk patients, 24 developed brain metastasis, and of these, 12 patients had brain metastasis detected more than 7 months after last brain MRI. Patients who missed this 7-month window had larger brain metastases (58% v 33% largest diameter > 10 mm; odds ratio, 2.80, CI, 0.51 to 13) versus those who had MRIs more frequently.The proposed model can identify high-risk patients, who may benefit from more intensive brain MRI surveillance to reduce morbidity of subsequent treatment through early detection.

View details for DOI 10.1200/PO.22.00220

View details for PubMedID 36201713
Accuracy of Electronic Medical Record Follow-Up Data for Estimating the Survival Time of Patients With Cancer. JCO clinical cancer informatics Gensheimer, M. F., Narasimhan, B., Henry, A. S., Wood, D. J., Rubin, D. L. 2022; 6: e2200019

Abstract

For real-world evidence, it is convenient to use routinely collected data from the electronic medical record (EMR) to measure survival outcomes. However, patients can become lost to follow-up, causing incomplete data and biased survival time estimates. We quantified this issue for patients with metastatic cancer seen in an academic health system by comparing survival estimates from EMR data only and from EMR data combined with high-quality cancer registry data.Patients diagnosed with metastatic cancer from 2008 to 2014 were included in this retrospective study. Patients who were diagnosed with cancer or received their initial treatment within our system were included in the institutional cancer registry and this study. Overall survival was calculated using the Kaplan-Meier method. Survival curves were generated in two ways: using EMR follow-up data alone and using EMR data supplemented with data from the Stanford Cancer Registry/California Cancer Registry.Four thousand seventy-seven patients were included. The median follow-up using EMR + Cancer Registry data was 19.9 months, and the median follow-up in surviving patients was 67.6 months. There were 1,301 deaths recorded in the EMR and 3,140 deaths recorded in the Cancer Registry. The median overall survival from the date of cancer diagnosis using EMR data was 58.7 months (95% CI, 54.2 to 63.2); using EMR + Cancer Registry data, it was 20.8 months (95% CI, 19.6 to 22.3). A similar pattern was seen using the date of first systemic therapy or date of first hospital admission as the baseline date.Using EMR data alone, survival time was overestimated compared with EMR + Cancer Registry data.

View details for DOI 10.1200/CCI.22.00019

View details for PubMedID 35802836
Patterns in cancer management changes for patients with COVID-19 in northern California. Glover, M., Wu, J., Kwon, D. H., Zhang, S., Henry, S., Wood, D., Rubin, D., Borno, H., Small, E., Schapira, L., Koshkin, V. S., Shah, S. LIPPINCOTT WILLIAMS & WILKINS. 2021

View details for DOI 10.1200/JCO.2021.39.15_suppl.1535

View details for Web of Science ID 000708120600235
Reply to R. Kebudi et al. JCO oncology practice Tsu-Yu Wu, J., Kwon, D. H., Glover, M., Henry, S., Wood, D., Rubin, D., Koshkin, V., Schapira, L., Shah, S. A. 2021: OP2100105

View details for DOI 10.1200/OP.21.00105

View details for PubMedID 33881937
Natural Language Processing to Identify Cancer Treatments With Electronic Medical Records. JCO clinical cancer informatics Zeng, J., Banerjee, I., Henry, A. S., Wood, D. J., Shachter, R. D., Gensheimer, M. F., Rubin, D. L. 2021; 5: 379–93

Abstract

PURPOSE: Knowing the treatments administered to patients with cancer is important for treatment planning and correlating treatment patterns with outcomes for personalized medicine study. However, existing methods to identify treatments are often lacking. We develop a natural language processing approach with structured electronic medical records and unstructured clinical notes to identify the initial treatment administered to patients with cancer.METHODS: We used a total number of 4,412 patients with 483,782 clinical notes from the Stanford Cancer Institute Research Database containing patients with nonmetastatic prostate, oropharynx, and esophagus cancer. We trained treatment identification models for each cancer type separately and compared performance of using only structured, only unstructured (bag-of-words, doc2vec, fasttext), and combinations of both (structured + bow, structured + doc2vec, structured + fasttext). We optimized the identification model among five machine learning methods (logistic regression, multilayer perceptrons, random forest, support vector machines, and stochastic gradient boosting). The treatment information recorded in the cancer registry is the gold standard and compares our methods to an identification baseline with billing codes.RESULTS: For prostate cancer, we achieved an f1-score of 0.99 (95% CI, 0.97 to 1.00) for radiation and 1.00 (95% CI, 0.99 to 1.00) for surgery using structured + doc2vec. For oropharynx cancer, we achieved an f1-score of 0.78 (95% CI, 0.58 to 0.93) for chemoradiation and 0.83 (95% CI, 0.69 to 0.95) for surgery using doc2vec. For esophagus cancer, we achieved an f1-score of 1.0 (95% CI, 1.0 to 1.0) for both chemoradiation and surgery using all combinations of structured and unstructured data. We found that employing the free-text clinical notes outperforms using the billing codes or only structured data for all three cancer types.CONCLUSION: Our results show that treatment identification using free-text clinical notes greatly improves upon the performance using billing codes and simple structured data. The approach can be used for treatment cohort identification and adapted for longitudinal cancer treatment identification.

View details for DOI 10.1200/CCI.20.00173

View details for PubMedID 33822653
Evaluation of Absolute Lymphocyte Count at Diagnosis and Mortality Among Patients With Localized Bone or Soft Tissue Sarcoma. JAMA network open Brewster, R., Purington, N., Henry, S., Wood, D., Ganjoo, K., Bui, N. 2021; 4 (3): e210845

Abstract

Importance: Host-related immune factors have been implicated in the development and progression of diverse malignant neoplasms. Identifying associations between immunologic laboratory parameters and overall survival may inform novel prognostic biomarkers and mechanisms of antitumor immunity in localized bone and soft tissue sarcoma.Objective: To assess whether lymphopenia at diagnosis is associated with overall survival among patients with localized bone and soft tissue sarcoma.Design, Setting, and Participants: This retrospective cohort study analyzed patients from the Stanford Cancer Institute with localized bone and soft tissue sarcoma between September 1, 1998, and November 1, 2018. Patients were included if laboratory values were available within 60 days of diagnosis and, if applicable, prior to the initiation of chemotherapy and/or radiotherapy. Statistical analysis was performed from January 1, 2019, to November 1, 2020.Exposures: Absolute lymphocyte count within 60 days of diagnosis and antimicrobial exposure, defined by the number of antimicrobial agent prescriptions and the cumulative duration of antimicrobial administration within 60 days of diagnosis.Main Outcomes and Measures: The association between minimum absolute lymphocyte count at diagnosis and 5-year overall survival probability was characterized with the Kaplan-Meier method and multivariate Cox proportional hazards regression models. Multivariable logistic regressions were fitted to evaluate whether patients with lymphopenia were at greater risk of increased antimicrobial exposure.Results: Among 634 patients, the median age at diagnosis was 53.7 years (interquartile range, 37.5-66.8 years), and 290 patients (45.7%) were women, with a 5-year survival probability of 67.9%. There was a significant inverse association between lymphopenia at diagnosis and overall survival (hazard ratio [HR], 1.82; 95% CI, 1.39-1.40), resulting in a 13.5% 5-year survival probability difference compared with patients who did not have lymphopenia at diagnosis (60.2% vs 73.7% for those who never had lymphopenia). In addition, poorer survival was observed with higher-grade lymphopenia (grades 3 and 4: HR, 2.44; 95% CI, 1.68-3.55; grades 1 and 2: HR, 1.60; 95% CI, 1.18-2.18). In an exploratory analysis, patients with increased antibiotic exposure were more likely to have lymphopenia (odds ratio, 1.96; 95% CI, 1.26-3.07 for total number of antimicrobial agents; odds ratio, 1.70; 95% CI, 1.10-2.57 for antimicrobial duration) than antimicrobial-naive patients.Conclusions and Relevance: This study suggests that an abnormally low absolute lymphocyte count at diagnosis is associated with higher mortality among patients with localized bone and soft tissue sarcoma; therefore, lymphopenia may serve as a reliable prognostic biomarker. Potential mechanisms associated with host immunity and overall survival include a suppressed antitumor response and increased infectious complications, which merit future investigation.

View details for DOI 10.1001/jamanetworkopen.2021.0845

View details for PubMedID 33666664
Impact of COVID-19 on breast cancer care at a Bay Area academic center Wu, J., Bobo, S., Henry, S., Mills, M., Kurian, A., Dirbas, F. AMER ASSOC CANCER RESEARCH. 2021

View details for Web of Science ID 000618737701065
Utility of Routine Surveillance Laboratory Testing in Detecting Relapse in Patients With Classic Hodgkin Lymphoma in First Remission: Results From a Large Single-Institution Study. JCO oncology practice Lynch, R. C., Sundaram, V., Desai, M., Henry, S., Wood, D., Daadi, S., Hoppe, R. T., Advani, R. 2020: JOP1900733

Abstract

PURPOSE: Classic Hodgkin lymphoma is highly curable with contemporary therapy. Although the limited role of surveillance imaging to detect early relapse for patients in complete remission at the end of therapy is well established, there is a paucity of data regarding role of laboratory testing in this setting.METHODS: Patients with newly diagnosed classic Hodgkin lymphoma uniformly treated with the Stanford V regimen from 1998-2014 and in complete remission for at least 3 months were identified in a single-center institutional database. Laboratory tests categorized by Common Terminology Criteria for Adverse Events v4.03 as grade 2 or higher were considered abnormal. Primary analysis included sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of surveillance laboratory tests for predicting relapse in the first 3 years after end of treatment.RESULTS: Among 235 eligible patients, 24 (10.2%) patients ultimately relapsed. In the first 3 years after end of therapy, the mean number of surveillance blood draws per patient was 7.1, (range, 1-13). These 1,661 surveillance blood draws included 4,684 individual laboratory tests, comprising 1,609 CBCs, 1,578 metabolic panels, and 1,497 erythrocyte sedimentation rates. None of the biopsies confirming relapses were prompted by any abnormal laboratory finding. The sensitivity of any surveillance laboratory test for detecting relapse within 3 years of end of treatment was 72.7% (95% CI, 49.8% to 89.3%), specificity 22.6% (95% CI, 17.2% to 28.9%), yielding a PPV of 8.9% (95% CI, 7.0% to 11.3%) and NPV of 88.9% (95% CI, 79% to 94%).CONCLUSION: Our study found limited clinically meaningful utility for routine surveillance laboratory testing in detecting relapse in patients with complete remission at end of treatment. Our results warrant consideration of modifications to current practice guidelines.

View details for DOI 10.1200/JOP.19.00733

View details for PubMedID 32369413
Computing the Cost of Care Per Day for Patients With Metastatic NETs Gupta, D., Kapphahn, K., Qin, F., Hornbacker, K., Henry, S., Wood, D., Blayney, D., Kunz, P. LIPPINCOTT WILLIAMS & WILKINS. 2020: 470

View details for Web of Science ID 000526823600061
Automated model versus treating physician for predicting survival time of patients with metastatic cancer. Journal of the American Medical Informatics Association : JAMIA Gensheimer, M. F., Aggarwal, S. n., Benson, K. R., Carter, J. N., Henry, A. S., Wood, D. J., Soltys, S. G., Hancock, S. n., Pollom, E. n., Shah, N. H., Chang, D. T. 2020

Abstract

Being able to predict a patient's life expectancy can help doctors and patients prioritize treatments and supportive care. For predicting life expectancy, physicians have been shown to outperform traditional models that use only a few predictor variables. It is possible that a machine learning model that uses many predictor variables and diverse data sources from the electronic medical record can improve on physicians' performance. For patients with metastatic cancer, we compared accuracy of life expectancy predictions by the treating physician, a machine learning model, and a traditional model.A machine learning model was trained using 14 600 metastatic cancer patients' data to predict each patient's distribution of survival time. Data sources included note text, laboratory values, and vital signs. From 2015-2016, 899 patients receiving radiotherapy for metastatic cancer were enrolled in a study in which their radiation oncologist estimated life expectancy. Survival predictions were also made by the machine learning model and a traditional model using only performance status. Performance was assessed with area under the curve for 1-year survival and calibration plots.The radiotherapy study included 1190 treatment courses in 899 patients. A total of 879 treatment courses in 685 patients were included in this analysis. Median overall survival was 11.7 months. Physicians, machine learning model, and traditional model had area under the curve for 1-year survival of 0.72 (95% CI 0.63-0.81), 0.77 (0.73-0.81), and 0.68 (0.65-0.71), respectively.The machine learning model's predictions were more accurate than those of the treating physician or a traditional model.

View details for DOI 10.1093/jamia/ocaa290

View details for PubMedID 33313792
Changes in Cancer Management due to COVID-19 Illness in Patients with Cancer in Northern California. JCO oncology practice Wu, J. T., Kwon, D. H., Glover, M. J., Henry, S. n., Wood, D. n., Rubin, D. L., Koshkin, V. S., Schapira, L. n., Shah, S. A. 2020: OP2000790

Abstract

The response to the COVID-19 pandemic has affected the management of patients with cancer. In this pooled retrospective analysis, we describe changes in management patterns for patients with cancer diagnosed with COVID-19 in two academic institutions in the San Francisco Bay Area.Adult and pediatric patients diagnosed with COVID-19 with a current or historical diagnosis of malignancy were identified from the electronic medical record at the University of California, San Francisco, and Stanford University. The proportion of patients undergoing active cancer management whose care was affected was quantified and analyzed for significant differences with regard to management type, treatment intent, and the time of COVID-19 diagnosis. The duration and characteristics of such changes were compared across subgroups.A total of 131 patients were included, of whom 55 were undergoing active cancer management. Of these, 35 of 55 (64%) had significant changes in management that consisted primarily of delays. An additional three patients not undergoing active cancer management experienced a delay in management after being diagnosed with COVID-19. The decision to change management was correlated with the time of COVID-19 diagnosis, with more delays identified in patients treated with palliative intent earlier in the course of the pandemic (March/April 2020) compared with later (May/June 2020) (OR, 4.2; 95% CI, 1.03 to 17.3; P = .0497). This difference was not seen among patients treated with curative intent during the same timeframe.We found significant changes in the management of cancer patients with COVID-19 treated with curative and palliative intent that evolved over time. Future studies are needed to determine the impact of changes in management and treatment on cancer outcomes for patients with cancer and COVID-19.

View details for DOI 10.1200/OP.20.00790

View details for PubMedID 33332170
Automated Survival Prediction in Metastatic Cancer Patients Using High-Dimensional Electronic Medical Record Data JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE Gensheimer, M. F., Henry, A., Wood, D. J., Hastie, T. J., Aggarwal, S., Dudley, S. A., Pradhan, P., Banerjee, I., Cho, E., Ramchandran, K., Pollom, E., Koong, A. C., Rubin, D. L., Chang, D. T. 2019; 111 (6): 568–74

View details for DOI 10.1093/jnci/djy178

View details for Web of Science ID 000474267400007
Natural Disease History, Outcomes, and Co-mutations in a Series of Patients With BRAF-Mutated Non-small-cell Lung Cancer CLINICAL LUNG CANCER Myall, N. J., Henry, S., Wood, D., Neal, J. W., Han, S. S., Padda, S. K., Wakelee, H. A. 2019; 20 (2): E208–E217

View details for DOI 10.1016/j.cllc.2018.10.003

View details for Web of Science ID 000459793100012
Dynamin impacts homology-directed repair and breast cancer response to chemotherapy JOURNAL OF CLINICAL INVESTIGATION Chernikova, S. B., Nguyen, R. B., Truong, J. T., Mello, S. S., Stafford, J. H., Hay, M. P., Olson, A., Solow-Cordero, D. E., Wood, D. J., Henry, S., von Eyben, R., Deng, L., Gephart, M., Aroumougame, A., Wiese, C., Game, J. C., Gyorffy, B., Brown, J. 2018; 128 (12): 5307–21

View details for DOI 10.1172/JCI87191

View details for Web of Science ID 000452071300018
No Utility of Routine Laboratory Testing during Surveillance in Detecting Relapse in Patients with Classic Hodgkin Lymphoma in First Remission Lynch, R. C., Sundaram, V., Desai, M., Henry, S., Wood, D., Daadi, S., Corbelli, K. S., Rosenberg, S., Hoppe, R. T., Advani, R. AMER SOC HEMATOLOGY. 2018

View details for DOI 10.1182/blood-2018-99-113156

View details for Web of Science ID 000454837601324
Natural Disease History, Outcomes, and Co-mutations in a Series of Patients With BRAF-Mutated Non-small-cell Lung Cancer. Clinical lung cancer Myall, N. J., Henry, S., Wood, D., Neal, J. W., Han, S. S., Padda, S. K., Wakelee, H. A. 2018

Abstract

BACKGROUND: BRAF mutations occur in 1% to 4% of non-small-cell lung cancer (NSCLC) cases. Previous retrospective studies have reported similar outcomes for BRAF-mutated NSCLC as compared with wild-type tumors without a known driver mutation or tumors harboring other mutations. However, select cases of prolonged survival have also been described, and thus, the natural history of BRAF-mutated NSCLC remains an area of ongoing study. The aim of this series was to describe the natural history, clinical outcomes, and occurrence of co-mutations in patients with BRAF-mutated NSCLC.PATIENTS AND METHODS: Patients with BRAF-mutated NSCLC seen at Stanford University Medical Center from January 1, 2006 through July 31, 2015 were reviewed. The Kaplan-Meier method was used to calculate median overall survival, and the generalized Wilcoxon test was used to compare median survivals across subgroups of patients.RESULTS: Within a cohort of 18 patients with BRAF-mutated NSCLC, V600E mutations were most common (72%; 13/18). Clinicopathologic features were similar between patients with V600E versus non-V600E mutations, although there was a trend toward more patients with non-V600E mutations being heavy smokers (80% vs. 31%; P= .12). Co-occurring mutations in TP53 were identified most commonly (28%; 5/18). The median overall survival for the entire cohort was 40.1 months, and the median survival from the onset of metastases (n= 16) was 28.1 months. Survival rates at 2 and 5 years from the onset of metastases were 56% and 13%, respectively.CONCLUSION: The clinical behavior of BRAF-mutated NSCLC is variable, but favorable outcomes can be seen in a subset of patients.

View details for PubMedID 30442523
Automated Survival Prediction in Metastatic Cancer Patients Using High-Dimensional Electronic Medical Record Data. Journal of the National Cancer Institute Gensheimer, M. F., Henry, A. S., Wood, D. J., Hastie, T. J., Aggarwal, S., Dudley, S. A., Pradhan, P., Banerjee, I., Cho, E., Ramchandran, K., Pollom, E., Koong, A. C., Rubin, D. L., Chang, D. T. 2018

Abstract

Background: Oncologists use patients' life expectancy to guide decisions and may benefit from a tool that accurately predicts prognosis. Existing prognostic models generally use only a few predictor variables. We used an electronic medical record dataset to train a prognostic model for patients with metastatic cancer.Methods: The model was trained and tested using 12588 patients treated for metastatic cancer in the Stanford Health Care system from 2008 to 2017. Data sources included provider note text, labs, vital signs, procedures, medication orders, and diagnosis codes. Patients were divided randomly into a training set used to fit the model coefficients and a test set used to evaluate model performance (80%/20% split). A regularized Cox model with 4126 predictor variables was used. A landmarking approach was used due to the multiple observations per patient, with t0 set to the time of metastatic cancer diagnosis. Performance was also evaluated using 399 palliative radiation courses in test set patients.Results: The C-index for overall survival was 0.786 in the test set (averaged across landmark times). For palliative radiation courses, the C-index was 0.745 (95% confidence interval [CI] = 0.715 to 0.775) compared with 0.635 (95% CI = 0.601 to 0.669) for a published model using performance status, primary tumor site, and treated site (two-sided P<.001). Our model's predictions were well-calibrated.Conclusions: The model showed high predictive performance, which will need to be validated using external data. Because it is fully automated, the model can be used to examine providers' practice patterns and could be deployed in a decision support tool to help improve quality of care.

View details for PubMedID 30346554
Retrospective comparison of the clinical effects of programmed death protein 1 inhibitors to treat melanoma versus nonmelanoma skin cancer Jin, M., Li, S., Duy Tran, Henry, S., Wood, D., Chang, A. MOSBY-ELSEVIER. 2018: AB246

View details for Web of Science ID 000440565901464
Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) Utilizing Free-Text Clinical Narratives. Scientific reports Banerjee, I., Gensheimer, M. F., Wood, D. J., Henry, S., Aggarwal, S., Chang, D. T., Rubin, D. L. 2018; 8 (1): 10037

Abstract

We propose a deep learning model - Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) for estimating short-term life expectancy (>3 months) of the patients by analyzing free-text clinical notes in the electronic medical record, while maintaining the temporal visit sequence. In a single framework, we integrated semantic data mapping and neural embedding technique to produce a text processing method that extracts relevant information from heterogeneous types of clinical notes in an unsupervised manner, and we designed a recurrent neural network to model the temporal dependency of the patient visits. The model was trained on a large dataset (10,293 patients) and validated on a separated dataset (1818 patients). Our method achieved an area under the ROC curve (AUC) of 0.89. To provide explain-ability, we developed an interactive graphical tool that may improve physician understanding of the basis for the model's predictions. The high accuracy and explain-ability of the PPES-Met model may enable our model to be used as a decision support tool to personalize metastatic cancer treatment and provide valuable assistance to the physicians.

View details for PubMedID 29968730
Probabilistic Prognostic Estimates of Survival in Metastatic Cancer Patients (PPES-Met) Utilizing Free-Text Clinical Narratives SCIENTIFIC REPORTS Banerjee, I., Gensheimer, M., Wood, D. J., Henry, S., Aggarwal, S., Chang, D. T., Rubin, D. L. 2018; 8

View details for DOI 10.1038/s41598-018-27946-5

View details for Web of Science ID 000437097800006
Decreased lymphocyte count at diagnosis as a marker for worse outcome in localized sarcoma. Bui, N., Henry, S., Ganjoo, K. N. AMER SOC CLINICAL ONCOLOGY. 2018

View details for DOI 10.1200/JCO.2018.36.15_suppl.e23537

View details for Web of Science ID 000442916007238
Dynamin impacts homology-directed repair and breast cancer response to chemotherapy. The Journal of clinical investigation Chernikova, S. B., Nguyen, R. B., Truong, J. T., Mello, S. S., Stafford, J. H., Hay, M. P., Olson, A. n., Solow-Cordero, D. E., Wood, D. J., Henry, S. n., von Eyben, R. n., Deng, L. n., Gephart, M. H., Aroumougame, A. n., Wiese, C. n., Game, J. C., Győrffy, B. n., Brown, J. M. 2018

Abstract

After the initial responsiveness of triple-negative breast cancers (TNBCs) to chemotherapy, they often recur as chemotherapy-resistant tumors, and this has been associated with upregulated homology-directed repair (HDR). Thus, inhibitors of HDR could be a useful adjunct to chemotherapy treatment of these cancers. We performed a high-throughput chemical screen for inhibitors of HDR from which we obtained a number of hits that disrupted microtubule dynamics. We postulated that high levels of the target molecules of our screen in tumors would correlate with poor chemotherapy response. We found that inhibition or knockdown of dynamin 2 (DNM2), known for its role in endocytic cell trafficking and microtubule dynamics, impaired HDR and improved response to chemotherapy of cells and of tumors in mice. In a retrospective analysis, levels of DNM2 at the time of treatment strongly predicted chemotherapy outcome for estrogen receptor-negative and especially for TNBC patients. We propose that DNM2-associated DNA repair enzyme trafficking is important for HDR efficiency and is a powerful predictor of sensitivity to breast cancer chemotherapy and an important target for therapy.

View details for PubMedID 30371505
An 18-year retrospective study on the outcomes of keratoacanthomas with different treatment modalities at a single academic center Duy Tran, Li, S., Henry, S., Wood, D., Chang, A. MOSBY-ELSEVIER. 2017: AB39

View details for Web of Science ID 000403369300150
A natural language processing algorithm to measure quality prostate cancer care. Hernandez-Boussard, T., Kourdis, P., Dulal, R., Ferrari, M., Henry, S., Seto, T., McDonald, K., Blayney, D. W., Brooks, J. D. AMER SOC CLINICAL ONCOLOGY. 2017

View details for Web of Science ID 000443301600231
Chart Review Versus an Automated Bioinformatic Approach to Assess Real-World Crizotinib Effectiveness in Anaplastic Lymphoma Kinase–Positive Non–Small-Cell Lung Cancer JCO: Clinical Cancer Informatics Bui, N., Henry, S., Wood, D., Wakelee, H. A., Neal, J. W. 2017

View details for DOI 10.1200/CCI.16.00055
Third party assessment of resection margin status in head and neck cancer ORAL ONCOLOGY Ransohoff, A., Wood, D., Henry, A. S., Divi, V., Colevas, A. 2016; 57: 27-31

Abstract

Definitive assessment of primary site margin status following resection of head and neck cancer is necessary for prognostication, treatment determination and qualification for clinical trials. This retrospective analysis determined how often an independent reviewer can assess primary tumor margin status of head and neck cancer resections based on review of the pathology report, surgical operative report, and first follow-up note alone.We extracted from the electronic medical record pathology reports, operative reports, and follow-up notes from head and neck cancer resections performed at Stanford Hospital. We classified margin status as definitive or not. We labeled any pathology report clearly indicating a positive, negative, or close (<5mm) margin as definitive. For each non-definitive pathology report, we reviewed the operative report and then the first follow-up note in an attempt to clarify margin status. We also looked for associations between non-definitive status and surgeon, year, and primary site.743 unique cases of head and neck cancer resection were extracted. We discarded 255 as non-head and neck cancer cases, or cases that did not involve a definitive resection of a primary tumor site. We could not definitively establish margin status in 20% of resections by independent review of the medical record. There was no correlation between margin determination and surgeon, site, or year of surgery.A substantial fraction (20%) of primary site surgical margins could not be definitively determined via independent EMR review. This could have implications for subsequent patient care decisions and clinical trial options.

View details for DOI 10.1016/j.oraloncology.2016.03.009

View details for Web of Science ID 000376084500010

View details for PubMedID 27208841
Increased Risk of Cutaneous Squamous Cell Carcinoma After Vismodegib Therapy for Basal Cell Carcinoma JAMA DERMATOLOGY Mohan, S. V., Chang, J., Li, S., Henry, A. S., Wood, D. J., Chang, A. L. 2016; 152 (5): 527-532

Abstract

Smoothened inhibitors (SIs) are a new type of targeted therapy for advanced basal cell carcinoma (BCC), and their long-term effects, such as increased risk of subsequent malignancy, are still being explored.To evaluate the risk of developing a non-BCC malignancy after SI exposure in patients with BCC.A case-control study at Stanford Medical Center, an academic hospital. Participants were higher-risk patients with BCC diagnosed from January 1, 1998, to December 31, 2014. The dates of the analysis were January 1 to November 1, 2015.The exposed participants (cases) comprised patients who had confirmed prior vismodegib treatment, and the nonexposed participants (controls) comprised patients who had never received any SI. Because vismodegib was the first approved SI, only patients exposed to this SI were included.Hazard ratio for non-BCC malignancies after vismodegib exposure, adjusting for covariates.The study cohort comprised 180 participants. Their mean (SD) age at BCC diagnosis was 56 (16) years, and 68.9% (n = 124) were male. Fifty-five cases were compared with 125 controls, accounting for age, sex, prior radiation therapy or cisplatin treatment, Charlson Comorbidity Index, clinical follow-up time, immunosuppression, and basal cell nevus syndrome status. Patients exposed to vismodegib had a hazard ratio of 6.37 (95% CI, 3.39-11.96; P < .001), indicating increased risk of developing a non-BCC malignancy. Most non-BCC malignancies were cutaneous squamous cell carcinomas, with a hazard ratio of 8.12 (95% CI, 3.89-16.97; P < .001), accounting for age and basal cell nevus syndrome status. There was no significant increase in other cancers.Increased risk for cutaneous squamous cell carcinomas after vismodegib therapy highlights the importance of continued skin surveillance after initiation of this therapy.

View details for DOI 10.1001/jamadermatol.2015.4330

View details for PubMedID 26914338
Third party assessment of resection margin status in head and neck cancer. Ransohoff, A., Wood, D., Henry, S., Divi, V., Colevas, A. AMER SOC CLINICAL ONCOLOGY. 2015

View details for DOI 10.1200/jco.2015.33.15_suppl.e17011

View details for Web of Science ID 000358036903396
Third party assessment of resection margin status in head and neck cancer. Journal of Clinical Oncology, 2015 ASCO Annual Meeting (May 29 - June 2, 2015) Ransohoff, A., Wood, D., Henry, S., Divi, V., Colevas, A. 2015; Vol 33
Referral trends for reproductive-age patients with breast cancer to a reproductive endocrinology clinic for fertility preservation counseling between 2004 and 2012. Kort, J., Seiger, K., Henry, S., Westphal, L. AMER SOC CLINICAL ONCOLOGY. 2013

View details for DOI 10.1200/jco.2013.31.26_suppl.129

View details for Web of Science ID 000335564700100
New models and online calculator for predicting non-sentinel lymph node status in sentinel lymph node positive breast cancer patients BMC CANCER Kohrt, H. E., Olshen, R. A., Bermas, H. R., Goodson, W. H., Wood, D. J., Henry, S., Rouse, R. V., Bailey, L., Philben, V. J., Dirbas, F. M., Dunn, J. J., Johnson, D. L., Wapnir, I. L., Carlson, R. W., Stockdale, F. E., Hansen, N. M., Jeffrey, S. S. 2008; 8

Abstract

Current practice is to perform a completion axillary lymph node dissection (ALND) for breast cancer patients with tumor-involved sentinel lymph nodes (SLNs), although fewer than half will have non-sentinel node (NSLN) metastasis. Our goal was to develop new models to quantify the risk of NSLN metastasis in SLN-positive patients and to compare predictive capabilities to another widely used model.We constructed three models to predict NSLN status: recursive partitioning with receiver operating characteristic curves (RP-ROC), boosted Classification and Regression Trees (CART), and multivariate logistic regression (MLR) informed by CART. Data were compiled from a multicenter Northern California and Oregon database of 784 patients who prospectively underwent SLN biopsy and completion ALND. We compared the predictive abilities of our best model and the Memorial Sloan-Kettering Breast Cancer Nomogram (Nomogram) in our dataset and an independent dataset from Northwestern University.285 patients had positive SLNs, of which 213 had known angiolymphatic invasion status and 171 had complete pathologic data including hormone receptor status. 264 (93%) patients had limited SLN disease (micrometastasis, 70%, or isolated tumor cells, 23%). 101 (35%) of all SLN-positive patients had tumor-involved NSLNs. Three variables (tumor size, angiolymphatic invasion, and SLN metastasis size) predicted risk in all our models. RP-ROC and boosted CART stratified patients into four risk levels. MLR informed by CART was most accurate. Using two composite predictors calculated from three variables, MLR informed by CART was more accurate than the Nomogram computed using eight predictors. In our dataset, area under ROC curve (AUC) was 0.83/0.85 for MLR (n = 213/n = 171) and 0.77 for Nomogram (n = 171). When applied to an independent dataset (n = 77), AUC was 0.74 for our model and 0.62 for Nomogram. The composite predictors in our model were the product of angiolymphatic invasion and size of SLN metastasis, and the product of tumor size and square of SLN metastasis size.We present a new model developed from a community-based SLN database that uses only three rather than eight variables to achieve higher accuracy than the Nomogram for predicting NSLN status in two different datasets.

View details for DOI 10.1186/1471-2407-8-66

View details for PubMedID 18315887

A. Solomon Henry

Affiliate, Technology & Digital Solutions

Education & Certifications

Projects

Location

Location

Location

Location

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract