Stanford Advisors

All Publications

  • Understanding the molecular basis of resilience to Alzheimer's disease. Frontiers in neuroscience Montine, K. S., Berson, E., Phongpreecha, T., Huang, Z., Aghaeepour, N., Zou, J. Y., MacCoss, M. J., Montine, T. J. 2023; 17: 1311157


    The cellular and molecular distinction between brain aging and neurodegenerative disease begins to blur in the oldest old. Approximately 15-25% of observations in humans do not fit predicted clinical manifestations, likely the result of suppressed damage despite usually adequate stressors and of resilience, the suppression of neurological dysfunction despite usually adequate degeneration. Factors during life may predict the clinico-pathologic state of resilience: cardiovascular health and mental health, more so than educational attainment, are predictive of a continuous measure of resilience to Alzheimer's disease (AD) and AD-related dementias (ADRDs). In resilience to AD alone (RAD), core features include synaptic and axonal processes, especially in the hippocampus. Future focus on larger and more diverse cohorts and additional regions offer emerging opportunities to understand this counterforce to neurodegeneration. The focus of this review is the molecular basis of resilience to AD.

    View details for DOI 10.3389/fnins.2023.1311157

    View details for PubMedID 38192507

    View details for PubMedCentralID PMC10773681

  • Quantitative estimate of cognitive resilience and its medical and genetic associations. Alzheimer's research & therapy Phongpreecha, T., Godrich, D., Berson, E., Espinosa, C., Kim, Y., Cholerton, B., Chang, A. L., Mataraso, S., Bukhari, S. A., Perna, A., Yakabi, K., Montine, K. S., Poston, K. L., Mormino, E., White, L., Beecham, G., Aghaeepour, N., Montine, T. J. 2023; 15 (1): 192


    We have proposed that cognitive resilience (CR) counteracts brain damage from Alzheimer's disease (AD) or AD-related dementias such that older individuals who harbor neurodegenerative disease burden sufficient to cause dementia remain cognitively normal. However, CR traditionally is considered a binary trait, capturing only the most extreme examples, and is often inconsistently defined.This study addressed existing discrepancies and shortcomings of the current CR definition by proposing a framework for defining CR as a continuous variable for each neuropsychological test. The linear equations clarified CR's relationship to closely related terms, including cognitive function, reserve, compensation, and damage. Primarily, resilience is defined as a function of cognitive performance and damage from neuropathologic damage. As such, the study utilized data from 844 individuals (age = 79 ± 12, 44% female) in the National Alzheimer's Coordinating Center cohort that met our inclusion criteria of comprehensive lesion rankings for 17 neuropathologic features and complete neuropsychological test results. Machine learning models and GWAS then were used to identify medical and genetic factors that are associated with CR.CR varied across five cognitive assessments and was greater in female participants, associated with longer survival, and weakly associated with educational attainment or APOE ε4 allele. In contrast, damage was strongly associated with APOE ε4 allele (P value < 0.0001). Major predictors of CR were cardiovascular health and social interactions, as well as the absence of behavioral symptoms.Our framework explicitly decoupled the effects of CR from neuropathologic damage. Characterizations and genetic association study of these two components suggest that the underlying CR mechanism has minimal overlap with the disease mechanism. Moreover, the identified medical features associated with CR suggest modifiable features to counteract clinical expression of damage and maintain cognitive function in older individuals.

    View details for DOI 10.1186/s13195-023-01329-z

    View details for PubMedID 37926851

    View details for PubMedCentralID 6410486

  • Deep representation learning identifies associations between physical activity and sleep patterns during pregnancy and prematurity. NPJ digital medicine Ravindra, N. G., Espinosa, C., Berson, E., Phongpreecha, T., Zhao, P., Becker, M., Chang, A. L., Shome, S., Marić, I., De Francesco, D., Mataraso, S., Saarunya, G., Thuraiappah, M., Xue, L., Gaudillière, B., Angst, M. S., Shaw, G. M., Herzog, E. D., Stevenson, D. K., England, S. K., Aghaeepour, N. 2023; 6 (1): 171


    Preterm birth (PTB) is the leading cause of infant mortality globally. Research has focused on developing predictive models for PTB without prioritizing cost-effective interventions. Physical activity and sleep present unique opportunities for interventions in low- and middle-income populations (LMICs). However, objective measurement of physical activity and sleep remains challenging and self-reported metrics suffer from low-resolution and accuracy. In this study, we use physical activity data collected using a wearable device comprising over 181,944 h of data across N = 1083 patients. Using a new state-of-the art deep learning time-series classification architecture, we develop a 'clock' of healthy dynamics during pregnancy by using gestational age (GA) as a surrogate for progression of pregnancy. We also develop novel interpretability algorithms that integrate unsupervised clustering, model error analysis, feature attribution, and automated actigraphy analysis, allowing for model interpretation with respect to sleep, activity, and clinical variables. Our model performs significantly better than 7 other machine learning and AI methods for modeling the progression of pregnancy. We found that deviations from a normal 'clock' of physical activity and sleep changes during pregnancy are strongly associated with pregnancy outcomes. When our model underestimates GA, there are 0.52 fewer preterm births than expected (P = 1.01e - 67, permutation test) and when our model overestimates GA, there are 1.44 times (P = 2.82e - 39, permutation test) more preterm births than expected. Model error is negatively correlated with interdaily stability (P = 0.043, Spearman's), indicating that our model assigns a more advanced GA when an individual's daily rhythms are less precise. Supporting this, our model attributes higher importance to sleep periods in predicting higher-than-actual GA, relative to lower-than-actual GA (P = 1.01e - 21, Mann-Whitney U). Combining prediction and interpretability allows us to signal when activity behaviors alter the likelihood of preterm birth and advocates for the development of clinical decision support through passive monitoring and exercise habit and sleep recommendations, which can be easily implemented in LMICs.

    View details for DOI 10.1038/s41746-023-00911-x

    View details for PubMedID 37770643

    View details for PubMedCentralID 3796350

  • Cross-species comparative analysis of single presynapses. Scientific reports Berson, E., Gajera, C. R., Phongpreecha, T., Perna, A., Bukhari, S. A., Becker, M., Chang, A. L., De Francesco, D., Espinosa, C., Ravindra, N. G., Postupna, N., Latimer, C. S., Shively, C. A., Register, T. C., Craft, S., Montine, K. S., Fox, E. J., Keene, C. D., Bendall, S. C., Aghaeepour, N., Montine, T. J. 2023; 13 (1): 13849


    Comparing brain structure across species and regions enables key functional insights. Leveraging publicly available data from a novel mass cytometry-based method, synaptometry by time of flight (SynTOF), we applied an unsupervised machine learning approach to conduct a comparative study of presynapse molecular abundance across three species and three brain regions. We used neural networks and their attractive properties to model complex relationships among high dimensional data to develop a unified, unsupervised framework for comparing the profile of more than 4.5 million single presynapses among normal human, macaque, and mouse samples. An extensive validation showed the feasibility of performing cross-species comparison using SynTOF profiling. Integrative analysis of the abundance of 20 presynaptic proteins revealed near-complete separation between primates and mice involving synaptic pruning, cellular energy, lipid metabolism, and neurotransmission. In addition, our analysis revealed a strong overlap between the presynaptic composition of human and macaque in the cerebral cortex and neostriatum. Our unique approach illuminates species- and region-specific variation in presynapse molecular composition.

    View details for DOI 10.1038/s41598-023-40683-8

    View details for PubMedID 37620363

    View details for PubMedCentralID 3365257

  • Whole genome deconvolution unveils Alzheimer's resilient epigenetic signature. Nature communications Berson, E., Sreenivas, A., Phongpreecha, T., Perna, A., Grandi, F. C., Xue, L., Ravindra, N. G., Payrovnaziri, N., Mataraso, S., Kim, Y., Espinosa, C., Chang, A. L., Becker, M., Montine, K. S., Fox, E. J., Chang, H. Y., Corces, M. R., Aghaeepour, N., Montine, T. J. 2023; 14 (1): 4947


    Assay for Transposase Accessible Chromatin by sequencing (ATAC-seq) accurately depicts the chromatin regulatory state and altered mechanisms guiding gene expression in disease. However, bulk sequencing entangles information from different cell types and obscures cellular heterogeneity. To address this, we developed Cellformer, a deep learning method that deconvolutes bulk ATAC-seq into cell type-specific expression across the whole genome. Cellformer enables cost-effective cell type-specific open chromatin profiling in large cohorts. Applied to 191 bulk samples from 3 brain regions, Cellformer identifies cell type-specific gene regulatory mechanisms involved in resilience to Alzheimer's disease, an uncommon group of cognitively healthy individuals that harbor a high pathological load of Alzheimer's disease. Cell type-resolved chromatin profiling unveils cell type-specific pathways and nominates potential epigenetic mediators underlying resilience that may illuminate therapeutic opportunities to limit the cognitive impact of the disease. Cellformer is freely available to facilitate future investigations using high-throughput bulk ATAC-seq data.

    View details for DOI 10.1038/s41467-023-40611-4

    View details for PubMedID 37587197

    View details for PubMedCentralID 6071637

  • Multiomic signals associated with maternal epidemiological factors contributing to preterm birth in low- and middle-income countries. Science advances Espinosa, C. A., Khan, W., Khanam, R., Das, S., Khalid, J., Pervin, J., Kasaro, M. P., Contrepois, K., Chang, A. L., Phongpreecha, T., Michael, B., Ellenberger, M., Mehmood, U., Hotwani, A., Nizar, A., Kabir, F., Wong, R. J., Becker, M., Berson, E., Culos, A., De Francesco, D., Mataraso, S., Ravindra, N., Thuraiappah, M., Xenochristou, M., Stelzer, I. A., Marić, I., Dutta, A., Raqib, R., Ahmed, S., Rahman, S., Hasan, A. S., Ali, S. M., Juma, M. H., Rahman, M., Aktar, S., Deb, S., Price, J. T., Wise, P. H., Winn, V. D., Druzin, M. L., Gibbs, R. S., Darmstadt, G. L., Murray, J. C., Stringer, J. S., Gaudilliere, B., Snyder, M. P., Angst, M. S., Rahman, A., Baqui, A. H., Jehan, F., Nisar, M. I., Vwalika, B., Sazawal, S., Shaw, G. M., Stevenson, D. K., Aghaeepour, N. 2023; 9 (21): eade7692


    Preterm birth (PTB) is the leading cause of death in children under five, yet comprehensive studies are hindered by its multiple complex etiologies. Epidemiological associations between PTB and maternal characteristics have been previously described. This work used multiomic profiling and multivariate modeling to investigate the biological signatures of these characteristics. Maternal covariates were collected during pregnancy from 13,841 pregnant women across five sites. Plasma samples from 231 participants were analyzed to generate proteomic, metabolomic, and lipidomic datasets. Machine learning models showed robust performance for the prediction of PTB (AUROC = 0.70), time-to-delivery (r = 0.65), maternal age (r = 0.59), gravidity (r = 0.56), and BMI (r = 0.81). Time-to-delivery biological correlates included fetal-associated proteins (e.g., ALPP, AFP, and PGF) and immune proteins (e.g., PD-L1, CCL28, and LIFR). Maternal age negatively correlated with collagen COL9A1, gravidity with endothelial NOS and inflammatory chemokine CXCL13, and BMI with leptin and structural protein FABP4. These results provide an integrated view of epidemiological factors associated with PTB and identify biological signatures of clinical covariates affecting this disease.

    View details for DOI 10.1126/sciadv.ade7692

    View details for PubMedID 37224249

  • Large-scale correlation network construction for unraveling the coordination of complex biological systems NATURE COMPUTATIONAL SCIENCE Becker, M., Nassar, H., Espinosa, C., Stelzer, I. A., Feyaerts, D., Berson, E., Bidoki, N. H., Chang, A. L., Saarunya, G., Culos, A., De Francesco, D., Fallahzadeh, R., Liu, Q., Kim, Y., Maric, I., Mataraso, S. J., Payrovnaziri, S., Phongpreecha, T., Ravindra, N. G., Stanley, N., Shome, S., Tan, Y., Thuraiappah, M., Xenochristou, M., Xue, L., Shaw, G., Stevenson, D., Angst, M. S., Gaudilliere, B., Aghaeepour, N. 2023
  • Large-scale correlation network construction for unraveling the coordination of complex biological systems. Nature computational science Becker, M., Nassar, H., Espinosa, C., Stelzer, I. A., Feyaerts, D., Berson, E., Bidoki, N. H., Chang, A. L., Saarunya, G., Culos, A., De Francesco, D., Fallahzadeh, R., Liu, Q., Kim, Y., Marić, I., Mataraso, S. J., Payrovnaziri, S. N., Phongpreecha, T., Ravindra, N. G., Stanley, N., Shome, S., Tan, Y., Thuraiappah, M., Xenochristou, M., Xue, L., Shaw, G., Stevenson, D., Angst, M. S., Gaudilliere, B., Aghaeepour, N. 2023; 3 (4): 346-359


    Advanced measurement and data storage technologies have enabled high-dimensional profiling of complex biological systems. For this, modern multiomics studies regularly produce datasets with hundreds of thousands of measurements per sample, enabling a new era of precision medicine. Correlation analysis is an important first step to gain deeper insights into the coordination and underlying processes of such complex systems. However, the construction of large correlation networks in modern high-dimensional datasets remains a major computational challenge owing to rapidly growing runtime and memory requirements. Here we address this challenge by introducing CorALS (Correlation Analysis of Large-scale (biological) Systems), an open-source framework for the construction and analysis of large-scale parametric as well as non-parametric correlation networks for high-dimensional biological data. It features off-the-shelf algorithms suitable for both personal and high-performance computers, enabling workflows and downstream analysis approaches. We illustrate the broad scope and potential of CorALS by exploring perspectives on complex biological processes in large-scale multiomics and single-cell studies.

    View details for DOI 10.1038/s43588-023-00429-y

    View details for PubMedID 38116462

    View details for PubMedCentralID PMC10727505

  • Data-driven longitudinal characterization of neonatal health and morbidity. Science translational medicine De Francesco, D., Reiss, J. D., Roger, J., Tang, A. S., Chang, A. L., Becker, M., Phongpreecha, T., Espinosa, C., Morin, S., Berson, E., Thuraiappah, M., Le, B. L., Ravindra, N. G., Payrovnaziri, S. N., Mataraso, S., Kim, Y., Xue, L., Rosenstein, M. G., Oskotsky, T., Marić, I., Gaudilliere, B., Carvalho, B., Bateman, B. T., Angst, M. S., Prince, L. S., Blumenfeld, Y. J., Benitz, W. E., Fuerch, J. H., Shaw, G. M., Sylvester, K. G., Stevenson, D. K., Sirota, M., Aghaeepour, N. 2023; 15 (683): eadc9854


    Although prematurity is the single largest cause of death in children under 5 years of age, the current definition of prematurity, based on gestational age, lacks the precision needed for guiding care decisions. Here, we propose a longitudinal risk assessment for adverse neonatal outcomes in newborns based on a deep learning model that uses electronic health records (EHRs) to predict a wide range of outcomes over a period starting shortly before conception and ending months after birth. By linking the EHRs of the Lucile Packard Children's Hospital and the Stanford Healthcare Adult Hospital, we developed a cohort of 22,104 mother-newborn dyads delivered between 2014 and 2018. Maternal and newborn EHRs were extracted and used to train a multi-input multitask deep learning model, featuring a long short-term memory neural network, to predict 24 different neonatal outcomes. An additional cohort of 10,250 mother-newborn dyads delivered at the same Stanford Hospitals from 2019 to September 2020 was used to validate the model. Areas under the receiver operating characteristic curve at delivery exceeded 0.9 for 10 of the 24 neonatal outcomes considered and were between 0.8 and 0.9 for 7 additional outcomes. Moreover, comprehensive association analysis identified multiple known associations between various maternal and neonatal features and specific neonatal outcomes. This study used linked EHRs from more than 30,000 mother-newborn dyads and would serve as a resource for the investigation and prediction of neonatal outcomes. An interactive website is available for independent investigators to leverage this unique dataset:

    View details for DOI 10.1126/scitranslmed.adc9854

    View details for PubMedID 36791208

  • Prediction of neuropathologic lesions from clinical data. Alzheimer's & dementia : the journal of the Alzheimer's Association Phongpreecha, T., Cholerton, B., Bhukari, S., Chang, A. L., De Francesco, D., Thuraiappah, M., Godrich, D., Perna, A., Becker, M. G., Ravindra, N. G., Espinosa, C., Kim, Y., Berson, E., Mataraso, S., Sha, S. J., Fox, E. J., Montine, K. S., Baker, L. D., Craft, S., White, L., Poston, K. L., Beecham, G., Aghaeepour, N., Montine, T. J. 2023


    Post-mortem analysis provides definitive diagnoses of neurodegenerative diseases; however, only a few can be diagnosed during life.This study employed statistical tools and machine learning to predict 17 neuropathologic lesions from a cohort of 6518 individuals using 381 clinical features (Table S1). The multisite data allowed validation of the model's robustness by splitting train/test sets by clinical sites. A similar study was performed for predicting Alzheimer's disease (AD) neuropathologic change without specific comorbidities.Prediction results show high performance for certain lesions that match or exceed that of research annotation. Neurodegenerative comorbidities in addition to AD neuropathologic change resulted in compounded, but disproportionate, effects across cognitive domains as the comorbidity number increased.Certain clinical features could be strongly associated with multiple neurodegenerative diseases, others were lesion-specific, and some were divergent between lesions. Our approach could benefit clinical research, and genetic and biomarker research by enriching cohorts for desired lesions.

    View details for DOI 10.1002/alz.12921

    View details for PubMedID 36681388

  • In-Silico Generation of High-Dimensional Immune Response Data in Patients using a Deep Neural Network. Cytometry. Part A : the journal of the International Society for Analytical Cytology Fallahzadeh, R., Bidoki, N. H., Stelzer, I. A., Becker, M., Marić, I., Chang, A. L., Culos, A., Phongpreecha, T., Xenochristou, M., De Francesco, D., Espinosa, C., Berson, E., Verdonk, F., Angst, M. S., Gaudilliere, B., Aghaeepour, N. 2022


    Technologies for single-cell profiling of the immune system have enabled researchers to extract rich interconnected networks of cellular abundance, phenotypical and functional cellular parameters. These studies can power machine learning approaches to understand the role of the immune system in various diseases. However, the performance of these approaches and the generalizability of the findings have been hindered by limited cohort sizes in translational studies, partially due to logistical demands and costs associated with longitudinal data collection in sufficiently large patient cohorts. An evolving challenge is the requirement for ever-increasing cohort sizes as the dimensionality of datasets grows. We propose a deep learning model derived from a novel pipeline of optimal temporal cell matching and overcomplete autoencoders that uses data from a small subset of patients to learn to forecast an entire patient's immune response in a high dimensional space from one timepoint to another. In our analysis of 1.08 million cells from patients pre- and post-surgical intervention, we demonstrate that the generated patient-specific data are qualitatively and quantitatively similar to real patient data by demonstrating fidelity, diversity, and usefulness. This article is protected by copyright. All rights reserved.

    View details for DOI 10.1002/cyto.a.24709

    View details for PubMedID 36507780

  • Revealing the impact of lifestyle stressors on the risk of adverse pregnancy outcomes with multitask machine learning. Frontiers in pediatrics Becker, M., Dai, J., Chang, A. L., Feyaerts, D., Stelzer, I. A., Zhang, M., Berson, E., Saarunya, G., De Francesco, D., Espinosa, C., Kim, Y., Maric, I., Mataraso, S., Payrovnaziri, S. N., Phongpreecha, T., Ravindra, N. G., Shome, S., Tan, Y., Thuraiappah, M., Xue, L., Mayo, J. A., Quaintance, C. C., Laborde, A., King, L. S., Dhabhar, F. S., Gotlib, I. H., Wong, R. J., Angst, M. S., Shaw, G. M., Stevenson, D. K., Gaudilliere, B., Aghaeepour, N. 2022; 10: 933266


    Psychosocial and stress-related factors (PSFs), defined as internal or external stimuli that induce biological changes, are potentially modifiable factors and accessible targets for interventions that are associated with adverse pregnancy outcomes (APOs). Although individual APOs have been shown to be connected to PSFs, they are biologically interconnected, relatively infrequent, and therefore challenging to model. In this context, multi-task machine learning (MML) is an ideal tool for exploring the interconnectedness of APOs on the one hand and building on joint combinatorial outcomes to increase predictive power on the other hand. Additionally, by integrating single cell immunological profiling of underlying biological processes, the effects of stress-based therapeutics may be measurable, facilitating the development of precision medicine approaches.Objectives: The primary objectives were to jointly model multiple APOs and their connection to stress early in pregnancy, and to explore the underlying biology to guide development of accessible and measurable interventions.Materials and Methods: In a prospective cohort study, PSFs were assessed during the first trimester with an extensive self-filled questionnaire for 200 women. We used MML to simultaneously model, and predict APOs (severe preeclampsia, superimposed preeclampsia, gestational diabetes and early gestational age) as well as several risk factors (BMI, diabetes, hypertension) for these patients based on PSFs. Strongly interrelated stressors were categorized to identify potential therapeutic targets. Furthermore, for a subset of 14 women, we modeled the connection of PSFs to the maternal immune system to APOs by building corresponding ML models based on an extensive single cell immune dataset generated by mass cytometry time of flight (CyTOF).Results: Jointly modeling APOs in a MML setting significantly increased modeling capabilities and yielded a highly predictive integrated model of APOs underscoring their interconnectedness. Most APOs were associated with mental health, life stress, and perceived health risks. Biologically, stressors were associated with specific immune characteristics revolving around CD4/CD8 T cells. Immune characteristics predicted based on stress were in turn found to be associated with APOs.Conclusions: Elucidating connections among stress, multiple APOs simultaneously, and immune characteristics has the potential to facilitate the implementation of ML-based, individualized, integrative models of pregnancy in clinical decision making. The modifiable nature of stressors may enable the development of accessible interventions, with success tracked through immune characteristics.

    View details for DOI 10.3389/fped.2022.933266

    View details for PubMedID 36582513