Serena Yeung, Postdoctoral Faculty Sponsor
Assessing the Impact of Climate Change on Future Water Demand using Weather Data
WATER RESOURCES MANAGEMENT
View details for DOI 10.1007/s11269-021-02789-4
View details for Web of Science ID 000636914800001
Short-Term Forecasting of Household Water Demand in the UK Using an Interpretable Machine Learning Approach
JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT
2021; 147 (4)
View details for DOI 10.1061/(ASCE)WR.1943-5452.0001325
View details for Web of Science ID 000672261000002
Data-Driven Modeling of Pregnancy-Related Complications.
Trends in molecular medicine
A healthy pregnancy depends on complex interrelated biological adaptations involving placentation, maternal immune responses, and hormonal homeostasis. Recent advances in high-throughput technologies have provided access to multiomics biological data that, combined with clinical and social data, can provide a deeper understanding of normal and abnormal pregnancies. Integration of these heterogeneous datasets using state-of-the-art machine-learning methods can enable the prediction of short- and long-term health trajectories for a mother and offspring and the development of treatments to prevent or minimize complications. We review advanced machine-learning methods that could: provide deeper biological insights into a pregnancy not yet unveiled by current methodologies; clarify the etiologies and heterogeneity of pathologies that affect a pregnancy; and suggest the best approaches to address disparities in outcomes affecting vulnerable populations.
View details for DOI 10.1016/j.molmed.2021.01.007
View details for PubMedID 33573911
Objective Activity Parameters Track Patient-Specific Physical Recovery Trajectories After Surgery and Link With Individual Preoperative Immune States.
Annals of surgery
The longitudinal assessment of physical function with high temporal resolution at a scalable and objective level in patients recovering from surgery is highly desirable to understand the biological and clinical factors that drive the clinical outcome. However, physical recovery from surgery itself remains poorly defined and the utility of wearable technologies to study recovery after surgery has not been established.Prolonged postoperative recovery is often associated with long-lasting impairment of physical, mental, and social functions. While phenotypical and clinical patient characteristics account for some variation of individual recovery trajectories, biological differences likely play a major role. Specifically, patient-specific immune states have been linked to prolonged physical impairment after surgery. However, current methods of quantifying physical recovery lack patient specificity and objectivity.Here, a combined high-fidelity accelerometry and state-of-the-art deep immune profiling approach was studied in patients undergoing major joint replacement surgery. The aim was to determine whether objective physical parameters derived from accelerometry data can accurately track patient-specific physical recovery profiles (suggestive of a 'clock of postoperative recovery'), compare the performance of derived parameters with benchmark metrics including step count, and link individual recovery profiles with patients' preoperative immune state.The results of our models indicate that patient-specific temporal patterns of physical function can be derived with a precision superior to benchmark metrics. Notably, six distinct domains of physical function and sleep are identified to represent the objective temporal patterns: "activity capacity" and "moderate and overall activity" (declined immediately after surgery); "sleep disruption and sedentary activity" (increased after surgery); "overall sleep", "sleep onset", and "light activity" (no clear changes were observed after surgery). These patterns can be linked to individual patients' preoperative immune state using cross-validated canonical-correlation analysis. Importantly, the pSTAT3 signal activity in M-MDSCs predicted a slower recovery.Accelerometry-based recovery trajectories are scalable and objective outcomes to study patient-specific factors that drive physical recovery.
View details for DOI 10.1097/SLA.0000000000005250
View details for PubMedID 35129529
Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions
NATURE MACHINE INTELLIGENCE
View details for DOI 10.1038/s42256-020-00232-8
View details for Web of Science ID 000579336000001
Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions.
Nature machine intelligence
2020; 2 (10): 619-628
The dense network of interconnected cellular signalling responses that are quantifiable in peripheral immune cells provides a wealth of actionable immunological insights. Although high-throughput single-cell profiling techniques, including polychromatic flow and mass cytometry, have matured to a point that enables detailed immune profiling of patients in numerous clinical settings, the limited cohort size and high dimensionality of data increase the possibility of false-positive discoveries and model overfitting. We introduce a generalizable machine learning platform, the immunological Elastic-Net (iEN), which incorporates immunological knowledge directly into the predictive models. Importantly, the algorithm maintains the exploratory nature of the high-dimensional dataset, allowing for the inclusion of immune features with strong predictive capabilities even if not consistent with prior knowledge. In three independent studies our method demonstrates improved predictions for clinically relevant outcomes from mass cytometry data generated from whole blood, as well as a large simulated dataset. The iEN is available under an open-source licence.
View details for DOI 10.1038/s42256-020-00232-8
View details for PubMedID 33294774
View details for PubMedCentralID PMC7720904
VoPo leverages cellular heterogeneity for predictive modeling of single-cell data.
2020; 11 (1): 3738
High-throughput single-cell analysis technologies produce an abundance of data that is critical for profiling the heterogeneity of cellular systems. We introduce VoPo (https://github.com/stanleyn/VoPo), a machine learning algorithm for predictive modeling and comprehensive visualization of the heterogeneity captured in large single-cell datasets. In three mass cytometry datasets, with the largest measuring hundreds of millions of cells over hundreds of samples, VoPo defines phenotypically and functionally homogeneous cell populations. VoPo further outperforms state-of-the-art machine learning algorithms in classification tasks, and identified immune-correlates of clinically-relevant parameters.
View details for DOI 10.1038/s41467-020-17569-8
View details for PubMedID 32719375
Multiomics Characterization of Preterm Birth in Low- and Middle-Income Countries.
JAMA network open
2020; 3 (12): e2029655
Worldwide, preterm birth (PTB) is the single largest cause of deaths in the perinatal and neonatal period and is associated with increased morbidity in young children. The cause of PTB is multifactorial, and the development of generalizable biological models may enable early detection and guide therapeutic studies.To investigate the ability of transcriptomics and proteomics profiling of plasma and metabolomics analysis of urine to identify early biological measurements associated with PTB.This diagnostic/prognostic study analyzed plasma and urine samples collected from May 2014 to June 2017 from pregnant women in 5 biorepository cohorts in low- and middle-income countries (LMICs; ie, Matlab, Bangladesh; Lusaka, Zambia; Sylhet, Bangladesh; Karachi, Pakistan; and Pemba, Tanzania). These cohorts were established to study maternal and fetal outcomes and were supported by the Alliance for Maternal and Newborn Health Improvement and the Global Alliance to Prevent Prematurity and Stillbirth biorepositories. Data were analyzed from December 2018 to July 2019.Blood and urine specimens that were collected early during pregnancy (median sampling time of 13.6 weeks of gestation, according to ultrasonography) were processed, stored, and shipped to the laboratories under uniform protocols. Plasma samples were assayed for targeted measurement of proteins and untargeted cell-free ribonucleic acid profiling; urine samples were assayed for metabolites.The PTB phenotype was defined as the delivery of a live infant before completing 37 weeks of gestation.Of the 81 pregnant women included in this study, 39 had PTBs (48.1%) and 42 had term pregnancies (51.9%) (mean [SD] age of 24.8 [5.3] years). Univariate analysis demonstrated functional biological differences across the 5 cohorts. A cohort-adjusted machine learning algorithm was applied to each biological data set, and then a higher-level machine learning modeling combined the results into a final integrative model. The integrated model was more accurate, with an area under the receiver operating characteristic curve (AUROC) of 0.83 (95% CI, 0.72-0.91) compared with the models derived for each independent biological modality (transcriptomics AUROC, 0.73 [95% CI, 0.61-0.83]; metabolomics AUROC, 0.59 [95% CI, 0.47-0.72]; and proteomics AUROC, 0.75 [95% CI, 0.64-0.85]). Primary features associated with PTB included an inflammatory module as well as a metabolomic module measured in urine associated with the glutamine and glutamate metabolism and valine, leucine, and isoleucine biosynthesis pathways.This study found that, in LMICs and high PTB settings, major biological adaptations during term pregnancy follow a generalizable model and the predictive accuracy for PTB was augmented by combining various omics data sets, suggesting that PTB is a condition that manifests within multiple biological systems. These data sets, with machine learning partnerships, may be a key step in developing valuable predictive tests and intervention candidates for preventing PTB.
View details for DOI 10.1001/jamanetworkopen.2020.29655
View details for PubMedID 33337494