James Zou, Postdoctoral Faculty Sponsor
- Advancing precision oncology with large, real-world genomics and treatment outcomes data NATURE MEDICINE 2022
Systematic pan-cancer analysis of mutation-treatment interactions using large real-world clinicogenomics data.
Quantifying the effectiveness of different cancer therapies in patients with specific tumor mutations is critical for improving patient outcomes and advancing precision medicine. Here we perform a large-scale computational analysis of 40,903 US patients with cancer who have detailed mutation profiles, treatment sequences and outcomes derived from electronic health records. We systematically identify 458 mutations that predict the survival of patients on specific immunotherapies, chemotherapy agents or targeted therapies across eight common cancer types. We further characterize mutation-mutation interactions that impact the outcomes of targeted therapies. This work demonstrates how computational analysis of large real-world data generates insights, hypotheses and resources to enable precision oncology.
View details for DOI 10.1038/s41591-022-01873-5
View details for PubMedID 35773542
Dynamical Systems Model of RNA Velocity Improves Inference of Single-cell Trajectory, Pseudo-time and Gene Regulation.
Journal of molecular biology
Recent development in inferring RNA velocity from single-cell RNA-seq opens up exciting new vista into developmental lineage and cellular dynamics. However, the estimated velocity only gives a snapshot of how the transcriptome instantaneously changes in individual cells, and it does not provide quantitative predictions and insights about the whole system. In this work, we develop RNA-ODE, a principled computational framework that extends RNA velocity to quantify systems level dynamics and improve single-cell data analysis. We model the gene expression dynamics by an ordinary differential equation (ODE) based formalism. Given a snapshot of gene expression at one time, RNA-ODE is able to predict and extrapolate the expression trajectory of each cell by solving the dynamic equations. Systematic experiments on simulations and on new data from developing brain demonstrate that RNA-ODE substantially improves many aspects of standard single-cell analysis. By leveraging temporal dynamics, RNA-ODE more accurately estimates cell state lineage and pseudo-time compared to previous state-of-the-art methods. It also infers gene regulatory networks and identifies influential genes whose expression changes can decide cell fate. We expect RNA-ODE to be a Swiss army knife that aids many facets of single-cell RNA-seq analysis.
View details for DOI 10.1016/j.jmb.2022.167606
View details for PubMedID 35489382
Machine Learning Prediction of Clinical Trial Operational Efficiency.
The AAPS journal
2022; 24 (3): 57
Clinical trials are the gatekeepers and bottlenecks of progress in medicine. In recent years, they have become increasingly complex and expensive, driven by a growing number of stakeholders requiring more endpoints, more diverse patient populations, and a stringent regulatory environment. Trial designers have historically relied on investigator expertise and legacy norms established within sponsor companies to improve operational efficiency while achieving study goals. As such, data-driven forecasts of operational metrics can be a useful resource for trial design and planning. We develop a machine learning model to predict clinical trial operational efficiency using a novel dataset from Roche containing over 2,000 clinical trials across 20 years and multiple disease areas. The data includes important operational metrics related to patient recruitment and trial duration, as well as a variety of trial features such as the number of procedures, eligibility criteria, and endpoints. Our results demonstrate that operational efficiency can be predicted robustly using trial features, which can provide useful insights to trial designers on the potential impact of their decisions on patient recruitment success and trial duration.
View details for DOI 10.1208/s12248-022-00703-3
View details for PubMedID 35449371
Evaluating eligibility criteria of oncology trials using real-world data and AI.
There is a growing focus on making clinical trials more inclusive but the design of trial eligibility criteria remains challenging1-3. Here we systematically evaluate the effect of different eligibility criteria on cancer trial populations and outcomes with real-world data using the computational framework of Trial Pathfinder. We apply Trial Pathfinder to emulate completed trials of advanced non-small-cell lung cancer using data from a nationwide database of electronic health records comprising 61,094 patients with advanced non-small-cell lung cancer. Our analyses reveal that many common criteria, including exclusions based on several laboratory values, had a minimal effect on the trial hazard ratios. When we used a data-driven approach to broaden restrictive criteria, the pool of eligible patients more than doubled on average and the hazard ratio of the overall survival decreased by an average of 0.05. This suggests that many patients who were not eligible under the original trial criteria could potentially benefit from the treatments. We further support our findings through analyses of other types of cancer and patient-safety data from diverse clinical trials. Our data-driven methodology for evaluating eligibility criteria can facilitate the design of more-inclusive trials while maintaining safeguards for patient safety.
View details for DOI 10.1038/s41586-021-03430-5
View details for PubMedID 33828294
- Modeling Spatial Correlation of Transcripts with Application to Developing Pancreas SCIENTIFIC REPORTS 2019; 9
Modeling Spatial Correlation of Transcripts with Application to Developing Pancreas.
2019; 9 (1): 5592
Recently high-throughput image-based transcriptomic methods were developed and enabled researchers to spatially resolve gene expression variation at the molecular level for the first time. In this work, we develop a general analysis tool to quantitatively study the spatial correlations of gene expression in fixed tissue sections. As an illustration, we analyze the spatial distribution of single mRNA molecules measured by in situ sequencing on human fetal pancreas at three developmental time points-80, 87 and 117days post-fertilization. We develop a density profile-based method to capture the spatial relationship between gene expression and other morphological features of the tissue sample such as position of nuclei and endocrine cells of the pancreas. In addition, we build a statistical model to characterize correlations in the spatial distribution of the expression level among different genes. This model enables us to infer the inhibitory and clustering effects throughout different time points. Our analysis framework is applicable to a wide variety of spatially-resolved transcriptomic data to derive biological insights.
View details for PubMedID 30944357
The Effects of Memory Replay in Reinforcement Learning
IEEE. 2018: 478–85
View details for Web of Science ID 000461021200067
Wide-Field Optical Microscopy of Microwave Fields Using Nitrogen-Vacancy Centers in Diamonds
Advanced Optical Materials
2016; 4 (7): 1075–1080
View details for DOI 10.1002/201600039
- Enhanced Raman scattering of single nanoparticles in a high- Q whispering-gallery microresonator PHYSICAL REVIEW A 2015; 91 (4)
- Cooling mechanical resonators to the quantum ground state from room temperature PHYSICAL REVIEW A 2015; 91 (1)