Stanford Advisors


All Publications


  • Leveraging cell-type specificity and similarity improves single-cell eQTL fine-mapping. Nature communications Lin, C., Lin, Y., Li, W., Xu, L., Zhang, X., Zhao, H. 2026

    Abstract

    Identifying cell-type-specific eQTL is important to understand the genetic regulation of gene expressions at the cell-type level and its relevance to complex traits. However, existing eQTL fine-mapping methods are limited in power and accuracy when cell types are analyzed separately. To improve eQTL mapping, we present CASE, a Bayesian framework to perform cell-type-specific and shared eQTL fine-mapping that simultaneously analyzes multiple cell types. CASE can effectively capture effect-sharing patterns across cell types while disentangling the confounding effects of linkage disequilibrium. We demonstrate that CASE outperforms the existing single-trait (SuSiE) and multi-trait (mvSuSiE) eQTL methods through comprehensive simulations. When applied to the OneK1K data, CASE identified more genetic regulations of gene expressions, better capturing cell type specificity and functionally enriched and disease-associated eQTL. The CASE framework for single-cell eQTL fine-mapping can be broadly applied to multi-tissue and multi-trait genetic studies.

    View details for DOI 10.1038/s41467-026-72176-3

    View details for PubMedID 42020412

  • Improving polygenic risk prediction performance by integrating electronic health records through phenotype embedding AMERICAN JOURNAL OF HUMAN GENETICS Xu, L., Zheng, W., Hu, J., Lin, Y., Zhao, J., Wang, G., Liu, T., Zhao, H. 2025; 112 (12): 3030-3045

    Abstract

    Large-scale biobanks provide comprehensive electronic health records (EHRs) that capture detailed clinical phenotypes, potentially enhancing disease prediction. However, traditional polygenic risk score (PRS) methods rely on simplified phenotype definitions or predefined trait sets, limiting their ability to represent the complex structures embedded within EHRs. To address this gap, we introduce EHR-embedding-enhanced PRS (EEPRS), leveraging phenotype embeddings derived from EHRs to improve PRSs using only genome-wide association study (GWAS) summary statistics. Employing embedding methods such as Word2Vec and GPT, we conducted EHR-embedding-based GWASs and identified a cardiovascular cluster via hierarchical clustering of genetic correlations. Across 41 traits in the UK Biobank, EEPRS consistently outperformed single-trait PRSs, particularly within this cluster. PRS-based phenome-wide association studies further demonstrated robust associations between EHR-embedding-based PRS and circulatory system diseases. We then developed EEPRS_optimal, a data-adaptive method that uses cross-validation to select the best embedding, yielding additional improvements. We also developed MTAG_EEPRS for multi-trait PRSs, which further improved prediction accuracy compared to single-trait PRSs and MTAG_PRS. Finally, we validated the benefits of EEPRS in the All of Us cohort for seven selected diseases. Overall, EEPRS represents a robust and interpretable framework, enhancing single-trait and multi-trait PRSs by integrating EHR embeddings.

    View details for DOI 10.1016/j.ajhg.2025.11.006

    View details for Web of Science ID 001636458400001

    View details for PubMedID 41349513

    View details for PubMedCentralID PMC12695032

  • Differential results of genetic risk scoring for multiple sclerosis in European and African American populations. Multiple sclerosis (Houndmills, Basingstoke, England) Rivier, C. A., Xu, L., Clocchiatti-Tuozzo, S., Zhao, H., Ohno-Machado, L., Hafler, D. A., Falcone, G. J., Longbrake, E. E. 2025; 31 (11): 1304-1313

    Abstract

    Genetic risk scores (GRSs) for multiple sclerosis (MS) help identify high-risk individuals and stratify populations for clinical trials, but most are derived from European populations, raising questions about GRS accuracy in other ancestries.We aimed to determine whether an MS GRS can stratify individuals of African and Latino/admixed ancestry and assess whether the JointPRS tool enhances GRS portability in African ancestry.In this cross-sectional study using All of Us (2018-2022), we derived a GRS from 232 variants for 32,428 European, 32,428 African, and 32,428 Latino/admixed participants, each divided into quintiles. The outcome was MS ascertained through International Classification of Diseases (ICD)-9/10 and Systematized Nomenclature of Medicine (SNOMED) codes. JointPRS was used to improve GRS portability in the African-ancestry group.MS prevalence was 1.0% in European, 0.56% in African, and 0.46% in Latino/admixed participants. The GRS stratified MS risk effectively in European (odds ratio (OR) = 2.30 (1.60-3.36); p-trend < 0.001) and Latino/admixed (OR = 2.53 (1.43-4.85); p-trend < 0.001) ancestry groups but did not significantly partition African participants (OR = 1.30 (0.88-1.95); p-trend = 0.17). After applying JointPRS, stratification in African ancestry improved (OR = 3.02 (1.00-8.95); p-trend = 0.007).A GRS for MS stratified European and Latino/admixed individuals but not African ancestry. Incorporating African-specific data enhanced performance, underscoring the need for more ancestry-tailored GRS.

    View details for DOI 10.1177/13524585251377607

    View details for PubMedID 40991630

    View details for PubMedCentralID PMC12633797

  • JointPRS: A data-adaptive framework for multi-population genetic risk prediction incorporating genetic correlation. Nature communications Xu, L., Zhou, G., Jiang, W., Zhang, H., Dong, Y., Guan, L., Zhao, H. 2025; 16 (1): 3841

    Abstract

    Genetic risk prediction for non-European populations is hindered by limited Genome-Wide Association Study (GWAS) sample sizes and small tuning datasets. We propose JointPRS, a data-adaptive framework that leverages genetic correlations across multiple populations using GWAS summary statistics. It achieves accurate predictions without individual-level tuning data and remains effective in the presence of a small tuning set thanks to its data-adaptive approach. Through extensive simulations and real data applications to 22 quantitative and four binary traits in five continental populations evaluated using the UK Biobank (UKBB) and All of Us (AoU), JointPRS consistently outperforms six state-of-the-art methods across three data scenarios: no tuning data, same-cohort tuning and testing, and cross-cohort tuning and testing. Notably, in the Admixed American population, JointPRS improves lipid trait prediction in AoU by 6.46%-172.00% compared to the other existing methods.

    View details for DOI 10.1038/s41467-025-59243-x

    View details for PubMedID 40268942

    View details for PubMedCentralID PMC12019179

  • Integrated longitudinal multiomics study identifies immune programs associated with acute COVID-19 severity and mortality. The Journal of clinical investigation Gygi, J. P., Maguire, C., Patel, R. K., Shinde, P., Konstorum, A., Shannon, C. P., Xu, L., Hoch, A., Jayavelu, N. D., Haddad, E. K., Reed, E. F., Kraft, M., McComsey, G. A., Metcalf, J. P., Ozonoff, A., Esserman, D., Cairns, C. B., Rouphael, N., Bosinger, S. E., Kim-Schulze, S., Krammer, F., Rosen, L. B., van Bakel, H., Wilson, M., Eckalbar, W. L., Maecker, H. T., Langelier, C. R., Steen, H., Altman, M. C., Montgomery, R. R., Levy, O., Melamed, E., Pulendran, B., Diray-Arce, J., Smolen, K. K., Fragiadakis, G. K., Becker, P. M., Sekaly, R. P., Ehrlich, L. I., Fourati, S., Peters, B., Kleinstein, S. H., Guan, L. 2024; 134 (9)

    Abstract

    BACKGROUNDPatients hospitalized for COVID-19 exhibit diverse clinical outcomes, with outcomes for some individuals diverging over time even though their initial disease severity appears similar to that of other patients. A systematic evaluation of molecular and cellular profiles over the full disease course can link immune programs and their coordination with progression heterogeneity.METHODSWe performed deep immunophenotyping and conducted longitudinal multiomics modeling, integrating 10 assays for 1,152 Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study participants and identifying several immune cascades that were significant drivers of differential clinical outcomes.RESULTSIncreasing disease severity was driven by a temporal pattern that began with the early upregulation of immunosuppressive metabolites and then elevated levels of inflammatory cytokines, signatures of coagulation, formation of neutrophil extracellular traps, and T cell functional dysregulation. A second immune cascade, predictive of 28-day mortality among critically ill patients, was characterized by reduced total plasma Igs and B cells and dysregulated IFN responsiveness. We demonstrated that the balance disruption between IFN-stimulated genes and IFN inhibitors is a crucial biomarker of COVID-19 mortality, potentially contributing to failure of viral clearance in patients with fatal illness.CONCLUSIONOur longitudinal multiomics profiling study revealed temporal coordination across diverse omics that potentially explain the disease progression, providing insights that can inform the targeted development of therapies for patients hospitalized with COVID-19, especially those who are critically ill.TRIAL REGISTRATIONClinicalTrials.gov NCT04378777.FUNDINGNIH (5R01AI135803-03, 5U19AI118608-04, 5U19AI128910-04, 4U19AI090023-11, 4U19AI118610-06, R01AI145835-01A1S1, 5U19AI062629-17, 5U19AI057229-17, 5U19AI125357-05, 5U19AI128913-03, 3U19AI077439-13, 5U54AI142766-03, 5R01AI104870-07, 3U19AI089992-09, 3U19AI128913-03, and 5T32DA018926-18); NIAID, NIH (3U19AI1289130, U19AI128913-04S1, and R01AI122220); and National Science Foundation (DMS2310836).

    View details for DOI 10.1172/JCI176640

    View details for PubMedID 38690733