Stanford Advisors

All Publications

  • Prioritizing disease-related rare variants by integrating gene expression data. bioRxiv : the preprint server for biology Guo, H., Urban, A. E., Wong, W. H. 2024


    Rare variants, comprising a vast majority of human genetic variations, are likely to have more deleterious impact on human diseases compared to common variants. Here we present carrier statistic, a statistical framework to prioritize disease-related rare variants by integrating gene expression data. By quantifying the impact of rare variants on gene expression, carrier statistic can prioritize those rare variants that have large functional consequence in the diseased patients. Through simulation studies and analyzing real multi-omics dataset, we demonstrated that carrier statistic is applicable in studies with limited sample size (a few hundreds) and achieves substantially higher sensitivity than existing rare variants association methods. Application to Alzheimer's disease reveals 16 rare variants within 15 genes with extreme carrier statistics. The carrier statistic method can be applied to various rare variant types and is adaptable to other omics data modalities, offering a powerful tool for investigating the molecular mechanisms underlying complex diseases.

    View details for DOI 10.1101/2024.03.19.585836

    View details for PubMedID 38562756

  • Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics. Nature communications Miao, J., Guo, H., Song, G., Zhao, Z., Hou, L., Lu, Q. 2023; 14 (1): 832


    Polygenic risk scores (PRS) calculated from genome-wide association studies (GWAS) of Europeans are known to have substantially reduced predictive accuracy in non-European populations, limiting their clinical utility and raising concerns about health disparities across ancestral populations. Here, we introduce a statistical framework named X-Wing to improve predictive performance in ancestrally diverse populations. X-Wing quantifies local genetic correlations for complex traits between populations, employs an annotation-dependent estimation procedure to amplify correlated genetic effects between populations, and combines multiple population-specific PRS into a unified score with GWAS summary statistics alone as input. Through extensive benchmarking, we demonstrate that X-Wing pinpoints portable genetic effects and substantially improves PRS performance in non-European populations, showing 14.1%-119.1% relative gain in predictive R2 compared to state-of-the-art methods based on GWAS summary statistics. Overall, X-Wing addresses critical limitations in existing approaches and may have broad applications in cross-population polygenic risk prediction.

    View details for DOI 10.1038/s41467-023-36544-7

    View details for PubMedID 36788230

  • Quantifying concordant genetic effects of de novo mutations on multiple disorders ELIFE Guo, H., Hou, L., Shi, Y., Jin, S., Zeng, X., Li, B., Lifton, R. P., Brueckner, M., Zhao, H., Lu, Q. 2022; 11


    Exome sequencing on tens of thousands of parent-proband trios has identified numerous deleterious de novo mutations (DNMs) and implicated risk genes for many disorders. Recent studies have suggested shared genes and pathways are enriched for DNMs across multiple disorders. However, existing analytic strategies only focus on genes that reach statistical significance for multiple disorders and require large trio samples in each study. As a result, these methods are not able to characterize the full landscape of genetic sharing due to polygenicity and incomplete penetrance. In this work, we introduce EncoreDNM, a novel statistical framework to quantify shared genetic effects between two disorders characterized by concordant enrichment of DNMs in the exome. EncoreDNM makes use of exome-wide, summary-level DNM data, including genes that do not reach statistical significance in single-disorder analysis, to evaluate the overall and annotation-partitioned genetic sharing between two disorders. Applying EncoreDNM to DNM data of nine disorders, we identified abundant pairwise enrichment correlations, especially in genes intolerant to pathogenic mutations and genes highly expressed in fetal tissues. These results suggest that EncoreDNM improves current analytic approaches and may have broad applications in DNM studies.

    View details for DOI 10.7554/eLife.75551

    View details for Web of Science ID 000867699200001

    View details for PubMedID 35666111

    View details for PubMedCentralID PMC9217133

  • Minimal sigma-field for flexible sufficient dimension reduction ELECTRONIC JOURNAL OF STATISTICS Guo, H., Hou, L., Zhu, Y. 2022; 16 (1): 1997-2032

    View details for DOI 10.1214/22-EJS1999

    View details for Web of Science ID 000825293500038

  • Detecting local genetic correlations with scan statistics NATURE COMMUNICATIONS Guo, H., Li, J. J., Lu, Q., Hou, L. 2021; 12 (1): 2033


    Genetic correlation analysis has quickly gained popularity in the past few years and provided insights into the genetic etiology of numerous complex diseases. However, existing approaches oversimplify the shared genetic architecture between different phenotypes and cannot effectively identify precise genetic regions contributing to the genetic correlation. In this work, we introduce LOGODetect, a powerful and efficient statistical method to identify small genome segments harboring local genetic correlation signals. LOGODetect automatically identifies genetic regions showing consistent associations with multiple phenotypes through a scan statistic approach. It uses summary association statistics from genome-wide association studies (GWAS) as input and is robust to sample overlap between studies. Applied to seven phenotypically distinct but genetically correlated neuropsychiatric traits, we identify 227 non-overlapping genome regions associated with multiple traits, including multiple hub regions showing concordant effects on five or more traits. Our method addresses critical limitations in existing analytic strategies and may have wide applications in post-GWAS analysis.

    View details for DOI 10.1038/s41467-021-22334-6

    View details for Web of Science ID 000636772600020

    View details for PubMedID 33795679

    View details for PubMedCentralID PMC8016883