All Publications

  • Convergence of coronary artery disease genes onto endothelial cell programs. Nature Schnitzler, G. R., Kang, H., Fang, S., Angom, R. S., Lee-Kim, V. S., Ma, X. R., Zhou, R., Zeng, T., Guo, K., Taylor, M. S., Vellarikkal, S. K., Barry, A. E., Sias-Garcia, O., Bloemendal, A., Munson, G., Guckelberger, P., Nguyen, T. H., Bergman, D. T., Hinshaw, S., Cheng, N., Cleary, B., Aragam, K., Lander, E. S., Finucane, H. K., Mukhopadhyay, D., Gupta, R. M., Engreitz, J. M. 2024


    Linking variants from genome-wide association studies (GWAS) to underlying mechanisms of disease remains a challenge1-3. For some diseases, a successful strategy has been to look for cases in which multiple GWAS loci contain genes that act in the same biological pathway1-6. However, our knowledge of which genes act in which pathways is incomplete, particularly for cell-type-specific pathways or understudied genes. Here we introduce a method to connect GWAS variants to functions. This method links variants to genes using epigenomics data, links genes to pathways de novo using Perturb-seq and integrates these data to identify convergence of GWAS loci onto pathways. We apply this approach to study the role of endothelial cells in genetic risk for coronary artery disease (CAD), and discover 43 CAD GWAS signals that converge on the cerebral cavernous malformation (CCM) signalling pathway. Two regulators of this pathway, CCM2 and TLNRD1, are each linked to a CAD risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. These results suggest a model whereby CAD risk is driven in part by the convergence of causal genes onto a particular transcriptional pathway in endothelial cells. They highlight shared genes between common and rare vascular diseases (CAD and CCM), and identify TLNRD1 as a new, previously uncharacterized member of the CCM signalling pathway. This approach will be widely useful for linking variants to functions for other common polygenic diseases.

    View details for DOI 10.1038/s41586-024-07022-x

    View details for PubMedID 38326615

    View details for PubMedCentralID 5501872

  • An encyclopedia of enhancer-gene regulatory interactions in the human genome. bioRxiv : the preprint server for biology Gschwind, A. R., Mualim, K. S., Karbalayghareh, A., Sheth, M. U., Dey, K. K., Jagoda, E., Nurtdinov, R. N., Xi, W., Tan, A. S., Jones, H., Ma, X. R., Yao, D., Nasser, J., Avsec, Ž., James, B. T., Shamim, M. S., Durand, N. C., Rao, S. S., Mahajan, R., Doughty, B. R., Andreeva, K., Ulirsch, J. C., Fan, K., Perez, E. M., Nguyen, T. C., Kelley, D. R., Finucane, H. K., Moore, J. E., Weng, Z., Kellis, M., Bassik, M. C., Price, A. L., Beer, M. A., Guigó, R., Stamatoyannopoulos, J. A., Lieberman Aiden, E., Greenleaf, W. J., Leslie, C. S., Steinmetz, L. M., Kundaje, A., Engreitz, J. M. 2023


    Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

    View details for DOI 10.1101/2023.11.09.563812

    View details for PubMedID 38014075

    View details for PubMedCentralID PMC10680627

  • Oligogenic Architecture of Rare Noncoding Variants Distinguishes 4 Congenital Heart Disease Phenotypes. Circulation. Genomic and precision medicine Yu, M., Aguirre, M., Jia, M., Gjoni, K., Cordova-Palomera, A., Munger, C., Amgalan, D., Rosa Ma, X., Pereira, A., Tcheandjieu, C., Seidman, C., Seidman, J., Tristani-Firouzi, M., Chung, W., Goldmuntz, E., Srivastava, D., Loos, R. J., Chami, N., Cordell, H., DreSSen, M., Mueller-Myhsok, B., Lahm, H., Krane, M., Pollard, K. S., Engreitz, J. M., Gagliano Taliun, S. A., Gelb, B. D., Priest, J. R. 2023: e003968


    BACKGROUND: Congenital heart disease (CHD) is highly heritable, but the power to identify inherited risk has been limited to analyses of common variants in small cohorts.METHODS: We performed reimputation of 4 CHD cohorts (n=55342) to the TOPMed reference panel (freeze 5), permitting meta-analysis of 14784017 variants including 6035962 rare variants of high imputation quality as validated by whole genome sequencing.RESULTS: Meta-analysis identified 16 novel loci, including 12 rare variants, which displayed moderate or large effect sizes (median odds ratio, 3.02) for 4 separate CHD categories. Analyses of chromatin structure link 13 of the genome-wide significant loci to key genes in cardiac development; rs373447426 (minor allele frequency, 0.003 [odds ratio, 3.37 for Conotruncal heart disease]; P=1.49*10-8) is predicted to disrupt chromatin structure for 2 nearby genes BDH1 and DLG1 involved in Conotruncal development. A lead variant rs189203952 (minor allele frequency, 0.01 [odds ratio, 2.4 for left ventricular outflow tract obstruction]; P=1.46*10-8) is predicted to disrupt the binding sites of 4 transcription factors known to participate in cardiac development in the promoter of SPAG9. A tissue-specific model of chromatin conformation suggests that common variant rs78256848 (minor allele frequency, 0.11 [odds ratio, 1.4 for Conotruncal heart disease]; P=2.6*10-8) physically interacts with NCAM1 (PFDR=1.86*10-27), a neural adhesion molecule acting in cardiac development. Importantly, while each individual malformation displayed substantial heritability (observed h2 ranging from 0.26 for complex malformations to 0.37 for left ventricular outflow tract obstructive disease) the risk for different CHD malformations appeared to be separate, without genetic correlation measured by linkage disequilibrium score regression or regional colocalization.CONCLUSIONS: We describe a set of rare noncoding variants conferring significant risk for individual heart malformations which are linked to genes governing cardiac development. These results illustrate that the oligogenic basis of CHD and significant heritability may be linked to rare variants outside protein-coding regions conferring substantial risk for individual categories of cardiac malformation.

    View details for DOI 10.1161/CIRCGEN.122.003968

    View details for PubMedID 37026454

  • Aberrant phase separation is a common killing strategy of positively charged peptides in biology and human disease. bioRxiv : the preprint server for biology Boeynaems, S., Ma, X. R., Yeong, V., Ginell, G. M., Chen, J. H., Blum, J. A., Nakayama, L., Sanyal, A., Briner, A., Haver, D. V., Pauwels, J., Ekman, A., Schmidt, H. B., Sundararajan, K., Porta, L., Lasker, K., Larabell, C., Hayashi, M. A., Kundaje, A., Impens, F., Obermeyer, A., Holehouse, A. S., Gitler, A. D. 2023


    Positively charged repeat peptides are emerging as key players in neurodegenerative diseases. These peptides can perturb diverse cellular pathways but a unifying framework for how such promiscuous toxicity arises has remained elusive. We used mass-spectrometry-based proteomics to define the protein targets of these neurotoxic peptides and found that they all share similar sequence features that drive their aberrant condensation with these positively charged peptides. We trained a machine learning algorithm to detect such sequence features and unexpectedly discovered that this mode of toxicity is not limited to human repeat expansion disorders but has evolved countless times across the tree of life in the form of cationic antimicrobial and venom peptides. We demonstrate that an excess in positive charge is necessary and sufficient for this killer activity, which we name 'polycation poisoning'. These findings reveal an ancient and conserved mechanism and inform ways to leverage its design rules for new generations of bioactive peptides.

    View details for DOI 10.1101/2023.03.09.531820

    View details for PubMedID 36945394

    View details for PubMedCentralID PMC10028949

  • TDP-43 represses cryptic exon inclusion in the FTD-ALS gene UNC13A. Nature Ma, X. R., Prudencio, M., Koike, Y., Vatsavayai, S. C., Kim, G., Harbinski, F., Briner, A., Rodriguez, C. M., Guo, C., Akiyama, T., Schmidt, H. B., Cummings, B. B., Wyatt, D. W., Kurylo, K., Miller, G., Mekhoubad, S., Sallee, N., Mekonnen, G., Ganser, L., Rubien, J. D., Jansen-West, K., Cook, C. N., Pickles, S., Oskarsson, B., Graff-Radford, N. R., Boeve, B. F., Knopman, D. S., Petersen, R. C., Dickson, D. W., Shorter, J., Myong, S., Green, E. M., Seeley, W. W., Petrucelli, L., Gitler, A. D. 2022


    A hallmark pathological feature of the neurodegenerative diseases amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) is the depletion of RNA-binding protein TDP-43 from the nucleus of neurons in the brain and spinal cord1. A major function of TDP-43 is as a repressor of cryptic exon inclusion during RNA splicing2-4. Single nucleotide polymorphisms in UNC13A are among the strongest hits associated with FTD and ALS in human genome-wide association studies5,6, but how those variants increase risk for disease is unknown. Here we show that TDP-43 represses a cryptic exon-splicing event in UNC13A. Loss of TDP-43 from the nucleus in human brain, neuronal cell lines and motor neurons derived from induced pluripotent stem cells resulted in the inclusion of a cryptic exon in UNC13A mRNA and reduced UNC13A protein expression. The top variants associated with FTD or ALS risk in humans are located in the intron harbouring the cryptic exon, and we show that they increase UNC13A cryptic exon splicing in the face of TDP-43 dysfunction. Together, our data provide a direct functional link between one of the strongest genetic risk factors for FTD and ALS (UNC13A genetic variants), and loss of TDP-43 function.

    View details for DOI 10.1038/s41586-022-04424-7

    View details for PubMedID 35197626

  • ALS Genetics: Gains, Losses, and Implications for Future Therapies. Neuron Kim, G. n., Gautier, O. n., Tassoni-Tsuchida, E. n., Ma, X. R., Gitler, A. D. 2020


    Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disorder caused by the loss of motor neurons from the brain and spinal cord. The ALS community has made remarkable strides over three decades by identifying novel familial mutations, generating animal models, elucidating molecular mechanisms, and ultimately developing promising new therapeutic approaches. Some of these approaches reduce the expression of mutant genes and are in human clinical trials, highlighting the need to carefully consider the normal functions of these genes and potential contribution of gene loss-of-function to ALS. Here, we highlight known loss-of-function mechanisms underlying ALS, potential consequences of lowering levels of gene products, and the need to consider both gain and loss of function to develop safe and effective therapeutic strategies.

    View details for DOI 10.1016/j.neuron.2020.08.022

    View details for PubMedID 32931756