All Publications


  • Molecular convergence of risk variants for congenital heart defects leveraging a regulatory map of the human fetal heart. medRxiv : the preprint server for health sciences Ma, X. R., Conley, S. D., Kosicki, M., Bredikhin, D., Cui, R., Tran, S., Sheth, M. U., Qiu, W. L., Chen, S., Kundu, S., Kang, H. Y., Amgalan, D., Munger, C. J., Duan, L., Dang, K., Rubio, O. M., Kany, S., Zamirpour, S., DePaolo, J., Padmanabhan, A., Olgin, J., Damrauer, S., Andersson, R., Gu, M., Priest, J. R., Quertermous, T., Qiu, X., Rabinovitch, M., Visel, A., Pennacchio, L., Kundaje, A., Glass, I. A., Gifford, C. A., Pirruccello, J. P., Goodyer, W. R., Engreitz, J. M. 2024

    Abstract

    Congenital heart defects (CHD) arise in part due to inherited genetic variants that alter genes and noncoding regulatory elements in the human genome. These variants are thought to act during fetal development to influence the formation of different heart structures. However, identifying the genes, pathways, and cell types that mediate these effects has been challenging due to the immense diversity of cell types involved in heart development as well as the superimposed complexities of interpreting noncoding sequences. As such, understanding the molecular functions of both noncoding and coding variants remains paramount to our fundamental understanding of cardiac development and CHD. Here, we created a gene regulation map of the healthy human fetal heart across developmental time, and applied it to interpret the functions of variants associated with CHD and quantitative cardiac traits. We collected single-cell multiomic data from 734,000 single cells sampled from 41 fetal hearts spanning post-conception weeks 6 to 22, enabling the construction of gene regulation maps in 90 cardiac cell types and states, including rare populations of cardiac conduction cells. Through an unbiased analysis of all 90 cell types, we find that both rare coding variants associated with CHD and common noncoding variants associated with valve traits converge to affect valvular interstitial cells (VICs). VICs are enriched for high expression of known CHD genes previously identified through mapping of rare coding variants. Eight CHD genes, as well as other genes in similar molecular pathways, are linked to common noncoding variants associated with other valve diseases or traits via enhancers in VICs. In addition, certain common noncoding variants impact enhancers with activities highly specific to particular subanatomic structures in the heart, illuminating how such variants can impact specific aspects of heart structure and function. Together, these results implicate new enhancers, genes, and cell types in the genetic etiology of CHD, identify molecular convergence of common noncoding and rare coding variants on VICs, and suggest a more expansive view of the cell types instrumental in genetic risk for CHD, beyond the working cardiomyocyte. This regulatory map of the human fetal heart will provide a foundational resource for understanding cardiac development, interpreting genetic variants associated with heart disease, and discovering targets for cell-type specific therapies.

    View details for DOI 10.1101/2024.11.20.24317557

    View details for PubMedID 39606363

    View details for PubMedCentralID PMC11601760

  • TDP-43 represses cryptic exon inclusion in the FTD-ALS gene UNC13A. Nature Ma, X. R., Prudencio, M., Koike, Y., Vatsavayai, S. C., Kim, G., Harbinski, F., Briner, A., Rodriguez, C. M., Guo, C., Akiyama, T., Schmidt, H. B., Cummings, B. B., Wyatt, D. W., Kurylo, K., Miller, G., Mekhoubad, S., Sallee, N., Mekonnen, G., Ganser, L., Rubien, J. D., Jansen-West, K., Cook, C. N., Pickles, S., Oskarsson, B., Graff-Radford, N. R., Boeve, B. F., Knopman, D. S., Petersen, R. C., Dickson, D. W., Shorter, J., Myong, S., Green, E. M., Seeley, W. W., Petrucelli, L., Gitler, A. D. 2022

    Abstract

    A hallmark pathological feature of the neurodegenerative diseases amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) is the depletion of RNA-binding protein TDP-43 from the nucleus of neurons in the brain and spinal cord1. A major function of TDP-43 is as a repressor of cryptic exon inclusion during RNA splicing2-4. Single nucleotide polymorphisms in UNC13A are among the strongest hits associated with FTD and ALS in human genome-wide association studies5,6, but how those variants increase risk for disease is unknown. Here we show that TDP-43 represses a cryptic exon-splicing event in UNC13A. Loss of TDP-43 from the nucleus in human brain, neuronal cell lines and motor neurons derived from induced pluripotent stem cells resulted in the inclusion of a cryptic exon in UNC13A mRNA and reduced UNC13A protein expression. The top variants associated with FTD or ALS risk in humans are located in the intron harbouring the cryptic exon, and we show that they increase UNC13A cryptic exon splicing in the face of TDP-43 dysfunction. Together, our data provide a direct functional link between one of the strongest genetic risk factors for FTD and ALS (UNC13A genetic variants), and loss of TDP-43 function.

    View details for DOI 10.1038/s41586-022-04424-7

    View details for PubMedID 35197626

  • Endothelial cell-related genetic variants identify LDL cholesterol-sensitive individuals who derive greater benefit from aggressive lipid lowering. Nature medicine Marston, N. A., Kamanu, F. K., Melloni, G. E., Schnitzler, G., Hakim, A., Ma, R. X., Kang, H., Chasman, D. I., Giugliano, R. P., Ellinor, P. T., Ridker, P. M., Engreitz, J. M., Sabatine, M. S., Ruff, C. T., Gupta, R. M. 2025

    Abstract

    The role of endothelial cell (EC) dysfunction in contributing to an individual's susceptibility to coronary atherosclerosis and how low-density lipoprotein cholesterol (LDL-C) concentrations might modify this relationship have not been previously studied. Here, from an examination of genome-wide significant single nucleotide polymorphisms associated with coronary artery disease (CAD), we identified variants with effects on EC function and constructed a 35 single nucleotide polymorphism polygenic risk score comprising these EC-specific variants (EC PRS). The association of the EC PRS with the risk of incident cardiovascular disease was tested in 3 cohorts: a primary prevention population in the UK Biobank (UKBB; n = 348,967); a primary prevention cohort from a trial that tested a statin (JUPITER, n = 8,749); and a secondary prevention cohort that tested a PCSK9 inhibitor (FOURIER, n = 14,298). In the UKBB, the EC PRS was independently associated with the risk of incident CAD (adjusted hazard ratio (aHR) per 1 s.d. of 1.24 (95% CI 1.21-1.26), P < 2 × 10-16). Moreover, LDL-C concentration significantly modified this risk: the aHR per 1 s.d. was 1.26 (1.22-1.30) when LDL-C was 150 mg dl-1 but 1.00 (0.85-1.16) when LDL-C was 50 mg dl-1 (Pinteraction = 0.004). The clinical benefit of LDL-C lowering was significantly greater in individuals with a high EC PRS than in individuals with low or intermediate EC PRS, with relative risk reductions of 68% (HR 0.32 (0.18-0.59)) versus 29% (HR 0.71 (0.52-0.95)) in the primary prevention cohort (Pinteraction = 0.02) and 33% (HR 0.67 (0.53-0.83)) versus 8% (HR 0.92 (0.82-1.03)) in the secondary prevention cohort (Pinteraction = 0.01). We conclude that EC PRS quantifies an independent axis of CAD risk that is not currently captured in medical practice and identifies individuals who are more sensitive to the atherogenic effects of LDL-C and who would potentially derive substantially greater benefit from aggressive LDL-C lowering.

    View details for DOI 10.1038/s41591-025-03533-w

    View details for PubMedID 40011692

    View details for PubMedCentralID 7755038

  • Mapping enhancer-gene regulatory interactions from single-cell data. bioRxiv : the preprint server for biology Sheth, M. U., Qiu, W., Rosa Ma, X., Gschwind, A. R., Jagoda, E., Tan, A. S., Einarsson, H., Gorissen, B. L., Dubocanin, D., McGinnis, C. S., Amgalan, D., Satpathy, A. T., Jones, T. R., Steinmetz, L. M., Kundaje, A., Ustun, B., Engreitz, J. M., Andersson, R. 2024

    Abstract

    Mapping enhancers and their target genes in specific cell types is crucial for understanding gene regulation and human disease genetics. However, accurately predicting enhancer-gene regulatory interactions from single-cell datasets has been challenging. Here, we introduce a new family of classification models, scE2G, to predict enhancer-gene regulation. These models use features from single-cell ATAC-seq or multiomic RNA and ATAC-seq data and are trained on a CRISPR perturbation dataset including >10,000 evaluated element-gene pairs. We benchmark scE2G models against CRISPR perturbations, fine-mapped eQTLs, and GWAS variant-gene associations and demonstrate state-of-the-art performance at prediction tasks across multiple cell types and categories of perturbations. We apply scE2G to build maps of enhancer-gene regulatory interactions in heterogeneous tissues and interpret noncoding variants associated with complex traits, nominating regulatory interactions linking INPP4B and IL15 to lymphocyte counts. The scE2G models will enable accurate mapping of enhancer-gene regulatory interactions across thousands of diverse human cell types.

    View details for DOI 10.1101/2024.11.23.624931

    View details for PubMedID 39605382

  • CRISPRi-Perturb-seq in endothelial cells links genetic variation in endothelin-1 to risk of coronary artery disease and hypertension CLINICAL SCIENCE Gupta, R., Schnitzler, G., Lee-Kim, V., Fang, S., Kang, H., Ma, R., Finucane, H., Engreitz, J. 2024; 138
  • CRISPRi-Perturb-seq in endothelial cells links genetic variation in endothelin-1 to risk of coronary artery disease and hypertension Gupta, R., Schnitzler, G., Lee-Kim, V., Fang, S., Kang, H., Ma, R., Finucane, H., Engreitz, J. PORTLAND PRESS LTD. 2024: A55-A56
  • Convergence of coronary artery disease genes onto endothelial cell programs. Nature Schnitzler, G. R., Kang, H., Fang, S., Angom, R. S., Lee-Kim, V. S., Ma, X. R., Zhou, R., Zeng, T., Guo, K., Taylor, M. S., Vellarikkal, S. K., Barry, A. E., Sias-Garcia, O., Bloemendal, A., Munson, G., Guckelberger, P., Nguyen, T. H., Bergman, D. T., Hinshaw, S., Cheng, N., Cleary, B., Aragam, K., Lander, E. S., Finucane, H. K., Mukhopadhyay, D., Gupta, R. M., Engreitz, J. M. 2024

    Abstract

    Linking variants from genome-wide association studies (GWAS) to underlying mechanisms of disease remains a challenge1-3. For some diseases, a successful strategy has been to look for cases in which multiple GWAS loci contain genes that act in the same biological pathway1-6. However, our knowledge of which genes act in which pathways is incomplete, particularly for cell-type-specific pathways or understudied genes. Here we introduce a method to connect GWAS variants to functions. This method links variants to genes using epigenomics data, links genes to pathways de novo using Perturb-seq and integrates these data to identify convergence of GWAS loci onto pathways. We apply this approach to study the role of endothelial cells in genetic risk for coronary artery disease (CAD), and discover 43 CAD GWAS signals that converge on the cerebral cavernous malformation (CCM) signalling pathway. Two regulators of this pathway, CCM2 and TLNRD1, are each linked to a CAD risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. These results suggest a model whereby CAD risk is driven in part by the convergence of causal genes onto a particular transcriptional pathway in endothelial cells. They highlight shared genes between common and rare vascular diseases (CAD and CCM), and identify TLNRD1 as a new, previously uncharacterized member of the CCM signalling pathway. This approach will be widely useful for linking variants to functions for other common polygenic diseases.

    View details for DOI 10.1038/s41586-024-07022-x

    View details for PubMedID 38326615

    View details for PubMedCentralID 5501872

  • An encyclopedia of enhancer-gene regulatory interactions in the human genome. bioRxiv : the preprint server for biology Gschwind, A. R., Mualim, K. S., Karbalayghareh, A., Sheth, M. U., Dey, K. K., Jagoda, E., Nurtdinov, R. N., Xi, W., Tan, A. S., Jones, H., Ma, X. R., Yao, D., Nasser, J., Avsec, Ž., James, B. T., Shamim, M. S., Durand, N. C., Rao, S. S., Mahajan, R., Doughty, B. R., Andreeva, K., Ulirsch, J. C., Fan, K., Perez, E. M., Nguyen, T. C., Kelley, D. R., Finucane, H. K., Moore, J. E., Weng, Z., Kellis, M., Bassik, M. C., Price, A. L., Beer, M. A., Guigó, R., Stamatoyannopoulos, J. A., Lieberman Aiden, E., Greenleaf, W. J., Leslie, C. S., Steinmetz, L. M., Kundaje, A., Engreitz, J. M. 2023

    Abstract

    Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

    View details for DOI 10.1101/2023.11.09.563812

    View details for PubMedID 38014075

    View details for PubMedCentralID PMC10680627

  • Oligogenic Architecture of Rare Noncoding Variants Distinguishes 4 Congenital Heart Disease Phenotypes. Circulation. Genomic and precision medicine Yu, M., Aguirre, M., Jia, M., Gjoni, K., Cordova-Palomera, A., Munger, C., Amgalan, D., Rosa Ma, X., Pereira, A., Tcheandjieu, C., Seidman, C., Seidman, J., Tristani-Firouzi, M., Chung, W., Goldmuntz, E., Srivastava, D., Loos, R. J., Chami, N., Cordell, H., DreSSen, M., Mueller-Myhsok, B., Lahm, H., Krane, M., Pollard, K. S., Engreitz, J. M., Gagliano Taliun, S. A., Gelb, B. D., Priest, J. R. 2023: e003968

    Abstract

    BACKGROUND: Congenital heart disease (CHD) is highly heritable, but the power to identify inherited risk has been limited to analyses of common variants in small cohorts.METHODS: We performed reimputation of 4 CHD cohorts (n=55342) to the TOPMed reference panel (freeze 5), permitting meta-analysis of 14784017 variants including 6035962 rare variants of high imputation quality as validated by whole genome sequencing.RESULTS: Meta-analysis identified 16 novel loci, including 12 rare variants, which displayed moderate or large effect sizes (median odds ratio, 3.02) for 4 separate CHD categories. Analyses of chromatin structure link 13 of the genome-wide significant loci to key genes in cardiac development; rs373447426 (minor allele frequency, 0.003 [odds ratio, 3.37 for Conotruncal heart disease]; P=1.49*10-8) is predicted to disrupt chromatin structure for 2 nearby genes BDH1 and DLG1 involved in Conotruncal development. A lead variant rs189203952 (minor allele frequency, 0.01 [odds ratio, 2.4 for left ventricular outflow tract obstruction]; P=1.46*10-8) is predicted to disrupt the binding sites of 4 transcription factors known to participate in cardiac development in the promoter of SPAG9. A tissue-specific model of chromatin conformation suggests that common variant rs78256848 (minor allele frequency, 0.11 [odds ratio, 1.4 for Conotruncal heart disease]; P=2.6*10-8) physically interacts with NCAM1 (PFDR=1.86*10-27), a neural adhesion molecule acting in cardiac development. Importantly, while each individual malformation displayed substantial heritability (observed h2 ranging from 0.26 for complex malformations to 0.37 for left ventricular outflow tract obstructive disease) the risk for different CHD malformations appeared to be separate, without genetic correlation measured by linkage disequilibrium score regression or regional colocalization.CONCLUSIONS: We describe a set of rare noncoding variants conferring significant risk for individual heart malformations which are linked to genes governing cardiac development. These results illustrate that the oligogenic basis of CHD and significant heritability may be linked to rare variants outside protein-coding regions conferring substantial risk for individual categories of cardiac malformation.

    View details for DOI 10.1161/CIRCGEN.122.003968

    View details for PubMedID 37026454

  • Aberrant phase separation is a common killing strategy of positively charged peptides in biology and human disease. bioRxiv : the preprint server for biology Boeynaems, S., Ma, X. R., Yeong, V., Ginell, G. M., Chen, J. H., Blum, J. A., Nakayama, L., Sanyal, A., Briner, A., Haver, D. V., Pauwels, J., Ekman, A., Schmidt, H. B., Sundararajan, K., Porta, L., Lasker, K., Larabell, C., Hayashi, M. A., Kundaje, A., Impens, F., Obermeyer, A., Holehouse, A. S., Gitler, A. D. 2023

    Abstract

    Positively charged repeat peptides are emerging as key players in neurodegenerative diseases. These peptides can perturb diverse cellular pathways but a unifying framework for how such promiscuous toxicity arises has remained elusive. We used mass-spectrometry-based proteomics to define the protein targets of these neurotoxic peptides and found that they all share similar sequence features that drive their aberrant condensation with these positively charged peptides. We trained a machine learning algorithm to detect such sequence features and unexpectedly discovered that this mode of toxicity is not limited to human repeat expansion disorders but has evolved countless times across the tree of life in the form of cationic antimicrobial and venom peptides. We demonstrate that an excess in positive charge is necessary and sufficient for this killer activity, which we name 'polycation poisoning'. These findings reveal an ancient and conserved mechanism and inform ways to leverage its design rules for new generations of bioactive peptides.

    View details for DOI 10.1101/2023.03.09.531820

    View details for PubMedID 36945394

    View details for PubMedCentralID PMC10028949

  • ALS Genetics: Gains, Losses, and Implications for Future Therapies. Neuron Kim, G. n., Gautier, O. n., Tassoni-Tsuchida, E. n., Ma, X. R., Gitler, A. D. 2020

    Abstract

    Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disorder caused by the loss of motor neurons from the brain and spinal cord. The ALS community has made remarkable strides over three decades by identifying novel familial mutations, generating animal models, elucidating molecular mechanisms, and ultimately developing promising new therapeutic approaches. Some of these approaches reduce the expression of mutant genes and are in human clinical trials, highlighting the need to carefully consider the normal functions of these genes and potential contribution of gene loss-of-function to ALS. Here, we highlight known loss-of-function mechanisms underlying ALS, potential consequences of lowering levels of gene products, and the need to consider both gain and loss of function to develop safe and effective therapeutic strategies.

    View details for DOI 10.1016/j.neuron.2020.08.022

    View details for PubMedID 32931756