Jesse Engreitz's Profile | Stanford Profiles

Bio

Jesse is currently an Assistant Professor at Stanford University in the Department of Genetics and the Children’s Heart Center Basic Sciences and Engineering (BASE) Initiative, and is a recipient of the NHGRI Genomic Innovator Award. He co-leads a Functional Characterization Center at Stanford for the Impact of Genomic Variation on Function (IGVF) Consortium, and is an Associate Director of the Novo Nordisk Foundation Center for Genomic Mechanisms of Disease at the Broad Institute.

Previously, Jesse was a Junior Fellow at the Harvard Society of Fellows and led a research group at the Broad Institute of MIT and Harvard. During his postdoctoral fellowship at the Broad Institute, Jesse developed large-scale CRISPR tools to map enhancer-gene regulation with Eric Lander and Nir Hacohen, and launched the Variants-to-Function (V2F) Initiative to connect genetic disease variants to their molecular and cellular functions. Jesse previously attended Stanford University, where he developed computational algorithms for analyzing gene expression with Russ Altman, and completed his PhD in the Harvard-MIT Division of Health Sciences and Technology, where he studied genome regulation by long noncoding RNAs with Eric Lander and Mitch Guttman. His research has been supported by the National Human Genome Research Institute, National Heart, Lung, and Blood Institute, Additional Ventures, Foundations for the National Institutes of Health, Harvard Society of Fellows, Fannie and John Hertz Foundation, and Department of Defense. Outside the lab, Jesse enjoys playing jazz/rock/funk, testing Chinese recipes, and surfing.

Academic Appointments

Assistant Professor, Genetics
Member, Bio-X
Member, Cardiovascular Institute
Member, Maternal & Child Health Research Institute (MCHRI)

Program Affiliations

Betty Irene Moore Children's Heart Center

Professional Education

PhD, MIT, Harvard-MIT Division of Health Sciences and Technology (2016)
MS, Stanford University, Bioengineering (2010)
BS, Stanford University, Biomedical Computation (2010)

Current Research and Scholarly Interests

We are mapping the regulatory wiring of the genome to understand the genetic basis of heart diseases.

The human genome encodes 2 million enhancers, which act in combination to regulate nearby genes. Each of the thousands of cell types in the human body has its own precise wiring that is difficult to resolve. Enhancers contain tens of thousands of DNA variants that affect human diseases — and therefore hold the key to understanding molecular mechanisms that control genetic risk for disease. We aim to map enhancer-gene connections in every cell type in the human body to connect disease variants to the genes, cell types, and pathways they control.

Our approach:
• We invent new single-cell methods combining genomics, biochemistry, and molecular biology.
• We dissect molecular mechanisms of enhancer-gene communication.
• We build computational models to map genome regulation.
• We connect human genetic variants to biological mechanisms of disease by applying these tools in cellular and animal models.

2025-26 Courses

Current Issues in Genetics
GENE 219 (Aut, Win, Spr, Sum)
Genetics and Developmental Biology Training Camp
DBIO 200, GENE 200 (Aut)
Independent Studies (6)
- Directed Investigation
  BIOE 392 (Aut, Win, Spr, Sum)
- Directed Reading in Genetics
  GENE 299 (Aut, Win, Spr, Sum)
- Directed Study
  BIOE 391 (Aut, Win, Spr, Sum)
- Graduate Research
  GENE 399 (Aut, Win, Spr, Sum)
- Supervised Study
  GENE 260 (Aut, Win, Spr, Sum)
- Undergraduate Research
  GENE 199 (Aut, Win, Spr, Sum)
Prior Year Courses
2024-25 Courses
- Current Issues in Genetics
  GENE 219 (Aut, Win, Spr, Sum)
- Genetics and Developmental Biology Training Camp
  DBIO 200, GENE 200 (Aut)
2023-24 Courses
- Current Issues in Genetics
  GENE 219 (Aut, Win, Spr, Sum)
- Genetics and Developmental Biology Training Camp
  DBIO 200, GENE 200 (Aut)
2022-23 Courses
- Biology and Applications of CRISPR/Cas9: Genome Editing and Epigenome Modifications
  BIOS 268, GENE 268 (Spr)
- Current Issues in Genetics
  GENE 219 (Spr)

Stanford Advisees

Doctoral Dissertation Reader (AC)
Julia Bauman, Benjamin Doughty, Danilo Dubocanin, Gina Duronio, Simon Gaudin, Tami Gjorgjieva, Julie Lake, Kate Lawrence, Thao Nguyen, Valeh Valiollah Pour Amiri
Postdoctoral Faculty Sponsor
Danila Bredikhin, Gal Keshet, Olga Pushkarev
Doctoral Dissertation Advisor (AC)
Yannick Lee-Yow, Maya Sheth, Ronghao Zhou
Doctoral Dissertation Co-Advisor (AC)
Shawn Cai, Tony Zeng
Doctoral (Program)
Jason Tan

Graduate and Fellowship Programs

Genetics (Phd Program)

All Publications

Compatibility rules of human enhancer and promoter sequences. Nature Bergman, D. T., Jones, T. R., Liu, V., Ray, J., Jagoda, E., Siraj, L., Kang, H. Y., Nasser, J., Kane, M., Rios, A., Nguyen, T. H., Grossman, S. R., Fulco, C. P., Lander, E. S., Engreitz, J. M. 2022

Abstract

Gene regulation in the human genome is controlled by distal enhancers that activate specific nearby promoters1. One model for this specificity is that promoters might have sequence-encoded preferences for certain enhancers, for example mediated by interacting sets of transcription factors or cofactors2. This "biochemical compatibility" model has been supported by observations at individual human promoters and by genome-wide measurements in Drosophila3-9. However, the degree to which human enhancers and promoters are intrinsically compatible has not been systematically measured, and how their activities combine to control RNA expression remains unclear. Here we designed a high-throughput reporter assay called ExP STARR-seq (enhancer x promoter self-transcribing active regulatory region sequencing) and applied it to examine the combinatorial compatibilities of 1,000 enhancer and 1,000 promoter sequences in human K562 cells. We identify simple rules for enhancer-promoter compatibility: most enhancers activated all promoters by similar amounts, and intrinsic enhancer and promoter activities combine multiplicatively to determine RNA output (R2=0.82). In addition, two classes of enhancers and promoters showed subtle preferential effects. Promoters of housekeeping genes contained built-in activating motifs for factors such as GABPA and YY1, which decreased the responsiveness of promoters to distal enhancers. Promoters of variably expressed genes lacked these motifs and showed stronger responsiveness to enhancers. Together, this systematic assessment of enhancer-promoter compatibility suggests a multiplicative model tuned by enhancer and promoter class to control gene transcription in the human genome.

View details for DOI 10.1038/s41586-022-04877-w

View details for PubMedID 35594906
Genome-wide enhancer maps link risk variants to disease genes. Nature Nasser, J., Bergman, D. T., Fulco, C. P., Guckelberger, P., Doughty, B. R., Patwardhan, T. A., Jones, T. R., Nguyen, T. H., Ulirsch, J. C., Lekschas, F., Mualim, K., Natri, H. M., Weeks, E. M., Munson, G., Kane, M., Kang, H. Y., Cui, A., Ray, J. P., Eisenhaure, T. M., Collins, R. L., Dey, K., Pfister, H., Price, A. L., Epstein, C. B., Kundaje, A., Xavier, R. J., Daly, M. J., Huang, H., Finucane, H. K., Hacohen, N., Lander, E. S., Engreitz, J. M. 2021

Abstract

Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-genemaps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.

View details for DOI 10.1038/s41586-021-03446-x

View details for PubMedID 33828297
HyPR-seq: Single-cell quantification of chosen RNAs via hybridization and sequencing of DNA probes. Proceedings of the National Academy of Sciences of the United States of America Marshall, J. L., Doughty, B. R., Subramanian, V., Guckelberger, P., Wang, Q., Chen, L. M., Rodriques, S. G., Zhang, K., Fulco, C. P., Nasser, J., Grinkevich, E. J., Noel, T., Mangiameli, S., Bergman, D. T., Greka, A., Lander, E. S., Chen, F., Engreitz, J. M. 2020; 117 (52): 33404–13

Abstract

Single-cell quantification of RNAs is important for understanding cellular heterogeneity and gene regulation, yet current approaches suffer from low sensitivity for individual transcripts, limiting their utility for many applications. Here we present Hybridization of Probes to RNA for sequencing (HyPR-seq), a method to sensitively quantify the expression of hundreds of chosen genes in single cells. HyPR-seq involves hybridizing DNA probes to RNA, distributing cells into nanoliter droplets, amplifying the probes with PCR, and sequencing the amplicons to quantify the expression of chosen genes. HyPR-seq achieves high sensitivity for individual transcripts, detects nonpolyadenylated and low-abundance transcripts, and can profile more than 100,000 single cells. We demonstrate how HyPR-seq can profile the effects of CRISPR perturbations in pooled screens, detect time-resolved changes in gene expression via measurements of gene introns, and detect rare transcripts and quantify cell-type frequencies in tissue using low-abundance marker genes. By directing sequencing power to genes of interest and sensitively quantifying individual transcripts, HyPR-seq reduces costs by up to 100-fold compared to whole-transcriptome single-cell RNA-sequencing, making HyPR-seq a powerful method for targeted RNA profiling in single cells.

View details for DOI 10.1073/pnas.2010738117

View details for PubMedID 33376219
Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations NATURE GENETICS Fulco, C. P., Nasser, J., Jones, T. R., Munson, G., Bergman, D. T., Subramanian, V., Grossman, S. R., Anyoha, R., Doughty, B. R., Patwardhan, T. A., Nguyen, T. H., Kane, M., Perez, E. M., Durand, N. C., Lareau, C. A., Stamenova, E. K., Aiden, E., Lander, E. S., Engreitz, J. M. 2019; 51 (12): 1664-+

Abstract

Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases1-4. Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-gene connections across cell types5,6. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer-gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer-gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.

View details for DOI 10.1038/s41588-019-0538-0

View details for Web of Science ID 000499696700003

View details for PubMedID 31784727

View details for PubMedCentralID PMC6886585
Local regulation of gene expression by lncRNA promoters, transcription and splicing NATURE Engreitz, J. M., Haines, J. E., Perez, E. M., Munson, G., Chen, J., Kane, M., McDonel, P. E., Guttman, M., Lander, E. S. 2016; 539 (7629): 452–55

Abstract

Mammalian genomes are pervasively transcribed to produce thousands of long non-coding RNAs (lncRNAs). A few of these lncRNAs have been shown to recruit regulatory complexes through RNA-protein interactions to influence the expression of nearby genes, and it has been suggested that many other lncRNAs can also act as local regulators. Such local functions could explain the observation that lncRNA expression is often correlated with the expression of nearby genes. However, these correlations have been challenging to dissect and could alternatively result from processes that are not mediated by the lncRNA transcripts themselves. For example, some gene promoters have been proposed to have dual functions as enhancers, and the process of transcription itself may contribute to gene regulation by recruiting activating factors or remodelling nucleosomes. Here we use genetic manipulation in mouse cell lines to dissect 12 genomic loci that produce lncRNAs and find that 5 of these loci influence the expression of a neighbouring gene in cis. Notably, none of these effects requires the specific lncRNA transcripts themselves and instead involves general processes associated with their production, including enhancer-like activity of gene promoters, the process of transcription, and the splicing of the transcript. Furthermore, such effects are not limited to lncRNA loci: we find that four out of six protein-coding loci also influence the expression of a neighbour. These results demonstrate that cross-talk among neighbouring genes is a prevalent phenomenon that can involve multiple mechanisms and cis-regulatory signals, including a role for RNA splice sites. These mechanisms may explain the function and evolution of some genomic loci that produce lncRNAs and broadly contribute to the regulation of both coding and non-coding genes.

View details for DOI 10.1038/nature20149

View details for Web of Science ID 000388161700059

View details for PubMedID 27783602

View details for PubMedCentralID PMC6853796
Systematic mapping of functional enhancer-promoter connections with CRISPR interference SCIENCE Fulco, C. P., Munschauer, M., Anyoha, R., Munson, G., Grossman, S. R., Perez, E. M., Kane, M., Cleary, B., Lander, E. S., Engreitz, J. M. 2016; 354 (6313): 769–73

Abstract

Gene expression in mammals is regulated by noncoding elements that can affect physiology and disease, yet the functions and target genes of most noncoding elements remain unknown. We present a high-throughput approach that uses clustered regularly interspaced short palindromic repeats (CRISPR) interference (CRISPRi) to discover regulatory elements and identify their target genes. We assess >1 megabase of sequence in the vicinity of two essential transcription factors, MYC and GATA1, and identify nine distal enhancers that control gene expression and cellular proliferation. Quantitative features of chromatin state and chromosome conformation distinguish the seven enhancers that regulate MYC from other elements that do not, suggesting a strategy for predicting enhancer-promoter connectivity. This CRISPRi-based approach can be applied to dissect transcriptional networks and interpret the contributions of noncoding genetic variation to human disease.

View details for DOI 10.1126/science.aag2445

View details for Web of Science ID 000387326300042

View details for PubMedID 27708057

View details for PubMedCentralID PMC5438575
The Xist lncRNA Exploits Three-Dimensional Genome Architecture to Spread Across the X Chromosome SCIENCE Engreitz, J. M., Pandya-Jones, A., McDonel, P., Shishkin, A., Sirokman, K., Surka, C., Kadri, S., Xing, J., Goren, A., Lander, E. S., Plath, K., Guttman, M. 2013; 341 (6147): 767-+

Abstract

Many large noncoding RNAs (lncRNAs) regulate chromatin, but the mechanisms by which they localize to genomic targets remain unexplored. We investigated the localization mechanisms of the Xist lncRNA during X-chromosome inactivation (XCI), a paradigm of lncRNA-mediated chromatin regulation. During the maintenance of XCI, Xist binds broadly across the X chromosome. During initiation of XCI, Xist initially transfers to distal regions across the X chromosome that are not defined by specific sequences. Instead, Xist identifies these regions by exploiting the three-dimensional conformation of the X chromosome. Xist requires its silencing domain to spread across actively transcribed regions and thereby access the entire chromosome. These findings suggest a model in which Xist coats the X chromosome by searching in three dimensions, modifying chromosome structure, and spreading to newly accessible locations.

View details for DOI 10.1126/science.1237973

View details for Web of Science ID 000323122200041

View details for PubMedID 23828888

View details for PubMedCentralID PMC3778663
An encyclopedia of human enhancer-gene regulatory interactions. Nature Gschwind, A. R., Mualim, K. S., Karbalayghareh, A., Sheth, M. U., Dey, K. K., Jagoda, E., Nurtdinov, R. N., Xi, W., Tan, A. S., Galante, J., Jones, H., Ma, X. R., Yao, D., Amgalan, D., Ray, J., Munger, C. J., Nasser, J., Avsec, Ž., James, B. T., Shamim, M. S., Durand, N. C., Rao, S. S., Mahajan, R., Doughty, B. R., Andreeva, K., Ulirsch, J. C., Fan, K., Perez, E. M., Nguyen, T. C., Kelley, D. R., Finucane, H. K., Moore, J. E., Weng, Z., Kellis, M., Bassik, M. C., Ustun, B., Price, A. L., Beer, M. A., Guigó, R., Stamatoyannopoulos, J. A., Lieberman Aiden, E., Greenleaf, W. J., Leslie, C. S., Steinmetz, L. M., Kundaje, A., Engreitz, J. M. 2026

Abstract

Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the effect of human genetic variation on disease1-6. Here we create and evaluate a resource of more than 92 million enhancer-gene regulatory interactions across 1,458 biosamples covering 369 cell types and tissues, by integrating predictive models, chromatin states, three-dimensional contacts and large-scale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,356 element-gene pairs measured in CRISPR perturbation experiments, more than 30,000 fine-mapped expression quantitative trait loci and 569 fine-mapped genome-wide association study (GWAS) variants linked to a probable causal gene. Using this framework, we develop ENCODE-rE2G, a predictive model achieving state-of-the-art performance across several prediction tasks, demonstrating that iterative perturbations and supervised machine learning can build increasingly accurate predictive models of enhancer regulation. Using ENCODE-rE2G, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, revealing global properties of enhancer networks, identifying differences in regulatory complexity across genes and improving analyses linking noncoding variants to target genes and cell types for common complex diseases. By interpreting the model, we find that beyond enhancer activity and three-dimensional enhancer-promoter contacts, additional features that guide enhancer-promoter communication include promoter class and enhancer-enhancer synergy. These genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

View details for DOI 10.1038/s41586-026-10781-4

View details for PubMedID 42457959

View details for PubMedCentralID 7845138
Intrinsic promoter responsiveness dictates sensitivity to transcriptional activation by enhancers. bioRxiv : the preprint server for biology Tan, Y., Ray, J., Sheth, M. U., Doughty, B. R., Greenleaf, W. J., Engreitz, J. M. 2026

Abstract

Enhancers activate specific target promoters, but whether intrinsic enhancer-promoter compatibility contributes to this specificity is debated. Recent studies using different reporter assays have reached contradictory conclusions. We compare six reporter assay designs, identify confounders that bias compatibility measurements, and apply improved assays to test 25,000 enhancer-promoter pairs. Promoters differ in their responsiveness to enhancers (>100-fold versus 1.1-fold activation) while enhancers activate all promoters in a similar rank order. Promoter output scales with enhancer activity following a power law, with the exponent varying across promoters. Incorporating this exponent into the Activity-by-Contact model improves prediction of endogenous enhancer effects, explaining why certain active genes are insensitive to distal perturbations and "skipped" by enhancers. Responsiveness is modulated by transcription factor motifs in the core promoter. This work establishes responsiveness as an intrinsic promoter property that enables specific promoters to be highly activated in a landscape of broadly compatible enhancers, while others remain unaffected.

View details for DOI 10.64898/2026.06.25.734173

View details for PubMedID 42395355

View details for PubMedCentralID PMC13320843
PerturbPlan: An analytical framework for designing Perturb-seq experiments. bioRxiv : the preprint server for biology Niu, Z., He, Y., Galante, J., Gschwind, A. R., Ray, J., Steinmetz, L. M., Engreitz, J. M., Katsevich, E. 2026

Abstract

CRISPR screens with single-cell RNA-seq readouts provide a powerful tool for characterizing the functions of noncoding elements and genes. However, designing these experiments to balance statistical power and cost is challenging, given the large number of design parameters. The only available tool for this purpose is a simulation-based power calculator, but it is computationally costly and requires high-performance computing to run. We derive a novel analytical formula for the power to detect perturbation-expression associations, recapitulating power estimates from the simulation-based tool while reducing runtime by up to seven orders of magnitude. This acceleration unlocks the possibility of interactive single-cell CRISPR screen design. Accordingly, we develop PerturbPlan, an interactive web application built on the analytical power formula. PerturbPlan helps users address 11 design questions for two types of single-cell CRISPR screens, Perturb-seq and targeted Perturb-seq (TAP-seq). We apply PerturbPlan to carry out a comparative analysis of three recent Perturb-seq designs, demonstrating how optimal design varies across experiments of different scales. We also use PerturbPlan to quantify the cost savings of a recent TAP-seq study relative to a hypothetical Perturb-seq study assaying the same perturbations, illustrating how the tool can inform decisions about targeted versus whole-transcriptome readouts. In sum, PerturbPlan is the first tool to facilitate flexible and interactive design of well-powered single-cell CRISPR screen experiments.

View details for DOI 10.64898/2026.05.22.727199

View details for PubMedID 42239142

View details for PubMedCentralID PMC13228452
A Perturb-seq screen guided by species divergence uncovers pathways for collateral artery formation. bioRxiv : the preprint server for biology Fan, X., Zhou, R., Raftrey, B. C., Coronado, P. E., Trimm, E., Clancy, E., Chen, X., Bozeman, J., Chen, M. S., Alimukhamedov, S., Alcocer, J., Bonham, I., Agarwal, S., Isakova, A., de Jesus Perez, V. A., Park, C. Y., Shay, T. F., Gradinaru, V., Quertermous, T., Engreitz, J. M., Red-Horse, K. 2026

Abstract

Collateral arteries are natural bypasses that can reroute blood flow around arterial blockages, limiting tissue injury during stroke and coronary artery disease. Despite their clinical effectiveness, therapeutic strategies to stimulate collateral artery growth remain unavailable due to our limited understanding of their developmental mechanisms. Remarkably, guinea pigs display exceptionally dense collateral artery networks across various organs, resulting in complete resistance to ischemic damage in the brain and heart. In this study, we compared single-cell RNA sequencing (scRNA-seq) from guinea pig and mouse tissues to identify endothelial cell (EC) gene expression patterns associated with extensive collateral artery development. We then developed an in vivo Perturb-seq platform in mice to test whether genes differentially expressed in guinea pigs influence artery EC specification. This pipeline identified artery repressors that were downregulated in guinea pigs and increased pial collateral abundance when inhibited in mice. Downstream analysis suggests that artery repressors, including WNT and hypoxia response genes, function in two capillary EC subsets-Esm1+ pre-artery and Apln+ angiogenic tip cells. Reduced activity of these repressors allows more ECs to acquire arterial identity, potentiating collateral artery formation. Collectively, our study establishes a strategy for discovering the genes underlying species-specific traits, suggests that guinea pigs have collaterals due to decreased activity of artery inhibitor pathways and hypoxia responses, and identifies novel targets for stimulating collateral artery formation (Graphical abstract).

View details for DOI 10.64898/2026.04.29.721711

View details for PubMedID 42146601

View details for PubMedCentralID PMC13174509
Common Coronary Artery Disease Risk Variants in Endothelial Regulatory Elements Modulate Tetraspanin 14 Expression and Notch Signaling. Arteriosclerosis, thrombosis, and vascular biology Lee-Kim, V. S., Schnitzler, G. R., Fang, S., Mardani-Kamali, N., Cai, X. S., Cui, R., Barry, A. E., Zandavi, M., Yoo, H. J., Kant, S., Mahajan, R., Rao, S. S., Aiden, E. L., Engreitz, J. M., Gupta, R. M. 2026

Abstract

Coronary artery disease (CAD) is a complex condition and remains the leading cause of mortality worldwide. Genome-wide association studies have identified a CAD risk locus on chromosome 10q23 that is independent of traditional risk factors, providing an opportunity to uncover novel molecular mechanisms contributing to CAD pathogenesis.Improved fine-mapping approaches were used to prioritize noncoding variants at the 10q23 locus within the intronic region of TSPAN14 (tetraspanin 14). Regulatory elements harboring lead variants were functionally interrogated using chromatin accessibility, 3-dimensional chromatin organization, and clustered regularly interspaced short palindromic repeats-mediated deletion approaches in endothelial cells (ECs). Subsequently, TSPAN14 function in ECs was assessed using transcriptomic profiling, Notch signaling activation assays, and EC functional assay using gene knockout (KO) cell lines.Fine-mapping identified 2 lead variants, rs17680741 and rs12260962, located within regulatory elements predicted to affect TSPAN14 expression in monocytes and ECs, respectively. Chromatin accessibility, organization, and enhancer deletion assays demonstrated EC-specific function for the rs12260962-harboring regulatory element in TSPAN14 expression regulation. Loss of TSPAN14 resulted in significant transcriptomic changes related to Notch signaling, heart morphogenesis, cell adhesion, and wound healing. Functionally, TSPAN14 KO ECs exhibited impaired cell-cell junction integrity, reduced repair capacity, and diminished mechanosensitive responses.Together, these data identify a regulatory element harboring CAD-associated variants at the 10q23 locus that modulates TSPAN14 expression and downstream Notch signaling in ECs, thereby linking genetic risk to endothelial dysfunction relevant to CAD pathogenesis.

View details for DOI 10.1161/ATVBAHA.125.323330

View details for PubMedID 42021732
Genome-scale mapping of variant, enhancer and gene function in primary human CD4+ T cells. bioRxiv : the preprint server for biology Moonen, D. P., Claringbould, A., Gschwind, A. R., Schrod, S., Braunger, J., Feng, C., Rauscher, B., Yi, J., Bi, S. Z., Matthess, Y., Kaulich, M., Acob, R. A., Ayer, A., Engreitz, J. M., Velten, B., Stegle, O., Trynka, G., Zaugg, J. B., Schraivogel, D., Steinmetz, L. M. 2026

Abstract

CD4+ T cells harbor a disproportionate enrichment of immune disease risk loci and represent the primary cellular context for immune disease biology, yet the genes and regulatory programs these variants affect remain largely unknown. We combined targeted Perturb-seq of 1,032 cis-regulatory elements (CREs) overlapping 4,724 variants across 14 immune diseases with genome-wide Perturb-seq of all expressed genes in primary human CD4+ T cells, spanning 4.1 million cells. We identified 626 CRE-gene pairs, and connected CRE targets to downstream regulatory cascades. At the TYK2 and DEXI/CLEC16A loci, we resolved target genes and linked noncoding variants to inflammatory and metabolic programs. Across diseases, we revealed that dispersed variants converged on shared and disease-specific programs. Our work provides a framework for tracing variant-to-CRE-to-gene-to-network in disease-relevant primary cells.

View details for DOI 10.64898/2026.03.09.710372

View details for PubMedID 41959477

View details for PubMedCentralID PMC13060838
An expanded registry of candidate cis-regulatory elements. Nature Moore, J. E., Pratt, H. E., Fan, K., Phalke, N., Fisher, J., Elhajjajy, S. I., Andrews, G., Gao, M., Shedd, N., Fu, Y., Lacadie, M. C., Meza, J., Khandpekar, M., Ganna, M., Choudhury, E., Swofford, R., Phan, H., Ramirez, C. C., Campbell, M., Likhite, M., Farrell, N. P., Weimer, A. K., Pampari, A., Ramalingam, V., Reese, F., Borsari, B., Yu, X., Wattenberg, E., Ruiz-Romero, M., Razavi-Mohseni, M., Xu, J., Galeev, T., Colubri, A., Beer, M. A., Guigó, R., Gerstein, M. B., Engreitz, J. M., Ljungman, M., Reddy, T. E., Snyder, M. P., Epstein, C. B., Gaskell, E., Bernstein, B. E., Dickel, D. E., Visel, A., Pennacchio, L. A., Mortazavi, A., Kundaje, A., Weng, Z. 2026

Abstract

Mammalian genomes contain millions of regulatory elements that control the complex patterns of gene expression1. Previously, the ENCODE consortium mapped biochemical signals across hundreds of cell types and tissues and integrated these data to develop a registry containing 0.9 million human and 300,000 mouse candidate cis-regulatory elements (cCREs) annotated with potential functions2. Here we have expanded the registry to include 2.37 million human and 967,000 mouse cCREs, leveraging new ENCODE datasets and enhanced computational methods. This expanded registry covers hundreds of unique cell and tissue types, providing a comprehensive understanding of gene regulation. Functional characterization data from assays such as STARR-seq3, massively parallel reporter assay4, CRISPR perturbation5,6 and transgenic mouse assays7 have profiled more than 90% of human cCREs, revealing complex regulatory functions. We identified thousands of novel silencer cCREs and demonstrated their dual enhancer and silencer roles in different cellular contexts. Integrating the registry with other ENCODE annotations facilitates genetic variation interpretation and trait-associated gene identification, exemplified by the identification of KLF1 as a novel causal gene for red blood cell traits. This expanded registry is a valuable resource for studying the regulatory genome and its impact on health and disease.

View details for DOI 10.1038/s41586-025-09909-9

View details for PubMedID 41501460

View details for PubMedCentralID 11903340
Junction-targeting designs limit the application of CRISPR-Cas13d in circular RNA perturbation studies. Nucleic acids research Lee-Yow, Y. C., Valbuena, R. C., Richter, C. S., Chang, H. Y., Engreitz, J. M. 2026; 54 (1)

Abstract

Circular RNAs (circRNAs) are RNA molecules formed through the backsplicing of linear exons. Several thousand have been identified, yet relatively few are functionally characterized due to challenges in distinguishing effects of circular from linear RNA targets. Recently, CRISPR-Cas13 systems have been utilized to directly target unique junctions formed through backsplicing, potentially allowing for selective degradation of circular isoforms. Applying this approach in pooled screens has indeed identified circRNAs proposed to affect viability in several cancer cell lines. However, the design limitations of applying Cas13d to study circRNAs are not fully characterized. Here, we assessed the limitations of Cas13d-mediated circRNA knockdowns by performing essentiality screens on 900 highly expressed circRNAs in K562, an ENCODE tier 1 cell line. We observed consistent off-target knockdown of linear isoforms by certain circRNA-targeting single-guide RNAs (sgRNAs). Re-analysis of existing Cas13d screens in other cell types revealed similar off-target effects. Using machine learning models that predict Cas13d sgRNA efficacy, we further found that most circRNA-targeting sgRNAs are unlikely to induce strong knockdown. After accounting for these design constraints, 0 of 346 circRNAs testable in our screens had detectable effects on proliferation. Our findings highlight key limitations of junction-targeting strategies, with implications for future circRNA perturbation studies.

View details for DOI 10.1093/nar/gkaf1447

View details for PubMedID 41505088
The IGVF catalog-from genetic variation to function. Nucleic acids research Li, D., Liu, S., Assis, P. R., Li, M., Dong, S., Whaling, I., Jolanki, O., Kagda, M., Zhang, W., Macias-Velasco, J. F., Liu, T., Cody, S., Antonacci-Fulton, L., Huang, Y., Liu, J., Montgomery, M. T., Zeiberg, D., Jain, S., Pejaver, V., Bergquist, T., Chen, Y., Radivojac, P., Gersbach, C. A., Sherpa, R. N., Castro, C. P., Boyle, A. P., Starita, L. M., Fowler, D. M., Ahituv, N., Dey, K. K., Majoros, W. H., Reddy, T. E., Craven, M., Sinha, R., Sverchkov, Y., Cai, X., Nzima, M. Z., Calderwood, M. A., Rozowsky, J., Gerstein, M., Ma, J., Yue, F., Cherry, J. M., Love, M. I., Engreitz, J. M., Hitz, B. C., Wang, T. 2025

Abstract

Genomic variation between individuals is essential for understanding how differences in the genome sequence affect molecular and cellular processes. The Impact of Genomic Variation on Function (IGVF) Consortium aims to uncover the relationships among genomic variation, genome function, and phenotypes by combining experimental techniques, such as single-cell mapping and genomic perturbation assays, with computational approaches such as machine learning-based predictive modeling. The IGVF Data and Administrative Coordinating Centers collect, analyze, and disseminate data and results from across the consortium through an open-source platform called the IGVF Catalog. This resource includes, but is not limited to, data on the effects of coding variants on protein abundance and function, noncoding variants on enhancer activity (measured by MPRA or predicted computationally), and associations between variants and quantitative traits. All data are organized within a graph database comprising over 50 types of data collections with nearly 3 billion nodes and over 7.5 billion edges. The Catalog offers public API endpoints (https://api.catalogkg.igvf.org/) and a user-friendly interface for exploring, querying, and visualizing the data at https://catalog.igvf.org. We expect that this open-access platform will support the broader scientific community to advance our understanding of how genomic variation influences biology and disease.

View details for DOI 10.1093/nar/gkaf1341

View details for PubMedID 41359121
The epigenomic landscape of single vascular cells reflects developmental origin and identifies disease risk loci Weldy, C., Kundu, S., Monteiro, J., Gu, W., Pedroza, A., Dalal, A., Worssam, M., Li, D., Palmisano, B., Zhao, Q., Sharma, D., Nguyen, T., Kundu, R., Fischbein, M., Engreitz, J., Kundaje, A., Cheng, P., Quertermous, T. LIPPINCOTT WILLIAMS & WILKINS. 2025

View details for DOI 10.1161/circ.152.suppl_3.4358524

View details for Web of Science ID 001613911000034
Deep learning the dynamic regulatory sequence code of cardiac organoid differentiation. bioRxiv : the preprint server for biology Metzl-Raz, E., Zhao, R., Deshpande, S., Powell, J., Porter, E. G., Zouaghi, Y., Liu, B. B., Kim, S. H., Abdi, I., Evergreen, I., Agarwal, M., Sheth, M. U., Rico, J., Miyamoto, M., Sanchez, J. M., Engreitz, J. M., Kundaje, A., Greenleaf, W. J., Gifford, C. A. 2025

Abstract

Defining the temporal gene regulatory programs that drive human organogenesis is essential for understanding the origins of congenital disease. We combined a time-resolved, single-cell multi-omic atlas of human iPSC-derived cardiac organoids with deep learning models that predict chromatin accessibility from DNA sequence, enabling the discovery of the regulatory syntax underlying early heart development. This framework uncovered cell-state-specific rules of cardiogenesis, including context-dependent activities of TEAD, HAND, and TBX transcription factor families, and linked these motifs to their target genes. We identified distinct programs guiding lineage divergence, such as ventricular versus pacemaker cardiomyocytes, and validated predictions by perturbing Myocardin (MYOCD), establishing its essential role in ventricular specification. Integration of chromatin, transcriptional, and genetic data further highlighted regulatory regions and disease-associated variants that perturb differentiation state transitions, supporting evidence that suggests congenital heart disease emerges early in development. This work bridges developmental gene regulation with disease genetics, providing a foundation for mechanistic and therapeutic insights into congenital diseases.

View details for DOI 10.1101/2025.10.15.680997

View details for PubMedID 41279701

View details for PubMedCentralID PMC12632746
Disease-linked regulatory DNA variants and homeostatic transcription factors in epidermis. Nature communications Porter, D. F., Meyers, R. M., Miao, W., Reynolds, D. L., Hong, A. W., Yang, X., Srinivasan, S., Mondal, S., Siprashvili, Z., Fabo, T., Zhou, R., Nguyen, T., Ducoli, L., Meyers, J. M., Nguyen, D. T., Ko, L. A., Kellman, L. N., Elfaki, I., Guo, M., Winge, M. C., Jackrazi, L. V., Lopez-Pajares, V., Liu, B. B., Qu, Y., Porter, I. E., Kim, S. H., Kim, G., Tao, S., Engreitz, J. M., Khavari, P. A. 2025; 16 (1): 8387

Abstract

Identifying noncoding single nucleotide variants (SNVs) in regulatory DNA linked to polygenic disease risk, the transcription factors (TFs) they bind, and the genes they dysregulate is a goal in polygenic disease research. Here, we use massively parallel reporter analysis of 3451 SNVs linked to risk for polygenic skin diseases with disrupted epidermal homeostasis to identify 355 differentially active SNVs (daSNVs). daSNV target gene analysis, combined with daSNV editing, underscored dysregulated epidermal differentiation as a shared pathomechanism. CRISPR knockout screens of 1772 human TFs revealed 123 TFs essential for epidermal homeostasis, highlighting ZNF217 and CXXC1. Population sampling CUT&RUN of 27 homeostatic TFs identified allele-specific DNA binding (ASB) differences at daSNVs enriched near epidermal homeostasis and monogenic skin disease genes, with notable representation of SP/KLF and AP-1/2 TFs. High TF-occupancy promoters were "buffered" against ASB. This resource implicates dysregulated binding of specific homeostatic TF families in risk for diverse polygenic skin diseases.

View details for DOI 10.1038/s41467-025-63070-5

View details for PubMedID 40998781

View details for PubMedCentralID 5715812
Epigenomic landscape of single vascular cells reflects developmental origin and disease risk loci. Molecular systems biology Weldy, C. S., Kundu, S., Monteiro, J., Gu, W., Pedroza, A. J., Dalal, A. R., Worssam, M. D., Li, D., Palmisano, B., Zhao, Q., Sharma, D., Nguyen, T., Kundu, R., Fischbein, M. P., Engreitz, J., Kundaje, A. B., Cheng, P. P., Quertermous, T. 2025

Abstract

Vascular sites have distinct susceptibility to atherosclerosis and aneurysm, yet the epigenomic and transcriptomic underpinning of vascular site-specific disease risk is largely unknown. Here, we performed single-cell chromatin accessibility (scATACseq) and gene expression profiling (scRNAseq) of mouse vascular tissue from three vascular sites. Through interrogation of epigenomic enhancers and gene regulatory networks, we discovered key regulatory enhancers to not only be cell type, but vascular site-specific. We identified epigenetic markers of embryonic origin including developmental transcription factors such as Tbx20, Hand2, Gata4, and Hoxb family members and discovered transcription factor motif accessibility to be vascular site-specific for smooth muscle, fibroblasts, and endothelial cells. We further integrated genome-wide association data for aortic dimension, and using a deep learning model to predict variant effect on chromatin accessibility, ChromBPNet, we predicted variant effects across cell type and vascular site of origin, revealing genomic regions enriched for specific TF motif footprints-including MEF2A, SMAD3, and HAND2. This work supports a paradigm that cell type and vascular site-specific enhancers govern complex genetic drivers of disease risk.

View details for DOI 10.1038/s44320-025-00140-2

View details for PubMedID 40931195

View details for PubMedCentralID 3357908
Enhancer-targeting CRISPR screens at coronary artery disease loci suggest shared mechanisms of disease risk. medRxiv : the preprint server for health sciences Ramste, M., Weldy, C., Kundu, S., Zhao, Q., Li, D., Brand, K., Sharma, D., Ramste, A., Jagoda, E., Ray, J., Caceres, R. D., Galante, J., Gschwind, A. R., Lahtinen, N., Nguyen, T., Amrute, J. M., Park, C. Y., Kim, J. B., Kaikkonen, M. U., Stitziel, N. O., Steinmetz, L., Kundaje, A., Engreitz, J. M., Quertermous, T. 2025

Abstract

To systematically identify causal genetic mechanisms that confer risk for coronary artery disease (CAD) in GWAS loci, we mapped genome-wide variant-to-enhancer-to-gene (V2E2G) links in vascular smooth muscle cells (SMC). Enhancers identified by active chromatin features, and further prioritized by base-resolution deep learning models of chromatin accessibility in 108 CAD loci, were studied with CRISPRi targeting and Direct-Capture Targeted Perturb-seq (DC-TAP-seq) evaluation of 470 genes. Seventy-six V2E2G links were identified for 59 candidate CAD genes representing gene programs including epithelial-mesenchymal transformation, ubiquitination, and protein folding as well as BMP and TGFB signaling. Similar methods employed with an independent focused screen targeting one candidate locus at 9p21.3 identified 10 enhancers regulating expression of multiple genes at this location. Detailed molecular studies revealed that two enhancers mediating transcription factor binding and transcriptional regulation contribute to ancestry-specific and sex-specific risk for CAD and the surrogate biomarker vascular calcification. Together, these studies advance our identification of GWAS CAD V2E2G links across the genome, and specific mechanisms of risk at the complex 9p21.3 locus.

View details for DOI 10.1101/2025.08.28.25334684

View details for PubMedID 40950476

View details for PubMedCentralID PMC12424881
The epigenomic landscape of single vascular cells reflects developmental origin and identifies disease risk loci. bioRxiv : the preprint server for biology Weldy, C. S., Kundu, S., Monteiro, J., Gu, W., Pedroza, A. J., Dalal, A. R., Worssam, M. D., Li, D., Palmisano, B., Zhao, Q., Sharma, D., Nguyen, T., Kundu, R., Fischbein, M. P., Engreitz, J., Kundaje, A. B., Cheng, P. P., Quertermous, T. 2025

Abstract

Vascular sites have distinct susceptibility to atherosclerosis and aneurysm, yet the biological underpinning of vascular site-specific disease risk is largely unknown. Vascular tissues have different developmental origins that may influence global chromatin accessibility, and understanding differential chromatin accessibility, gene expression profiles, and gene regulatory networks (GRN) on single cell resolution may give key insight into vascular site-specific disease risk. Here, we performed single cell chromatin accessibility (scATACseq) and gene expression profiling (scRNAseq) of healthy adult mouse vascular tissue from three vascular sites, 1) aortic root and ascending aorta, 2) brachiocephalic and carotid artery, and 3) descending thoracic aorta. Through a comprehensive analysis at single cell resolution, we discovered key regulatory enhancers to not only be cell type, but vascular site specific in vascular smooth muscle (SMC), fibroblasts, and endothelial cells. We identified epigenetic markers of embryonic origin with differential chromatin accessibility of key developmental transcription factors such as Tbx20, Hand2, Gata4, and Hoxb family members and discovered transcription factor motif accessibility to be cell type and vascular site specific. Notably, we found ascending fibroblasts to have distinct epigenomic patterns, highlighting SMAD2/3 function to suggest a differential susceptibility to TGFβ, a finding we confirmed through in vitro culture of primary adventitial fibroblasts. Finally, to understand how vascular site-specific enhancers may regulate human genetic risk for disease, we integrated genome wide association study (GWAS) data for ascending and descending aortic dimension, and through using a distinct base resolution deep learning model to predict variant effect on chromatin accessibility, ChromBPNet, to predict variant effects in SMC, Fibroblasts, and Endothelial cells within ascending aorta, carotid, and descending aorta sites of origin. We reveal that although cell type remains a primary influence on variant effects, vascular site modifies cell type transcription and highlights genomic regions that are enriched for specific TF motif footprints - including MEF2A, SMAD3, and HAND2. This work supports a paradigm that the epigenomic and transcriptomic landscape of vascular cells are cell type and vascular site-specific and that site-specific enhancers govern complex genetic drivers of disease risk.

View details for DOI 10.1101/2022.05.18.492517

View details for PubMedID 40655014

View details for PubMedCentralID PMC12247710
Rewriting regulatory DNA to dissect and reprogram gene expression. Cell Martyn, G. E., Montgomery, M. T., Jones, H., Guo, K., Doughty, B. R., Linder, J., Bisht, D., Xia, F., Cai, X. S., Chen, Z., Cochran, K., Lawrence, K. A., Munson, G., Pampari, A., Fulco, C. P., Sahni, N., Kelley, D. R., Lander, E. S., Kundaje, A., Engreitz, J. M. 2025

Abstract

Regulatory DNA provides a platform for transcription factor binding to encode cell-type-specific patterns of gene expression. However, the effects and programmability of regulatory DNA sequences remain difficult to map or predict. Here, we develop variant effects from flow-sorting experiments with CRISPR targeting screens (Variant-EFFECTS) to introduce hundreds of designed edits to endogenous regulatory DNA and quantify their effects on gene expression. We systematically dissect and reprogram 3 regulatory elements for 2 genes in 2 cell types. These data reveal endogenous binding sites with effects specific to genomic context, transcription factor motifs with cell-type-specific activities, and limitations of computational models for predicting the effect sizes of variants. We identify small edits that can tune gene expression over a large dynamic range, suggesting new possibilities for prime-editing-based therapeutics targeting regulatory DNA. Variant-EFFECTS provides a generalizable tool to dissect regulatory DNA and to identify genome editing reagents that tune gene expression in an endogenous context.

View details for DOI 10.1016/j.cell.2025.03.034

View details for PubMedID 40245860
Selective Enhancer Dependencies in MYC-Intact and MYC-Rearranged Germinal Center B-cell Diffuse Large B-cell Lymphoma. Blood cancer discovery Iyer, A. R., Gurumurthy, A., Chu, S. A., Kodgule, R., Aguilar, A. R., Saari, T., Ramzan, A., Rosa, J., Gupta, J., Emmanuel, A., Hall, C. N., Runge, J. S., Owczarczyk, A. B., Cho, J. W., Weiss, M. B., Anyoha, R., Sikkink, K., Gemus, S., Fulco, C. P., Perry, A. M., Schmitt, A. D., Engreitz, J. M., Brown, N. A., Cieslik, M. P., Ryan, R. J. 2025

Abstract

High expression of MYC and its target genes identify germinal center B-cell diffuse large B-cell lymphomas (GCB-DLBCL) associated with poor outcomes. We used CRISPR-interference profiling of human lymphoma cell lines to define essential enhancers in the MYC locus and non-immunoglobulin rearrangement partner loci, including a recurrent rearrangement between MYC and the BCL6 locus control region. GCB-DLBCL cell lines without MYC rearrangement are dependent on an evolutionarily-conserved enhancer we name "germinal center MYC enhancer 1" (GME-1), which is activated by the transcription factor complex of OCT2, OCA-B, and MEF2B, shows an active chromatin state in normal human and mouse germinal center B cells, and demonstrates selective acetylation and MYC promoter topological interactions in MYC-intact GCB-DLBCL biopsies. Whole-genome sequencing identified tandem copy gains of the GME-1 enhancer as a rare but recurrent event in DLBCL. Our findings shed new light on mechanisms that dysregulate MYC, a key driver of B cell malignancy.

View details for DOI 10.1158/2643-3230.BCD-24-0126

View details for PubMedID 40067173
Endothelial cell-related genetic variants identify LDL cholesterol-sensitive individuals who derive greater benefit from aggressive lipid lowering. Nature medicine Marston, N. A., Kamanu, F. K., Melloni, G. E., Schnitzler, G., Hakim, A., Ma, R. X., Kang, H., Chasman, D. I., Giugliano, R. P., Ellinor, P. T., Ridker, P. M., Engreitz, J. M., Sabatine, M. S., Ruff, C. T., Gupta, R. M. 2025

Abstract

The role of endothelial cell (EC) dysfunction in contributing to an individual's susceptibility to coronary atherosclerosis and how low-density lipoprotein cholesterol (LDL-C) concentrations might modify this relationship have not been previously studied. Here, from an examination of genome-wide significant single nucleotide polymorphisms associated with coronary artery disease (CAD), we identified variants with effects on EC function and constructed a 35 single nucleotide polymorphism polygenic risk score comprising these EC-specific variants (EC PRS). The association of the EC PRS with the risk of incident cardiovascular disease was tested in 3 cohorts: a primary prevention population in the UK Biobank (UKBB; n = 348,967); a primary prevention cohort from a trial that tested a statin (JUPITER, n = 8,749); and a secondary prevention cohort that tested a PCSK9 inhibitor (FOURIER, n = 14,298). In the UKBB, the EC PRS was independently associated with the risk of incident CAD (adjusted hazard ratio (aHR) per 1 s.d. of 1.24 (95% CI 1.21-1.26), P < 2 × 10-16). Moreover, LDL-C concentration significantly modified this risk: the aHR per 1 s.d. was 1.26 (1.22-1.30) when LDL-C was 150 mg dl-1 but 1.00 (0.85-1.16) when LDL-C was 50 mg dl-1 (Pinteraction = 0.004). The clinical benefit of LDL-C lowering was significantly greater in individuals with a high EC PRS than in individuals with low or intermediate EC PRS, with relative risk reductions of 68% (HR 0.32 (0.18-0.59)) versus 29% (HR 0.71 (0.52-0.95)) in the primary prevention cohort (Pinteraction = 0.02) and 33% (HR 0.67 (0.53-0.83)) versus 8% (HR 0.92 (0.82-1.03)) in the secondary prevention cohort (Pinteraction = 0.01). We conclude that EC PRS quantifies an independent axis of CAD risk that is not currently captured in medical practice and identifies individuals who are more sensitive to the atherogenic effects of LDL-C and who would potentially derive substantially greater benefit from aggressive LDL-C lowering.

View details for DOI 10.1038/s41591-025-03533-w

View details for PubMedID 40011692

View details for PubMedCentralID 7755038
MorPhiC Consortium: towards functional characterization of all human genes. Nature Adli, M., Przybyla, L., Burdett, T., Burridge, P. W., Cacheiro, P., Chang, H. Y., Engreitz, J. M., Gilbert, L. A., Greenleaf, W. J., Hsu, L., Huangfu, D., Hung, L. H., Kundaje, A., Li, S., Parkinson, H., Qiu, X., Robson, P., Schürer, S. C., Shojaie, A., Skarnes, W. C., Smedley, D., Studer, L., Sun, W., Vidović, D., Vierbuchen, T., White, B. S., Yeung, K. Y., Yue, F., Zhou, T. 2025; 638 (8050): 351-359

Abstract

Recent advances in functional genomics and human cellular models have substantially enhanced our understanding of the structure and regulation of the human genome. However, our grasp of the molecular functions of human genes remains incomplete and biased towards specific gene classes. The Molecular Phenotypes of Null Alleles in Cells (MorPhiC) Consortium aims to address this gap by creating a comprehensive catalogue of the molecular and cellular phenotypes associated with null alleles of all human genes using in vitro multicellular systems. In this Perspective, we present the strategic vision of the MorPhiC Consortium and discuss various strategies for generating null alleles, as well as the challenges involved. We describe the cellular models and scalable phenotypic readouts that will be used in the consortium's initial phase, focusing on 1,000 protein-coding genes. The resulting molecular and cellular data will be compiled into a catalogue of null-allele phenotypes. The methodologies developed in this phase will establish best practices for extending these approaches to all human protein-coding genes. The resources generated-including engineered cell lines, plasmids, phenotypic data, genomic information and computational tools-will be made available to the broader research community to facilitate deeper insights into human gene functions.

View details for DOI 10.1038/s41586-024-08243-w

View details for PubMedID 39939790

View details for PubMedCentralID 9903716
High Shear Stress Reduces ERG Causing Endothelial-Mesenchymal Transition and Pulmonary Arterial Hypertension. Arteriosclerosis, thrombosis, and vascular biology Shinohara, T., Moonen, J. R., Chun, Y. H., Lee-Yow, Y. C., Okamura, K., Szafron, J. M., Kaplan, J., Cao, A., Wang, L., Guntur, D., Taylor, S., Isobe, S., Dong, M., Yang, W., Guo, K., Franco, B. D., Pacharinsak, C., Pisani, L. J., Saitoh, S., Mitani, Y., Marsden, A. L., Engreitz, J. M., Körbelin, J., Rabinovitch, M. 2024

Abstract

Computational modeling indicated that pathological high shear stress (HSS; 100 dyn/cm2) is generated in pulmonary arteries (PAs; 100-500 µm) in congenital heart defects causing PA hypertension (PAH) and in idiopathic PAH with occlusive vascular remodeling. Endothelial-to-mesenchymal transition (EndMT) is a feature of PAH. We hypothesize that HSS induces EndMT, contributing to the initiation and progression of PAH.We used Ibidi perfusion system to determine whether HSS applied to human PA endothelial cells (ECs) induces EndMT when compared with physiological laminar shear stress (15 dyn/cm2). The mechanism was investigated and targeted to prevent PAH in a mouse with HSS induced by an aortocaval shunt.EndMT, a feature of PAH not previously attributed to HSS, was observed. HSS did not alter the induction of transcription KLF (Krüppel-like factor) 2/4, but an ERG (ETS-family transcription factor) was reduced, as were histone H3 lysine 27 acetylation enhancer-promoter peaks containing ERG motifs. Consequently, there was reduced interaction between ERG and KLF2/4, a feature important in tethering KLF and the chromatin remodeling complex to DNA. In PA ECs under laminar shear stress, reducing ERG by siRNA caused EndMT associated with decreased BMPR2 (bone morphogenetic protein receptor 2), CDH5 (cadherin 5), and PECAM1 (platelet and EC adhesion molecule 1) and increased SNAI1/2 (Snail/Slug) and ACTA2 (smooth muscle α2 actin). In PA ECs under HSS, transfection of ERG prevented EndMT. HSS was then induced in mice by an aortocaval shunt, causing progressive PAH over 8 weeks. An adeno-associated viral vector (AAV2-ESGHGYF) was used to replenish ERG selectively in PA ECs. Elevated PA pressure, EndMT, and vascular remodeling (muscularization of peripheral arteries) in the aortocaval shunt mice were markedly reduced by ERG delivery.Pathological HSS reduced lung EC ERG, resulting in EndMT and PAH. Agents that upregulate ERG could reverse HSS-mediated PAH and occlusive vascular remodeling resulting from high flow or narrowed PAs.

View details for DOI 10.1161/ATVBAHA.124.321092

View details for PubMedID 39723537
An Expanded Registry of Candidate cis-Regulatory Elements for Studying Transcriptional Regulation. bioRxiv : the preprint server for biology Moore, J. E., Pratt, H. E., Fan, K., Phalke, N., Fisher, J., Elhajjajy, S. I., Andrews, G., Gao, M., Shedd, N., Fu, Y., Lacadie, M. C., Meza, J., Ganna, M., Choudhury, E., Swofford, R., Farrell, N. P., Pampari, A., Ramalingam, V., Reese, F., Borsari, B., Yu, M., Wattenberg, E., Ruiz-Romero, M., Razavi-Mohseni, M., Xu, J., Galeev, T., Beer, M. A., Guigo, R., Gerstein, M., Engreitz, J., Ljungman, M., Reddy, T. E., Snyder, M. P., Epstein, C. B., Gaskell, E., Bernstein, B. E., Dickel, D. E., Visel, A., Pennacchio, L. A., Mortazavi, A., Kundaje, A., Weng, Z. 2024

Abstract

Mammalian genomes contain millions of regulatory elements that control the complex patterns of gene expression. Previously, The ENCODE consortium mapped biochemical signals across many cell types and tissues and integrated these data to develop a Registry of 0.9 million human and 300 thousand mouse candidate cis-Regulatory Elements (cCREs) annotated with potential functions1. We have expanded the Registry to include 2.35 million human and 927 thousand mouse cCREs, leveraging new ENCODE datasets and enhanced computational methods. This expanded Registry covers hundreds of unique cell and tissue types, providing a comprehensive understanding of gene regulation. Functional characterization data from assays like STARR-seq, MPRA, CRISPR perturbation, and transgenic mouse assays now cover over 90% of human cCREs, revealing complex regulatory functions. We identified thousands of novel silencer cCREs and demonstrated their dual enhancer/silencer roles in different cellular contexts. Integrating the Registry with other ENCODE annotations facilitates genetic variation interpretation and trait-associated gene identification, exemplified by discovering KLF1 as a novel causal gene for red blood cell traits. This expanded Registry is a valuable resource for studying the regulatory genome and its impact on health and disease.

View details for DOI 10.1101/2024.12.26.629296

View details for PubMedID 39763870
Molecular convergence of risk variants for congenital heart defects leveraging a regulatory map of the human fetal heart. medRxiv : the preprint server for health sciences Ma, X. R., Conley, S. D., Kosicki, M., Bredikhin, D., Cui, R., Tran, S., Sheth, M. U., Qiu, W. L., Chen, S., Kundu, S., Kang, H. Y., Amgalan, D., Munger, C. J., Duan, L., Dang, K., Rubio, O. M., Kany, S., Zamirpour, S., DePaolo, J., Padmanabhan, A., Olgin, J., Damrauer, S., Andersson, R., Gu, M., Priest, J. R., Quertermous, T., Qiu, X., Rabinovitch, M., Visel, A., Pennacchio, L., Kundaje, A., Glass, I. A., Gifford, C. A., Pirruccello, J. P., Goodyer, W. R., Engreitz, J. M. 2024

Abstract

Congenital heart defects (CHD) arise in part due to inherited genetic variants that alter genes and noncoding regulatory elements in the human genome. These variants are thought to act during fetal development to influence the formation of different heart structures. However, identifying the genes, pathways, and cell types that mediate these effects has been challenging due to the immense diversity of cell types involved in heart development as well as the superimposed complexities of interpreting noncoding sequences. As such, understanding the molecular functions of both noncoding and coding variants remains paramount to our fundamental understanding of cardiac development and CHD. Here, we created a gene regulation map of the healthy human fetal heart across developmental time, and applied it to interpret the functions of variants associated with CHD and quantitative cardiac traits. We collected single-cell multiomic data from 734,000 single cells sampled from 41 fetal hearts spanning post-conception weeks 6 to 22, enabling the construction of gene regulation maps in 90 cardiac cell types and states, including rare populations of cardiac conduction cells. Through an unbiased analysis of all 90 cell types, we find that both rare coding variants associated with CHD and common noncoding variants associated with valve traits converge to affect valvular interstitial cells (VICs). VICs are enriched for high expression of known CHD genes previously identified through mapping of rare coding variants. Eight CHD genes, as well as other genes in similar molecular pathways, are linked to common noncoding variants associated with other valve diseases or traits via enhancers in VICs. In addition, certain common noncoding variants impact enhancers with activities highly specific to particular subanatomic structures in the heart, illuminating how such variants can impact specific aspects of heart structure and function. Together, these results implicate new enhancers, genes, and cell types in the genetic etiology of CHD, identify molecular convergence of common noncoding and rare coding variants on VICs, and suggest a more expansive view of the cell types instrumental in genetic risk for CHD, beyond the working cardiomyocyte. This regulatory map of the human fetal heart will provide a foundational resource for understanding cardiac development, interpreting genetic variants associated with heart disease, and discovering targets for cell-type specific therapies.

View details for DOI 10.1101/2024.11.20.24317557

View details for PubMedID 39606363

View details for PubMedCentralID PMC11601760
Single cell variant to enhancer to gene map for coronary artery disease. medRxiv : the preprint server for health sciences Amrute, J. M., Lee, P. C., Eres, I., Lee, C. J., Bredemeyer, A., Sheth, M. U., Yamawaki, T., Gurung, R., Anene-Nzelu, C., Qiu, W. L., Kundu, S., Li, D. Y., Ramste, M., Lu, D., Tan, A., Kang, C. J., Wagoner, R. E., Alisio, A., Cheng, P., Zhao, Q., Miller, C. L., Hall, I. M., Gupta, R. M., Hsu, Y. H., Haldar, S. M., Lavine, K. J., Jackson, S., Andersson, R., Engreitz, J. M., Foo, R. S., Li, C. M., Ason, B., Quertermous, T., Stitziel, N. O. 2024

Abstract

Although genome wide association studies (GWAS) in large populations have identified hundreds of variants associated with common diseases such as coronary artery disease (CAD), most disease-associated variants lie within non-coding regions of the genome, rendering it difficult to determine the downstream causal gene and cell type. Here, we performed paired single nucleus gene expression and chromatin accessibility profiling from 44 human coronary arteries. To link disease variants to molecular traits, we developed a meta-map of 88 samples and discovered 11,182 single-cell chromatin accessibility quantitative trait loci (caQTLs). Heritability enrichment analysis and disease variant mapping demonstrated that smooth muscle cells (SMCs) harbor the greatest genetic risk for CAD. To capture the continuum of SMC cell states in disease, we used dynamic single cell caQTL modeling for the first time in tissue to uncover QTLs whose effects are modified by cell state and expand our insight into genetic regulation of heterogenous cell populations. Notably, we identified a variant in the COL4A1/COL4A2 CAD GWAS locus which becomes a caQTL as SMCs de-differentiate by changing a transcription factor binding site for EGR1/2. To unbiasedly prioritize functional candidate genes, we built a genome-wide single cell variant to enhancer to gene (scV2E2G) map for human CAD to link disease variants to causal genes in cell types. Using this approach, we found several hundred genes predicted to be linked to disease variants in different cell types. Next, we performed genome-wide Hi-C in 16 human coronary arteries to build tissue specific maps of chromatin conformation and link disease variants to integrated chromatin hubs and distal target genes. Using this approach, we show that rs4887091 within the ADAMTS7 CAD GWAS locus modulates function of a super chromatin interactome through a change in a CTCF binding site. Finally, we used CRISPR interference to validate a distal gene, AMOTL2, liked to a CAD GWAS locus. Collectively we provide a disease-agnostic framework to translate human genetic findings to identify pathologic cell states and genes driving disease, producing a comprehensive scV2E2G map with genetic and tissue level convergence for future mechanistic and therapeutic studies.

View details for DOI 10.1101/2024.11.13.24317257

View details for PubMedID 39606421

View details for PubMedCentralID PMC11601770
Selective Enhancer Gain of Function Deregulates MYC Expression in Multiple Myeloma. Cancer research Rahmat, M., Clement, K., Alberge, J. B., Sklavenitis-Pistofidis, R., Kodgule, R., Fulco, C. P., Heilpern-Mallory, D., Nilsson, K., Dorfman, D., Engreitz, J. M., Getz, G., Pinello, L., Ryan, R., Ghobrial, I. M. 2024

Abstract

MYC deregulation occurs in the majority of multiple myeloma (MM) cases and is associated with progression and worse prognosis. Enhanced MYC expression occurs in about 70% of MM patients, but it is known to be driven by translocation or amplification events in only ~40% of myelomas. Here, we used CRISPR interference (CRISPRi) to uncover an epigenetic mechanism of MYC regulation whereby increased accessibility of a plasma cell-type specific enhancer leads to increased MYC expression. This native enhancer activity was not associated with enhancer hijacking events but led to specific binding of c-MAF, IRF4, and SPIB transcription factors that activated MYC expression in the absence of known genetic aberrations. In addition, focal amplification was another mechanism of activation of this enhancer in approximately 3.4% of MM patients. Together, these findings define an epigenetic mechanism of MYC deregulation in MM beyond known translocations or amplifications and point to the importance of non-coding regulatory elements and their associated transcription factor networks as drivers of MM progression.

View details for DOI 10.1158/0008-5472.CAN-24-1440

View details for PubMedID 39312195
Deciphering the impact of genomic variation on function. Nature 2024; 633 (8028): 47-57

Abstract

Our genomes influence nearly every aspect of human biology-from molecular and cellular functions to phenotypes in health and disease. Studying the differences in DNA sequence between individuals (genomic variation) could reveal previously unknown mechanisms of human biology, uncover the basis of genetic predispositions to diseases, and guide the development of new diagnostic tools and therapeutic agents. Yet, understanding how genomic variation alters genome function to influence phenotype has proved challenging. To unlock these insights, we need a systematic and comprehensive catalogue of genome function and the molecular and cellular effects of genomic variants. Towards this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations and predictive modelling to investigate the relationships among genomic variation, genome function and phenotypes. IGVF will create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how such effects connect through gene-regulatory and protein-interaction networks. These experimental data, computational predictions and accompanying standards and pipelines will be integrated into an open resource that will catalyse community efforts to explore how our genomes influence biology and disease across populations.

View details for DOI 10.1038/s41586-024-07510-0

View details for PubMedID 39232149

View details for PubMedCentralID 7405896
Genetic and functional analysis of Raynaud's syndrome implicates loci in vasculature and immunity. Cell genomics Tervi, A., Ramste, M., Abner, E., Cheng, P., Lane, J. M., Maher, M., Valliere, J., Lammi, V., Strausz, S., Riikonen, J., Nguyen, T., Martyn, G. E., Sheth, M. U., Xia, F., Docampo, M. L., Gu, W., Esko, T., Saxena, R., Pirinen, M., Palotie, A., Ripatti, S., Sinnott-Armstrong, N., Daly, M., Engreitz, J. M., Rabinovitch, M., Heckman, C. A., Quertermous, T., Jones, S. E., Ollila, H. M. 2024: 100630

Abstract

Raynaud's syndrome is a dysautonomia where exposure to cold causes vasoconstriction and hypoxia, particularly in the extremities. We performed meta-analysis in four cohorts and discovered eight loci (ADRA2A, IRX1, NOS3, ACVR2A, TMEM51, PCDH10-DT, HLA, and RAB6C) where ADRA2A, ACVR2A, NOS3, TMEM51, and IRX1 co-localized with expression quantitative trait loci (eQTLs), particularly in distal arteries. CRISPR gene editing further showed that ADRA2A and NOS3 loci modified gene expression and in situ RNAscope clarified the specificity of ADRA2A in small vessels and IRX1 around small capillaries in the skin. A functional contraction assay in the cold showed lower contraction in ADRA2A-deficient and higher contraction in ADRA2A-overexpressing smooth muscle cells. Overall, our study highlights the power of genome-wide association testing with functional follow-up as a method to understand complex diseases. The results indicate temperature-dependent adrenergic signaling through ADRA2A, effects at the microvasculature by IRX1, endothelial signaling by NOS3, and immune mechanisms by the HLA locus in Raynaud's syndrome.

View details for DOI 10.1016/j.xgen.2024.100630

View details for PubMedID 39142284
CRISPRi-Perturb-seq in endothelial cells links genetic variation in endothelin-1 to risk of coronary artery disease and hypertension CLINICAL SCIENCE Gupta, R., Schnitzler, G., Lee-Kim, V., Fang, S., Kang, H., Ma, R., Finucane, H., Engreitz, J. 2024; 138

View details for Web of Science ID 001250117300046
CRISPRi-Perturb-seq in endothelial cells links genetic variation in endothelin-1 to risk of coronary artery disease and hypertension Gupta, R., Schnitzler, G., Lee-Kim, V., Fang, S., Kang, H., Ma, R., Finucane, H., Engreitz, J. PORTLAND PRESS LTD. 2024: A55-A56

View details for Web of Science ID 001362332500046
Multicenter integrated analysis of noncoding CRISPRi screens. Nature methods Yao, D., Tycko, J., Oh, J. W., Bounds, L. R., Gosai, S. J., Lataniotis, L., Mackay-Smith, A., Doughty, B. R., Gabdank, I., Schmidt, H., Guerrero-Altamirano, T., Siklenka, K., Guo, K., White, A. D., Youngworth, I., Andreeva, K., Ren, X., Barrera, A., Luo, Y., Yardımcı, G. G., Tewhey, R., Kundaje, A., Greenleaf, W. J., Sabeti, P. C., Leslie, C., Pritykin, Y., Moore, J. E., Beer, M. A., Gersbach, C. A., Reddy, T. E., Shen, Y., Engreitz, J. M., Bassik, M. C., Reilly, S. K. 2024

Abstract

The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.

View details for DOI 10.1038/s41592-024-02216-7

View details for PubMedID 38504114

View details for PubMedCentralID 3771521
Convergence of coronary artery disease genes onto endothelial cell programs. Nature Schnitzler, G. R., Kang, H., Fang, S., Angom, R. S., Lee-Kim, V. S., Ma, X. R., Zhou, R., Zeng, T., Guo, K., Taylor, M. S., Vellarikkal, S. K., Barry, A. E., Sias-Garcia, O., Bloemendal, A., Munson, G., Guckelberger, P., Nguyen, T. H., Bergman, D. T., Hinshaw, S., Cheng, N., Cleary, B., Aragam, K., Lander, E. S., Finucane, H. K., Mukhopadhyay, D., Gupta, R. M., Engreitz, J. M. 2024

Abstract

Linking variants from genome-wide association studies (GWAS) to underlying mechanisms of disease remains a challenge1-3. For some diseases, a successful strategy has been to look for cases in which multiple GWAS loci contain genes that act in the same biological pathway1-6. However, our knowledge of which genes act in which pathways is incomplete, particularly for cell-type-specific pathways or understudied genes. Here we introduce a method to connect GWAS variants to functions. This method links variants to genes using epigenomics data, links genes to pathways de novo using Perturb-seq and integrates these data to identify convergence of GWAS loci onto pathways. We apply this approach to study the role of endothelial cells in genetic risk for coronary artery disease (CAD), and discover 43 CAD GWAS signals that converge on the cerebral cavernous malformation (CCM) signalling pathway. Two regulators of this pathway, CCM2 and TLNRD1, are each linked to a CAD risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. These results suggest a model whereby CAD risk is driven in part by the convergence of causal genes onto a particular transcriptional pathway in endothelial cells. They highlight shared genes between common and rare vascular diseases (CAD and CCM), and identify TLNRD1 as a new, previously uncharacterized member of the CCM signalling pathway. This approach will be widely useful for linking variants to functions for other common polygenic diseases.

View details for DOI 10.1038/s41586-024-07022-x

View details for PubMedID 38326615

View details for PubMedCentralID 5501872
Rewriting regulatory DNA to dissect and reprogram gene expression. bioRxiv : the preprint server for biology Martyn, G. E., Montgomery, M. T., Jones, H., Guo, K., Doughty, B. R., Linder, J., Chen, Z., Cochran, K., Lawrence, K. A., Munson, G., Pampari, A., Fulco, C. P., Kelley, D. R., Lander, E. S., Kundaje, A., Engreitz, J. M. 2023

Abstract

Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico. 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.

View details for DOI 10.1101/2023.12.20.572268

View details for PubMedID 38187584

View details for PubMedCentralID PMC10769263
Reduced FOXF1 links unrepaired DNA damage to pulmonary arterial hypertension. Nature communications Isobe, S., Nair, R. V., Kang, H. Y., Wang, L., Moonen, J. R., Shinohara, T., Cao, A., Taylor, S., Otsuki, S., Marciano, D. P., Harper, R. L., Adil, M. S., Zhang, C., Lago-Docampo, M., Körbelin, J., Engreitz, J. M., Snyder, M. P., Rabinovitch, M. 2023; 14 (1): 7578

Abstract

Pulmonary arterial hypertension (PAH) is a progressive disease in which pulmonary arterial (PA) endothelial cell (EC) dysfunction is associated with unrepaired DNA damage. BMPR2 is the most common genetic cause of PAH. We report that human PAEC with reduced BMPR2 have persistent DNA damage in room air after hypoxia (reoxygenation), as do mice with EC-specific deletion of Bmpr2 (EC-Bmpr2-/-) and persistent pulmonary hypertension. Similar findings are observed in PAEC with loss of the DNA damage sensor ATM, and in mice with Atm deleted in EC (EC-Atm-/-). Gene expression analysis of EC-Atm-/- and EC-Bmpr2-/- lung EC reveals reduced Foxf1, a transcription factor with selectivity for lung EC. Reducing FOXF1 in control PAEC induces DNA damage and impaired angiogenesis whereas transfection of FOXF1 in PAH PAEC repairs DNA damage and restores angiogenesis. Lung EC targeted delivery of Foxf1 to reoxygenated EC-Bmpr2-/- mice repairs DNA damage, induces angiogenesis and reverses pulmonary hypertension.

View details for DOI 10.1038/s41467-023-43039-y

View details for PubMedID 37989727

View details for PubMedCentralID 4737700
An encyclopedia of enhancer-gene regulatory interactions in the human genome. bioRxiv : the preprint server for biology Gschwind, A. R., Mualim, K. S., Karbalayghareh, A., Sheth, M. U., Dey, K. K., Jagoda, E., Nurtdinov, R. N., Xi, W., Tan, A. S., Jones, H., Ma, X. R., Yao, D., Nasser, J., Avsec, Ž., James, B. T., Shamim, M. S., Durand, N. C., Rao, S. S., Mahajan, R., Doughty, B. R., Andreeva, K., Ulirsch, J. C., Fan, K., Perez, E. M., Nguyen, T. C., Kelley, D. R., Finucane, H. K., Moore, J. E., Weng, Z., Kellis, M., Bassik, M. C., Price, A. L., Beer, M. A., Guigó, R., Stamatoyannopoulos, J. A., Lieberman Aiden, E., Greenleaf, W. J., Leslie, C. S., Steinmetz, L. M., Kundaje, A., Engreitz, J. M. 2023

Abstract

Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

View details for DOI 10.1101/2023.11.09.563812

View details for PubMedID 38014075

View details for PubMedCentralID PMC10680627
Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nature genetics Weeks, E. M., Ulirsch, J. C., Cheng, N. Y., Trippe, B. L., Fine, R. S., Miao, J., Patwardhan, T. A., Kanai, M., Nasser, J., Fulco, C. P., Tashman, K. C., Aguet, F., Li, T., Ordovas-Montanes, J., Smillie, C. S., Biton, M., Shalek, A. K., Ananthakrishnan, A. N., Xavier, R. J., Regev, A., Gupta, R. M., Lage, K., Ardlie, K. G., Hirschhorn, J. N., Lander, E. S., Engreitz, J. M., Finucane, H. K. 2023

Abstract

Genome-wide association studies (GWASs) are a valuable tool for understanding the biology of complex human traits and diseases, but associated variants rarely point directly to causal genes. In the present study, we introduce a new method, polygenic priority score (PoPS), that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. Using a large evaluation set of genes with fine-mapped coding variants, we show that PoPS and the closest gene individually outperform other gene prioritization methods, but observe the best overall performance by combining PoPS with orthogonal methods. Using this combined approach, we prioritize 10,642 unique gene-trait pairs across 113 complex traits and diseases with high precision, finding not only well-established gene-trait relationships but nominating new genes at unresolved loci, such as LGR4 for estimated glomerular filtration rate and CCR7 for deep vein thrombosis. Overall, we demonstrate that PoPS provides a powerful addition to the gene prioritization toolbox.

View details for DOI 10.1038/s41588-023-01443-6

View details for PubMedID 37443254

View details for PubMedCentralID 5501872
An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome biology Fowler, D. M., Adams, D. J., Gloyn, A. L., Hahn, W. C., Marks, D. S., Muffley, L. A., Neal, J. T., Roth, F. P., Rubin, A. F., Starita, L. M., Hurles, M. E. 2023; 24 (1): 147

Abstract

Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an 'Atlas' of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.

View details for DOI 10.1186/s13059-023-02986-x

View details for PubMedID 37394429

View details for PubMedCentralID PMC10316620
Oligogenic Architecture of Rare Noncoding Variants Distinguishes 4 Congenital Heart Disease Phenotypes. Circulation. Genomic and precision medicine Yu, M., Aguirre, M., Jia, M., Gjoni, K., Cordova-Palomera, A., Munger, C., Amgalan, D., Rosa Ma, X., Pereira, A., Tcheandjieu, C., Seidman, C., Seidman, J., Tristani-Firouzi, M., Chung, W., Goldmuntz, E., Srivastava, D., Loos, R. J., Chami, N., Cordell, H., DreSSen, M., Mueller-Myhsok, B., Lahm, H., Krane, M., Pollard, K. S., Engreitz, J. M., Gagliano Taliun, S. A., Gelb, B. D., Priest, J. R. 2023: e003968

Abstract

BACKGROUND: Congenital heart disease (CHD) is highly heritable, but the power to identify inherited risk has been limited to analyses of common variants in small cohorts.METHODS: We performed reimputation of 4 CHD cohorts (n=55342) to the TOPMed reference panel (freeze 5), permitting meta-analysis of 14784017 variants including 6035962 rare variants of high imputation quality as validated by whole genome sequencing.RESULTS: Meta-analysis identified 16 novel loci, including 12 rare variants, which displayed moderate or large effect sizes (median odds ratio, 3.02) for 4 separate CHD categories. Analyses of chromatin structure link 13 of the genome-wide significant loci to key genes in cardiac development; rs373447426 (minor allele frequency, 0.003 [odds ratio, 3.37 for Conotruncal heart disease]; P=1.49*10-8) is predicted to disrupt chromatin structure for 2 nearby genes BDH1 and DLG1 involved in Conotruncal development. A lead variant rs189203952 (minor allele frequency, 0.01 [odds ratio, 2.4 for left ventricular outflow tract obstruction]; P=1.46*10-8) is predicted to disrupt the binding sites of 4 transcription factors known to participate in cardiac development in the promoter of SPAG9. A tissue-specific model of chromatin conformation suggests that common variant rs78256848 (minor allele frequency, 0.11 [odds ratio, 1.4 for Conotruncal heart disease]; P=2.6*10-8) physically interacts with NCAM1 (PFDR=1.86*10-27), a neural adhesion molecule acting in cardiac development. Importantly, while each individual malformation displayed substantial heritability (observed h2 ranging from 0.26 for complex malformations to 0.37 for left ventricular outflow tract obstructive disease) the risk for different CHD malformations appeared to be separate, without genetic correlation measured by linkage disequilibrium score regression or regional colocalization.CONCLUSIONS: We describe a set of rare noncoding variants conferring significant risk for individual heart malformations which are linked to genes governing cardiac development. These results illustrate that the oligogenic basis of CHD and significant heritability may be linked to rare variants outside protein-coding regions conferring substantial risk for individual categories of cardiac malformation.

View details for DOI 10.1161/CIRCGEN.122.003968

View details for PubMedID 37026454
Genetic Determinants of the Interventricular Septum Are Linked to Ventricular Septal Defects and Hypertrophic Cardiomyopathy. Circulation. Genomic and precision medicine Yu, M., Harper, A. R., Aguirre, M., Pittman, M., Tcheandjieu, C., Amgalan, D., Grace, C., Goel, A., Farrall, M., Xiao, K., Engreitz, J., Pollard, K. S., Watkins, H., Priest, J. R. 2023: e003708

Abstract

A large proportion of genetic risk remains unexplained for structural heart disease involving the interventricular septum (IVS) including hypertrophic cardiomyopathy and ventricular septal defects. This study sought to develop a reproducible proxy of IVS structure from standard medical imaging, discover novel genetic determinants of IVS structure, and relate these loci to diseases of the IVS, hypertrophic cardiomyopathy, and ventricular septal defect.We estimated the cross-sectional area of the IVS from the 4-chamber view of cardiac magnetic resonance imaging in 32 219 individuals from the UK Biobank which was used as the basis of genome wide association studies and Mendelian randomization.Measures of IVS cross-sectional area at diastole were a strong proxy for the 3-dimensional volume of the IVS (Pearson r=0.814, P=0.004), and correlated with anthropometric measures, blood pressure, and diagnostic codes related to cardiovascular physiology. Seven loci with clear genomic consequence and relevance to cardiovascular biology were uncovered by genome wide association studies, most notably a single nucleotide polymorphism in an intron of CDKN1A (rs2376620; β, 7.7 mm2 [95% CI, 5.8-11.0]; P=6.0×10-10), and a common inversion incorporating KANSL1 predicted to disrupt local chromatin structure (β, 8.4 mm2 [95% CI, 6.3-10.9]; P=4.2×10-14). Mendelian randomization suggested that inheritance of larger IVS cross-sectional area at diastole was strongly associated with hypertrophic cardiomyopathy risk (pIVW=4.6×10-10) while inheritance of smaller IVS cross-sectional area at diastole was associated with risk for ventricular septal defect (pIVW=0.007).Automated estimates of cross-sectional area of the IVS supports discovery of novel loci related to cardiac development and Mendelian disease. Inheritance of genetic liability for either small or large IVS, appears to confer risk for ventricular septal defect or hypertrophic cardiomyopathy, respectively. These data suggest that a proportion of risk for structural and congenital heart disease can be localized to the common genetic determinants of size and shape of cardiovascular anatomy.

View details for DOI 10.1161/CIRCGEN.122.003708

View details for PubMedID 37017090
The Type 2 Diabetes Knowledge Portal: An open access genetic resource dedicated to type 2 diabetes and related traits. Cell metabolism Costanzo, M. C., von Grotthuss, M., Massung, J., Jang, D., Caulkins, L., Koesterer, R., Gilbert, C., Welch, R. P., Kudtarkar, P., Hoang, Q., Boughton, A. P., Singh, P., Sun, Y., Duby, M., Moriondo, A., Nguyen, T., Smadbeck, P., Alexander, B. R., Brandes, M., Carmichael, M., Dornbos, P., Green, T., Huellas-Bruskiewicz, K. C., Ji, Y., Kluge, A., McMahon, A. C., Mercader, J. M., Ruebenacker, O., Sengupta, S., Spalding, D., Taliun, D., AMP-T2D Consortium, Smith, P., Thomas, M. K., Akolkar, B., Brosnan, M. J., Cherkas, A., Chu, A. Y., Fauman, E. B., Fox, C. S., Kamphaus, T. N., Miller, M. R., Nguyen, L., Parsa, A., Reilly, D. F., Ruetten, H., Wholley, D., Zaghloul, N. A., Abecasis, G. R., Altshuler, D., Keane, T. M., McCarthy, M. I., Gaulton, K. J., Florez, J. C., Boehnke, M., Burtt, N. P., Flannick, J., Abecasis, G., Akolkar, B., Alexander, B. R., Allred, N. D., Altshuler, D., Below, J. E., Bergman, R., Beulens, J. W., Blangero, J., Boehnke, M., Bokvist, K., Bottinger, E., Boughton, A. P., Bowden, D., Brosnan, M. J., Brown, C., Bruskiewicz, K., Burtt, N. P., Carmichael, M., Caulkins, L., Cebola, I., Chambers, J., Ida Chen, Y., Cherkas, A., Chu, A. Y., Clark, C., Claussnitzer, M., Costanzo, M. C., Cox, N. J., Hoed, M. d., Dong, D., Duby, M., Duggirala, R., Dupuis, J., Elders, P. J., Engreitz, J. M., Fauman, E., Ferrer, J., Flannick, J., Flicek, P., Flickinger, M., Florez, J. C., Fox, C. S., Frayling, T. M., Frazer, K. A., Gaulton, K. J., Gilbert, C., Gloyn, A. L., Green, T., Hanis, C. L., Hanson, R., Hattersley, A. T., Hoang, Q., Im, H. K., Iqbal, S., Jacobs, S. B., Jang, D., Jordan, T., Kamphaus, T., Karpe, F., Keane, T. M., Kim, S. K., Kluge, A., Koesterer, R., Kudtarkar, P., Lage, K., Lange, L. A., Lazar, M., Lehman, D., Liu, C., Loos, R. J., Ma, R. C., MacDonald, P., Massung, J., Maurano, M. T., McCarthy, M. I., McVean, G., Meigs, J. B., Mercader, J. M., Miller, M. R., Mitchell, B., Mohlke, K. L., Morabito, S., Morgan, C., Mullican, S., Narendra, S., Ng, M. C., Nguyen, L., Palmer, C. N., Parker, S. C., Parrado, A., Parsa, A., Pawlyk, A. C., Pearson, E. R., Plump, A., Province, M., Quertermous, T., Redline, S., Reilly, D. F., Ren, B., Rich, S. S., Richards, J. B., Rotter, J. I., Ruebenacker, O., Ruetten, H., Salem, R. M., Sander, M., Sanders, M., Sanghera, D., Scott, L. J., Sengupta, S., Siedzik, D., Sim, X., Singh, P., Sladek, R., Small, K., Smith, P., Stein, P., Spalding, D., Stringham, H. M., Sun, Y., Susztak, K., 't Hart, L. M., Taliun, D., Taylor, K., Thomas, M. K., Todd, J. A., Udler, M. S., Voight, B., von Grotthuss, M., Wan, A., Welch, R. P., Wholley, D., Yuksel, K., Zaghloul, N. A. 2023

Abstract

Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.

View details for DOI 10.1016/j.cmet.2023.03.001

View details for PubMedID 36963395
Integrative single-cell analysis of cardiogenesis identifies developmental trajectories and non-coding mutations in congenital heart disease. Cell Ameen, M., Sundaram, L., Shen, M., Banerjee, A., Kundu, S., Nair, S., Shcherbina, A., Gu, M., Wilson, K. D., Varadarajan, A., Vadgama, N., Balsubramani, A., Wu, J. C., Engreitz, J. M., Farh, K., Karakikes, I., Wang, K. C., Quertermous, T., Greenleaf, W. J., Kundaje, A. 2022; 185 (26): 4937

Abstract

To define the multi-cellular epigenomic and transcriptional landscape of cardiac cellular development, we generated single-cell chromatin accessibility maps of human fetal heart tissues. We identified eight major differentiation trajectories involving primary cardiac cell types, each associated with dynamic transcription factor (TF) activity signatures. We contrasted regulatory landscapes of iPSC-derived cardiac cell types and their invivo counterparts, which enabled optimization of invitro differentiation of epicardial cells. Further, we interpreted sequence based deep learning models of cell-type-resolved chromatin accessibility profiles to decipher underlying TF motif lexicons. De novo mutations predicted to affect chromatin accessibility in arterial endothelium were enriched in congenital heart disease (CHD) cases vs. controls. Invitro studies in iPSCs validated the functional impact of identified variation on the predicted developmental cell types. This work thus defines the cell-type-resolved cis-regulatory sequence determinants of heart development and identifies disruption of cell type-specific regulatory elements in CHD.

View details for DOI 10.1016/j.cell.2022.11.028

View details for PubMedID 36563664
Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nature genetics Jagadeesh, K. A., Dey, K. K., Montoro, D. T., Mohan, R., Gazal, S., Engreitz, J. M., Xavier, R. J., Price, A. L., Regev, A. 2022

Abstract

Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell-disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.

View details for DOI 10.1038/s41588-022-01187-9

View details for PubMedID 36175791
KLF4 recruits SWI/SNF to increase chromatin accessibility and reprogram the endothelial enhancer landscape under laminar shear stress. Nature communications Moonen, J. R., Chappell, J., Shi, M., Shinohara, T., Li, D., Mumbach, M. R., Zhang, F., Nair, R. V., Nasser, J., Mai, D. H., Taylor, S., Wang, L., Metzger, R. J., Chang, H. Y., Engreitz, J. M., Snyder, M. P., Rabinovitch, M. 2022; 13 (1): 4941

Abstract

Physiologic laminar shear stress (LSS) induces an endothelial gene expression profile that is vasculo-protective. In this report, we delineate how LSS mediates changes in the epigenetic landscape to promote this beneficial response. We show that under LSS, KLF4 interacts with the SWI/SNF nucleosome remodeling complex to increase accessibility at enhancer sites that promote the expression of homeostatic endothelial genes. By combining molecular and computational approaches we discover enhancers that loop to promoters of KLF4- and LSS-responsive genes that stabilize endothelial cells and suppress inflammation, such as BMPR2, SMAD5, and DUSP5. By linking enhancers to genes that they regulate under physiologic LSS, our work establishes a foundation for interpreting how non-coding DNA variants in these regions might disrupt protective gene expression to influence vascular disease.

View details for DOI 10.1038/s41467-022-32566-9

View details for PubMedID 35999210
SNP-to-gene linking strategies reveal contributions of enhancer-related and candidate master-regulator genes to autoimmune disease. Cell genomics Dey, K. K., Gazal, S., van de Geijn, B., Kim, S. S., Nasser, J., Engreitz, J. M., Price, A. L. 2022; 2 (7)

Abstract

We assess contributions to autoimmune disease of genes whose regulation is driven by enhancer regions (enhancer-related) and genes that regulate other genes in trans (candidate master-regulator). We link these genes to SNPs using several SNP-to-gene (S2G) strategies and apply heritability analyses to draw three conclusions about 11 autoimmune/blood-related diseases/traits. First, several characterizations of enhancer-related genes using functional genomics data are informative for autoimmune disease heritability after conditioning on a broad set of regulatory annotations. Second, candidate master-regulator genes defined using trans-eQTL in blood are also conditionally informative for autoimmune disease heritability. Third, integrating enhancer-related and master-regulator gene sets with protein-protein interaction (PPI) network information magnified their disease signal. The resulting PPI-enhancer gene score produced >2-fold stronger heritability signal and >2-fold stronger enrichment for drug targets, compared with the recently proposed enhancer domain score. In each case, functionally informed S2G strategies produced 4.1- to 13-fold stronger disease signals than conventional window-based strategies.

View details for DOI 10.1016/j.xgen.2022.100145

View details for PubMedID 35873673
Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nature genetics Gazal, S., Weissbrod, O., Hormozdiari, F., Dey, K. K., Nasser, J., Jagadeesh, K. A., Weiner, D. J., Shi, H., Fulco, C. P., O'Connor, L. J., Pasaniuc, B., Engreitz, J. M., Price, A. L. 2022

Abstract

Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.

View details for DOI 10.1038/s41588-022-01087-y

View details for PubMedID 35668300
Computational estimates of annular diameter reveal genetic determinants of mitral valve function and disease. JCI insight Yu, M., Tcheandjieu, C., Georges, A., Xiao, K., Tejeda, H., Dina, C., Le Tourneau, T., Fiterau, M., Judy, R., Tsao, N. L., Amgalan, D., Munger, C. J., Engreitz, J. M., Damrauer, S. M., Bouatia-Naji, N., Priest, J. R. 2022; 7 (3)

Abstract

The fibrous annulus of the mitral valve plays an important role in valvular function and cardiac physiology, while normal variation in the size of cardiovascular anatomy may share a genetic link with common and rare disease. We derived automated estimates of mitral valve annular diameter in the 4-chamber view from 32,220 MRI images from the UK Biobank at ventricular systole and diastole as the basis for GWAS. Mitral annular dimensions corresponded to previously described anatomical norms, and GWAS inclusive of 4 population strata identified 10 loci, including possibly novel loci (GOSR2, ERBB4, MCTP2, MCPH1) and genes related to cardiac contractility (BAG3, TTN, RBFOX1). ATAC-Seq of primary mitral valve tissue localized multiple variants to regions of open chromatin in biologically relevant cell types and rs17608766 to an algorithmically predicted enhancer element in GOSR2. We observed strong genetic correlation with measures of contractility and mitral valve disease and clinical correlations with heart failure, cerebrovascular disease, and ventricular arrhythmias. Polygenic scoring of mitral valve annular diameter in systole was predictive of risk mitral valve prolapse across 4 cohorts. In summary, genetic and clinical studies of mitral valve annular diameter revealed genetic determinants of mitral valve biology, while highlighting clinical associations. Polygenic determinants of mitral valve annular diameter may represent an independent risk factor for mitral prolapse. Overall, computationally estimated phenotypes derived at scale from medical imaging represent an important substrate for genetic discovery and clinical risk prediction.

View details for DOI 10.1172/jci.insight.146580

View details for PubMedID 35132965
Systematic identification of genomic elements that regulate FCGR2A expression and harbor variants linked with autoimmune disease. Human molecular genetics Dahlqvist, J., Fulco, C. P., Ray, J. P., Liechti, T., de Boer, C. G., Lieb, D. J., Eisenhaure, T. M., Engreitz, J. M., Roederer, M., Hacohen, N. 1800

Abstract

BACKGROUND: FCGR2A binds antibody-antigen complexes to regulate the abundance of circulating and deposited complexes along with downstream immune and autoimmune responses. While the abundance of FCRG2A may be critical in immune-mediated diseases, little is known about whether its surface expression is regulated through cis genomic elements and non-coding variants. In the current study, we aimed to characterize the regulation of FCGR2A expression, the impact of genetic variation and its association with autoimmune disease.METHODS: We applied CRISPR-based interference and editing to scrutinize 1.7Mb of open chromatin surrounding the FCGR2A gene to identify regulatory elements. Relevant transcription factors binding to these regions were defined through public databases. Genetic variants affecting regulation were identified using luciferase reporter assays and were verified in a cohort of 1996 genotyped healthy individuals using flow cytometry.RESULTS: We identified a complex proximal region and five distal enhancers regulating FCGR2A. The proximal region split into subregions upstream and downstream of the transcription start site, was enriched in binding of inflammation-regulated transcription factors, and harbored a variant associated with FCGR2A expression in primary myeloid cells. One distal enhancer region was occupied by CCCTC-binding factor (CTCF) whose binding site was disrupted by a rare genetic variant, altering gene expression.CONCLUSIONS: The FCGR2A gene is regulated by multiple proximal and distal genomic regions, with links to autoimmune disease. These findings may open up novel therapeutic avenues where fine-tuning of FCGR2A levels may constitute a part of treatment strategies for immune-mediated diseases.

View details for DOI 10.1093/hmg/ddab372

View details for PubMedID 34970970
COVID-19 tissue atlases reveal SARS-CoV-2 pathology and cellular targets. Nature Delorey, T. M., Ziegler, C. G., Heimberg, G., Normand, R., Yang, Y., Segerstolpe, A., Abbondanza, D., Fleming, S. J., Subramanian, A., Montoro, D. T., Jagadeesh, K. A., Dey, K. K., Sen, P., Slyper, M., Pita-Juarez, Y. H., Phillips, D., Biermann, J., Bloom-Ackermann, Z., Barkas, N., Ganna, A., Gomez, J., Melms, J. C., Katsyv, I., Normandin, E., Naderi, P., Popov, Y. V., Raju, S. S., Niezen, S., Tsai, L. T., Siddle, K. J., Sud, M., Tran, V. M., Vellarikkal, S. K., Wang, Y., Amir-Zilberstein, L., Atri, D. S., Beechem, J., Brook, O. R., Chen, J., Divakar, P., Dorceus, P., Engreitz, J. M., Essene, A., Fitzgerald, D. M., Fropf, R., Gazal, S., Gould, J., Grzyb, J., Harvey, T., Hecht, J., Hether, T., Jane-Valbuena, J., Leney-Greene, M., Ma, H., McCabe, C., McLoughlin, D. E., Miller, E. M., Muus, C., Niemi, M., Padera, R., Pan, L., Pant, D., Pe'er, C., Pfiffner-Borges, J., Pinto, C. J., Plaisted, J., Reeves, J., Ross, M., Rudy, M., Rueckert, E. H., Siciliano, M., Sturm, A., Todres, E., Waghray, A., Warren, S., Zhang, S., Zollinger, D. R., Cosimi, L., Gupta, R. M., Hacohen, N., Hibshoosh, H., Hide, W., Price, A. L., Rajagopal, J., Tata, P. R., Riedel, S., Szabo, G., Tickle, T. L., Ellinor, P. T., Hung, D., Sabeti, P. C., Novak, R., Rogers, R., Ingber, D. E., Jiang, Z. G., Juric, D., Babadi, M., Farhi, S. L., Izar, B., Stone, J. R., Vlachos, I. S., Solomon, I. H., Ashenberg, O., Porter, C. B., Li, B., Shalek, A. K., Villani, A., Rozenblatt-Rosen, O., Regev, A. 2021

Abstract

COVID-19, caused by SARS-CoV-2, can result in acute respiratory distress syndrome and multiple-organ failure1-4, but little is known about its pathophysiology. Here, we generated single-cell atlases of 23 lung, 16 kidney, 16 liver and 19 heart COVID-19 autopsy donor tissue samples, and spatial atlases of 14 lung donors. Integrated computational analysis uncovered substantial remodeling in the lung epithelial, immune and stromal compartments, with evidence of multiple paths of failed tissue regeneration, including defective alveolar type 2 differentiation and expansion of fibroblasts and putative TP63+ intrapulmonary basal-like progenitor cells. Viral RNAs were enriched in mononuclear phagocytic and endothelial lung cells which induced specific host programs. Spatial analysis in lung distinguished inflammatory host responses in lung regions with and without viral RNA. Analysis of the other tissue atlases showed transcriptional alterations in multiple cell types in COVID-19 donor heart tissue, and mapped cell types and genes implicated with disease severity based on COVID-19 GWAS. Our foundational dataset elucidates the biological impact of severe SARS-CoV-2 infection across the body, a key step towards new treatments.

View details for DOI 10.1038/s41586-021-03570-8

View details for PubMedID 33915569
Inherited causes of clonal haematopoiesis in 97,691 whole genomes (vol 586 , pg 763, 2020) NATURE Bick, A. G., Weinstock, J. S., Nandakumar, S. K., Fulco, C. P., Bao, E. L., Zekavat, S. M., Szeto, M. D., Liao, X., Leventhal, M. J., Nasser, J., Chang, K., Laurie, C., Burugula, B., Gibson, C. J., Niroula, A., Lin, A. E., Taub, M. A., Aguet, F., Ardlie, K., Mitchell, B. D., Barnes, K. C., Moscati, A., Fornage, M., Redline, S., Psaty, B. M., Silverman, E. K., Weiss, S. T., Palmer, N. D., Vasan, R. S., Burchard, E. G., Kardia, S. L. R., He, J., Kaplan, R. C., Smith, N. L., Arnett, D. K., Schwartz, D. A., Correa, A., de Andrade, M., Guo, X., Konkle, B. A., Custer, B., Peralta, J. M., Gui, H., Meyers, D. A., McGarvey, S. T., Chen, I., Shoemaker, M., Peyser, P. A., Broome, J. G., Gogarten, S. M., Wang, F., Wong, Q., Montasser, M. E., Daya, M., Kenny, E. E., North, K. E., Launer, L. J., Cade, B. E., Bis, J. C., Cho, M. H., Lasky-Su, J., Bowden, D. W., Cupples, L., Mak, A. C. Y., Becker, L. C., Smith, J. A., Kelly, T. N., Aslibekyan, S., Heckbert, S. R., Tiwari, H. K., Yang, I. V., Heit, J. A., Lubitz, S. A., Johnsen, J. M., Curran, J. E., Wenzel, S. E., Weeks, D. E., Rao, D. C., Darbar, D., Moon, J., Tracy, R. P., Buth, E. J., Rafaels, N., Loos, R. J. F., Durda, P., Liu, Y., Hou, L., Lee, J., Kachroo, P., Freedman, B. I., Levy, D., Bielak, L. F., Hixson, J. E., Floyd, J. S., Whitsel, E. A., Ellinor, P. T., Irvin, M. R., Fingerlin, T. E., Raffield, L. M., Armasu, S. M., Wheeler, M. M., Sabino, E. C., Blangero, J., Williams, L., Levy, B. D., Sheu, W., Roden, D. M., Boerwinkle, E., Manson, J. E., Mathias, R. A., Desai, P., Taylor, K. D., Johnson, A. D., Auer, P. L., Kooperberg, C., Laurie, C. C., Blackwell, T. W., Smith, A. V., Zhao, H., Lange, E., Lange, L., Rich, S. S., Rotter, J. I., Wilson, J. G., Scheet, P., Kitzman, J. O., Lander, E. S., Engreitz, J. M., Ebert, B. L., Reiner, A. P., Jaiswal, S., Abecasis, G., Sankaran, V. G., Kathiresan, S., Natarajan, P., NHLBI Trans Omi 2021; 591 (7851): E27

View details for DOI 10.1038/s41586-021-03280

View details for Web of Science ID 000632177100002
Author Correction: Inherited causes of clonal haematopoiesis in 97,691 whole genomes. Nature Bick, A. G., Weinstock, J. S., Nandakumar, S. K., Fulco, C. P., Bao, E. L., Zekavat, S. M., Szeto, M. D., Liao, X., Leventhal, M. J., Nasser, J., Chang, K., Laurie, C., Burugula, B. B., Gibson, C. J., Niroula, A., Lin, A. E., Taub, M. A., Aguet, F., Ardlie, K., Mitchell, B. D., Barnes, K. C., Moscati, A., Fornage, M., Redline, S., Psaty, B. M., Silverman, E. K., Weiss, S. T., Palmer, N. D., Vasan, R. S., Burchard, E. G., Kardia, S. L., He, J., Kaplan, R. C., Smith, N. L., Arnett, D. K., Schwartz, D. A., Correa, A., de Andrade, M., Guo, X., Konkle, B. A., Custer, B., Peralta, J. M., Gui, H., Meyers, D. A., McGarvey, S. T., Chen, I. Y., Shoemaker, M. B., Peyser, P. A., Broome, J. G., Gogarten, S. M., Wang, F. F., Wong, Q., Montasser, M. E., Daya, M., Kenny, E. E., North, K. E., Launer, L. J., Cade, B. E., Bis, J. C., Cho, M. H., Lasky-Su, J., Bowden, D. W., Cupples, L. A., Mak, A. C., Becker, L. C., Smith, J. A., Kelly, T. N., Aslibekyan, S., Heckbert, S. R., Tiwari, H. K., Yang, I. V., Heit, J. A., Lubitz, S. A., Johnsen, J. M., Curran, J. E., Wenzel, S. E., Weeks, D. E., Rao, D. C., Darbar, D., Moon, J., Tracy, R. P., Buth, E. J., Rafaels, N., Loos, R. J., Durda, P., Liu, Y., Hou, L., Lee, J., Kachroo, P., Freedman, B. I., Levy, D., Bielak, L. F., Hixson, J. E., Floyd, J. S., Whitsel, E. A., Ellinor, P. T., Irvin, M. R., Fingerlin, T. E., Raffield, L. M., Armasu, S. M., Wheeler, M. M., Sabino, E. C., Blangero, J., Williams, L. K., Levy, B. D., Sheu, W. H., Roden, D. M., Boerwinkle, E., Manson, J. E., Mathias, R. A., Desai, P., Taylor, K. D., Johnson, A. D., NHLBI Trans-Omics for Precision Medicine Consortium, Auer, P. L., Kooperberg, C., Laurie, C. C., Blackwell, T. W., Smith, A. V., Zhao, H., Lange, E., Lange, L., Rich, S. S., Rotter, J. I., Wilson, J. G., Scheet, P., Kitzman, J. O., Lander, E. S., Engreitz, J. M., Ebert, B. L., Reiner, A. P., Jaiswal, S., Abecasis, G., Sankaran, V. G., Kathiresan, S., Natarajan, P., Abe, N., Albert, C., Almasy, L., Alonso, A., Ament, S., Anderson, P., Anugu, P., Applebaum-Bowden, D., Arking, D., Ashley-Koch, A., Aslibekyan, S., Assimes, T., Avramopoulos, D., Barnard, J., Barr, R. G., Barron-Casella, E., Barwick, L., Beaty, T., Beck, G., Becker, D., Beer, R., Beitelshees, A., Benjamin, E., Benos, P., Bezerra, M., Bielak, L., Bowler, R., Brody, J., Broeckel, U., Bunting, K., Bustamante, C., Cardwell, J., Carey, V., Carty, C., Casaburi, R., Casella, J., Castaldi, P., Chaffin, M., Chang, C., Chang, Y., Chasman, D., Chavan, S., Chen, B., Chen, W., Choi, S. H., Chuang, L., Chung, M., Chung, R., Clish, C., Comhair, S., Cornell, E., Crandall, C., Crapo, J., Curtis, J., Damcott, C., Das, S., David, S., Davis, C., DeBaun, M., Deka, R., DeMeo, D., Devine, S., Duan, Q., Duggirala, R., Dutcher, S., Eaton, C., Ekunwe, L., El Boueiz, A., Emery, L., Erzurum, S., Farber, C., Flickinger, M., Franceschini, N., Frazar, C., Fu, M., Fullerton, S. M., Fulton, L., Gabriel, S., Gan, W., Gao, S., Gao, Y., Gass, M., Gelb, B., Geng, X. P., Geraci, M., Germer, S., Gerszten, R., Ghosh, A., Gibbs, R., Gignoux, C., Gladwin, M., Glahn, D., Gong, D., Goring, H., Graw, S., Grine, D., Gu, C. C., Guan, Y., Gupta, N., Haessler, J., Hall, M., Harris, D., Hawley, N. L., Heavner, B., Hernandez, R., Herrington, D., Hersh, C., Hidalgo, B., Hobbs, B., Hokanson, J., Hong, E., Hoth, K., Hsiung, C. A., Hung, Y., Huston, H., Hwu, C. M., Jackson, R., Jain, D., Jaquish, C., Jhun, M. A., Johnson, C., Johnston, R., Jones, K., Kang, H. M., Kelly, S., Kessler, M., Khan, A., Kim, W., Kinney, G., Kramer, H., Lange, C., LeBoff, M., Lee, S. S., Lee, W., LeFaive, J., Levine, D., Lewis, J., Li, X., Li, Y., Lin, H., Lin, H., Lin, K. H., Lin, X., Liu, S., Liu, Y., Lunetta, K., Luo, J., Mahaney, M., Make, B., Manichaikul, A., Margolin, L., Martin, L., Mathai, S., May, S., McArdle, P., McDonald, M., McFarland, S., McGoldrick, D., McHugh, C., Mei, H., Mestroni, L., Mikulla, J., Min, N., Minear, M., Minster, R. L., Moll, M., Montgomery, C., Musani, S., Mwasongwe, S., Mychaleckyj, J. C., Nadkarni, G., Naik, R., Naseri, T., Nekhai, S., Nelson, S. C., Neltner, B., Nickerson, D., O'Connell, J., O'Connor, T., Ochs-Balcom, H., Paik, D., Pankow, J., Papanicolaou, G., Parsa, A., Perez, M., Perry, J., Peters, U., Peyser, P., Phillips, L. S., Pollin, T., Post, W., Becker, J. P., Boorgula, M. P., Preuss, M., Qasba, P., Qiao, D., Qin, Z., Rasmussen-Torvik, L., Ratan, A., Reed, R., Regan, E., Reupena, M. S., Rice, K., Roselli, C., Ruczinski, I., Russell, P., Ruuska, S., Ryan, K., Saleheen, D., Salimi, S., Salzberg, S., Sandow, K., Scheller, C., Schmidt, E., Schwander, K., Sciurba, F., Seidman, C., Seidman, J., Sheehan, V., Sherman, S. L., Shetty, A., Shetty, A., Silver, B., Smith, J., Smith, T., Smoller, S., Snively, B., Snyder, M., Sofer, T., Sotoodehnia, N., Stilp, A. M., Storm, G., Streeten, E., Su, J. L., Sung, Y. J., Sylvia, J., Szpiro, A., Sztalryd, C., Taliun, D., Tang, H., Taylor, M., Taylor, S., Telen, M., Thornton, T. A., Threlkeld, M., Tinker, L., Tirschwell, D., Tishkoff, S., Tiwari, H., Tong, C., Tsai, M., Vaidya, D., Van Den Berg, D., VandeHaar, P., Vrieze, S., Walker, T., Wallace, R., Walts, A., Wang, H., Watson, K., Weir, B., Weng, L., Wessel, J., Willer, C., Williams, K., Wilson, C., Wu, J., Xu, H., Yanek, L., Yang, R., Zaghloul, N., Zhang, Y., Zhao, S. X., Zhao, W., Zhi, D., Zhou, X., Zhu, X., Zody, M., Zoellner, S. 2021

View details for DOI 10.1038/s41586-021-03280-1

View details for PubMedID 33707633
A single-cell and spatial atlas of autopsy tissues reveals pathology and cellular targets of SARS-CoV-2. bioRxiv : the preprint server for biology Delorey, T. M., Ziegler, C. G., Heimberg, G., Normand, R., Yang, Y., Segerstolpe, A., Abbondanza, D., Fleming, S. J., Subramanian, A., Montoro, D. T., Jagadeesh, K. A., Dey, K. K., Sen, P., Slyper, M., Pita-Juárez, Y. H., Phillips, D., Bloom-Ackerman, Z., Barkas, N., Ganna, A., Gomez, J., Normandin, E., Naderi, P., Popov, Y. V., Raju, S. S., Niezen, S., Tsai, L. T., Siddle, K. J., Sud, M., Tran, V. M., Vellarikkal, S. K., Amir-Zilberstein, L., Atri, D. S., Beechem, J., Brook, O. R., Chen, J., Divakar, P., Dorceus, P., Engreitz, J. M., Essene, A., Fitzgerald, D. M., Fropf, R., Gazal, S., Gould, J., Grzyb, J., Harvey, T., Hecht, J., Hether, T., Jane-Valbuena, J., Leney-Greene, M., Ma, H., McCabe, C., McLoughlin, D. E., Miller, E. M., Muus, C., Niemi, M., Padera, R., Pan, L., Pant, D., Pe'er, C., Pfiffner-Borges, J., Pinto, C. J., Plaisted, J., Reeves, J., Ross, M., Rudy, M., Rueckert, E. H., Siciliano, M., Sturm, A., Todres, E., Waghray, A., Warren, S., Zhang, S., Zollinger, D. R., Cosimi, L., Gupta, R. M., Hacohen, N., Hide, W., Price, A. L., Rajagopal, J., Tata, P. R., Riedel, S., Szabo, G., Tickle, T. L., Hung, D., Sabeti, P. C., Novak, R., Rogers, R., Ingber, D. E., Jiang, Z. G., Juric, D., Babadi, M., Farhi, S. L., Stone, J. R., Vlachos, I. S., Solomon, I. H., Ashenberg, O., Porter, C. B., Li, B., Shalek, A. K., Villani, A. C., Rozenblatt-Rosen, O., Regev, A. 2021

Abstract

The SARS-CoV-2 pandemic has caused over 1 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome, or direct complications resulting in multiple-organ failures. Little is known about the host tissue immune and cellular responses associated with COVID-19 infection, symptoms, and lethality. To address this, we collected tissues from 11 organs during the clinical autopsy of 17 individuals who succumbed to COVID-19, resulting in a tissue bank of approximately 420 specimens. We generated comprehensive cellular maps capturing COVID-19 biology related to patients' demise through single-cell and single-nucleus RNA-Seq of lung, kidney, liver and heart tissues, and further contextualized our findings through spatial RNA profiling of distinct lung regions. We developed a computational framework that incorporates removal of ambient RNA and automated cell type annotation to facilitate comparison with other healthy and diseased tissue atlases. In the lung, we uncovered significantly altered transcriptional programs within the epithelial, immune, and stromal compartments and cell intrinsic changes in multiple cell types relative to lung tissue from healthy controls. We observed evidence of: alveolar type 2 (AT2) differentiation replacing depleted alveolar type 1 (AT1) lung epithelial cells, as previously seen in fibrosis; a concomitant increase in myofibroblasts reflective of defective tissue repair; and, putative TP63+ intrapulmonary basal-like progenitor (IPBLP) cells, similar to cells identified in H1N1 influenza, that may serve as an emergency cellular reserve for severely damaged alveoli. Together, these findings suggest the activation and failure of multiple avenues for regeneration of the epithelium in these terminal lungs. SARS-CoV-2 RNA reads were enriched in lung mononuclear phagocytic cells and endothelial cells, and these cells expressed distinct host response transcriptional programs. We corroborated the compositional and transcriptional changes in lung tissue through spatial analysis of RNA profiles in situ and distinguished unique tissue host responses between regions with and without viral RNA, and in COVID-19 donor tissues relative to healthy lung. Finally, we analyzed genetic regions implicated in COVID-19 GWAS with transcriptomic data to implicate specific cell types and genes associated with disease severity. Overall, our COVID-19 cell atlas is a foundational dataset to better understand the biological impact of SARS-CoV-2 infection across the human body and empowers the identification of new therapeutic interventions and prevention strategies.

View details for DOI 10.1101/2021.02.25.430130

View details for PubMedID 33655247

View details for PubMedCentralID PMC7924267
Activity-dependent regulome of human GABAergic neurons reveals new patterns of gene regulation and neurological disease heritability. Nature neuroscience Boulting, G. L., Durresi, E., Ataman, B., Sherman, M. A., Mei, K., Harmin, D. A., Carter, A. C., Hochbaum, D. R., Granger, A. J., Engreitz, J. M., Hrvatin, S., Blanchard, M. R., Yang, M. G., Griffith, E. C., Greenberg, M. E. 2021

Abstract

Neuronal activity-dependent gene expression is essential for brain development. Although transcriptional and epigenetic effects of neuronal activity have been explored in mice, such an investigation is lacking in humans. Because alterations in GABAergic neuronal circuits are implicated in neurological disorders, we conducted a comprehensive activity-dependent transcriptional and epigenetic profiling of human induced pluripotent stem cell-derived GABAergic neurons similar to those of the early developing striatum. We identified genes whose expression is inducible after membrane depolarization, some of which have specifically evolved in primates and/or are associated with neurological diseases, including schizophrenia and autism spectrum disorder (ASD). We define the genome-wide profile of human neuronal activity-dependent enhancers, promoters and the transcription factors CREB and CRTC1. We found significant heritability enrichment for ASD in the inducible promoters. Our results suggest that sequence variation within activity-inducible promoters of developing human forebrain GABAergic neurons contributes to ASD risk.

View details for DOI 10.1038/s41593-020-00786-1

View details for PubMedID 33542524
Inherited causes of clonal haematopoiesis in 97,691 whole genomes. Nature Bick, A. G., Weinstock, J. S., Nandakumar, S. K., Fulco, C. P., Bao, E. L., Zekavat, S. M., Szeto, M. D., Liao, X., Leventhal, M. J., Nasser, J., Chang, K., Laurie, C., Burugula, B. B., Gibson, C. J., Lin, A. E., Taub, M. A., Aguet, F., Ardlie, K., Mitchell, B. D., Barnes, K. C., Moscati, A., Fornage, M., Redline, S., Psaty, B. M., Silverman, E. K., Weiss, S. T., Palmer, N. D., Vasan, R. S., Burchard, E. G., Kardia, S. L., He, J., Kaplan, R. C., Smith, N. L., Arnett, D. K., Schwartz, D. A., Correa, A., de Andrade, M., Guo, X., Konkle, B. A., Custer, B., Peralta, J. M., Gui, H., Meyers, D. A., McGarvey, S. T., Chen, I. Y., Shoemaker, M. B., Peyser, P. A., Broome, J. G., Gogarten, S. M., Wang, F. F., Wong, Q., Montasser, M. E., Daya, M., Kenny, E. E., North, K. E., Launer, L. J., Cade, B. E., Bis, J. C., Cho, M. H., Lasky-Su, J., Bowden, D. W., Cupples, L. A., Mak, A. C., Becker, L. C., Smith, J. A., Kelly, T. N., Aslibekyan, S., Heckbert, S. R., Tiwari, H. K., Yang, I. V., Heit, J. A., Lubitz, S. A., Johnsen, J. M., Curran, J. E., Wenzel, S. E., Weeks, D. E., Rao, D. C., Darbar, D., Moon, J., Tracy, R. P., Buth, E. J., Rafaels, N., Loos, R. J., Durda, P., Liu, Y., Hou, L., Lee, J., Kachroo, P., Freedman, B. I., Levy, D., Bielak, L. F., Hixson, J. E., Floyd, J. S., Whitsel, E. A., Ellinor, P. T., Irvin, M. R., Fingerlin, T. E., Raffield, L. M., Armasu, S. M., Wheeler, M. M., Sabino, E. C., Blangero, J., Williams, L. K., Levy, B. D., Sheu, W. H., Roden, D. M., Boerwinkle, E., Manson, J. E., Mathias, R. A., Desai, P., Taylor, K. D., Johnson, A. D., NHLBI Trans-Omics for Precision Medicine Consortium, Auer, P. L., Kooperberg, C., Laurie, C. C., Blackwell, T. W., Smith, A. V., Zhao, H., Lange, E., Lange, L., Rich, S. S., Rotter, J. I., Wilson, J. G., Scheet, P., Kitzman, J. O., Lander, E. S., Engreitz, J. M., Ebert, B. L., Reiner, A. P., Jaiswal, S., Abecasis, G., Sankaran, V. G., Kathiresan, S., Natarajan, P., Abe, N., Albert, C., Almasy, L., Alonso, A., Ament, S., Anderson, P., Anugu, P., Applebaum-Bowden, D., Arking, D., Ashley-Koch, A., Aslibekyan, S., Assimes, T., Avramopoulos, D., Barnard, J., Barr, R. G., Barron-Casella, E., Barwick, L., Beaty, T., Beck, G., Becker, D., Beer, R., Beitelshees, A., Benjamin, E., Benos, P., Bezerra, M., Bielak, L., Bowler, R., Brody, J., Broeckel, U., Bunting, K., Bustamante, C., Cardwell, J., Carey, V., Carty, C., Casaburi, R., Casella, J., Castaldi, P., Chaffin, M., Chang, C., Chang, Y., Chasman, D., Chavan, S., Chen, B., Chen, W., Choi, S. H., Chuang, L., Chung, M., Chung, R., Clish, C., Comhair, S., Cornell, E., Crandall, C., Crapo, J., Curtis, J., Damcott, C., Das, S., David, S., Davis, C., DeBaun, M., Deka, R., DeMeo, D., Devine, S., Duan, Q., Duggirala, R., Dutcher, S., Eaton, C., Ekunwe, L., Boueiz, A. E., Emery, L., Erzurum, S., Farber, C., Flickinger, M., Franceschini, N., Frazar, C., Fu, M., Fullerton, S. M., Fulton, L., Gabriel, S., Gan, W., Gao, S., Gao, Y., Gass, M., Gelb, B., Priscilla Geng, X., Geraci, M., Germer, S., Gerszten, R., Ghosh, A., Gibbs, R., Gignoux, C., Gladwin, M., Glahn, D., Gong, D., Goring, H., Graw, S., Grine, D., Gu, C. C., Guan, Y., Gupta, N., Haessler, J., Hall, M., Harris, D., Hawley, N. L., Heavner, B., Hernandez, R., Herrington, D., Hersh, C., Hidalgo, B., Hobbs, B., Hokanson, J., Hong, E., Hoth, K., Agnes Hsiung, C., Hung, Y., Huston, H., Hwu, C. M., Jackson, R., Jain, D., Jaquish, C., Jhun, M. A., Johnson, C., Johnston, R., Jones, K., Kang, H. M., Kelly, S., Kessler, M., Khan, A., Kim, W., Kinney, G., Kramer, H., Lange, C., LeBoff, M., Lee, S. S., Lee, W., LeFaive, J., Levine, D., Lewis, J., Li, X., Li, Y., Lin, H., Lin, H., Lin, K. H., Lin, X., Liu, S., Liu, Y., Lunetta, K., Luo, J., Mahaney, M., Make, B., Manichaikul, A., Margolin, L., Martin, L., Mathai, S., May, S., McArdle, P., McDonald, M., McFarland, S., McGoldrick, D., McHugh, C., Mei, H., Mestroni, L., Mikulla, J., Min, N., Minear, M., Minster, R. L., Moll, M., Montgomery, C., Musani, S., Mwasongwe, S., Mychaleckyj, J. C., Nadkarni, G., Naik, R., Naseri, T., Nekhai, S., Nelson, S. C., Neltner, B., Nickerson, D., O'Connell, J., O'Connor, T., Ochs-Balcom, H., Paik, D., Pankow, J., Papanicolaou, G., Parsa, A., Perez, M., Perry, J., Peters, U., Peyser, P., Phillips, L. S., Pollin, T., Post, W., Becker, J. P., Boorgula, M. P., Preuss, M., Qasba, P., Qiao, D., Qin, Z., Rasmussen-Torvik, L., Ratan, A., Reed, R., Regan, E., Sefuiva Reupena, M., Rice, K., Roselli, C., Ruczinski, I., Russell, P., Ruuska, S., Ryan, K., Saleheen, D., Salimi, S., Salzberg, S., Sandow, K., Scheller, C., Schmidt, E., Schwander, K., Sciurba, F., Seidman, C., Seidman, J., Sheehan, V., Sherman, S. L., Shetty, A., Shetty, A., Silver, B., Smith, J., Smith, T., Smoller, S., Snively, B., Snyder, M., Sofer, T., Sotoodehnia, N., Stilp, A. M., Storm, G., Streeten, E., Su, J. L., Sung, Y. J., Sylvia, J., Szpiro, A., Sztalryd, C., Taliun, D., Tang, H., Taylor, M., Taylor, S., Telen, M., Thornton, T. A., Threlkeld, M., Tinker, L., Tirschwell, D., Tishkoff, S., Tiwari, H., Tong, C., Tsai, M., Vaidya, D., Berg, D. V., VandeHaar, P., Vrieze, S., Walker, T., Wallace, R., Walts, A., Wang, H., Watson, K., Weir, B., Weng, L., Wessel, J., Willer, C., Williams, K., Wilson, C., Wu, J., Xu, H., Yanek, L., Yang, R., Zaghloul, N., Zhang, Y., Zhao, S. X., Zhao, W., Zhi, D., Zhou, X., Zhu, X., Zody, M., Zoellner, S. 2020

Abstract

Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown1. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer2-4 and coronary heart disease5-this phenomenon istermed clonal haematopoiesis of indeterminate potential (CHIP)6. Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIPdriver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.

View details for DOI 10.1038/s41586-020-2819-2

View details for PubMedID 33057201
Publisher Correction: Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries. Nature communications Zekavat, S. M., Ruotsalainen, S., Handsaker, R. E., Alver, M., Bloom, J., Poterba, T., Seed, C., Ernst, J., Chaffin, M., Engreitz, J., Peloso, G. M., Manichaikul, A., Yang, C., Ryan, K. A., Fu, M., Johnson, W. C., Tsai, M., Budoff, M., Vasan, R. S., Cupples, L. A., Rotter, J. I., Rich, S. S., Post, W., Mitchell, B. D., Correa, A., Metspalu, A., Wilson, J. G., Salomaa, V., Kellis, M., Daly, M. J., Neale, B. M., McCarroll, S., Surakka, I., Esko, T., Ganna, A., Ripatti, S., Kathiresan, S., Natarajan, P., NHLBI TOPMed Lipids Working Group, Abe, N., Abecasis, G., Albert, C., Allred, N. N., Almasy, L., Alonso, A., Ament, S., Anderson, P., Anugu, P., Applebaum-Bowden, D., Arking, D., Arnett, D. K., Ashley-Koch, A., Aslibekyan, S., Assimes, T., Auer, P., Avramopoulos, D., Barnard, J., Barnes, K., Barr, R. G., Barron-Casella, E., Beaty, T., Becker, D., Becker, L., Beer, R., Begum, F., Beitelshees, A., Benjamin, E., Bezerra, M., Bielak, L., Bis, J., Blackwell, T., Blangero, J., Boerwinkle, E., Borecki, I., Bowler, R., Brody, J., Broeckel, U., Broome, J., Bunting, K., Burchard, E., Cardwell, J., Carty, C., Casaburi, R., Casella, J., Chang, C., Chasman, D., Chavan, S., Chen, B., Chen, W., Chen, Y. I., Cho, M., Choi, S. H., Chuang, L., Chung, M., Cornell, E., Crandall, C., Crapo, J., Curran, J., Curtis, J., Custer, B., Damcott, C., Darbar, D., Das, S., David, S., Davis, C., Daya, M., Andrade, M. d., DeBaun, M., Deka, R., DeMeo, D., Devine, S., Do, R., Duan, Q., Duggirala, R., Durda, P., Dutcher, S., Eaton, C., Ekunwe, L., Ellinor, P., Emery, L., Farber, C., Farnam, L., Fingerlin, T., Flickinger, M., Fornage, M., Franceschini, N., Fullerton, S. M., Fulton, L., Gabriel, S., Gan, W., Gao, Y., Gass, M., Gelb, B., Geng, X. P., Germer, S., Gignoux, C., Gladwin, M., Glahn, D., Gogarten, S., Gong, D., Goring, H., Gu, C. C., Guan, Y., Guo, X., Haessler, J., Hall, M., Harris, D., Hawley, N., He, J., Heavner, B., Heckbert, S., Hernandez, R., Herrington, D., Hersh, C., Hidalgo, B., Hixson, J., Hokanson, J., Hong, E., Hoth, K., Hsiung, C. A., Huston, H., Hwu, C. M., Irvin, M. R., Jackson, R., Jain, D., Jaquish, C., Jhun, M. A., Johnsen, J., Johnson, A., Johnston, R., Jones, K., Kang, H. M., Kaplan, R., Kardia, S., Kaufman, L., Kelly, S., Kenny, E., Kessler, M., Khan, A., Kinney, G., Konkle, B., Kooperberg, C., Kramer, H., Krauter, S., Lange, C., Lange, E., Lange, L., Laurie, C., Laurie, C., LeBoff, M., Lee, S. S., Lee, W., LeFaive, J., Levine, D., Levy, D., Lewis, J., Li, Y., Lin, H., Lin, K. H., Liu, S., Liu, Y., Loos, R., Lubitz, S., Lunetta, K., Luo, J., Mahaney, M., Make, B., Manson, J., Margolin, L., Martin, L., Mathai, S., Mathias, R., McArdle, P., McDonald, M., McFarland, S., McGarvey, S., Mei, H., Meyers, D. A., Mikulla, J., Min, N., Minear, M., Minster, R. L., Montasser, M. E., Musani, S., Mwasongwe, S., Mychaleckyj, J. C., Nadkarni, G., Naik, R., Nekhai, S., Nickerson, D., North, K., O'Connell, J., O'Connor, T., Ochs-Balcom, H., Pankow, J., Papanicolaou, G., Parker, M., Parsa, A., Penchev, S., Peralta, J. M., Perez, M., Perry, J., Peters, U., Peyser, P., Phillips, L., Phillips, S., Pollin, T., Becker, J. P., Boorgula, M. P., Preuss, M., Prokopenko, D., Psaty, B., Qasba, P., Qiao, D., Qin, Z., Rafaels, N., Raffield, L., Rao, D. C., Rasmussen-Torvik, L., Ratan, A., Redline, S., Reed, R., Regan, E., Reiner, A., Rice, K., Roden, D., Roselli, C., Ruczinski, I., Russell, P., Ruuska, S., Sakornsakolpat, P., Salimi, S., Salzberg, S., Sandow, K., Sankaran, V., Scheller, C., Schmidt, E., Schwander, K., Schwartz, D., Sciurba, F., Seidman, C., Sheehan, V., Shetty, A., Shetty, A., Sheu, W. H., Shoemaker, M. B., Silver, B., Silverman, E., Smith, J., Smith, J., Smith, N., Smith, T., Smoller, S., Snively, B., Sofer, T., Sotoodehnia, N., Stilp, A., Streeten, E., Sung, Y. J., Sylvia, J., Szpiro, A., Sztalryd, C., Taliun, D., Tang, H., Taub, M., Taylor, K., Taylor, S., Telen, M., Thornton, T. A., Tinker, L., Tirschwell, D., Tiwari, H., Tracy, R., Vaidya, D., VandeHaar, P., Vrieze, S., Walker, T., Wallace, R., Walts, A., Wan, E., Wang, F. F., Watson, K., Weeks, D. E., Weir, B., Weiss, S., Weng, L., Willer, C., Williams, K., Williams, L. K., Wilson, C., Wong, Q., Xu, H., Yanek, L., Yang, I., Yang, R., Zaghloul, N., Zhang, Y., Zhao, S. X., Zhao, W., Zheng, X., Zhi, D., Zhou, X., Zody, M., Zoellner, S. 2020; 11 (1): 1715

Abstract

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

View details for DOI 10.1038/s41467-020-15236-6

View details for PubMedID 32238811
Prioritizing disease and trait causal variants at the TNFAIP3 locus using functional and genomic features NATURE COMMUNICATIONS Ray, J. P., de Boer, C. G., Fulco, C. P., Lareau, C. A., Kanai, M., Ulirsch, J. C., Tewhey, R., Ludwig, L. S., Reilly, S. K., Bergman, D. T., Engreitz, J. M., Issner, R., Finucane, H. K., Lander, E. S., Regev, A., Hacohen, N. 2020; 11 (1): 1237

Abstract

Genome-wide association studies have associated thousands of genetic variants with complex traits and diseases, but pinpointing the causal variant(s) among those in tight linkage disequilibrium with each associated variant remains a major challenge. Here, we use seven experimental assays to characterize all common variants at the multiple disease-associated TNFAIP3 locus in five disease-relevant immune cell lines, based on a set of features related to regulatory potential. Trait/disease-associated variants are enriched among SNPs prioritized based on either: (1) residing within CRISPRi-sensitive regulatory regions, or (2) localizing in a chromatin accessible region while displaying allele-specific reporter activity. Of the 15 trait/disease-associated haplotypes at TNFAIP3, 9 have at least one variant meeting one or both of these criteria, 5 of which are further supported by genetic fine-mapping. Our work provides a comprehensive strategy to characterize genetic variation at important disease-associated loci, and aids in the effort to identify trait causal genetic variants.

View details for DOI 10.1038/s41467-020-15022-4

View details for Web of Science ID 000549162600014

View details for PubMedID 32144282

View details for PubMedCentralID PMC7060350
Functional disease architectures reveal unique biological role of transposable elements NATURE COMMUNICATIONS Hormozdiari, F., van de Geijn, B., Nasser, J., Weissbrod, O., Gazal, S., Ju, C., O'Connor, L., Hujoel, M. L. A., Engreitz, J., Hormozdiari, F., Price, A. L. 2019; 10: 4054

Abstract

Transposable elements (TE) comprise roughly half of the human genome. Though initially derided as junk DNA, they have been widely hypothesized to contribute to the evolution of gene regulation. However, the contribution of TE to the genetic architecture of diseases remains unknown. Here, we analyze data from 41 independent diseases and complex traits to draw three conclusions. First, TE are uniquely informative for disease heritability. Despite overall depletion for heritability (54% of SNPs, 39 ± 2% of heritability), TE explain substantially more heritability than expected based on their depletion for known functional annotations. This implies that TE acquire function in ways that differ from known functional annotations. Second, older TE contribute more to disease heritability, consistent with acquiring biological function. Third, Short Interspersed Nuclear Elements (SINE) are far more enriched for blood traits than for other traits. Our results can help elucidate the biological roles that TE play in the genetic architecture of diseases.

View details for DOI 10.1038/s41467-019-11957-5

View details for Web of Science ID 000484599900004

View details for PubMedID 31492842

View details for PubMedCentralID PMC6731302
CRISPR Tools for Systematic Studies of RNA Regulation COLD SPRING HARBOR PERSPECTIVES IN BIOLOGY Engreitz, J., Abudayyeh, O., Gootenberg, J., Zhang, F. 2019; 11 (8)

Abstract

RNA molecules perform diverse functions in mammalian cells, including transferring genetic information from DNA to protein and playing diverse regulatory roles through interactions with other cellular components. Here, we discuss how clustered regularly interspaced short palindromic repeat (CRISPR)-based technologies for directed perturbations of DNA and RNA are revealing new insights into RNA regulation. First, we review the fundamentals of CRISPR-Cas enzymes and functional genomics tools that leverage these systems. Second, we explore how these new perturbation technologies are transforming the study of regulation of and by RNA, focusing on the functions of DNA regulatory elements and long noncoding RNAs (lncRNAs). Third, we highlight an emerging class of RNA-targeting CRISPR-Cas enzymes that have the potential to catalyze studies of RNA biology by providing tools to directly perturb or measure RNA modifications and functions. Together, these tools enable systematic studies of RNA function and regulation in mammalian cells.

View details for DOI 10.1101/cshperspect.a035386

View details for Web of Science ID 000482756900008

View details for PubMedID 31371352

View details for PubMedCentralID PMC6671937
Discovering metabolic disease gene interactions by correlated effects on cellular morphology MOLECULAR METABOLISM Jiao, Y., Ahmed, U., Sim, M., Bejar, A., Zhang, X., Talukder, M., Rice, R., Flannick, J., Podgornaia, A., Reilly, D. F., Engreitz, J. M., Kost-Alimova, M., Hartland, K., Mercader, J., Georges, S., Wagh, V., Tadin-Strapps, M., Doench, J. G., Edwardson, J., Rochford, J. J., Rosen, E. D., Majithia, A. R. 2019; 24: 108–19

Abstract

Impaired expansion of peripheral fat contributes to the pathogenesis of insulin resistance and Type 2 Diabetes (T2D). We aimed to identify novel disease-gene interactions during adipocyte differentiation.Genes in disease-associated loci for T2D, adiposity and insulin resistance were ranked according to expression in human adipocytes. The top 125 genes were ablated in human pre-adipocytes via CRISPR/CAS9 and the resulting cellular phenotypes quantified during adipocyte differentiation with high-content microscopy and automated image analysis. Morphometric measurements were extracted from all images and used to construct morphologic profiles for each gene.Over 107 morphometric measurements were obtained. Clustering of the morphologic profiles accross all genes revealed a group of 14 genes characterized by decreased lipid accumulation, and enriched for known lipodystrophy genes. For two lipodystrophy genes, BSCL2 and AGPAT2, sub-clusters with PLIN1 and CEBPA identifed by morphological similarity were validated by independent experiments as novel protein-protein and gene regulatory interactions.A morphometric approach in adipocytes can resolve multiple cellular mechanisms for metabolic disease loci; this approach enables mechanistic interrogation of the hundreds of metabolic disease loci whose function still remains unknown.

View details for DOI 10.1016/j.molmet.2019.03.001

View details for Web of Science ID 000468472300008

View details for PubMedID 30940487

View details for PubMedCentralID PMC6531784
Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis ELIFE Nandakumar, S. K., McFarland, S. K., Mateyka, L. M., Lareau, C. A., Ulirsch, J. C., Ludwig, L. S., Agarwal, G., Engreitz, J. M., Przychodzen, B., McConkey, M., Cowley, G. S., Doench, J. G., Maciejewski, J. P., Ebert, B. L., Root, D. E., Sankaran, V. G. 2019; 8

Abstract

Genome-wide association studies (GWAS) have identified thousands of variants associated with human diseases and traits. However, the majority of GWAS-implicated variants are in non-coding regions of the genome and require in depth follow-up to identify target genes and decipher biological mechanisms. Here, rather than focusing on causal variants, we have undertaken a pooled loss-of-function screen in primary hematopoietic cells to interrogate 389 candidate genes contained in 75 loci associated with red blood cell traits. Using this approach, we identify 77 genes at 38 GWAS loci, with most loci harboring 1-2 candidate genes. Importantly, the hit set was strongly enriched for genes validated through orthogonal genetic approaches. Genes identified by this approach are enriched in specific and relevant biological pathways, allowing regulators of human erythropoiesis and modifiers of blood diseases to be defined. More generally, this functional screen provides a paradigm for gene-centric follow up of GWAS for a variety of human diseases and traits.

View details for DOI 10.7554/eLife.44080

View details for Web of Science ID 000468967900001

View details for PubMedID 31070582

View details for PubMedCentralID PMC6534380
CRISPR-SURF: discovering regulatory elements by deconvolution of CRISPR tiling screen data NATURE METHODS Hsu, J. Y., Fulco, C. P., Cole, M. A., Canver, M. C., Pellin, D., Sher, F., Farouni, R., Clement, K., Guo, J. A., Biasco, L., Orkin, S. H., Engreitz, J. M., Lander, E. S., Joung, J., Bauer, D. E., Pinello, L. 2018; 15 (12): 992-+

View details for DOI 10.1038/s41592-018-0225-6

View details for Web of Science ID 000451826200006

View details for PubMedID 30504875

View details for PubMedCentralID PMC6620603
The NORAD lncRNA assembles a topoisomerase complex critical for genome stability (vol 561, pg 132, 2018) NATURE Munschauer, M., Nguyen, C. T., Sirokman, K., Hartigan, C. R., Hogstrom, L., Engreitz, J. M., Ulirsch, J. C., Fulco, C. P., Subramanian, V., Chen, J., Schenone, M., Guttman, M., Carr, S. A., Lander, E. S. 2018; 563 (7733): E32

Abstract

A typo in the 'Reviewer information' section of this Letter was corrected online.

View details for DOI 10.1038/s41586-018-0584-2

View details for Web of Science ID 000451599900011

View details for PubMedID 30279576
The NORAD lncRNA assembles a topoisomerase complex critical for genome stability NATURE Munschauer, M., Nguyen, C. T., Sirokman, K., Hartigan, C. R., Hogstrom, L., Engreitz, J. M., Ulirsch, J. C., Fulco, C. P., Subramanian, V., Chen, J., Schenone, M., Guttman, M., Carr, S. A., Lander, E. S. 2018; 561 (7721): 132-+

Abstract

The human genome contains thousands of long non-coding RNAs1, but specific biological functions and biochemical mechanisms have been discovered for only about a dozen2-7. A specific long non-coding RNA-non-coding RNA activated by DNA damage (NORAD)-has recently been shown to be required for maintaining genomic stability8, but its molecular mechanism is unknown. Here we combine RNA antisense purification and quantitative mass spectrometry to identify proteins that directly interact with NORAD in living cells. We show that NORAD interacts with proteins involved in DNA replication and repair in steady-state cells and localizes to the nucleus upon stimulation with replication stress or DNA damage. In particular, NORAD interacts with RBMX, a component of the DNA-damage response, and contains the strongest RBMX-binding site in the transcriptome. We demonstrate that NORAD controls the ability of RBMX to assemble a ribonucleoprotein complex-which we term NORAD-activated ribonucleoprotein complex 1 (NARC1)-that contains the known suppressors of genomic instability topoisomerase I (TOP1), ALYREF and the PRPF19-CDC5L complex. Cells depleted for NORAD or RBMX display an increased frequency of chromosome segregation defects, reduced replication-fork velocity and altered cell-cycle progression-which represent phenotypes that are mechanistically linked to TOP1 and PRPF19-CDC5L function. Expression of NORAD in trans can rescue defects caused by NORAD depletion, but rescue is significantly impaired when the RBMX-binding site in NORAD is deleted. Our results demonstrate that the interaction between NORAD and RBMX is important for NORAD function, and that NORAD is required for the assembly of the previously unknown topoisomerase complex NARC1, which contributes to maintaining genomic stability. In addition, we uncover a previously unknown function for long non-coding RNAs in modulating the ability of an RNA-binding protein to assemble a higher-order ribonucleoprotein complex.

View details for DOI 10.1038/s41586-018-0453-z

View details for Web of Science ID 000443755200046

View details for PubMedID 30150775
Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries (vol 9, 2606, 2018) NATURE COMMUNICATIONS Zekavat, S. M., Ruotsalainen, S., Handsaker, R. E., Alver, M., Bloom, J., Poterba, T., Seed, C., Ernst, J., Chaffin, M., Engreitz, J., Peloso, G. M., Manichaikul, A., Yang, C., Ryan, K. A., Fu, M., Johnson, W., Tsai, M., Budoff, M., Vasan, R. S., Cupples, L., Rotter, J. I., Rich, S. S., Post, W., Mitchell, B. D., Correa, A., Metspalu, A., Wilson, J. G., Salomaa, V., Kellis, M., Daly, M. J., Neale, B. M., McCarroll, S., Surakka, I., Esko, T., Ganna, A., Ripatti, S., Kathiresan, S., Natarajan, P., NHLBI TOPMed Lipids Working Grp 2018; 9: 3493

Abstract

The original version of this article contained an error in the name of the author Ramachandran S. Vasan, which was incorrectly given as Vasan S. Ramachandran. This has now been corrected in both the PDF and HTML versions of the article.

View details for DOI 10.1038/s41467-018-05975-y

View details for Web of Science ID 000442522400001

View details for PubMedID 30140049

View details for PubMedCentralID PMC6107495
Positional specificity of different transcription factor classes within enhancers PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Grossman, S. R., Engreitz, J., Ray, J. P., Nguyen, T. H., Hacohen, N., Lander, E. S. 2018; 115 (30): E7222–E7230

Abstract

Gene expression is controlled by sequence-specific transcription factors (TFs), which bind to regulatory sequences in DNA. TF binding occurs in nucleosome-depleted regions of DNA (NDRs), which generally encompass regions with lengths similar to those protected by nucleosomes. However, less is known about where within these regions specific TFs tend to be found. Here, we characterize the positional bias of inferred binding sites for 103 TFs within ∼500,000 NDRs across 47 cell types. We find that distinct classes of TFs display different binding preferences: Some tend to have binding sites toward the edges, some toward the center, and some at other positions within the NDR. These patterns are highly consistent across cell types, suggesting that they may reflect TF-specific intrinsic structural or functional characteristics. In particular, TF classes with binding sites at NDR edges are enriched for those known to interact with histones and chromatin remodelers, whereas TFs with central enrichment interact with other TFs and cofactors such as p300. Our results suggest distinct regiospecific binding patterns and functions of TF classes within enhancers.

View details for DOI 10.1073/pnas.1804663115

View details for Web of Science ID 000439574700030

View details for PubMedID 29987030

View details for PubMedCentralID PMC6065035
Ribosome Levels Selectively Regulate Translation and Lineage Commitment in Human Hematopoiesis CELL Khajuria, R. K., Munschauer, M., Ulirsch, J. C., Fiorini, C., Ludwig, L. S., McFarland, S. K., Abdulhay, N. J., Specht, H., Keshishian, H., Mani, D. R., Jovanovic, M., Ellis, S. R., Fulco, C. P., Engreitz, J. M., Schutz, S., Lian, J., Gripp, K. W., Weinberg, O. K., Pinkus, G. S., Gehrke, L., Regev, A., Lander, E. S., Gazda, H. T., Lee, W. Y., Panse, V. G., Carr, S. A., Sankaran, V. G. 2018; 173 (1): 90-+

Abstract

Blood cell formation is classically thought to occur through a hierarchical differentiation process, although recent studies have shown that lineage commitment may occur earlier in hematopoietic stem and progenitor cells (HSPCs). The relevance to human blood diseases and the underlying regulation of these refined models remain poorly understood. By studying a genetic blood disorder, Diamond-Blackfan anemia (DBA), where the majority of mutations affect ribosomal proteins and the erythroid lineage is selectively perturbed, we are able to gain mechanistic insight into how lineage commitment is programmed normally and disrupted in disease. We show that in DBA, the pool of available ribosomes is limited, while ribosome composition remains constant. Surprisingly, this global reduction in ribosome levels more profoundly alters translation of a select subset of transcripts. We show how the reduced translation of select transcripts in HSPCs can impair erythroid lineage commitment, illuminating a regulatory role for ribosome levels in cellular differentiation.

View details for DOI 10.1016/j.cell.2018.02.036

View details for Web of Science ID 000428234200010

View details for PubMedID 29551269

View details for PubMedCentralID PMC5866246
Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries. Nature communications Zekavat, S. M., Ruotsalainen, S. n., Handsaker, R. E., Alver, M. n., Bloom, J. n., Poterba, T. n., Seed, C. n., Ernst, J. n., Chaffin, M. n., Engreitz, J. n., Peloso, G. M., Manichaikul, A. n., Yang, C. n., Ryan, K. A., Fu, M. n., Johnson, W. C., Tsai, M. n., Budoff, M. n., Ramachandran, V. S., Cupples, L. A., Rotter, J. I., Rich, S. S., Post, W. n., Mitchell, B. D., Correa, A. n., Metspalu, A. n., Wilson, J. G., Salomaa, V. n., Kellis, M. n., Daly, M. J., Neale, B. M., McCarroll, S. n., Surakka, I. n., Esko, T. n., Ganna, A. n., Ripatti, S. n., Kathiresan, S. n., Natarajan, P. n. 2018; 9 (1): 2606

Abstract

Lipoprotein(a), Lp(a), is a modified low-density lipoprotein particle that contains apolipoprotein(a), encoded by LPA, and is a highly heritable, causal risk factor for cardiovascular diseases that varies in concentrations across ancestries. Here, we use deep-coverage whole genome sequencing in 8392 individuals of European and African ancestry to discover and interpret both single-nucleotide variants and copy number (CN) variation associated with Lp(a). We observe that genetic determinants between Europeans and Africans have several unique determinants. The common variant rs12740374 associated with Lp(a) cholesterol is an eQTL for SORT1 and independent of LDL cholesterol. Observed associations of aggregates of rare non-coding variants are largely explained by LPA structural variation, namely the LPA kringle IV 2 (KIV2)-CN. Finally, we find that LPA risk genotypes confer greater relative risk for incident atherosclerotic cardiovascular diseases compared to directly measured Lp(a), and are significantly associated with measures of subclinical atherosclerosis in African Americans.

View details for DOI 10.1038/s41467-018-04668-w

View details for PubMedID 29973585

View details for PubMedCentralID PMC6031652
Deep-coverage whole genome sequences and blood lipids among 16,324 individuals. Nature communications Natarajan, P. n., Peloso, G. M., Zekavat, S. M., Montasser, M. n., Ganna, A. n., Chaffin, M. n., Khera, A. V., Zhou, W. n., Bloom, J. M., Engreitz, J. M., Ernst, J. n., O'Connell, J. R., Ruotsalainen, S. E., Alver, M. n., Manichaikul, A. n., Johnson, W. C., Perry, J. A., Poterba, T. n., Seed, C. n., Surakka, I. L., Esko, T. n., Ripatti, S. n., Salomaa, V. n., Correa, A. n., Vasan, R. S., Kellis, M. n., Neale, B. M., Lander, E. S., Abecasis, G. n., Mitchell, B. n., Rich, S. S., Wilson, J. G., Cupples, L. A., Rotter, J. I., Willer, C. J., Kathiresan, S. n. 2018; 9 (1): 3391

Abstract

Large-scale deep-coverage whole-genome sequencing (WGS) is now feasible and offers potential advantages for locus discovery. We perform WGS in 16,324 participants from four ancestries at mean depth >29X and analyze genotypes with four quantitative traits-plasma total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol, and triglycerides. Common variant association yields known loci except for few variants previously poorly imputed. Rare coding variant association yields known Mendelian dyslipidemia genes but rare non-coding variant association detects no signals. A high 2M-SNP LDL-C polygenic score (top 5th percentile) confers similar effect size to a monogenic mutation (~30 mg/dl higher for each); however, among those with severe hypercholesterolemia, 23% have a high polygenic score and only 2% carry a monogenic mutation. At these sample sizes and for these phenotypes, the incremental value of WGS for discovery is limited but WGS permits simultaneous assessment of monogenic and polygenic models to severe hypercholesterolemia.

View details for DOI 10.1038/s41467-018-05747-8

View details for PubMedID 30140000
Genome-scale activation screen identifies a lncRNA locus regulating a gene neighbourhood NATURE Joung, J., Engreitz, J. M., Konermann, S., Abudayyeh, O. O., Verdine, V. K., Aguet, F., Gootenberg, J. S., Sanjana, N. E., Wright, J. B., Fulco, C. P., Tseng, Y., Yoon, C. H., Boehm, J. S., Lander, E. S., Zhang, F. 2017; 548 (7667): 343-+

Abstract

Mammalian genomes contain thousands of loci that transcribe long noncoding RNAs (lncRNAs), some of which are known to carry out critical roles in diverse cellular processes through a variety of mechanisms. Although some lncRNA loci encode RNAs that act non-locally (in trans), there is emerging evidence that many lncRNA loci act locally (in cis) to regulate the expression of nearby genes-for example, through functions of the lncRNA promoter, transcription, or transcript itself. Despite their potentially important roles, it remains challenging to identify functional lncRNA loci and distinguish among these and other mechanisms. Here, to address these challenges, we developed a genome-scale CRISPR-Cas9 activation screen that targets more than 10,000 lncRNA transcriptional start sites to identify noncoding loci that influence a phenotype of interest. We found 11 lncRNA loci that, upon recruitment of an activator, mediate resistance to BRAF inhibitors in human melanoma cells. Most candidate loci appear to regulate nearby genes. Detailed analysis of one candidate, termed EMICERI, revealed that its transcriptional activation resulted in dosage-dependent activation of four neighbouring protein-coding genes, one of which confers the resistance phenotype. Our screening and characterization approach provides a CRISPR toolkit with which to systematically discover the functions of noncoding loci and elucidate their diverse roles in gene regulation and cellular function.

View details for DOI 10.1038/nature23451

View details for Web of Science ID 000407748400035

View details for PubMedID 28792927

View details for PubMedCentralID PMC5706657
A Genetic Variant Associated with Five Vascular Diseases Is a Distal Regulator of Endothelin-1 Gene Expression CELL Gupta, R. M., Hadaya, J., Trehan, A., Zekavat, S. M., Roselli, C., Klarin, D., Emdin, C. A., Hilvering, C. R. E., Bianchi, V., Mueller, C., Khera, A. V., Ryan, R. J. H., Engreitz, J. M., Issner, R., Shoresh, N., Epstein, C. B., De laat, W., Brown, J. D., Schnabel, R. B., Bernstein, B. E., Kathiresan, S. 2017; 170 (3): 522-+

Abstract

Genome-wide association studies (GWASs) implicate the PHACTR1 locus (6p24) in risk for five vascular diseases, including coronary artery disease, migraine headache, cervical artery dissection, fibromuscular dysplasia, and hypertension. Through genetic fine mapping, we prioritized rs9349379, a common SNP in the third intron of the PHACTR1 gene, as the putative causal variant. Epigenomic data from human tissue revealed an enhancer signature at rs9349379 exclusively in aorta, suggesting a regulatory function for this SNP in the vasculature. CRISPR-edited stem cell-derived endothelial cells demonstrate rs9349379 regulates expression of endothelin 1 (EDN1), a gene located 600 kb upstream of PHACTR1. The known physiologic effects of EDN1 on the vasculature may explain the pattern of risk for the five associated diseases. Overall, these data illustrate the integration of genetic, phenotypic, and epigenetic analysis to identify the biologic mechanism by which a common, non-coding variant can distally regulate a gene and contribute to the pathogenesis of multiple vascular diseases.

View details for DOI 10.1016/j.cell.2017.06.049

View details for Web of Science ID 000406462400011

View details for PubMedID 28753427

View details for PubMedCentralID PMC5785707
Recurrent and functional regulatory mutations in breast cancer NATURE Rheinbay, E., Parasuraman, P., Grimsby, J., Tiao, G., Engreitz, J. M., Kim, J., Lawrence, M. S., Taylor-Weiner, A., Rodriguez-Cuevas, S., Rosenberg, M., Hess, J., Stewart, C., Maruvka, Y. E., Stojanov, P., Cortes, M. L., Seepo, S., Cibulskis, C., Tracy, A., Pugh, T. J., Lee, J., Zheng, Z., Ellisen, L. W., Iafrate, A., Boehm, J. S., Gabriel, S. B., Meyerson, M., Golub, T. R., Baselga, J., Hidalgo-Miranda, A., Shioda, T., Bernards, A., Lander, E. S., Getz, G. 2017; 547 (7661): 55-+

Abstract

Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.

View details for DOI 10.1038/nature22992

View details for Web of Science ID 000404839900030

View details for PubMedID 28658208

View details for PubMedCentralID PMC5563978
Systematic dissection of genomic features determining transcription factor binding and enhancer function PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Grossman, S. R., Zhang, X., Wang, L., Engreitz, J., Melnikov, A., Rogov, P., Tewhey, R., Isakova, A., Deplancke, B., Bernstein, B. E., Mikkelsen, T. S., Lander, E. S. 2017; 114 (7): E1291–E1300

Abstract

Enhancers regulate gene expression through the binding of sequence-specific transcription factors (TFs) to cognate motifs. Various features influence TF binding and enhancer function-including the chromatin state of the genomic locus, the affinities of the binding site, the activity of the bound TFs, and interactions among TFs. However, the precise nature and relative contributions of these features remain unclear. Here, we used massively parallel reporter assays (MPRAs) involving 32,115 natural and synthetic enhancers, together with high-throughput in vivo binding assays, to systematically dissect the contribution of each of these features to the binding and activity of genomic regulatory elements that contain motifs for PPARγ, a TF that serves as a key regulator of adipogenesis. We show that distinct sets of features govern PPARγ binding vs. enhancer activity. PPARγ binding is largely governed by the affinity of the specific motif site and higher-order features of the larger genomic locus, such as chromatin accessibility. In contrast, the enhancer activity of PPARγ binding sites depends on varying contributions from dozens of TFs in the immediate vicinity, including interactions between combinations of these TFs. Different pairs of motifs follow different interaction rules, including subadditive, additive, and superadditive interactions among specific classes of TFs, with both spatially constrained and flexible grammars. Our results provide a paradigm for the systematic characterization of the genomic features underlying regulatory elements, applicable to the design of synthetic regulatory elements or the interpretation of human genetic variation.

View details for DOI 10.1073/pnas.1621150114

View details for Web of Science ID 000393989300030

View details for PubMedID 28137873

View details for PubMedCentralID PMC5321001
Cohesin Loss Eliminates All Loop Domains. Cell Rao, S. S., Huang, S. C., Glenn St Hilaire, B. n., Engreitz, J. M., Perez, E. M., Kieffer-Kwon, K. R., Sanborn, A. L., Johnstone, S. E., Bascom, G. D., Bochkov, I. D., Huang, X. n., Shamim, M. S., Shin, J. n., Turner, D. n., Ye, Z. n., Omer, A. D., Robinson, J. T., Schlick, T. n., Bernstein, B. E., Casellas, R. n., Lander, E. S., Aiden, E. L. 2017; 171 (2): 305–20.e24

Abstract

The human genome folds to create thousands of intervals, called "contact domains," that exhibit enhanced contact frequency within themselves. "Loop domains" form because of tethering between two loci-almost always bound by CTCF and cohesin-lying on the same chromosome. "Compartment domains" form when genomic intervals with similar histone marks co-segregate. Here, we explore the effects of degrading cohesin. All loop domains are eliminated, but neither compartment domains nor histone marks are affected. Loss of loop domains does not lead to widespread ectopic gene activation but does affect a significant minority of active genes. In particular, cohesin loss causes superenhancers to co-localize, forming hundreds of links within and across chromosomes and affecting the regulation of nearby genes. We then restore cohesin and monitor the re-formation of each loop. Although re-formation rates vary greatly, many megabase-sized loops recovered in under an hour, consistent with a model where loop extrusion is rapid.

View details for PubMedID 28985562

View details for PubMedCentralID PMC5846482
Long non-coding RNAs: spatial amplifiers that control nuclear structure and gene expression NATURE REVIEWS MOLECULAR CELL BIOLOGY Engreitz, J. M., Ollikainen, N., Guttman, M. 2016; 17 (12): 756–70

Abstract

Over the past decade, it has become clear that mammalian genomes encode thousands of long non-coding RNAs (lncRNAs), many of which are now implicated in diverse biological processes. Recent work studying the molecular mechanisms of several key examples - including Xist, which orchestrates X chromosome inactivation - has provided new insights into how lncRNAs can control cellular functions by acting in the nucleus. Here we discuss emerging mechanistic insights into how lncRNAs can regulate gene expression by coordinating regulatory proteins, localizing to target loci and shaping three-dimensional (3D) nuclear organization. We explore these principles to highlight biological challenges in gene regulation, in which lncRNAs are well-suited to perform roles that cannot be carried out by DNA elements or protein regulators alone, such as acting as spatial amplifiers of regulatory signals in the nucleus.

View details for DOI 10.1038/nrm.2016.126

View details for Web of Science ID 000388967900010

View details for PubMedID 27780979
Principles of Systems Biology-No. 10 CELL SYSTEMS Sanjana, N., Zhang, F., Fulco, C., Lander, E., Engreitz, J., Costanzo, M., Myers, C. L., Andrews, B., Boone, C., Kwiecien, N., Coon, J., Stefely, J., Pagliarini, D., Lewitus, E., Morlon, H., Zhang, R., Heyde, K. C., Scott, F. Y., Paek, S., Ruder, W. C., Liang, K., Doonan, C., Falcaro, P. 2016; 3 (4): 318–20

Abstract

CRISPR analysis of gene regulatory elements, a near-complete yeast genetic interaction map, and multi-omics mass spectrometry are milestones covered in this month's Cell Systems Call (Cell Systems 1, 307).

View details for Web of Science ID 000395781400002

View details for PubMedID 27788354
Eradication of large established tumors in mice by combination immunotherapy that engages innate and adaptive immune responses. Nature medicine Moynihan, K. D., Opel, C. F., Szeto, G. L., Tzeng, A., Zhu, E. F., Engreitz, J. M., Williams, R. T., Rakhra, K., Zhang, M. H., Rothschilds, A. M., Kumari, S., Kelly, R. L., Kwan, B. H., Abraham, W., Hu, K., Mehta, N. K., Kauke, M. J., Suh, H., Cochran, J. R., Lauffenburger, D. A., Wittrup, K. D., Irvine, D. J. 2016

Abstract

Checkpoint blockade with antibodies specific for cytotoxic T lymphocyte-associated protein (CTLA)-4 or programmed cell death 1 (PDCD1; also known as PD-1) elicits durable tumor regression in metastatic cancer, but these dramatic responses are confined to a minority of patients. This suboptimal outcome is probably due in part to the complex network of immunosuppressive pathways present in advanced tumors, which are unlikely to be overcome by intervention at a single signaling checkpoint. Here we describe a combination immunotherapy that recruits a variety of innate and adaptive immune cells to eliminate large tumor burdens in syngeneic tumor models and a genetically engineered mouse model of melanoma; to our knowledge tumors of this size have not previously been curable by treatments relying on endogenous immunity. Maximal antitumor efficacy required four components: a tumor-antigen-targeting antibody, a recombinant interleukin-2 with an extended half-life, anti-PD-1 and a powerful T cell vaccine. Depletion experiments revealed that CD8(+) T cells, cross-presenting dendritic cells and several other innate immune cell subsets were required for tumor regression. Effective treatment induced infiltration of immune cells and production of inflammatory cytokines in the tumor, enhanced antibody-mediated tumor antigen uptake and promoted antigen spreading. These results demonstrate the capacity of an elicited endogenous immune response to destroy large, established tumors and elucidate essential characteristics of combination immunotherapies that are capable of curing a majority of tumors in experimental settings typically viewed as intractable.

View details for DOI 10.1038/nm.4200

View details for PubMedID 27775706

View details for PubMedCentralID PMC5209798
RNA Antisense Purification (RAP) for Mapping RNA Interactions with Chromatin NUCLEAR BODIES AND NONCODING RNAS: METHODS AND PROTOCOLS Engreitz, J., Lander, E. S., Guttman, M. edited by Nakagawa, S., Hirose, T. 2015; 1262: 183–97

Abstract

RNA-centric biochemical purification is a general approach for studying the functions and mechanisms of noncoding RNAs. Here, we describe the experimental procedures for RNA antisense purification (RAP), a method for selective purification of endogenous RNA complexes from cell extracts that enables mapping of RNA interactions with chromatin. In RAP, the user cross-links cells to fix endogenous RNA complexes and purifies these complexes through hybrid capture with biotinylated antisense oligos. DNA loci that interact with the target RNA are identified using high-throughput DNA sequencing.

View details for DOI 10.1007/978-1-4939-2253-6_11

View details for Web of Science ID 000357692500012

View details for PubMedID 25555582
RNA-RNA Interactions Enable Specific Targeting of Noncoding RNAs to Nascent Pre-mRNAs and Chromatin Sites CELL Engreitz, J. M., Sirokman, K., McDonel, P., Shishkin, A. A., Surka, C., Russell, P., Grossman, S. R., Chow, A. Y., Guttman, M., Lander, E. S. 2014; 159 (1): 188–99

Abstract

Intermolecular RNA-RNA interactions are used by many noncoding RNAs (ncRNAs) to achieve their diverse functions. To identify these contacts, we developed a method based on RNA antisense purification to systematically map RNA-RNA interactions (RAP-RNA) and applied it to investigate two ncRNAs implicated in RNA processing: U1 small nuclear RNA, a component of the spliceosome, and Malat1, a large ncRNA that localizes to nuclear speckles. U1 and Malat1 interact with nascent transcripts through distinct targeting mechanisms. Using differential crosslinking, we confirmed that U1 directly hybridizes to 5' splice sites and 5' splice site motifs throughout introns and found that Malat1 interacts with pre-mRNAs indirectly through protein intermediates. Interactions with nascent pre-mRNAs cause U1 and Malat1 to localize proximally to chromatin at active genes, demonstrating that ncRNAs can use RNA-RNA interactions to target specific pre-mRNAs and genomic sites. RAP-RNA is sensitive to lower abundance RNAs as well, making it generally applicable for investigating ncRNAs.

View details for DOI 10.1016/j.cell.2014.08.018

View details for Web of Science ID 000343095000019

View details for PubMedID 25259926

View details for PubMedCentralID PMC4177037
Transcriptome-wide Mapping Reveals Widespread Dynamic-Regulated Pseudouridylation of ncRNA and mRNA CELL Schwartz, S., Bernstein, D. A., Mumbach, M. R., Jovanovic, M., Herbst, R. H., Leon-Ricardo, B. X., Engreitz, J. M., Guttman, M., Satija, R., Lander, E. S., Fink, G., Regev, A. 2014; 159 (1): 148–62

Abstract

Pseudouridine is the most abundant RNA modification, yet except for a few well-studied cases, little is known about the modified positions and their function(s). Here, we develop Ψ-seq for transcriptome-wide quantitative mapping of pseudouridine. We validate Ψ-seq with spike-ins and de novo identification of previously reported positions and discover hundreds of unique sites in human and yeast mRNAs and snoRNAs. Perturbing pseudouridine synthases (PUS) uncovers which pseudouridine synthase modifies each site and their target sequence features. mRNA pseudouridinylation depends on both site-specific and snoRNA-guided pseudouridine synthases. Upon heat shock in yeast, Pus7p-mediated pseudouridylation is induced at >200 sites, and PUS7 deletion decreases the levels of otherwise pseudouridylated mRNA, suggesting a role in enhancing transcript stability. rRNA pseudouridine stoichiometries are conserved but reduced in cells from dyskeratosis congenita patients, where the PUS DKC1 is mutated. Our work identifies an enhanced, transcriptome-wide scope for pseudouridine and methods to dissect its underlying mechanisms and function.

View details for DOI 10.1016/j.cell.2014.08.028

View details for Web of Science ID 000343095000016

View details for PubMedID 25219674

View details for PubMedCentralID PMC4180118
Topological organization of multichromosomal regions by the long intergenic noncoding RNA Firre NATURE STRUCTURAL & MOLECULAR BIOLOGY Hacisuleyman, E., Goff, L. A., Trapnell, C., Williams, A., Henao-Mejia, J., Sun, L., McClanahan, P., Hendrickson, D. G., Sauvageau, M., Kelley, D. R., Morse, M., Engreitz, J., Lander, E. S., Guttman, M., Lodish, H. F., Flavell, R., Raj, A., Rinn, J. L. 2014; 21 (2): 198-+

Abstract

RNA, including long noncoding RNA (lncRNA), is known to be an abundant and important structural component of the nuclear matrix. However, the molecular identities, functional roles and localization dynamics of lncRNAs that influence nuclear architecture remain poorly understood. Here, we describe one lncRNA, Firre, that interacts with the nuclear-matrix factor hnRNPU through a 156-bp repeating sequence and localizes across an ~5-Mb domain on the X chromosome. We further observed Firre localization across five distinct trans-chromosomal loci, which reside in spatial proximity to the Firre genomic locus on the X chromosome. Both genetic deletion of the Firre locus and knockdown of hnRNPU resulted in loss of colocalization of these trans-chromosomal interacting loci. Thus, our data suggest a model in which lncRNAs such as Firre can interface with and modulate nuclear architecture across chromosomes.

View details for DOI 10.1038/nsmb.2764

View details for Web of Science ID 000331093600013

View details for PubMedID 24463464

View details for PubMedCentralID PMC3950333
Neuregulin Autocrine Signaling Promotes Self-Renewal of Breast Tumor-Initiating Cells by Triggering HER2/HER3 Activation CANCER RESEARCH Lee, C. Y., Lin, Y., Bratman, S. V., Feng, W., Kuo, A. H., Scheeren, F. A., Engreitz, J. M., Varma, S., West, R. B., Diehn, M. 2014; 74 (1): 341-352

Abstract

Currently, only patients with HER2-positive tumors are candidates for HER2-targeted therapies. However, recent clinical observations suggest that the survival of patients with HER2-low breast cancers, who lack HER2 amplification, may benefit from adjuvant therapy that targets HER2. In this study, we explored a mechanism through which these benefits may be obtained. Prompted by the hypothesis that HER2/HER3 signaling in breast tumor-initiating cells (TIC) promotes self-renewal and survival, we obtained evidence that neuregulin 1 (NRG1) produced by TICs promotes their proliferation and self-renewal in HER2-low tumors, including in triple-negative breast tumors. Pharmacologic inhibition of EGFR, HER2, or both receptors reduced breast TIC survival and self-renewal in vitro and in vivo and increased TIC sensitivity to ionizing radiation. Through a tissue microarray analysis, we found that NRG1 expression and associated HER2 activation occurred in a subset of HER2-low breast cancers. Our results offer an explanation for why HER2 inhibition blocks the growth of HER2-low breast tumors. Moreover, they argue that dual inhibition of EGFR and HER2 may offer a useful therapeutic strategy to target TICs in these tumors. In generating a mechanistic rationale to apply HER2-targeting therapies in patients with HER2-low tumors, this work shows why these therapies could benefit a considerably larger number of patients with breast cancer than they currently reach.

View details for DOI 10.1158/0008-5472.CAN-13-1055

View details for Web of Science ID 000329297600033

View details for PubMedID 24177178

View details for PubMedCentralID PMC3917843
Three-Dimensional Genome Architecture Influences Partner Selection for Chromosomal Translocations in Human Disease PLOS ONE Engreitz, J. M., Agarwala, V., Mirny, L. A. 2012; 7 (9): e44196

Abstract

Chromosomal translocations are frequent features of cancer genomes that contribute to disease progression. These rearrangements result from formation and illegitimate repair of DNA double-strand breaks (DSBs), a process that requires spatial colocalization of chromosomal breakpoints. The "contact first" hypothesis suggests that translocation partners colocalize in the nuclei of normal cells, prior to rearrangement. It is unclear, however, the extent to which spatial interactions based on three-dimensional genome architecture contribute to chromosomal rearrangements in human disease. Here we intersect Hi-C maps of three-dimensional chromosome conformation with collections of 1,533 chromosomal translocations from cancer and germline genomes. We show that many translocation-prone pairs of regions genome-wide, including the cancer translocation partners BCR-ABL and MYC-IGH, display elevated Hi-C contact frequencies in normal human cells. Considering tissue specificity, we find that translocation breakpoints reported in human hematologic malignancies have higher Hi-C contact frequencies in lymphoid cells than those reported in sarcomas and epithelial tumors. However, translocations from multiple tissue types show significant correlation with Hi-C contact frequencies, suggesting that both tissue-specific and universal features of chromatin structure contribute to chromosomal alterations. Our results demonstrate that three-dimensional genome architecture shapes the landscape of rearrangements directly observed in human disease and establish Hi-C as a key method for dissecting these effects.

View details for DOI 10.1371/journal.pone.0044196

View details for Web of Science ID 000309973900005

View details for PubMedID 23028501

View details for PubMedCentralID PMC3460994
ProfileChaser: searching microarray repositories based on genome-wide patterns of differential expression BIOINFORMATICS Engreitz, J. M., Chen, R., Morgan, A. A., Dudley, J. T., Mallelwar, R., Butte, A. J. 2011; 27 (23): 3317-3318

Abstract

We introduce ProfileChaser, a web server that allows for querying the Gene Expression Omnibus based on genome-wide patterns of differential expression. Using a novel, content-based approach, ProfileChaser retrieves expression profiles that match the differentially regulated transcriptional programs in a user-supplied experiment. This analysis identifies statistical links to similar expression experiments from the vast array of publicly available data on diseases, drugs, phenotypes and other experimental conditions.http://profilechaser.stanford.eduabutte@stanford.eduSupplementary data are available at Bioinformatics online.

View details for DOI 10.1093/bioinformatics/btr548

View details for Web of Science ID 000297352100015

View details for PubMedID 21967760

View details for PubMedCentralID PMC3223361
The Lin28/let-7 Axis Regulates Glucose Metabolism CELL Zhu, H., Shyh-Chang, N., Segre, A. V., Shinoda, G., Shah, S. P., Einhorn, W. S., Takeuchi, A., Engreitz, J. M., Hagan, J. P., Kharas, M. G., Urbach, A., Thornton, J. E., Triboulet, R., Gregory, R. I., Altshuler, D., Daley, G. Q., DIAGRAM Consortium, MAGIC Investigators 2011; 147 (1): 81–94

Abstract

The let-7 tumor suppressor microRNAs are known for their regulation of oncogenes, while the RNA-binding proteins Lin28a/b promote malignancy by inhibiting let-7 biogenesis. We have uncovered unexpected roles for the Lin28/let-7 pathway in regulating metabolism. When overexpressed in mice, both Lin28a and LIN28B promote an insulin-sensitized state that resists high-fat-diet induced diabetes. Conversely, muscle-specific loss of Lin28a or overexpression of let-7 results in insulin resistance and impaired glucose tolerance. These phenomena occur, in part, through the let-7-mediated repression of multiple components of the insulin-PI3K-mTOR pathway, including IGF1R, INSR, and IRS2. In addition, the mTOR inhibitor, rapamycin, abrogates Lin28a-mediated insulin sensitivity and enhanced glucose uptake. Moreover, let-7 targets are enriched for genes containing SNPs associated with type 2 diabetes and control of fasting glucose in human genome-wide association studies. These data establish the Lin28/let-7 pathway as a central regulator of mammalian glucose metabolism.

View details for DOI 10.1016/j.cell.2011.08.033

View details for Web of Science ID 000295396700017

View details for PubMedID 21962509

View details for PubMedCentralID PMC3353524
Content-based microarray search using differential expression profiles BMC BIOINFORMATICS Engreitz, J. M., Morgan, A. A., Dudley, J. T., Chen, R., Thathoo, R., Altman, R. B., Butte, A. J. 2010; 11

Abstract

With the expansion of public repositories such as the Gene Expression Omnibus (GEO), we are rapidly cataloging cellular transcriptional responses to diverse experimental conditions. Methods that query these repositories based on gene expression content, rather than textual annotations, may enable more effective experiment retrieval as well as the discovery of novel associations between drugs, diseases, and other perturbations.We develop methods to retrieve gene expression experiments that differentially express the same transcriptional programs as a query experiment. Avoiding thresholds, we generate differential expression profiles that include a score for each gene measured in an experiment. We use existing and novel dimension reduction and correlation measures to rank relevant experiments in an entirely data-driven manner, allowing emergent features of the data to drive the results. A combination of matrix decomposition and p-weighted Pearson correlation proves the most suitable for comparing differential expression profiles. We apply this method to index all GEO DataSets, and demonstrate the utility of our approach by identifying pathways and conditions relevant to transcription factors Nanog and FoxO3.Content-based gene expression search generates relevant hypotheses for biological inquiry. Experiments across platforms, tissue types, and protocols inform the analysis of new datasets.

View details for DOI 10.1186/1471-2105-11-603

View details for Web of Science ID 000286192100001

View details for PubMedID 21172034

View details for PubMedCentralID PMC3022631
Independent component analysis: Mining microarray data for fundamental human gene expression modules JOURNAL OF BIOMEDICAL INFORMATICS Engreitz, J. M., Daigle, B. J., Marshall, J. J., Altman, R. B. 2010; 43 (6): 932-944

Abstract

As public microarray repositories rapidly accumulate gene expression data, these resources contain increasingly valuable information about cellular processes in human biology. This presents a unique opportunity for intelligent data mining methods to extract information about the transcriptional modules underlying these biological processes. Modeling cellular gene expression as a combination of functional modules, we use independent component analysis (ICA) to derive 423 fundamental components of human biology from a 9395-array compendium of heterogeneous expression data. Annotation using the Gene Ontology (GO) suggests that while some of these components represent known biological modules, others may describe biology not well characterized by existing manually-curated ontologies. In order to understand the biological functions represented by these modules, we investigate the mechanism of the preclinical anti-cancer drug parthenolide (PTL) by analyzing the differential expression of our fundamental components. Our method correctly identifies known pathways and predicts that N-glycan biosynthesis and T-cell receptor signaling may contribute to PTL response. The fundamental gene modules we describe have the potential to provide pathway-level insight into new gene expression datasets.

View details for DOI 10.1016/j.jbi.2010.07.001

View details for Web of Science ID 000285036700009

View details for PubMedID 20619355

View details for PubMedCentralID PMC2991480

Jesse Engreitz

Assistant Professor of Genetics

Bio

Academic Appointments

Program Affiliations

Professional Education

Contact

Links

Current Research and Scholarly Interests

2025-26 Courses

2024-25 Courses

2023-24 Courses

2022-23 Courses

Stanford Advisees

Graduate and Fellowship Programs

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract