I am a scientist with expertise in single-cell genomics, immunology, and molecular genetics. I am passionate about the development and application of new technologies to study human disease and design new therapeutic interventions. My research focuses on how cells evolve within an individual’s lifetime from molecular triggers, including somatic mutations and exposures to pathogens, and how these can lead to the predisposition of age-associated diseases.
Honors & Awards
30 Under 30- Science, Forbes (2022)
K99/R00 Pathway to Independence Award, National Human Genome Research Institute (2022)
STAT Wunderkind, STAT News (2022)
Parker Scholar, Parker Institute for Cancer Immunotherapy (2021)
Stanford Science Fellow, Stanford University (2020)
NIH Ruth L. Kirschstein National Research Service Award (F31), National Cancer Institute (2018)
NSF GRFP, National Science Foundation (2015)
DAAD Rise Fellow, Deutscher Akademischer Austauschdienst (2013, 2014)
Barry M. Goldwater Scholar, Goldwater Foundation (2013)
PhD, Harvard Medical School, Biological and Biomedical Sciences (2020)
Bachelor of Science, University of Tulsa, Biochemistry and Mathematics (2015)
BAF complex maintains glioma stem cells in pediatric H3K27M-glioma.
Diffuse midline gliomas are uniformly fatal pediatric central nervous system cancers, refractory to standard of care therapeutic modalities. The primary genetic drivers are a set of recurrent amino acid substitutions in genes encoding histone H3 (H3K27M), which are currently undruggable. These H3K27M oncohistones perturb normal chromatin architecture, resulting in an aberrant epigenetic landscape. To interrogate for epigenetic dependencies, we performed a CRISPR screen and show that patient-derived H3K27M-glioma neurospheres are dependent on core components of the mammalian BAF (SWI/SNF) chromatin remodeling complex. The BAF complex maintains glioma stem cells in a cycling, oligodendrocyte precursor cell (OPC)-like state, where genetic perturbation of the BAF catalytic subunit SMARCA4 (BRG1), as well as pharmacological suppression opposes proliferation, promotes progression of differentiation along the astrocytic lineage, and improves overall survival of patient-derived xenograft models. In summary, we demonstrate that therapeutic inhibition of BAF complex has translational potential for children with H3K27M-gliomas.
View details for DOI 10.1158/2159-8290.CD-21-1491
View details for PubMedID 36305736
Clonal expansion and epigenetic inheritance of long-lasting NK cell memory.
Clonal expansion of cells with somatically diversified receptors and their long-term maintenance as memory cells is a hallmark of adaptive immunity. Here, we studied pathogen-specific adaptation within the innate immune system, tracking natural killer (NK) cell memory to human cytomegalovirus (HCMV) infection. Leveraging single-cell multiomic maps of ex vivo NK cells and somatic mitochondrial DNA mutations as endogenous barcodes, we reveal substantial clonal expansion of adaptive NK cells in HCMV+ individuals. NK cell clonotypes were characterized by a convergent inflammatory memory signature enriched for AP1 motifs superimposed on a private set of clone-specific accessible chromatin regions. NK cell clones were stably maintained in specific epigenetic states over time, revealing that clonal inheritance of chromatin accessibility shapes the epigenetic memory repertoire. Together, we identify clonal expansion and persistence within the human innate immune system, suggesting that these mechanisms have evolved independent of antigen-receptor diversification.
View details for DOI 10.1038/s41590-022-01327-7
View details for PubMedID 36289449
- Editorial: Lineage tracing, hematopoietic stem cell and immune cell dynamics. Frontiers in immunology 2022; 13: 1062415
Functional inference of gene regulation using single-cell multi-omics.
2022; 2 (9)
Cells require coordinated control over gene expression when responding to environmental stimuli. Here we apply scATAC-seq and single-cell RNA sequencing (scRNA-seq) in resting and stimulated human blood cells. Collectively, we generate ~91,000 single-cell profiles, allowing us to probe the cis-regulatory landscape of the immunological response across cell types, stimuli, and time. Advancing tools to integrate multi-omics data, we develop functional inference of gene regulation (FigR), a framework to computationally pair scA-TAC-seq with scRNA-seq cells, connect distal cis-regulatory elements to genes, and infer gene-regulatory networks (GRNs) to identify candidate transcription factor (TF) regulators. Utilizing these paired multi-omics data, we define domains of regulatory chromatin (DORCs) of immune stimulation and find that cells alter chromatin accessibility and gene expression at timescales of minutes. Construction of the stimulation GRN elucidates TF activity at disease-associated DORCs. Overall, FigR enables elucidation of regulatory interactions across single-cell data, providing new opportunities to understand the function of cells within tissues.
View details for DOI 10.1016/j.xgen.2022.100166
View details for PubMedID 36204155
A RORgammat+ cell instructs gut microbiota-specific Treg cell differentiation.
The mutualistic relationship of gut-resident microbiota and the host immune system promotes homeostasis that ensures maintenance of the microbial community and of a largely non-aggressive immune cell compartment1,2. The consequences of disturbing this balance include proximal inflammatory conditions, such as Crohn's disease, and systemic illnesses. This equilibrium is achieved in part through the induction of both effector and suppressor arms of the adaptive immune system. Helicobacter species induce T regulatory (Treg) and T follicular helper (TFH) cells under homeostatic conditions, but induce inflammatory T helper 17 (TH17) cells when induced Treg (iTreg) cells are compromised3,4. How Helicobacter and other gut bacteria direct T cells to adopt distinct functions remains poorly understood. Here we investigated the cells and molecular components required for iTreg cell differentiation. We found that antigen presentation by cells expressing RORgammat, rather than by classical dendritic cells, was required and sufficient for induction of Treg cells. These RORgammat+ cells-probably type 3 innate lymphoid cells and/or Janus cells5-require the antigen-presentation machinery, the chemokine receptor CCR7 and the TGFbeta activator alphav integrin. In the absence of any of these factors, there was expansion of pathogenic TH17 cells instead of iTreg cells, induced by CCR7-independent antigen-presenting cells. Thus, intestinal commensal microbes and their products target multiple antigen-presenting cells with pre-determined features suited to directing appropriate T cell differentiation programmes, rather than a common antigen-presenting cell that they endow with appropriate functions.
View details for DOI 10.1038/s41586-022-05089-y
View details for PubMedID 36071167
- Advancing T cell-based cancer therapy with single-cell technologies. Nature medicine 2022; 28 (9): 1761-1764
Runx3 drives a CD8+ T cell tissue residency program that is absent in CD4+ T cells.
Tissue-resident memory T cells (TRM cells) provide rapid and superior control of localized infections. While the transcription factor Runx3 is a critical regulator of CD8+ T cell tissue residency, its expression is repressed in CD4+ T cells. Here, we show that, as a direct consequence of this Runx3-deficiency, CD4+ TRM cells lacked the transforming growth factor (TGF)-beta-responsive transcriptional network that underpins the tissue residency of epithelial CD8+ TRM cells. While CD4+ TRM cell formation required Runx1, this, along with the modest expression of Runx3 in CD4+ TRM cells, was insufficient to engage the TGF-beta-driven residency program. Ectopic expression of Runx3 in CD4+ T cells incited this TGF-beta-transcriptional network to promote prolonged survival, decreased tissue egress, a microanatomical redistribution towards epithelial layers and enhanced effector functionality. Thus, our results reveal distinct programming of tissue residency in CD8+ and CD4+ TRM cell subsets that is attributable to divergent Runx3 activity.
View details for DOI 10.1038/s41590-022-01273-4
View details for PubMedID 35882933
Mitochondrial variant enrichment from high-throughput single-cell RNA sequencing resolves clonal populations.
The combination of single-cell transcriptomics with mitochondrial DNA variant detection can be used to establish lineage relationships in primary human cells, but current methods are not scalable to interrogate complex tissues. Here, we combine common 3' single-cell RNA-sequencing protocols with mitochondrial transcriptome enrichment to increase coverage by more than 50-fold, enabling high-confidence mutation detection. The method successfully identifies skewed immune-cell expansions in primary human clonal hematopoiesis.
View details for DOI 10.1038/s41587-022-01210-8
View details for PubMedID 35210612
- JAK inhibition in a patient with a STAT1 gain-of-function variant reveals STAT1 dysregulation as a common feature of aplastic anemia MED 2022; 3 (1): 42-+
JAK inhibition in a patient with a STAT1 gain-of-function variant reveals STAT1 dysregulation as a common feature of aplastic anemia.
Med (New York, N.Y.)
2022; 3 (1): 42-57.e5
Idiopathic aplastic anemia is a potentially lethal disease, characterized by T cell-mediated autoimmune attack of bone marrow hematopoietic stem cells. Standard of care therapies (stem cell transplantation or immunosuppression) are effective but associated with a risk of serious toxicities.An 18-year-old man presented with aplastic anemia in the context of a germline gain-of-function variant in STAT1. Treatment with the JAK1 inhibitor itacitinib resulted in a rapid resolution of aplastic anemia and a sustained recovery of hematopoiesis. Peripheral blood and bone marrow samples were compared before and after JAK1 inhibitor therapy.Following therapy, samples showed a decrease in the plasma concentration of interferon-γ, a decrease in PD1-positive exhausted CD8+ T cell population, and a decrease in an interferon responsive myeloid population. Single-cell analysis of chromatin accessibility showed decreased accessibility of STAT1 across CD4+ and CD8+ T cells, as well as CD14+ monocytes. To query whether other cases of aplastic anemia share a similar STAT1-mediated pathophysiology, we examined a cohort of 9 patients with idiopathic aplastic anemia. Bone marrow from six of nine patients also displayed abnormal STAT1 hyper-activation.These findings raise the possibility that STAT1 hyperactivition defines a subset of idiopathic aplastic anemia patients for whom JAK inhibition may be an efficacious therapy.Funding was provided by the Massachusetts General Hospital Department of Medicine Pathways Program and NIH T32 AI007387. A trial registration is at https://clinicaltrials.gov/ct2/show/NCT03906318.
View details for DOI 10.1016/j.medj.2021.12.003
View details for PubMedID 35590143
Functional dissection of inherited non-coding variation influencing multiple myeloma risk.
1800; 13 (1): 151
Thousands of non-coding variants have been associated with increased risk of human diseases, yet the causal variants and their mechanisms-of-action remain obscure. In an integrative study combining massively parallel reporter assays (MPRA), expression analyses (eQTL, meQTL, PCHiC) and chromatin accessibility analyses in primary cells (caQTL), we investigate 1,039 variants associated with multiple myeloma (MM). We demonstrate that MM susceptibility is mediated by gene-regulatory changes in plasma cells and B-cells, and identify putative causal variants at six risk loci (SMARCD3, WAC, ELL2, CDCA7L, CEP120, and PREX1). Notably, three of these variants co-localize with significant plasma cell caQTLs, signaling the presence of causal activity at these precise genomic positions in an endogenous chromosomal context in vivo. Our results provide a systematic functional dissection of risk loci for a hematologic malignancy.
View details for DOI 10.1038/s41467-021-27666-x
View details for PubMedID 35013207
A Congenital Anemia Reveals Distinct Targeting Mechanisms for Master Transcription Factor GATA1.
Master regulators, such as the hematopoietic transcription factor (TF) GATA1, play an essential role in orchestrating lineage commitment and differentiation. However, the precise mechanisms by which such TFs regulate transcription through interactions with specific cis-regulatory elements remain incompletely understood. Here, we describe a form of congenital hemolytic anemia caused by missense mutations in an intrinsically disordered region of GATA1, with a poorly understood role in transcriptional regulation. Through integrative functional approaches, we demonstrate that these mutations perturb GATA1 transcriptional activity by partially impairing nuclear localization and selectively altering precise chromatin occupancy by GATA1. These alterations in chromatin occupancy and concordant chromatin accessibility changes alter faithful gene expression, with failure to both effectively silence and activate select genes necessary for effective terminal red cell production. We demonstrate how disease-causing mutations can reveal regulatory mechanisms that enable the faithful genomic targeting of master TFs during cellular differentiation.
View details for DOI 10.1182/blood.2021013753
View details for PubMedID 35030251
Spatial genomics enables multi-modal study of clonal heterogeneity in tissues.
The state and behaviour of a cell can be influenced by both genetic and environmental factors. In particular, tumour progression is determined by underlying genetic aberrations1-4 as well as the makeup of the tumour microenvironment5,6. Quantifying the contributions of these factors requires new technologies that can accurately measure the spatial location of genomic sequence together with phenotypic readouts. Here we developed slide-DNA-seq, a method for capturing spatially resolved DNA sequences from intact tissue sections. We demonstrate that this method accurately preserves local tumour architecture and enables thede novo discovery of distinct tumour clones and their copy number alterations. We then apply slide-DNA-seq to a mouse model ofmetastasis and a primary human cancer, revealing that clonal populations are confined to distinct spatial regions. Moreover, through integration with spatial transcriptomics, we uncover distinct sets of genes that are associated with clone-specific genetic aberrations, the local tumour microenvironment, or both. Together, this multi-modal spatial genomics approach provides a versatile platform for quantifying how cell-intrinsic and cell-extrinsic factors contribute to gene expression, protein abundance and other cellular phenotypes.
View details for DOI 10.1038/s41586-021-04217-4
View details for PubMedID 34912115
Charting the tumor antigen maps drawn by single-cell genomics.
1800; 39 (12): 1553-1557
The remarkable specificity of antibodies has enabled precision cancer immunotherapies, including chimeric antigen receptor Tcells and antibody-drug conjugates. In parallel, single-cell genomics technologies present the possibility of a comprehensive annotation of antigen expression throughout tissues of the human body and on cancer cells. We reflect on the rationale for antigen targets currently used in immunotherapies, their adverse effects revealed in the clinic, and the opportunity to utilize large genomics datasets to de-risk potential targets and nominate optimal antigens for therapy.
View details for DOI 10.1016/j.ccell.2021.11.005
View details for PubMedID 34906314
- Mitochondrial DNA Mutations Distinguish Individual Donor- and Recipient-Derived Immune Cells Following Matched Unrelated Allogeneic Stem Cell Transplantation AMER SOC HEMATOLOGY. 2021
- Single-cell multiomics defines tolerogenic extrathymic Aire-expressing populations with unique homology to thymic epithelium. Science immunology 2021; 6 (65): eabl5053
Single-cell chromatin state analysis with Signac.
2021; 18 (11): 1333-1341
The recent development of experimental methods for measuring chromatin state at single-cell resolution has created a need for computational tools capable of analyzing these datasets. Here we developed Signac, a comprehensive toolkit for the analysis of single-cell chromatin data. Signac enables an end-to-end analysis of single-cell chromatin data, including peak calling, quantification, quality control, dimension reduction, clustering, integration with single-cell gene expression datasets, DNA motif analysis and interactive visualization. Through its seamless compatibility with the Seurat package, Signac facilitates the analysis of diverse multimodal single-cell chromatin data, including datasets that co-assay DNA accessibility with gene expression, protein abundance and mitochondrial genotype. We demonstrate scaling of the Signac framework to analyze datasets containing over 700,000 cells.
View details for DOI 10.1038/s41592-021-01282-5
View details for PubMedID 34725479
Single-cell profiling of proteins and chromatin accessibility using PHAGE-ATAC.
Multimodal measurements of single-cell profiles are proving increasingly useful for characterizing cell states and regulatory mechanisms. In the present study, we developed PHAGE-ATAC (Assay for Transposase-Accessible Chromatin), a massively parallel droplet-based method that uses phage displaying, engineered, camelid single-domain antibodies ('nanobodies') for simultaneous single-cell measurements of protein levels and chromatin accessibility profiles, and mitochondrial DNA-based clonal tracing. We use PHAGE-ATAC for multimodal analysis in primary human immune cells, sample multiplexing, intracellular protein analysis and the detection of SARS-CoV-2 spike protein in human cell populations. Finally, we construct a synthetic high-complexity phage library for selection of antigen-specific nanobodies that bind cells of particular molecular profiles, opening an avenue for protein detection, cell characterization and screening with single-cell genomics.
View details for DOI 10.1038/s41587-021-01065-5
View details for PubMedID 34675424
STAG2 loss rewires oncogenic and developmental programs to promote metastasis in Ewing sarcoma.
2021; 39 (6): 827
The core cohesin subunit STAG2 is recurrently mutated in Ewing sarcoma but its biological role is less clear. Here, we demonstrate that cohesin complexes containing STAG2 occupy enhancer and polycomb repressive complex (PRC2)-marked regulatory regions. Genetic suppression of STAG2 leads to a compensatory increase in cohesin-STAG1 complexes, but not in enhancer-rich regions, and results in reprogramming of cis-chromatin interactions. Strikingly, in STAG2 knockout cells the oncogenic genetic program driven by the fusion transcription factor EWS/FLI1 was highly perturbed, in part due to altered enhancer-promoter contacts. Moreover, loss of STAG2 also disrupted PRC2-mediated regulation of gene expression. Combined, these transcriptional changes converged to modulate EWS/FLI1, migratory, and neurodevelopmental programs. Finally, consistent with clinical observations, functional studies revealed that loss of STAG2 enhances the metastatic potential of Ewing sarcoma xenografts. Our findings demonstrate that STAG2 mutations can alter chromatin architecture and transcriptional programs to promote an aggressive cancer phenotype.
View details for DOI 10.1016/j.ccell.2021.05.007
View details for PubMedID 34129824
Longitudinal single-cell dynamics of chromatin accessibility and mitochondrial mutations in chronic lymphocytic leukemia mirror disease history.
While cancers evolve during disease progression and in response to therapy, temporal dynamics remain difficult to study in humans due to the lack of consistent barcodes marking individual clones in vivo. We employ mitochondrial single-cell assay for transposase-accessible chromatin with sequencing to profile 163,279 cells from 9 patients with chronic lymphocytic leukemia (CLL) collected across disease course and utilize mitochondrial DNA (mtDNA) mutations as natural genetic markers of cancer clones. We observe stable propagation of mtDNA mutations over years in the absence of strong selective pressure indicating clonal persistence, but dramatic changes following tight bottlenecks including disease transformation and relapse post-therapy, paralleled by acquisition of copy number variants, changes in chromatin accessibility and gene expression. Furthermore, we link CLL subclones to distinct chromatin states, providing insight into non-genetic sources of relapse. mtDNA mutations thus mirror disease history and provide naturally-occurring genetic barcodes to enable patient-specific study of cancer subclonal dynamics.
View details for DOI 10.1158/2159-8290.CD-21-0276
View details for PubMedID 34112698
A microRNA expression and regulatory element activity atlas of the mouse immune system.
To better define the control of immune system regulation, we generated an atlas of microRNA (miRNA) expression from 63mouse immune cell populations and connected these signatures with assay for transposase-accessible chromatin using sequencing (ATAC-seq), chromatin immunoprecipitation followed by sequencing (ChIP-seq) and nascent RNA profiles to establish a map of miRNA promoter and enhancer usage in immune cells. miRNA complexity was relatively low, with >90% of the miRNA compartment of each population comprising <75miRNAs; however, each cell type had a unique miRNA signature. Integration of miRNA expression with chromatin accessibility revealed putative regulatory elements for differentially expressed miRNAs, including miR-21a, miR-146a and miR-223. The integrated maps suggest that many miRNAs utilize multiple promoters to reach high abundance and identified dominant and divergent miRNA regulatory elements between lineages and during development that may be used by clustered miRNAs, such as miR-99a/let-7c/miR-125b, to achieve distinct expression. These studies, with web-accessible data, help delineate the cis-regulatory elements controlling miRNA signatures of the immune system.
View details for DOI 10.1038/s41590-021-00944-y
View details for PubMedID 34099919
Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells.
Recent technological advances have enabled massively parallel chromatin profiling with scATAC-seq (single-cell assay for transposase accessible chromatin by sequencing). Here we present ATAC with select antigen profiling by sequencing (ASAP-seq), a tool to simultaneously profile accessible chromatin and protein levels. Our approach pairs sparse scATAC-seq data with robust detection of hundreds of cell surface and intracellular protein markers and optional capture of mitochondrial DNA for clonal tracking, capturing three distinct modalities in single cells. ASAP-seq uses a bridging approach that repurposes antibody:oligonucleotide conjugates designed for existing technologies that pair protein measurements with single-cell RNA sequencing. Together with DOGMA-seq, an adaptation of CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) for measuring gene activity across the central dogma of gene regulation, we demonstrate the utility of systematic multi-omic profiling by revealing coordinated and distinct changes in chromatin, RNA and surface proteins during native hematopoietic differentiation and peripheral blood mononuclear cell stimulation and as a combinatorial decoder and reporter of multiplexed perturbations in primary T cells.
View details for DOI 10.1038/s41587-021-00927-2
View details for PubMedID 34083792
The neutrotime transcriptional signature defines a single continuum of neutrophils across biological compartments.
2021; 12 (1): 2856
Neutrophils are implicated in multiple homeostatic and pathological processes, but whether functional diversity requires discrete neutrophil subsets is not known. Here, we apply single-cell RNA sequencing to neutrophils from normal and inflamed mouse tissues. Whereas conventional clustering yields multiple alternative organizational structures, diffusion mapping plus RNA velocity discloses a single developmental spectrum, ordered chronologically. Termed here neutrotime, this spectrum extends from immature pre-neutrophils, largely in bone marrow, to mature neutrophils predominantly in blood and spleen. The sharpest increments in neutrotime occur during the transitions from pre-neutrophils to immature neutrophils and from mature marrow neutrophils to those in blood. Human neutrophils exhibit a similar transcriptomic pattern. Neutrophils migrating into inflamed mouse lung, peritoneum and joint maintain the core mature neutrotime signature together with new transcriptional activity that varies with site and stimulus. Together, these data identify a single developmental spectrum as the dominant organizational theme of neutrophil heterogeneity.
View details for DOI 10.1038/s41467-021-22973-9
View details for PubMedID 34001893
Distinct Foxp3 enhancer elements coordinate development, maintenance, and function of regulatory T cells.
The transcription factor Foxp3 plays crucial roles for Treg cell development and function. Conserved non-coding sequences (CNSs) at the Foxp3 locus control Foxp3 transcription, but how they developmentally contribute to Treg cell lineage specification remains obscure. Here, we show that among Foxp3 CNSs, the promoter-upstream CNS0 and the intergenic CNS3, which bind distinct transcription factors, were activated at early stages of thymocyte differentiation prior to Foxp3 promoter activation, with sequential genomic looping bridging these regions and the promoter. While deletion of either CNS0 or CNS3 partially compromised thymic Treg cell generation, deletion of both completely abrogated the generation and impaired the stability of Foxp3 expression in residual Treg cells. As a result, CNS0 and CNS3 double-deleted mice succumbed to lethal systemic autoimmunity and inflammation. Thus, hierarchical and coordinated activation of Foxp3 CNS0 and CNS3 initiates and stabilizes Foxp3 gene expression, thereby crucially controlling Treg cell development, maintenance, and consequently immunological self-tolerance.
View details for DOI 10.1016/j.immuni.2021.04.005
View details for PubMedID 33930308
- Integrated single-cell transcriptomics and epigenomics reveals strong germinal center-associated etiology of autoimmune risk loci. Science immunology 2021; 6 (64): eabh3768
The SARS-CoV-2 RNA-protein interactome in infected human cells.
Characterizing the interactions that SARS-CoV-2 viral RNAs make with host cell proteins during infection can improve our understanding of viral RNA functions and the host innate immune response. Using RNA antisense purification and mass spectrometry, we identified up to 104 human proteins that directly and specifically bind to SARS-CoV-2 RNAs in infected human cells. We integrated the SARS-CoV-2 RNA interactome with changes in proteome abundance induced by viral infection and linked interactome proteins to cellular pathways relevant to SARS-CoV-2 infections. We demonstrated by genetic perturbation that cellular nucleic acid-binding protein (CNBP) and La-related protein 1 (LARP1), two of the most strongly enriched viral RNA binders, restrict SARS-CoV-2 replication in infected cells and provide a global map of their direct RNA contact sites. Pharmacological inhibition of three other RNA interactome members, PPIA, ATP1A1, and the ARP2/3 complex, reduced viral replication in two human cell lines. The identification of host dependency factors and defence strategies as presented in this work will improve the design of targeted therapeutics against SARS-CoV-2.
View details for DOI 10.1038/s41564-020-00846-z
View details for PubMedID 33349665
Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin.
Cell differentiation and function are regulated across multiple layers of gene regulation, including modulation of gene expression by changes in chromatin accessibility. However, differentiation is an asynchronous process precluding a temporal understanding of regulatory events leading to cell fate commitment. Here we developed simultaneous high-throughput ATAC and RNA expression with sequencing (SHARE-seq), a highly scalable approach for measurement of chromatin accessibility and gene expression in the same single cell, applicable to different tissues. Using 34,774 joint profiles from mouse skin, we develop a computational strategy to identify cis-regulatory interactions and define domains of regulatory chromatin (DORCs) that significantly overlap with super-enhancers. During lineage commitment, chromatin accessibility at DORCs precedes gene expression, suggesting that changes in chromatin accessibility may prime cells for lineage commitment. We computationally infer chromatin potential as a quantitative measure of chromatin lineage-priming and use it to predict cell fate outcomes. SHARE-seq is an extensible platform to study regulatory circuitry across diverse cells in tissues.
View details for DOI 10.1016/j.cell.2020.09.056
View details for PubMedID 33098772
Inherited myeloproliferative neoplasm risk affects haematopoietic stem cells.
Myeloproliferative neoplasms (MPNs) are blood cancers that are characterized by the excessive production of mature myeloid cells and arise from the acquisition of somatic driver mutations in haematopoietic stem cells (HSCs). Epidemiological studies indicate a substantial heritable component of MPNs that is among the highest known for cancers1. However, only a limited number of genetic risk loci have been identified, and the underlying biological mechanisms that lead to the acquisition of MPNs remain unclear. Here, by conducting a large-scale genome-wide association study (3,797 cases and 1,152,977 controls), we identify 17 MPN risk loci (P<5.0*10-8), 7 of which have not been previously reported. We find that there is a shared genetic architecture between MPN risk and several haematopoietic traits from distinct lineages; that there is an enrichment for MPN risk variants within accessible chromatin of HSCs; and that increased MPN risk is associated with longer telomere length in leukocytes and other clonal haematopoietic states-collectively suggesting that MPN risk is associated with the function and self-renewal of HSCs. We use gene mapping to identify modulators of HSC biologylinked to MPN risk, and show through targeted variant-to-function assays that CHEK2 and GFI1B have roles in altering the function of HSCs to confer disease risk. Overall, our results reveal a previously unappreciated mechanism for inherited MPN risk through the modulation of HSC function.
View details for DOI 10.1038/s41586-020-2786-7
View details for PubMedID 33057200
The Polygenic and Monogenic Basis of Blood Traits and Diseases.
2020; 182 (5): 1214
Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant global health burden. Here we integrate data from UK Biobank and a large-scale international collaborative effort, including data for 563,085 European ancestry participants, and discover 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering a range of variation impacting hematopoiesis. We holistically characterize the genetic architecture of hematopoiesis, assess the relevance of the omnigenic model to blood cell phenotypes, delineate relevant hematopoietic cell states influenced by regulatory genetic variants and gene networks, identify novel splice-altering variants mediating the associations, and assess the polygenic prediction potential for blood traits and clinical disorders at the interface of complex and Mendelian genetics. These results show the power of large-scale blood cell trait GWAS to interrogate clinically meaningful variants across a wide allelic spectrum of human variation.
View details for DOI 10.1016/j.cell.2020.08.008
View details for PubMedID 32888494
Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 Individuals from 5 Global Populations.
2020; 182 (5): 1198
Most loci identified by GWASs have been found in populations of European ancestry (EUR). In trans-ethnic meta-analyses for 15 hematological traits in 746,667 participants, including 184,535 non-EUR individuals, we identified 5,552 trait-variant associations at p< 5* 10-9, including 71 novel associations not found in EUR populations. We also identified 28 additional novel variants in ancestry-specific, non-EUR meta-analyses, including an IL7 missense variant in South Asians associated with lymphocyte count invivo and IL-7 secretion levels invitro. Fine-mapping prioritized variants annotated as functional and generated 95% credible sets that were 30% smaller when using the trans-ethnic as opposed to the EUR-only results. We explored the clinical significance and predictive value of trans-ethnic variants in multiple populations and compared genetic architecture and the effect of natural selection on these blood phenotypes between populations. Altogether, our results for hematological traits highlight the value of a more global representation of populations in genetic studies.
View details for DOI 10.1016/j.cell.2020.06.045
View details for PubMedID 32888493
Epigenomic State Transitions Characterize Tumor Progression in Mouse Lung Adenocarcinoma.
2020; 38 (2): 212
Regulatory networks that maintain functional, differentiated cell states are often dysregulated in tumor development. Here, we use single-cell epigenomics to profile chromatin state transitions in a mouse model of lung adenocarcinoma (LUAD). We identify an epigenomic continuum representing loss of cellular identity and progression toward a metastatic state. We define co-accessible regulatory programs and infer key activating and repressive chromatin regulators of these cell states. Among these co-accessibility programs, we identify a pre-metastatic transition, characterized by activation of RUNX transcription factors, which mediates extracellular matrix remodeling to promote metastasis and is predictive of survival across human LUAD patients. Together, these results demonstrate the power of single-cell epigenomics to identify regulatory programs to uncover mechanisms and key biomarkers of tumor progression.
View details for DOI 10.1016/j.ccell.2020.06.006
View details for PubMedID 32707078
A dual-deaminase CRISPR base editor enables concurrent adenine and cytosine editing
2020; 38 (7): 861–U27
Existing adenine and cytosine base editors induce only a single type of modification, limiting the range of DNA alterations that can be created. Here we describe a CRISPR-Cas9-based synchronous programmable adenine and cytosine editor (SPACE) that can concurrently introduce A-to-G and C-to-T substitutions with minimal RNA off-target edits. SPACE expands the range of possible DNA sequence alterations, broadening the research applications of CRISPR base editors.
View details for DOI 10.1038/s41587-020-0535-y
View details for Web of Science ID 000537041400001
View details for PubMedID 32483364
Prioritizing disease and trait causal variants at the TNFAIP3 locus using functional and genomic features
2020; 11 (1): 1237
Genome-wide association studies have associated thousands of genetic variants with complex traits and diseases, but pinpointing the causal variant(s) among those in tight linkage disequilibrium with each associated variant remains a major challenge. Here, we use seven experimental assays to characterize all common variants at the multiple disease-associated TNFAIP3 locus in five disease-relevant immune cell lines, based on a set of features related to regulatory potential. Trait/disease-associated variants are enriched among SNPs prioritized based on either: (1) residing within CRISPRi-sensitive regulatory regions, or (2) localizing in a chromatin accessible region while displaying allele-specific reporter activity. Of the 15 trait/disease-associated haplotypes at TNFAIP3, 9 have at least one variant meeting one or both of these criteria, 5 of which are further supported by genetic fine-mapping. Our work provides a comprehensive strategy to characterize genetic variation at important disease-associated loci, and aids in the effort to identify trait causal genetic variants.
View details for DOI 10.1038/s41467-020-15022-4
View details for Web of Science ID 000549162600014
View details for PubMedID 32144282
View details for PubMedCentralID PMC7060350
Inference and effects of barcode multiplets in droplet-based single-cell assays
2020; 11 (1): 866
A widespread assumption for single-cell analyses specifies that one cell's nucleic acids are predominantly captured by one oligonucleotide barcode. Here, we show that ~13-21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call "barcode multiplets". We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that approximately 4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that approximately 5% of beads may contain detectable levels of multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.
View details for DOI 10.1038/s41467-020-14667-5
View details for Web of Science ID 000514928000007
View details for PubMedID 32054859
View details for PubMedCentralID PMC7018801
Control of human hemoglobin switching by LIN28B-mediated regulation of BCL11A translation
2020; 52 (2): 138-+
Increased production of fetal hemoglobin (HbF) can ameliorate the severity of sickle cell disease and β-thalassemia1. BCL11A represses the genes encoding HbF and regulates human hemoglobin switching through variation in its expression during development2-7. However, the mechanisms underlying the developmental expression of BCL11A remain mysterious. Here we show that BCL11A is regulated at the level of messenger RNA (mRNA) translation during human hematopoietic development. Despite decreased BCL11A protein synthesis earlier in development, BCL11A mRNA continues to be associated with ribosomes. Through unbiased genomic and proteomic analyses, we demonstrate that the RNA-binding protein LIN28B, which is developmentally expressed in a pattern reciprocal to that of BCL11A, directly interacts with ribosomes and BCL11A mRNA. Furthermore, we show that BCL11A mRNA translation is suppressed by LIN28B through direct interactions, independently of its role in regulating let-7 microRNAs, and that BCL11A is the major target of LIN28B-mediated HbF induction. Our results reveal a previously unappreciated mechanism underlying human hemoglobin switching that illuminates new therapeutic opportunities.
View details for DOI 10.1038/s41588-019-0568-7
View details for Web of Science ID 000508324400002
View details for PubMedID 31959994
View details for PubMedCentralID PMC7031047
- An old BATF's new T-ricks. Nature immunology 2020
Purifying Selection against Pathogenic Mitochondrial DNA in Human T Cells.
The New England journal of medicine
Many mitochondrial diseases are caused by mutations in mitochondrial DNA (mtDNA). Patients' cells contain a mixture of mutant and nonmutant mtDNA (a phenomenon called heteroplasmy). The proportion of mutant mtDNA varies across patients and among tissues within a patient. We simultaneously assayed single-cell heteroplasmy and cell state in thousands of blood cells obtained from three unrelated patients who had A3243G-associated mitochondrial encephalomyopathy, lactic acidosis, and strokelike episodes. We observed a broad range of heteroplasmy across all cell types but also found markedly reduced heteroplasmy in T cells, a finding consistent with purifying selection within this lineage. We observed this pattern in six additional patients who had heteroplasmic A3243G without strokelike episodes. (Funded by the Marriott Foundation and others.).
View details for DOI 10.1056/NEJMoa2001265
View details for PubMedID 32786181
Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling.
Natural mitochondrial DNA (mtDNA) mutations enable the inference of clonal relationships among cells. mtDNA can be profiled along with measures of cell state, but has not yet been combined with the massively parallel approaches needed to tackle the complexity of human tissue. Here, we introduce a high-throughput, droplet-based mitochondrial single-cell assay for transposase-accessible chromatin with sequencing (scATAC-seq), a method that combines high-confidence mtDNA mutation calling in thousands of single cells with their concomitant high-quality accessible chromatin profile. This enables the inference of mtDNA heteroplasmy, clonal relationships, cell state and accessible chromatin variation in individual cells. We reveal single-cell variation in heteroplasmy of a pathologic mtDNA variant, which we associate with intra-individual chromatin variability and clonal evolution. We clonally trace thousands of cells from cancers, linking epigenomic variability to subclonal evolution, and infer cellular dynamics of differentiating hematopoietic cells in vitro and in vivo. Taken together, our approach enables the study of cellular population dynamics and clonal properties in vivo.
View details for DOI 10.1038/s41587-020-0645-6
View details for PubMedID 32788668
Single Cell Transcriptomics Implicate Novel Monocyte and T Cell Immune Dysregulation in Sarcoidosis.
Frontiers in immunology
2020; 11: 567342
Sarcoidosis is a systemic inflammatory disease characterized by infiltration of immune cells into granulomas. Previous gene expression studies using heterogeneous cell mixtures lack insight into cell-type-specific immune dysregulation. We performed the first single-cell RNA-sequencing study of sarcoidosis in peripheral immune cells in 48 patients and controls. Following unbiased clustering, differentially expressed genes were identified for 18 cell types and bioinformatically assessed for function and pathway enrichment. Our results reveal persistent activation of circulating classical monocytes with subsequent upregulation of trafficking molecules. Specifically, classical monocytes upregulated distinct markers of activation including adhesion molecules, pattern recognition receptors, and chemokine receptors, as well as enrichment of immunoregulatory pathways HMGB1, mTOR, and ephrin receptor signaling. Predictive modeling implicated TGFbeta and mTOR signaling as drivers of persistent monocyte activation. Additionally, sarcoidosis T cell subsets displayed patterns of dysregulation. CD4 naive T cells were enriched for markers of apoptosis and Th17/Treg differentiation, while effector T cells showed enrichment of anergy-related pathways. Differentially expressed genes in regulatory T cells suggested dysfunctional p53, cell death, and TNFR2 signaling. Using more sensitive technology and more precise units of measure, we identify cell-type specific, novel inflammatory and regulatory pathways. Based on our findings, we suggest a novel model involving four convergent arms of dysregulation: persistent hyperactivation of innate and adaptive immunity via classical monocytes and CD4 naive T cells, regulatory T cell dysfunction, and effector T cell anergy. We further our understanding of the immunopathology of sarcoidosis and point to novel therapeutic targets.
View details for DOI 10.3389/fimmu.2020.567342
View details for PubMedID 33363531
Large-Scale Topological Changes Restrain Malignant Progression in Colorectal Cancer.
Widespread changes to DNA methylation and chromatin are well documented in cancer, but the fate of higher-order chromosomal structure remains obscure. Here we integrated topological maps for colon tumors and normal colons with epigenetic, transcriptional, and imaging data to characterize alterations to chromatin loops, topologically associated domains, and large-scale compartments. We found that spatial partitioning of the open and closed genome compartments is profoundly compromised in tumors. This reorganization is accompanied by compartment-specific hypomethylation and chromatin changes. Additionally, we identify a compartment at the interface between the canonical A and B compartments that is reorganized in tumors. Remarkably, similar shifts were evident in non-malignant cells that have accumulated excess divisions. Our analyses suggest that these topological changes repress stemness and invasion programs while inducing anti-tumor immunity genes and may therefore restrain malignant progression. Our findings call into question the conventional view that tumor-associated epigenomic alterations are primarily oncogenic.
View details for DOI 10.1016/j.cell.2020.07.030
View details for PubMedID 32841603
Longitudinal assessment of clonal mosaicism in human hematopoiesis via mitochondrial mutation tracking
2019; 3 (24): 4161–65
Our ability to track cellular dynamics in humans over time in vivo has been limited. Here, we demonstrate how somatic mutations in mitochondrial DNA (mtDNA) can be used to longitudinally track the dynamic output of hematopoietic stem and progenitor cells in humans. Over the course of 3 years of blood sampling in a single individual, our analyses reveal somatic mtDNA sequence variation and evolution reminiscent of models of hematopoiesis established by genetic labeling approaches. Furthermore, we observe fluctuations in mutation heteroplasmy, coinciding with specific clinical events, such as infections, and further identify lineage-specific somatic mtDNA mutations in longitudinally sampled circulating blood cell subsets in individuals with leukemia. Collectively, these observations indicate the significant potential of using tracking of somatic mtDNA sequence variation as a broadly applicable approach to systematically assess hematopoietic clonal dynamics in human health and disease.
View details for DOI 10.1182/bloodadvances.2019001196
View details for Web of Science ID 000504042200003
View details for PubMedID 31841597
View details for PubMedCentralID PMC6929387
Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations
2019; 51 (12): 1664-+
Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases1-4. Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-gene connections across cell types5,6. We developed an experimental approach, CRISPRi-FlowFISH, to perturb enhancers in the genome, and we applied it to test >3,500 potential enhancer-gene connections for 30 genes. We found that a simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in our CRISPR dataset. This activity-by-contact model allows us to construct genome-wide maps of enhancer-gene connections in a given cell type, on the basis of chromatin state measurements. Together, CRISPRi-FlowFISH and the activity-by-contact model provide a systematic approach to map and predict which enhancers regulate which genes, and will help to interpret the functions of the thousands of disease risk variants in the noncoding genome.
View details for DOI 10.1038/s41588-019-0538-0
View details for Web of Science ID 000499696700003
View details for PubMedID 31784727
View details for PubMedCentralID PMC6886585
Assessment of computational methods for the analysis of single-cell ATAC-seq data
2019; 20 (1): 241
Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans), lead to inherent data sparsity (1-10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10-45% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level.We present a benchmarking framework that is applied to 10 computational methods for scATAC-seq on 13 synthetic and real datasets from different assays, profiling cell types from diverse tissues and organisms. Methods for processing and featurizing scATAC-seq data were compared by their ability to discriminate cell types when combined with common unsupervised clustering approaches. We rank evaluated methods and discuss computational challenges associated with scATAC-seq analysis including inherently sparse data, determination of features, peak calling, the effects of sequencing coverage and noise, and clustering performance. Running times and memory requirements are also discussed.This reference summary of scATAC-seq methods offers recommendations for best practices with consideration for both the non-expert user and the methods developer. Despite variation across methods and datasets, SnapATAC, Cusanovich2018, and cisTopic outperform other methods in separating cell populations of different coverages and noise levels in both synthetic and real datasets. Notably, SnapATAC is the only method able to analyze a large dataset (> 80,000 cells).
View details for DOI 10.1186/s13059-019-1854-5
View details for Web of Science ID 000501809500001
View details for PubMedID 31739806
View details for PubMedCentralID PMC6859644
CRISPR DNA base editors with reduced RNA off-target and self-editing activities
2019; 37 (9): 1041-+
Cytosine or adenine base editors (CBEs or ABEs) can introduce specific DNA C-to-T or A-to-G alterations1-4. However, we recently demonstrated that they can also induce transcriptome-wide guide-RNA-independent editing of RNA bases5, and created selective curbing of unwanted RNA editing (SECURE)-BE3 variants that have reduced unwanted RNA-editing activity5. Here we describe structure-guided engineering of SECURE-ABE variants with reduced off-target RNA-editing activity and comparable on-target DNA-editing activity that are also among the smallest Streptococcus pyogenes Cas9 base editors described to date. We also tested CBEs with cytidine deaminases other than APOBEC1 and found that the human APOBEC3A-based CBE induces substantial editing of RNA bases, whereas an enhanced APOBEC3A-based CBE6, human activation-induced cytidine deaminase-based CBE7, and the Petromyzon marinus cytidine deaminase-based CBE Target-AID4 induce less editing of RNA. Finally, we found that CBEs and ABEs that exhibit RNA off-target editing activity can also self-edit their own transcripts, thereby leading to heterogeneity in base-editor coding sequences.
View details for DOI 10.1038/s41587-019-0236-6
View details for Web of Science ID 000488532200020
View details for PubMedID 31477922
View details for PubMedCentralID PMC6730565
Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility
2019; 37 (8): 916-+
Recent technical advancements have facilitated the mapping of epigenomes at single-cell resolution; however, the throughput and quality of these methods have limited their widespread adoption. Here we describe a high-quality (105 nuclear fragments per cell) droplet-microfluidics-based method for single-cell profiling of chromatin accessibility. We use this approach, named 'droplet single-cell assay for transposase-accessible chromatin using sequencing' (dscATAC-seq), to assay 46,653 cells for the unbiased discovery of cell types and regulatory elements in adult mouse brain. We further increase the throughput of this platform by combining it with combinatorial indexing (dsciATAC-seq), enabling single-cell studies at a massive scale. We demonstrate the utility of this approach by measuring chromatin accessibility across 136,463 resting and stimulated human bone marrow-derived cells to reveal changes in the cis- and trans-regulatory landscape across cell types and under stimulatory conditions at single-cell resolution. Altogether, we describe a total of 510,123 single-cell profiles, demonstrating the scalability and flexibility of this droplet-based platform.
View details for DOI 10.1038/s41587-019-0147-6
View details for Web of Science ID 000482876100023
View details for PubMedID 31235917
Transcriptional States and Chromatin Accessibility Underlying Human Erythropoiesis
2019; 27 (11): 3228-+
Human erythropoiesis serves as a paradigm of physiologic cellular differentiation. This process is also of considerable interest for better understanding anemias and identifying new therapies. Here, we apply deep transcriptomic and accessible chromatin profiling to characterize a faithful ex vivo human erythroid differentiation system from hematopoietic stem and progenitor cells. We reveal stage-specific transcriptional states and chromatin accessibility during various stages of erythropoiesis, including 14,260 differentially expressed genes and 63,659 variably accessible chromatin peaks. Our analysis suggests differentiation stage-predominant roles for specific master regulators, including GATA1 and KLF1. We integrate chromatin profiles with common and rare genetic variants associated with erythroid cell traits and diseases, finding that variants regulating different erythroid phenotypes likely act at variable points during differentiation. In addition, we identify a regulator of terminal erythropoiesis, TMCC2, more broadly illustrating the value of this comprehensive analysis to improve our understanding of erythropoiesis in health and disease.
View details for DOI 10.1016/j.celrep.2019.05.046
View details for Web of Science ID 000470993200011
View details for PubMedID 31189107
View details for PubMedCentralID PMC6579117
Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors
2019; 569 (7756): 433-+
CRISPR-Cas base-editor technology enables targeted nucleotide alterations, and is being increasingly used for research and potential therapeutic applications1,2. The most widely used cytosine base editors (CBEs) induce deamination of DNA cytosines using the rat APOBEC1 enzyme, which is targeted by a linked Cas protein-guide RNA complex3,4. Previous studies of the specificity of CBEs have identified off-target DNA edits in mammalian cells5,6. Here we show that a CBE with rat APOBEC1 can cause extensive transcriptome-wide deamination of RNA cytosines in human cells, inducing tens of thousands of C-to-U edits with frequencies ranging from 0.07% to 100% in 38-58% of expressed genes. CBE-induced RNA edits occur in both protein-coding and non-protein-coding sequences and generate missense, nonsense, splice site, and 5' and 3' untranslated region mutations. We engineered two CBE variants bearing mutations in rat APOBEC1 that substantially decreased the number of RNA edits (by more than 390-fold and more than 3,800-fold) in human cells. These variants also showed more precise on-target DNA editing than the wild-type CBE and, for most guide RNAs tested, no substantial reduction in editing efficiency. Finally, we show that an adenine base editor7 can also induce transcriptome-wide RNA edits. These results have implications for the use of base editors in both research and clinical settings, illustrate the feasibility of engineering improved variants with reduced RNA editing activities, and suggest the need to more fully define and characterize the RNA off-target effects of deaminase enzymes in base editor platforms.
View details for DOI 10.1038/s41586-019-1161-z
View details for Web of Science ID 000468123700044
View details for PubMedID 30995674
View details for PubMedCentralID PMC6657343
Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis
Genome-wide association studies (GWAS) have identified thousands of variants associated with human diseases and traits. However, the majority of GWAS-implicated variants are in non-coding regions of the genome and require in depth follow-up to identify target genes and decipher biological mechanisms. Here, rather than focusing on causal variants, we have undertaken a pooled loss-of-function screen in primary hematopoietic cells to interrogate 389 candidate genes contained in 75 loci associated with red blood cell traits. Using this approach, we identify 77 genes at 38 GWAS loci, with most loci harboring 1-2 candidate genes. Importantly, the hit set was strongly enriched for genes validated through orthogonal genetic approaches. Genes identified by this approach are enriched in specific and relevant biological pathways, allowing regulators of human erythropoiesis and modifiers of blood diseases to be defined. More generally, this functional screen provides a paradigm for gene-centric follow up of GWAS for a variety of human diseases and traits.
View details for DOI 10.7554/eLife.44080
View details for Web of Science ID 000468967900001
View details for PubMedID 31070582
View details for PubMedCentralID PMC6534380
Impaired human hematopoiesis due to a cryptic intronic GATA1 splicing mutation
JOURNAL OF EXPERIMENTAL MEDICINE
2019; 216 (5): 1050–60
Studies of allelic variation underlying genetic blood disorders have provided important insights into human hematopoiesis. Most often, the identified pathogenic mutations result in loss-of-function or missense changes. However, assessing the pathogenicity of noncoding variants can be challenging. Here, we characterize two unrelated patients with a distinct presentation of dyserythropoietic anemia and other impairments in hematopoiesis associated with an intronic mutation in GATA1 that is 24 nucleotides upstream of the canonical splice acceptor site. Functional studies demonstrate that this single-nucleotide alteration leads to reduced canonical splicing and increased use of an alternative splice acceptor site that causes a partial intron retention event. The resultant altered GATA1 contains a five-amino acid insertion at the C-terminus of the C-terminal zinc finger and has no observable activity. Collectively, our results demonstrate how altered splicing of GATA1, which reduces levels of the normal form of this master transcription factor, can result in distinct changes in human hematopoiesis.
View details for DOI 10.1084/jem.20181625
View details for Web of Science ID 000466981400009
View details for PubMedID 30914438
View details for PubMedCentralID PMC6504223
Heritability of fetal hemoglobin, white cell count, and other clinical traits from a sickle cell disease family cohort
AMERICAN JOURNAL OF HEMATOLOGY
2019; 94 (5): 522–27
Sickle cell disease (SCD) is the most common monogenic disorder in the world. Notably, there is extensive clinical heterogeneity in SCD that cannot be fully accounted for by known factors, and in particular, the extent to which the phenotypic diversity of SCD can be explained by genetic variation has not been reliably quantified. Here, in a family-based cohort of 449 patients with SCD and 755 relatives, we first show that 5 known modifiers affect 11 adverse outcomes in SCD to varying degrees. We then utilize a restricted maximum likelihood procedure to estimate the heritability of 20 hematologic traits, including fetal hemoglobin (HbF) and white blood cell count (WBC), in the clinically relevant context of inheritance from healthy carriers to SCD patients. We report novel estimations of heritability for HbF at 31.6% (±5.4%) and WBC at 41.2% (±6.8%) in our cohort. Finally, we demonstrate shared genetic bases between HbF, WBC, and other hematologic traits, but surprisingly little overlap between HbF and WBC themselves. In total, our analyses show that HbF and WBC have significant heritable components among individuals with SCD and their relatives, demonstrating the value of using family-based studies to better understand modifiers of SCD.
View details for DOI 10.1002/ajh.25421
View details for Web of Science ID 000468303900002
View details for PubMedID 30680775
View details for PubMedCentralID PMC6449202
Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM
2019; 10: 1903
Single-cell transcriptomic assays have enabled the de novo reconstruction of lineage differentiation trajectories, along with the characterization of cellular heterogeneity and state transitions. Several methods have been developed for reconstructing developmental trajectories from single-cell transcriptomic data, but efforts on analyzing single-cell epigenomic data and on trajectory visualization remain limited. Here we present STREAM, an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. We have tested STREAM on several synthetic and real datasets generated with different single-cell technologies. We further demonstrate its utility for understanding myoblast differentiation and disentangling known heterogeneity in hematopoiesis for different organisms. STREAM is an open-source software package.
View details for DOI 10.1038/s41467-019-09670-4
View details for Web of Science ID 000465202300008
View details for PubMedID 31015418
View details for PubMedCentralID PMC6478907
Novel CRISPR Cytosine Base Editors with Minimized Off-Target Effects and Improved Editing Properties
CELL PRESS. 2019: 295
View details for Web of Science ID 000464381003087
The ATPase module of mammalian SWI/SNF family complexes mediates subcomplex identity and catalytic activity-independent genomic targeting
2019; 51 (4): 618-+
Perturbations to mammalian switch/sucrose non-fermentable (mSWI/SNF) chromatin remodeling complexes have been widely implicated as driving events in cancer1. One such perturbation is the dual loss of the SMARCA4 and SMARCA2 ATPase subunits in small cell carcinoma of the ovary, hypercalcemic type (SCCOHT)2-5, SMARCA4-deficient thoracic sarcomas6 and dedifferentiated endometrial carcinomas7. However, the consequences of dual ATPase subunit loss on mSWI/SNF complex subunit composition, chromatin targeting, DNA accessibility and gene expression remain unknown. Here we identify an ATPase module of subunits that is required for functional specification of the Brahma-related gene-associated factor (BAF) and polybromo-associated BAF (PBAF) mSWI/SNF family subcomplexes. Using SMARCA4/2 ATPase mutant variants, we define the catalytic activity-dependent and catalytic activity-independent contributions of the ATPase module to the targeting of BAF and PBAF complexes on chromatin genome-wide. Finally, by linking distinct mSWI/SNF complex target sites to tumor-suppressive gene expression programs, we clarify the transcriptional consequences of SMARCA4/2 dual loss in SCCOHT.
View details for DOI 10.1038/s41588-019-0363-5
View details for Web of Science ID 000462767500009
View details for PubMedID 30858614
View details for PubMedCentralID PMC6755913
- Interrogation of human hematopoiesis at single-cell and single-variant resolution NATURE GENETICS 2019; 51 (4): 683-+
Lineage Tracing in Humans Enabled by Mitochondrial Mutations and Single-Cell Genomics
2019; 176 (6): 1325-+
Lineage tracing provides key insights into the fate of individual cells in complex organisms. Although effective genetic labeling approaches are available in model systems, in humans, most approaches require detection of nuclear somatic mutations, which have high error rates, limited scale, and do not capture cell state information. Here, we show that somatic mutations in mtDNA can be tracked by single-cell RNA or assay for transposase accessible chromatin (ATAC) sequencing. We leverage somatic mtDNA mutations as natural genetic barcodes and demonstrate their utility as highly accurate clonal markers to infer cellular relationships. We track native human cells both in vitro and in vivo and relate clonal dynamics to gene expression and chromatin accessibility. Our approach should allow clonal tracking at a 1,000-fold greater scale than with nuclear genome sequencing, with simultaneous information on cell state, opening the way to chart cellular dynamics in human health and disease.
View details for DOI 10.1016/j.cell.2019.01.022
View details for Web of Science ID 000460509600009
View details for PubMedID 30827679
View details for PubMedCentralID PMC6408267
The cis-Regulatory Atlas of the Mouse Immune System
2019; 176 (4): 897-+
A complete chart of cis-regulatory elements and their dynamic activity is necessary to understand the transcriptional basis of differentiation and function of an organ system. We generated matched epigenome and transcriptome measurements in 86 primary cell types that span the mouse immune system and its differentiation cascades. This breadth of data enable variance components analysis that suggests that genes fall into two distinct classes, controlled by either enhancer- or promoter-driven logic, and multiple regression that connects genes to the enhancers that regulate them. Relating transcription factor (TF) expression to the genome-wide accessibility of their binding motifs classifies them as predominantly openers or closers of local chromatin accessibility, pinpointing specific cis-regulatory elements where binding of given TFs is likely functionally relevant, validated by chromatin immunoprecipitation sequencing (ChIP-seq). Overall, this cis-regulatory atlas provides a trove of information on transcriptional regulation through immune differentiation and a foundational scaffold to define key regulatory events throughout the immunological genome.
View details for DOI 10.1016/j.cell.2018.12.036
View details for Web of Science ID 000457969200019
View details for PubMedID 30686579
View details for PubMedCentralID PMC6785993
Preprocessing and Computational Analysis of Single-Cell Epigenomic Datasets.
Methods in molecular biology (Clifton, N.J.)
2019; 1935: 187–202
Recent technological developments have enabled the characterization of the epigenetic landscape of single cells across a range of tissues in normal and diseased states and under various biological and chemical perturbations. While analysis of these profiles resembles methods from single-cell transcriptomic studies, unique challenges are associated with bioinformatics processing of single-cell epigenetic data, including a much larger (10-1,000*) feature set and significantly greater sparsity, requiring customized solutions. Here, we discuss the essentials of the computational methodology required for analyzing common single-cell epigenomic measurements for DNA methylation using bisulfite sequencing and open chromatin using ATAC-Seq.
View details for DOI 10.1007/978-1-4939-9057-3_13
View details for PubMedID 30758828
A non-canonical SWI/SNF complex is a synthetic lethal target in cancers driven by BAF complex perturbation
NATURE CELL BIOLOGY
2018; 20 (12): 1410-+
Mammalian SWI/SNF chromatin remodelling complexes exist in three distinct, final-form assemblies: canonical BAF (cBAF), PBAF and a newly characterized non-canonical complex (ncBAF). However, their complex-specific targeting on chromatin, functions and roles in disease remain largely undefined. Here, we comprehensively mapped complex assemblies on chromatin and found that ncBAF complexes uniquely localize to CTCF sites and promoters. We identified ncBAF subunits as synthetic lethal targets specific to synovial sarcoma and malignant rhabdoid tumours, which both exhibit cBAF complex (SMARCB1 subunit) perturbation. Chemical and biological depletion of the ncBAF subunit, BRD9, rapidly attenuates synovial sarcoma and malignant rhabdoid tumour cell proliferation. Importantly, in cBAF-perturbed cancers, ncBAF complexes maintain gene expression at retained CTCF-promoter sites and function in a manner distinct from fusion oncoprotein-bound complexes. Together, these findings unmask the unique targeting and functional roles of ncBAF complexes and present new cancer-specific therapeutic targets.
View details for DOI 10.1038/s41556-018-0221-1
View details for Web of Science ID 000451328500013
View details for PubMedID 30397315
View details for PubMedCentralID PMC6698386
Enhancer histone-QTLs are enriched on autoimmune risk haplotypes and influence gene expression within chromatin networks
2018; 9: 2905
Genetic variants can confer risk to complex genetic diseases by modulating gene expression through changes to the epigenome. To assess the degree to which genetic variants influence epigenome activity, we integrate epigenetic and genotypic data from lupus patient lymphoblastoid cell lines to identify variants that induce allelic imbalance in the magnitude of histone post-translational modifications, referred to herein as histone quantitative trait loci (hQTLs). We demonstrate that enhancer hQTLs are enriched on autoimmune disease risk haplotypes and disproportionately influence gene expression variability compared with non-hQTL variants in strong linkage disequilibrium. We show that the epigenome regulates HLA class II genes differently in individuals who carry HLA-DR3 or HLA-DR15 haplotypes, resulting in differential 3D chromatin conformation and gene expression. Finally, we identify significant expression QTL (eQTL) x hQTL interactions that reveal substructure within eQTL gene expression, suggesting potential implications for functional genomic studies that leverage eQTL data for subject selection and stratification.
View details for DOI 10.1038/s41467-018-05328-9
View details for Web of Science ID 000439687600003
View details for PubMedID 30046115
View details for PubMedCentralID PMC6060153
Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation
2018; 173 (6): 1535-+
Human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While the chromatin accessibility landscape of this process has been explored in defined populations, single-cell regulatory variation has been hidden by ensemble averaging. We collected single-cell chromatin accessibility profiles across 10 populations of immunophenotypically defined human hematopoietic cell types and constructed a chromatin accessibility landscape of human hematopoiesis to characterize differentiation trajectories. We find variation consistent with lineage bias toward different developmental branches in multipotent cell types. We observe heterogeneity within common myeloid progenitors (CMPs) and granulocyte-macrophage progenitors (GMPs) and develop a strategy to partition GMPs along their differentiation trajectory. Furthermore, we integrated single-cell RNA sequencing (scRNA-seq) data to associate transcription factors to chromatin accessibility changes and regulatory elements to target genes through correlations of expression and regulatory element accessibility. Overall, this work provides a framework for integrative exploration of complex regulatory dynamics in a primary human tissue at single-cell resolution.
View details for PubMedID 29706549
Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types
2018; 50 (4): 621-+
We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.
View details for DOI 10.1038/s41588-018-0081-4
View details for Web of Science ID 000429529300022
View details for PubMedID 29632380
View details for PubMedCentralID PMC5896795
- Response to "Unexpected mutations after CRISPR-Cas9 editing in vivo" NATURE METHODS 2018; 15 (4): 238–39
- hichipper: a preprocessing pipeline for calling DNA loops from HiChIP data NATURE METHODS 2018; 15 (3): 155–56
diffloop: a computational framework for identifying and analyzing differential DNA loops from sequencing data
2018; 34 (4): 672–74
The 3D architecture of DNA within the nucleus is a key determinant of interactions between genes, regulatory elements, and transcriptional machinery. As a result, differences in DNA looping structure are associated with variation in gene expression and cell state. To systematically assess changes in DNA looping architecture between samples, we introduce diffloop, an R/Bioconductor package that provides a suite of functions for the quality control, statistical testing, annotation, and visualization of DNA loops. We demonstrate this functionality by detecting differences between ENCODE ChIA-PET samples and relate looping to variability in epigenetic state.Diffloop is implemented as an R/Bioconductor package available at https://firstname.lastname@example.org.Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btx623
View details for Web of Science ID 000424889300017
View details for PubMedID 29028898
View details for PubMedCentralID PMC5860605
Common genes associated with antidepressant response in mouse and man identify key role of glucocorticoid receptor sensitivity
2017; 15 (12): e2002690
Response to antidepressant treatment in major depressive disorder (MDD) cannot be predicted currently, leading to uncertainty in medication selection, increasing costs, and prolonged suffering for many patients. Despite tremendous efforts in identifying response-associated genes in large genome-wide association studies, the results have been fairly modest, underlining the need to establish conceptually novel strategies. For the identification of transcriptome signatures that can distinguish between treatment responders and nonresponders, we herein submit a novel animal experimental approach focusing on extreme phenotypes. We utilized the large variance in response to antidepressant treatment occurring in DBA/2J mice, enabling sample stratification into subpopulations of good and poor treatment responders to delineate response-associated signature transcript profiles in peripheral blood samples. As a proof of concept, we translated our murine data to the transcriptome data of a clinically relevant human cohort. A cluster of 259 differentially regulated genes was identified when peripheral transcriptome profiles of good and poor treatment responders were compared in the murine model. Differences in expression profiles from baseline to week 12 of the human orthologues selected on the basis of the murine transcript signature allowed prediction of response status with an accuracy of 76% in the patient population. Finally, we show that glucocorticoid receptor (GR)-regulated genes are significantly enriched in this cluster of antidepressant-response genes. Our findings point to the involvement of GR sensitivity as a potential key mechanism shaping response to antidepressant treatment and support the hypothesis that antidepressants could stimulate resilience-promoting molecular mechanisms. Our data highlight the suitability of an appropriate animal experimental approach for the discovery of treatment response-associated pathways across species.
View details for DOI 10.1371/journal.pbio.2002690
View details for Web of Science ID 000418943900003
View details for PubMedID 29283992
View details for PubMedCentralID PMC5746203
A B Cell Regulome Links Notch to Downstream Oncogenic Pathways in Small B Cell Lymphomas
2017; 21 (3): 784–97
Gain-of-function Notch mutations are recurrent in mature small B cell lymphomas such as mantle cell lymphoma (MCL) and chronic lymphocytic leukemia (CLL), but the Notch target genes that contribute to B cell oncogenesis are largely unknown. We performed integrative analysis of Notch-regulated transcripts, genomic binding of Notch transcription complexes, and genome conformation data to identify direct Notch target genes in MCL cell lines. This B cell Notch regulome is largely controlled through Notch-bound distal enhancers and includes genes involved in B cell receptor and cytokine signaling and the oncogene MYC, which sustains proliferation of Notch-dependent MCL cell lines via a Notch-regulated lineage-restricted enhancer complex. Expression of direct Notch target genes is associated with Notch activity in an MCL xenograft model and in CLL lymph node biopsies. Our findings provide key insights into the role of Notch in MCL and other B cell malignancies and have important implications for therapeutic targeting of Notch-dependent oncogenic pathways.
View details for DOI 10.1016/j.celrep.2017.09.066
View details for Web of Science ID 000413090600019
View details for PubMedID 29045844
View details for PubMedCentralID PMC5687286
An Epigenome-Guided Approach to Causal Variant Discovery in Autoimmune Disease
View details for Web of Science ID 000411824100184
Dissecting hematopoietic and renal cell heterogeneity in adult zebrafish at single-cell resolution using RNA sequencing
JOURNAL OF EXPERIMENTAL MEDICINE
2017; 214 (10): 2875–87
Recent advances in single-cell, transcriptomic profiling have provided unprecedented access to investigate cell heterogeneity during tissue and organ development. In this study, we used massively parallel, single-cell RNA sequencing to define cell heterogeneity within the zebrafish kidney marrow, constructing a comprehensive molecular atlas of definitive hematopoiesis and functionally distinct renal cells found in adult zebrafish. Because our method analyzed blood and kidney cells in an unbiased manner, our approach was useful in characterizing immune-cell deficiencies within DNA-protein kinase catalytic subunit (prkdc), interleukin-2 receptor γ a (il2rga), and double-homozygous-mutant fish, identifying blood cell losses in T, B, and natural killer cells within specific genetic mutants. Our analysis also uncovered novel cell types, including two classes of natural killer immune cells, classically defined and erythroid-primed hematopoietic stem and progenitor cells, mucin-secreting kidney cells, and kidney stem/progenitor cells. In total, our work provides the first, comprehensive, single-cell, transcriptomic analysis of kidney and marrow cells in the adult zebrafish.
View details for DOI 10.1084/jem.20170976
View details for Web of Science ID 000412015600006
View details for PubMedID 28878000
View details for PubMedCentralID PMC5626406
- Confounding in ex vivo models of Diamond-Blackfan anemia BLOOD 2017; 130 (9): 1165–68
- Notch-Regulated Enhancers in B-Cell Lymphoma Activate MYC and Potentiate B-Cell Receptor Signaling AMER SOC HEMATOLOGY. 2016
Computationally Efficient Solutions for Functionalizing Common Variants in Three-Dimensional Models
WILEY-BLACKWELL. 2015: 562
View details for Web of Science ID 000363340500092
Fine mapping of chromosome 15q25 implicates ZNF592 in neurosarcoidosis patients
ANNALS OF CLINICAL AND TRANSLATIONAL NEUROLOGY
2015; 2 (10): 972–77
Neurosarcoidosis is a clinical subtype of sarcoidosis characterized by the presence of granulomas in the nervous system. Here, we report a highly significant association with a variant (rs75652600, P = 3.12 × 10(-8), odds ratios = 4.34) within a zinc finger gene, ZNF592, from an imputation-based fine-mapping study of the chromosomal region 15q25 in African-Americans with neurosarcoidosis. We validate the association with ZNF592, a gene previously shown to cause cerebellar ataxia, in a cohort of European-Americans with neurosarcoidosis by uncovering low-frequency variants with a similar risk effect size (chr15:85309284, P = 0.0021, odds ratios = 5.36).
View details for DOI 10.1002/acn3.229
View details for Web of Science ID 000367239000004
View details for PubMedID 26478897
View details for PubMedCentralID PMC4603380