Doctor of Philosophy, Tongji University (2016)
Stanley Qi, Postdoctoral Faculty Sponsor
CRISPhieRmix: a hierarchical mixture model for CRISPR pooled screens.
2018; 19 (1): 159
Pooled CRISPR screens allow researchers to interrogate genetic causes of complex phenotypes at the genome-wide scale and promise higher specificity and sensitivity compared to competing technologies. Unfortunately, two problems exist, particularly for CRISPRi/a screens: variability in guide efficiency and large rare off-target effects. We present a method, CRISPhieRmix, that resolves these issues by using a hierarchical mixture model with a broad-tailed null distribution. We show that CRISPhieRmix allows for more accurate and powerful inferences in large-scale pooled CRISPRi/a screens. We discuss key issues in the analysis and design of screens, particularly the number of guides needed for faithful full discovery.
View details for PubMedID 30296940
CRISPR-Mediated Programmable 3D Genome Positioning and Nuclear Organization.
Programmable control of spatial genome organization is a powerful approach for studying how nuclear structure affects gene regulation and cellular function. Here, we develop a versatile CRISPR-genome organization (CRISPR-GO) system that can efficiently control the spatial positioning of genomic loci relative to specific nuclear compartments, including the nuclear periphery, Cajal bodies, and promyelocytic leukemia (PML) bodies. CRISPR-GO is chemically inducible and reversible, enabling interrogation of real-time dynamics of chromatin interactions with nuclear compartments in living cells. Inducible repositioning of genomic loci to the nuclear periphery allows for dissection of mitosis-dependent and -independent relocalization events and also for interrogation of the relationship between gene position and gene expression. CRISPR-GO mediates rapid de novo formation of Cajal bodies at desired chromatin loci and causes significant repression of endogenous gene expression over long distances (30-600 kb). The CRISPR-GO system offers a programmable platform to investigate large-scale spatial genome organization and function.
View details for PubMedID 30318144
CRISPR Activation Screens Systematically Identify Factors that Drive Neuronal Fate and Reprogramming.
Cell stem cell
Comprehensive identification of factors that can specify neuronal fate could provide valuable insights into lineage specification and reprogramming, but systematic interrogation of transcription factors, and their interactions with each other, has proven technically challenging. We developed a CRISPR activation (CRISPRa) approach to systematically identify regulators of neuronal-fate specification. We activated expression of all endogenous transcription factors and other regulators via a pooled CRISPRa screen in embryonic stem cells, revealing genes including epigenetic regulators such as Ezh2 that can induce neuronal fate. Systematic CRISPR-based activation of factor pairs allowed us to generate a genetic interaction map for neuronal differentiation, with confirmation of top individual and combinatorial hits as bona fide inducers of neuronal fate. Several factor pairs could directly reprogram fibroblasts into neurons, which shared similar transcriptional programs with endogenous neurons. This study provides an unbiased discovery approach for systematic identification of genes that drive cell-fate acquisition.
View details for PubMedID 30318302
DNMT3A and TET1 cooperate to regulate promoter epigenetic landscapes in mouse embryonic stem cells
2018; 19: 88
DNA methylation is a heritable epigenetic mark, enabling stable but reversible gene repression. In mammalian cells, DNA methyltransferases (DNMTs) are responsible for modifying cytosine to 5-methylcytosine (5mC), which can be further oxidized by the TET dioxygenases to ultimately cause DNA demethylation. However, the genome-wide cooperation and functions of these two families of proteins, especially at large under-methylated regions, called canyons, remain largely unknown.Here we demonstrate that DNMT3A and TET1 function in a complementary and competitive manner in mouse embryonic stem cells to mediate proper epigenetic landscapes and gene expression. The longer isoform of DNMT3A, DNMT3A1, exhibits significant enrichment at distal promoters and canyon edges, but is excluded from proximal promoters and canyons where TET1 shows prominent binding. Deletion of Tet1 increases DNMT3A1 binding capacity at and around genes with wild-type TET1 binding. However, deletion of Dnmt3a has a minor effect on TET1 binding on chromatin, indicating that TET1 may limit DNA methylation partially by protecting its targets from DNMT3A and establishing boundaries for DNA methylation. Local CpG density may determine their complementary binding patterns and therefore that the methylation landscape is encoded in the DNA sequence. Furthermore, DNMT3A and TET1 impact histone modifications which in turn regulate gene expression. In particular, they regulate Polycomb Repressive Complex 2 (PRC2)-mediated H3K27me3 enrichment to constrain gene expression from bivalent promoters.We conclude that DNMT3A and TET1 regulate the epigenome and gene expression at specific targets via their functional interplay.
View details for PubMedID 30001199
Homeobox oncogene activation by pan-cancer DNA hypermethylation.
2018; 19 (1): 108
Cancers have long been recognized to be not only genetically but also epigenetically distinct from their tissues of origin. Although genetic alterations underlying oncogene upregulation have been well studied, to what extent epigenetic mechanisms, such as DNA methylation, can also induce oncogene expression remains unknown.Here, through pan-cancer analysis of 4174 genome-wide profiles, including whole-genome bisulfite sequencing data from 30 normal tissues and 35 solid tumors, we discover a strong correlation between gene-body hypermethylation of DNA methylation canyons, defined as broad under-methylated regions, and overexpression of approximately 43% of homeobox genes, many of which are also oncogenes. To gain insights into the cause-and-effect relationship, we use a newly developed dCas9-SunTag-DNMT3A system to methylate genomic sites of interest. The locus-specific hypermethylation of gene-body canyon, but not promoter, of homeobox oncogene DLX1, can directly increase its gene expression.Our pan-cancer analysis followed by functional validation reveals DNA hypermethylation as a novel epigenetic mechanism for homeobox oncogene upregulation.
View details for PubMedID 30097071
View details for PubMedCentralID PMC6085761
Sparse conserved under-methylated CpGs are associated with high-order chromatin structure
2017; 18: 163
Whole-genome bisulfite sequencing (WGBS) is the gold standard for studying landscape DNA methylation. Current computational methods for WGBS are mainly designed for gene regulatory regions with multiple under-methylated CpGs (UMCs), such as promoters and enhancers.To reliably predict the functional importance of single isolated UMCs across the genome, which is usually not achievable using traditional methods, we develop a multi-sample-based method. We identified 9421 sparse conserved under-methylated CpGs (scUMCs) from 31 high-quality methylomes, which are enriched in distal interacting anchor regions co-occupied by multiple chromatin-loop factors and are flanked by highly methylated CpGs. Moreover, cell lineage-specific scUMCs are associated with essential developmental genes, regulators of cell differentiation, and chromatin remodeling enzymes. Dynamic methylation levels of scUMCs correlate with the intensity of chromatin interactions and binding of looping factors as well as patterns of gene expression.We introduce an innovative computational method for the identification of scUMCs, which are novel epigenetic features associated with high-order chromatin structure, opening new directions in the study of the inter-relationships between DNA methylation and chromatin structure.
View details for PubMedID 28859663
DNMT3A Loss Drives Enhancer Hypomethylation in FLT3-ITD-Associated Leukemias.
2016; 29 (6): 922–34
DNMT3A, the gene encoding the de novo DNA methyltransferase 3A, is among the most frequently mutated genes in hematologic malignancies. However, the mechanisms through which DNMT3A normally suppresses malignancy development are unknown. Here, we show that DNMT3A loss synergizes with the FLT3 internal tandem duplication in a dose-influenced fashion to generate rapid lethal lymphoid or myeloid leukemias similar to their human counterparts. Loss of DNMT3A leads to reduced DNA methylation, predominantly at hematopoietic enhancer regions in both mouse and human samples. Myeloid and lymphoid diseases arise from transformed murine hematopoietic stem cells. Broadly, our findings support a role for DNMT3A as a guardian of the epigenetic state at enhancer regions, critical for inhibition of leukemic transformation.
View details for PubMedID 27300438
View details for PubMedCentralID PMC4908977
Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes.
2015; 47 (10): 1149–57
Tumor suppressors are mostly defined by inactivating mutations in tumors, yet little is known about their epigenetic features in normal cells. Through integrative analysis of 1,134 genome-wide epigenetic profiles, mutations from >8,200 tumor-normal pairs and our experimental data from clinical samples, we discovered broad peaks for trimethylation of histone H3 at lysine 4 (H3K4me3; wider than 4 kb) as the first epigenetic signature for tumor suppressors in normal cells. Broad H3K4me3 is associated with increased transcription elongation and enhancer activity, which together lead to exceptionally high gene expression, and is distinct from other broad epigenetic features, such as super-enhancers. Genes with broad H3K4me3 peaks conserved across normal cells may represent pan-cancer tumor suppressors, such as TP53 and PTEN, whereas genes with cell type-specific broad H3K4me3 peaks may represent cell identity genes and cell type-specific tumor suppressors. Furthermore, widespread shortening of broad H3K4me3 peaks in cancers is associated with repression of tumor suppressors. Thus, the broad H3K4me3 epigenetic signature provides mutation-independent information for the discovery and characterization of new tumor suppressors.
View details for PubMedID 26301496
View details for PubMedCentralID PMC4780747
Comparative analysis of metazoan chromatin organization.
2014; 512 (7515): 449-452
Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.
View details for DOI 10.1038/nature13415
View details for PubMedID 25164756
BSeQC: quality control of bisulfite sequencing experiments
2013; 29 (24): 3227–29
Bisulfite sequencing (BS-seq) has emerged as the gold standard to study genome-wide DNA methylation at single-nucleotide resolution. Quality control (QC) is a critical step in the analysis pipeline to ensure that BS-seq data are of high quality and suitable for subsequent analysis. Although several QC tools are available for next-generation sequencing data, most of them were not designed to handle QC issues specific to BS-seq protocols. Therefore, there is a strong need for a dedicated QC tool to evaluate and remove potential technical biases in BS-seq experiments.We developed a package named BSeQC to comprehensively evaluate the quality of BS-seq experiments and automatically trim nucleotides with potential technical biases that may result in inaccurate methylation estimation. BSeQC takes standard SAM/BAM files as input and generates bias-free SAM/BAM files for downstream analysis. Evaluation based on real BS-seq data indicates that the use of the bias-free SAM/BAM file substantially improves the quantification of methylation level.BSeQC is freely available at: http://code.google.com/p/bseqc/.
View details for PubMedID 24064417
View details for PubMedCentralID PMC3842756