Doctor of Philosophy, Tongji University (2016)
Stanley Qi, Postdoctoral Faculty Sponsor
Nested epistasis enhancer networks for robust genome regulation.
Science (New York, N.Y.)
Mammalian genomes possess multiple enhancers spanning an ultralong distance (>megabases) to modulate important genes, yet it is unclear how these enhancers coordinate to achieve this task. Here, we combine multiplexed CRISPRi screening with machine learning to define quantitative enhancer-enhancer interactions. We find that the ultralong distance enhancer network possesses a nested multi-layer architecture that confers functional robustness of gene expression. Experimental characterization reveals that enhancer epistasis is maintained by three-dimensional chromosomal interactions and BRD4 condensation. Machine learning prediction of synergistic enhancers provides an effective strategy to identify non-coding variant pairs associated with pathogenic genes in diseases beyond Genome-Wide Association Studies (GWAS) analysis. Our work unveils nested epistasis enhancer networks, which can better explain enhancer functions within cells and in diseases.
View details for DOI 10.1126/science.abk3512
View details for PubMedID 35951677
Multiplex CRISPR genome regulation in the retina
ASSOC RESEARCH VISION OPHTHALMOLOGY INC. 2022
View details for Web of Science ID 000844401300064
Broad-spectrum CRISPR-mediated inhibition of SARS-CoV-2 variants and endemic coronaviruses in vitro.
2022; 13 (1): 2766
A major challenge in coronavirus vaccination and treatment is to counteract rapid viral evolution and mutations. Here we demonstrate that CRISPR-Cas13d offers a broad-spectrum antiviral (BSA) to inhibit many SARS-CoV-2 variants and diverse human coronavirus strains with >99% reduction of the viral titer. We show that Cas13d-mediated coronavirus inhibition is dependent on the crRNA cellular spatial colocalization with Cas13d and target viral RNA. Cas13d can significantly enhance the therapeutic effects of diverse small molecule drugs against coronaviruses for prophylaxis or treatment purposes, and the best combination reduced viral titer by over four orders of magnitude. Using lipid nanoparticle-mediated RNA delivery, we demonstrate that the Cas13d system can effectively treat infection from multiple variants of coronavirus, including Omicron SARS-CoV-2, in human primary airway epithelium air-liquid interface (ALI) cultures. Our study establishes CRISPR-Cas13 as a BSA which is highly complementary to existing vaccination and antiviral treatment strategies.
View details for DOI 10.1038/s41467-022-30546-7
View details for PubMedID 35589813
The disordered N-terminal domain of DNMT3A recognizes H2AK119ub and is required for postnatal development.
DNA methyltransferase 3a (DNMT3A) plays a crucial role during mammalian development. Two isoforms of DNMT3A are differentially expressed from stem cells to somatic tissues, but their individual functions remain largely uncharacterized. Here we report that the long isoform DNMT3A1, but not the short DNMT3A2, is essential for mouse postnatal development. DNMT3A1 binds to and regulates bivalent neurodevelopmental genes in the brain. Strikingly, Dnmt3a1 knockout perinatal lethality could be partially rescued by DNMT3A1 restoration in the nervous system. We further show that the intrinsically disordered N terminus of DNMT3A1 is required for normal development and DNA methylation at DNMT3A1-enriched regions. Mechanistically, a ubiquitin-interacting motif embedded in a putative alpha-helix within the N terminus binds to mono-ubiquitinated histone H2AK119, probably mediating recruitment of DNMT3A1 to Polycomb-regulated regions. These data demonstrate an isoform-specific role for DNMT3A1 in mouse postnatal development and reveal the N terminus as a necessary regulatory domain for DNMT3A1 chromatin occupancy and functions in the nervous system.
View details for DOI 10.1038/s41588-022-01063-6
View details for PubMedID 35534561
Multiplexed genome regulation in vivo with hyper-efficient Cas12a.
Nature cell biology
Multiplexed modulation of endogenous genes is crucial for sophisticated gene therapy and cell engineering. CRISPR-Cas12a systems enable versatile multiple-genomic-loci targeting by processing numerous CRISPR RNAs (crRNAs) from a single transcript; however, their low efficiency has hindered in vivo applications. Through structure-guided protein engineering, we developed a hyper-efficient Lachnospiraceae bacterium Cas12a variant, termed hyperCas12a, with its catalytically dead version hyperdCas12a showing significantly enhanced efficacy for gene activation, particularly at low concentrations of crRNA. We demonstrate that hyperdCas12a has comparable off-target effects compared with the wild-type system and exhibits enhanced activity for gene editing and repression. Delivery of the hyperdCas12a activator and a single crRNA array simultaneously activating the endogenous Oct4, Sox2 and Klf4 genes in the retina of post-natal mice alters the differentiation of retinal progenitor cells. The hyperCas12a system offers a versatile in vivo tool for a broad range of gene-modulation and gene-therapy applications.
View details for DOI 10.1038/s41556-022-00870-7
View details for PubMedID 35414015
Multiplexed Genome Regulation In Vivo with Hyper-Efficient Cas12a
CELL PRESS. 2022: 103
View details for Web of Science ID 000794043700207
A comprehensive analysis and resource to use CRISPR-Cas13 for broad-spectrum targeting of RNA viruses.
Cell reports. Medicine
The COVID-19 pandemic caused by SARS-CoV-2 and variants has led to significant mortality. We recently reported that an RNA-targeting CRISPR-Cas13 system, termed prophylactic antiviral CRISPR in human (PAC-MAN), offered an antiviral strategy against SARS-CoV-2 and influenza A virus. Here, we expand in silico analysis to use PAC-MAN to target a broad spectrum of human- or livestock-infectious RNA viruses with high specificity, coverage, and predicted efficiency. Our analysis reveals that a minimal set of 14 crRNAs is able to target >90% of human-infectious viruses across 10 RNA virus families. We predict that a set of 5 experimentally validated crRNAs can target new SARS-CoV-2 variant sequences with zero mismatches. We also build an online resource (crispr-pacman.stanford.edu) to support community use of CRISPR-Cas13 for broad-spectrum RNA virus targeting. Our work provides a new bioinformatic resource for using CRISPR-Cas13 to target diverse RNA viruses in order to facilitate development of CRISPR-based antivirals.
View details for DOI 10.1016/j.xcrm.2021.100245
View details for PubMedID 33778788
A benchmark of algorithms for the analysis of pooled CRISPR screens.
2020; 21 (1): 62
Genome-wide pooled CRISPR-Cas-mediated knockout, activation, and repression screens are powerful tools for functional genomic investigations. Despite their increasing importance, there is currently little guidance on how to design and analyze CRISPR-pooled screens. Here, we provide a review of the commonly used algorithms in the computational analysis of pooled CRISPR screens. We develop a comprehensive simulation framework to benchmark and compare the performance of these algorithms using both synthetic and real datasets. Our findings inform parameter choices of CRISPR screens and provide guidance to researchers on the design and analysis of pooled CRISPR screens.
View details for DOI 10.1186/s13059-020-01972-x
View details for PubMedID 32151271
Computational Methods for Analysis of Large-Scale CRISPR Screens
ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 3, 2020
2020; 3: 137–62
View details for DOI 10.1146/annurev-biodatasci-020520-113523
View details for Web of Science ID 000613910200006
Development of CRISPR as an Antiviral Strategy to Combat SARS-CoV-2 and Influenza.
The coronavirus disease 2019 (COVID-19) pandemic, caused by the SARS-CoV-2 virus, has highlighted the need for antiviral approaches that can target emerging viruses with no effective vaccines or pharmaceuticals. Here, we demonstrate a CRISPR-Cas13-based strategy, PAC-MAN (prophylactic antiviral CRISPR in human cells), for viral inhibition that can effectively degrade RNA from SARS-CoV-2 sequences and live influenza A virus (IAV) in human lung epithelial cells. We designed and screened CRISPR RNAs (crRNAs) targeting conserved viral regions and identified functional crRNAs targeting SARS-CoV-2. This approach effectively reduced H1N1 IAV load in respiratory epithelial cells. Our bioinformatic analysis showed that a group of only six crRNAs can target more than 90% of all coronaviruses. With the development of a safe and effective system for respiratory tract delivery, PAC-MAN has the potential to become an important pan-coronavirus inhibition strategy.
View details for DOI 10.1016/j.cell.2020.04.020
View details for PubMedID 32353252
CRISPhieRmix: a hierarchical mixture model for CRISPR pooled screens.
2018; 19 (1): 159
Pooled CRISPR screens allow researchers to interrogate genetic causes of complex phenotypes at the genome-wide scale and promise higher specificity and sensitivity compared to competing technologies. Unfortunately, two problems exist, particularly for CRISPRi/a screens: variability in guide efficiency and large rare off-target effects. We present a method, CRISPhieRmix, that resolves these issues by using a hierarchical mixture model with a broad-tailed null distribution. We show that CRISPhieRmix allows for more accurate and powerful inferences in large-scale pooled CRISPRi/a screens. We discuss key issues in the analysis and design of screens, particularly the number of guides needed for faithful full discovery.
View details for PubMedID 30296940
CRISPR-Mediated Programmable 3D Genome Positioning and Nuclear Organization.
Programmable control of spatial genome organization is a powerful approach for studying how nuclear structure affects gene regulation and cellular function. Here, we develop a versatile CRISPR-genome organization (CRISPR-GO) system that can efficiently control the spatial positioning of genomic loci relative to specific nuclear compartments, including the nuclear periphery, Cajal bodies, and promyelocytic leukemia (PML) bodies. CRISPR-GO is chemically inducible and reversible, enabling interrogation of real-time dynamics of chromatin interactions with nuclear compartments in living cells. Inducible repositioning of genomic loci to the nuclear periphery allows for dissection of mitosis-dependent and -independent relocalization events and also for interrogation of the relationship between gene position and gene expression. CRISPR-GO mediates rapid de novo formation of Cajal bodies at desired chromatin loci and causes significant repression of endogenous gene expression over long distances (30-600 kb). The CRISPR-GO system offers a programmable platform to investigate large-scale spatial genome organization and function.
View details for PubMedID 30318144
CRISPR Activation Screens Systematically Identify Factors that Drive Neuronal Fate and Reprogramming.
Cell stem cell
Comprehensive identification of factors that can specify neuronal fate could provide valuable insights into lineage specification and reprogramming, but systematic interrogation of transcription factors, and their interactions with each other, has proven technically challenging. We developed a CRISPR activation (CRISPRa) approach to systematically identify regulators of neuronal-fate specification. We activated expression of all endogenous transcription factors and other regulators via a pooled CRISPRa screen in embryonic stem cells, revealing genes including epigenetic regulators such as Ezh2 that can induce neuronal fate. Systematic CRISPR-based activation of factor pairs allowed us to generate a genetic interaction map for neuronal differentiation, with confirmation of top individual and combinatorial hits as bona fide inducers of neuronal fate. Several factor pairs could directly reprogram fibroblasts into neurons, which shared similar transcriptional programs with endogenous neurons. This study provides an unbiased discovery approach for systematic identification of genes that drive cell-fate acquisition.
View details for PubMedID 30318302
Homeobox oncogene activation by pan-cancer DNA hypermethylation.
2018; 19 (1): 108
Cancers have long been recognized to be not only genetically but also epigenetically distinct from their tissues of origin. Although genetic alterations underlying oncogene upregulation have been well studied, to what extent epigenetic mechanisms, such as DNA methylation, can also induce oncogene expression remains unknown.Here, through pan-cancer analysis of 4174 genome-wide profiles, including whole-genome bisulfite sequencing data from 30 normal tissues and 35 solid tumors, we discover a strong correlation between gene-body hypermethylation of DNA methylation canyons, defined as broad under-methylated regions, and overexpression of approximately 43% of homeobox genes, many of which are also oncogenes. To gain insights into the cause-and-effect relationship, we use a newly developed dCas9-SunTag-DNMT3A system to methylate genomic sites of interest. The locus-specific hypermethylation of gene-body canyon, but not promoter, of homeobox oncogene DLX1, can directly increase its gene expression.Our pan-cancer analysis followed by functional validation reveals DNA hypermethylation as a novel epigenetic mechanism for homeobox oncogene upregulation.
View details for DOI 10.1186/s13059-018-1492-3
View details for PubMedID 30097071
View details for PubMedCentralID PMC6085761
DNMT3A and TET1 cooperate to regulate promoter epigenetic landscapes in mouse embryonic stem cells
2018; 19: 88
DNA methylation is a heritable epigenetic mark, enabling stable but reversible gene repression. In mammalian cells, DNA methyltransferases (DNMTs) are responsible for modifying cytosine to 5-methylcytosine (5mC), which can be further oxidized by the TET dioxygenases to ultimately cause DNA demethylation. However, the genome-wide cooperation and functions of these two families of proteins, especially at large under-methylated regions, called canyons, remain largely unknown.Here we demonstrate that DNMT3A and TET1 function in a complementary and competitive manner in mouse embryonic stem cells to mediate proper epigenetic landscapes and gene expression. The longer isoform of DNMT3A, DNMT3A1, exhibits significant enrichment at distal promoters and canyon edges, but is excluded from proximal promoters and canyons where TET1 shows prominent binding. Deletion of Tet1 increases DNMT3A1 binding capacity at and around genes with wild-type TET1 binding. However, deletion of Dnmt3a has a minor effect on TET1 binding on chromatin, indicating that TET1 may limit DNA methylation partially by protecting its targets from DNMT3A and establishing boundaries for DNA methylation. Local CpG density may determine their complementary binding patterns and therefore that the methylation landscape is encoded in the DNA sequence. Furthermore, DNMT3A and TET1 impact histone modifications which in turn regulate gene expression. In particular, they regulate Polycomb Repressive Complex 2 (PRC2)-mediated H3K27me3 enrichment to constrain gene expression from bivalent promoters.We conclude that DNMT3A and TET1 regulate the epigenome and gene expression at specific targets via their functional interplay.
View details for PubMedID 30001199
Sparse conserved under-methylated CpGs are associated with high-order chromatin structure
2017; 18: 163
Whole-genome bisulfite sequencing (WGBS) is the gold standard for studying landscape DNA methylation. Current computational methods for WGBS are mainly designed for gene regulatory regions with multiple under-methylated CpGs (UMCs), such as promoters and enhancers.To reliably predict the functional importance of single isolated UMCs across the genome, which is usually not achievable using traditional methods, we develop a multi-sample-based method. We identified 9421 sparse conserved under-methylated CpGs (scUMCs) from 31 high-quality methylomes, which are enriched in distal interacting anchor regions co-occupied by multiple chromatin-loop factors and are flanked by highly methylated CpGs. Moreover, cell lineage-specific scUMCs are associated with essential developmental genes, regulators of cell differentiation, and chromatin remodeling enzymes. Dynamic methylation levels of scUMCs correlate with the intensity of chromatin interactions and binding of looping factors as well as patterns of gene expression.We introduce an innovative computational method for the identification of scUMCs, which are novel epigenetic features associated with high-order chromatin structure, opening new directions in the study of the inter-relationships between DNA methylation and chromatin structure.
View details for PubMedID 28859663
DNMT3A mutation leads to leukemic extramedullary infiltration mediated by TWIST1.
Journal of hematology & oncology
2016; 9 (1): 106
DNMT3A mutations are frequently discovered in acute myeloid leukemia (AML), associated with poor outcome. Recently, a relapse case report of AML extramedullary disease has showed that AML cells harboring DNMT3A variation were detected in the cerebral spinal fluid. However, whether a causal relationship exists between DNMT3A mutation (D3Amut) and extramedullary infiltration (EMI) is unclear.We took advantage of DNMT3A (R882C) mutation-carrying AML cell strain, that is, OCI-AML3, assessing its migration ability in vitro and in vivo. By RNA interfering technology and a xenograft mouse model, we evaluated the effect of DNMT3A mutation on cell mobility and explored the possible mechanism.OCI-AML3 displayed extraordinary migration ability in vitro and infiltrated into meninges of NOD/SCID mice after intravenous transfusion. We found that this leukemic migration or infiltration capacity was significantly compromised by the knockdown of DNMT3A mutant. Notably, TWIST1, a critical inducer of epithelial-mesenchymal transition, which underlies the metastasis of carcinomas, was highly expressed in association with R882 mutations. Abrogation of TWIST1 in DNMT3A mutated cells considerably weakened their mobility or infiltration.Our results demonstrate that D3Amut in OCI-AML3 strain enhances leukemic aggressiveness by promoting EMI process, which is partially through upregulating TWIST1.
View details for DOI 10.1186/s13045-016-0337-3
View details for PubMedID 27724883
View details for PubMedCentralID PMC5057205
DNMT3A Loss Drives Enhancer Hypomethylation in FLT3-ITD-Associated Leukemias.
2016; 29 (6): 922-934
DNMT3A, the gene encoding the de novo DNA methyltransferase 3A, is among the most frequently mutated genes in hematologic malignancies. However, the mechanisms through which DNMT3A normally suppresses malignancy development are unknown. Here, we show that DNMT3A loss synergizes with the FLT3 internal tandem duplication in a dose-influenced fashion to generate rapid lethal lymphoid or myeloid leukemias similar to their human counterparts. Loss of DNMT3A leads to reduced DNA methylation, predominantly at hematopoietic enhancer regions in both mouse and human samples. Myeloid and lymphoid diseases arise from transformed murine hematopoietic stem cells. Broadly, our findings support a role for DNMT3A as a guardian of the epigenetic state at enhancer regions, critical for inhibition of leukemic transformation.
View details for DOI 10.1016/j.ccell.2016.05.003
View details for PubMedID 27300438
View details for PubMedCentralID PMC4908977
Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes.
2015; 47 (10): 1149-57
Tumor suppressors are mostly defined by inactivating mutations in tumors, yet little is known about their epigenetic features in normal cells. Through integrative analysis of 1,134 genome-wide epigenetic profiles, mutations from >8,200 tumor-normal pairs and our experimental data from clinical samples, we discovered broad peaks for trimethylation of histone H3 at lysine 4 (H3K4me3; wider than 4 kb) as the first epigenetic signature for tumor suppressors in normal cells. Broad H3K4me3 is associated with increased transcription elongation and enhancer activity, which together lead to exceptionally high gene expression, and is distinct from other broad epigenetic features, such as super-enhancers. Genes with broad H3K4me3 peaks conserved across normal cells may represent pan-cancer tumor suppressors, such as TP53 and PTEN, whereas genes with cell type-specific broad H3K4me3 peaks may represent cell identity genes and cell type-specific tumor suppressors. Furthermore, widespread shortening of broad H3K4me3 peaks in cancers is associated with repression of tumor suppressors. Thus, the broad H3K4me3 epigenetic signature provides mutation-independent information for the discovery and characterization of new tumor suppressors.
View details for DOI 10.1038/ng.3385
View details for PubMedID 26301496
View details for PubMedCentralID PMC4780747
Comparative analysis of metazoan chromatin organization.
2014; 512 (7515): 449-452
Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.
View details for DOI 10.1038/nature13415
View details for PubMedID 25164756
BSeQC: quality control of bisulfite sequencing experiments
2013; 29 (24): 3227–29
Bisulfite sequencing (BS-seq) has emerged as the gold standard to study genome-wide DNA methylation at single-nucleotide resolution. Quality control (QC) is a critical step in the analysis pipeline to ensure that BS-seq data are of high quality and suitable for subsequent analysis. Although several QC tools are available for next-generation sequencing data, most of them were not designed to handle QC issues specific to BS-seq protocols. Therefore, there is a strong need for a dedicated QC tool to evaluate and remove potential technical biases in BS-seq experiments.We developed a package named BSeQC to comprehensively evaluate the quality of BS-seq experiments and automatically trim nucleotides with potential technical biases that may result in inaccurate methylation estimation. BSeQC takes standard SAM/BAM files as input and generates bias-free SAM/BAM files for downstream analysis. Evaluation based on real BS-seq data indicates that the use of the bias-free SAM/BAM file substantially improves the quantification of methylation level.BSeQC is freely available at: http://code.google.com/p/bseqc/.
View details for PubMedID 24064417
View details for PubMedCentralID PMC3842756