Dr. Cong's group is developing novel genome technology for genome engineering and single-cell genomics, leveraging scalable computational methods. His group has several focus areas. We are using genome technology such as gene-editing and data science method/analysis to study immunological and neurological diseases. His work has led to one of the first CRISPR/Cas9 gene-editing tools for in vivo gene therapy. More recently, his group invented tools for cleavage-free large gene insertion via mining microbial recombination protein, and developed single-cell tracking approach for studying cancer biology and cancer immunology. Dr. Cong is a recipient of the NIH/NHGRI Genomic Innovator Award, a Baxter Foundation Faculty Scholar, and has been selected by Clarivate Web of Science as a Highly Cited Researcher.
Honors & Awards
Genomic Innovator Award, National Institute of Health (NIH), National Human Genome Research Institute (NHGRI)
Donald and Delia Baxter Foundation Faculty Scholar, Baxter Foundation
CRI Irvington Fellow, Cancer Research Institute
HHMI International Fellow, Howard Hughes Medical Institute
Boards, Advisory Committees, Professional Organizations
Genome Editing and New Investigator Committee Member, American Society of Gene & Cell Therapy (ASGCT) (2019 - Present)
PhD, Harvard University, Harvard Medical School., Biological and Biomedical Sciences (2014)
LHB, Harvard Medical School., Certificate in Leder Human Biology and Translational Medicine
B.S., Tsinghua University, Biological Sciences, Electronic Engineering (2009)
Community and International Work
Neuro-engineering and Gene-editing., Cold Spring Harbor Laboratory
Advanced Techniques in Molecular Neuroscience
Cold Spring Harbor Laboratory
Opportunities for Student Involvement
Feng Zhang, Le Cong, Patrick Hsu, Fei Ann Ran. "United States Patent 8,906,616 Engineering of systems, methods and optimized guide compositions for sequence manipulation"
Le Cong, Feng Zhang. "United States Patent 8,932,814 CRISPR-Cas nickase systems, methods and compositions for sequence manipulation in eukaryotes."
Feng Zhang, LeCong, Randall Platt, Neville Sanjana, Fei Ann Ran. "United States Patent 8,993,233 Engineering and optimization of systems, methods and compositions for sequence manipulation with functional domains"
Cong, Egloff, Garraway, Grandis, Lander, Stransky, Tward, Zhang.. "United States Patent 9,370,551. Compositions and methods of treating head and neck cancer."
Le Cong, Feng Zhang, Patrick Hsu, Fei Ann Ran. "United StatesEngineering of systems, methods and optimized guide compositions for sequence manipulation."
Graduate and Fellowship Programs
Biology (School of Humanities and Sciences) (Phd Program)
Biomedical Informatics (Phd Program)
- Long sequence insertion via CRISPR/Cas gene-editing with transposase, recombinase, and integrase CURRENT OPINION IN BIOMEDICAL ENGINEERING 2023; 28
Integrative analysis of functional genomic screening and clinical data identifies a protective role for spironolactone in severe COVID-19.
Cell reports methods
2023; 3 (7): 100503
We demonstrate that integrative analysis of CRISPR screening datasets enables network-based prioritization of prescription drugs modulating viral entry in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by developing a network-based approach called Rapid proXimity Guidance for Repurposing Investigational Drugs (RxGRID). We use our results to guide a propensity-score-matched, retrospective cohort study of 64,349 COVID-19 patients, showing that a top candidate drug, spironolactone, is associated with improved clinical prognosis, measured by intensive care unit (ICU) admission and mechanical ventilation rates. Finally, we show that spironolactone exerts a dose-dependent inhibitory effect on viral entry in human lung epithelial cells. Our RxGRID method presents a computational framework, implemented as an open-source software package, enabling genomics researchers to identify drugs likely to modulate a molecular phenotype of interest based on high-throughput screening data. Our results, derived from this method and supported by experimental and clinical analysis, add additional supporting evidence for a potential protective role of the potassium-sparing diuretic spironolactone in severe COVID-19.
View details for DOI 10.1016/j.crmeth.2023.100503
View details for PubMedID 37529368
Gene set proximity analysis: expanding gene set enrichment analysis through learned geometric embeddings, with drug-repurposing applications in COVID-19.
Bioinformatics (Oxford, England)
MOTIVATION: Gene set analysis methods rely on knowledge-based representations of genetic interactions in the form of both gene set collections and protein-protein interaction (PPI) networks. However, explicit representations of genetic interactions often fail to capture complex interdependencies among genes, limiting the analytic power of such methods.RESULTS: We propose an extension of gene set enrichment analysis to a latent embedding space reflecting PPI network topology, called gene set proximity analysis (GSPA). Compared with existing methods, GSPA provides improved ability to identify disease-associated pathways in disease-matched gene expression datasets, while improving reproducibility of enrichment statistics for similar gene sets. GSPA is statistically straightforward, reducing to a version of traditional gene set enrichment analysis through a single user-defined parameter. We apply our method to identify novel drug associations with SARS-CoV-2 viral entry. Finally, we validate our drug association predictions through retrospective clinical analysis of claims data from 8 million patients, supporting a role for gabapentin as a risk factor and metformin as a protective factor for severe COVID-19.AVAILABILITY: GSPA is available for download as a command-line Python package at https://github.com/henrycousins/gspa.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btac735
View details for PubMedID 36394254
Single-cell transcriptome analysis of regenerating RGCs reveals potent glaucoma neural repair genes.
Axon regeneration holds great promise for neural repair of CNS axonopathies, including glaucoma. Pten deletion in retinal ganglion cells (RGCs) promotes potent optic nerve regeneration, but only a small population of Pten-null RGCs are actually regenerating RGCs (regRGCs); most surviving RGCs (surRGCs) remain non-regenerative. Here, we developed a strategy to specifically label and purify regRGCs and surRGCs, respectively, from the same Pten-deletion mice after optic nerve crush, in which they differ only in their regeneration capability. Smart-Seq2 single-cell transcriptome analysis revealed novel regeneration-associated genes that significantly promote axon regeneration. The most potent of these, Anxa2, acts synergistically with its ligand tPA in Pten-deletion-induced axon regeneration. Anxa2, its downstream effector ILK, and Mpp1 dramatically protect RGC somata and axons and preserve visual function in a clinically relevant model of glaucoma, demonstrating the exciting potential of this innovative strategy to identify novel effective neural repair candidates.
View details for DOI 10.1016/j.neuron.2022.06.022
View details for PubMedID 35952672
Machine-learning-optimized Cas12a barcoding enables the recovery of single-cell lineages and transcriptional profiles.
The development of CRISPR-based barcoding methods creates an exciting opportunity to understand cellular phylogenies. We present a compact, tunable, high-capacity Cas12a barcoding system called dual acting inverted site array (DAISY). We combined high-throughput screening and machine learning to predict and optimize the 60-bp DAISY barcode sequences. After optimization, top-performing barcodes had ∼10-fold increased capacity relative to the best random-screened designs and performed reliably across diverse cell types. DAISY barcode arrays generated ∼12 bits of entropy and ∼66,000 unique barcodes. Thus, DAISY barcodes-at a fraction of the size of Cas9 barcodes-achieved high-capacity barcoding. We coupled DAISY barcoding with single-cell RNA-seq to recover lineages and gene expression profiles from ∼47,000 human melanoma cells. A single DAISY barcode recovered up to ∼700 lineages from one parental cell. This analysis revealed heritable single-cell gene expression and potential epigenetic modulation of memory gene transcription. Overall, Cas12a DAISY barcoding is an efficient tool for investigating cell-state dynamics.
View details for DOI 10.1016/j.molcel.2022.06.001
View details for PubMedID 35752172
- Editorial: CRISPR and alternative approaches. Biotechnology journal 2022: e2200290
dCas9-based gene editing for cleavage-free genomic knock-in of long sequences.
Nature cell biology
Gene editing is a powerful tool for genome and cell engineering. Exemplified by CRISPR-Cas, gene editing could cause DNA damage and trigger DNA repair processes that are often error-prone. Such unwanted mutations and safety concerns can be exacerbated when altering long sequences. Here we couple microbial single-strand annealing proteins (SSAPs) with catalytically inactive dCas9 for gene editing. This cleavage-free gene editor, dCas9-SSAP, promotes the knock-in of long sequences in mammalian cells. The dCas9-SSAP editor has low on-target errors and minimal off-target effects, showing higher accuracy than canonical Cas9 methods. It is effective for inserting kilobase-scale sequences, with an efficiency of up to approximately 20% and robust performance across donor designs and cell types, including human stem cells. We show that dCas9-SSAP is less sensitive to inhibition of DNA repair enzymes than Cas9 references. We further performed truncation and aptamer engineering to minimize its size to fit into a single adeno-associated-virus vector for future application. Together, this tool opens opportunities towards safer long-sequence genome engineering.
View details for DOI 10.1038/s41556-021-00836-1
View details for PubMedID 35145221
- Neural Bandits for Protein Sequence Optimization IEEE. 2022: 188-193
The role of p53 in the development of pancreatic ductal adenocarcinoma.
AMER ASSOC CANCER RESEARCH. 2021: 58
View details for Web of Science ID 000720117400098
Deciphering pathogenicity of variants of uncertain significance with CRISPR-edited iPSCs.
Trends in genetics : TIG
Genetic variants play an important role in conferring risk for cardiovascular diseases (CVDs). With the rapid development of next-generation sequencing (NGS), thousands of genetic variants associated with CVDs have been identified by genome-wide association studies (GWAS), but the function of more than 40% of genetic variants is still unknown. This gap of knowledge is a barrier to the clinical application of the genetic information. However, determining the pathogenicity of a variant of uncertain significance (VUS) is challenging due to the lack of suitable model systems and accessible technologies. By combining clustered regularly interspaced short palindromic repeats (CRISPR) and human induced pluripotent stem cells (iPSCs), unprecedented advances are now possible in determining the pathogenicity of VUS in CVDs. Here, we summarize recent progress and new strategies in deciphering pathogenic variants for CVDs using CRISPR-edited human iPSCs.
View details for DOI 10.1016/j.tig.2021.08.009
View details for PubMedID 34509299
Conventional type I dendric cells maintain a reservoir of proliferative tumor-antigen specific TCF-1+ CD8+ Tcells in tumor-draining lymph nodes.
In tumors, a subset of CD8+ Tcells expressing the transcription factor TCF-1 drives the response to immune checkpoint blockade. We examined the mechanisms that maintain these cells in an autochthonous model of lung adenocarcinoma. Longitudinal sampling and single-cell sequencing of tumor-antigen specific TCF-1+ CD8+ Tcells revealed that while intratumoral TCF-1+ CD8+ Tcells acquired dysfunctional features and decreased in number as tumors progressed, TCF-1+ CD8+ Tcell frequency in the tumor draining LN (dLN) remained stable. Two discrete intratumoral TCF-1+ CD8+ Tcell subsets developed over time-a proliferative SlamF6+ subset and a non-cycling SlamF6- subset. Blocking dLN egress decreased the frequency of intratumoral SlamF6+ TCF-1+ CD8+ Tcells. Conventional type I dendritic cell (cDC1) in dLN decreased in number with tumor progression, and Flt3L+anti-CD40 treatment recovered SlamF6+ Tcell frequencies and decreased tumor burden. Thus, cDC1s in tumor dLN maintain a reservoir of TCF-1+ CD8+ Tcells and their decrease contributes to failed anti-tumor immunity.
View details for DOI 10.1016/j.immuni.2021.08.026
View details for PubMedID 34534439
CRISPR-Cas12a System With Synergistic Phage Recombination Proteins for Multiplex Precision Editing in Human Cells.
Frontiers in cell and developmental biology
2021; 9: 719705
The development of CRISPR-based gene-editing technologies has brought an unprecedented revolution in the field of genome engineering. Cas12a, a member of the Class 2 Type V CRISPR-associated endonuclease family distinct from Cas9, has been repurposed and developed into versatile gene-editing tools with distinct PAM recognition sites and multiplexed gene targeting capability. However, with current CRISPR/Cas12a technologies, it remains a challenge to perform efficient and precise genome editing of long sequences in mammalian cells. To address this limitation, we utilized phage recombination enzymes and developed an efficient CRISPR/Cas12a tool for multiplexed precision editing in mammalian cells. Through protein engineering, we were able to recruit phage recombination proteins to Cas12a to enhance its homology-directed repair efficiencies. Our phage-recombination-assisted Cas12a system achieved up to 3-fold improvements for kilobase-scale knock-ins in human cells without compromising the specificity of the enzyme. The performance of this system compares favorably against Cas9 references, the commonly used enzyme for gene-editing tasks, with improved specificity. Additionally, we demonstrated multi-target editing with similar improved activities thanks to the RNA-processing activity of the Cas12a system. This compact, multi-target editing tool has the potential to assist in understanding multi-gene interactions. In particular, it paves the way for a gene therapy method for human diseases that complements existing tools and is suitable for polygenic disorders and diseases requiring long-sequence corrections.
View details for DOI 10.3389/fcell.2021.719705
View details for PubMedID 35774104
View details for PubMedCentralID PMC9237396
Cleavage-Free dCas9 Knock-In Gene-Editing Tool Leveraging RNA-Guided Targeting of Recombineering Proteins
CELL PRESS. 2021: 107
View details for Web of Science ID 000645188700204
- A CRISPR Landing for Genome Rewriting at Locus-Scale. The CRISPR journal 2021; 4 (2): 163–66
Microbial single-strand annealing proteins enable CRISPR gene-editing tools with improved knock-in efficiencies and reduced off-target effects.
Nucleic acids research
Several existing technologies enable short genomic alterations including generating indels and short nucleotide variants, however,engineering more significant genomic changes is more challenging due to reduced efficiency and precision. Here, we developed RecT Editor via Designer-Cas9-Initiated Targeting (REDIT), which leverages phage single-stranded DNA-annealing proteins (SSAP) RecT for mammalian genome engineering. Relative to Cas9-mediated homology-directed repair (HDR), REDIT yielded up to a 5-fold increase of efficiency to insert kilobase-scale exogenous sequences at defined genomic regions. We validated our REDIT approach using different formats and lengths of knock-in templates. We further demonstrated that REDIT tools using Cas9 nickase have efficient gene-editing activities and reduced off-target errors, measured using a combination of targeted sequencing, genome-wide indel, and insertion mapping assays. Our experiments inhibiting repair enzyme activities suggested that REDIT has the potential to overcome limitations of endogenous DNA repair steps.Finally, our REDIT method is applicable across cell types including human stem cells,and is generalizable to different Cas9 enzymes.
View details for DOI 10.1093/nar/gkaa1264
View details for PubMedID 33619540
A functional taxonomy of tumor suppression in oncogenic KRAS-driven lung cancer.
Cancer genotyping has identified a large number of putative tumor suppressor genes. Carcinogenesis is a multi-step process, however the importance and specific roles of many of these genes during tumor initiation, growth and progression remain unknown. Here we use a multiplexed mouse model of oncogenic KRAS-driven lung cancer to quantify the impact of forty-eight known and putative tumor suppressor genes on diverse aspects of carcinogenesis at an unprecedented scale and resolution. We uncover many previously understudied functional tumor suppressors that constrain cancer in vivo. Inactivation of some genes substantially increased growth, while the inactivation of others increases tumor initiation and/or the emergence of exceptionally large tumors. These functional in vivo analyses revealed an unexpectedly complex landscape of tumor suppression that has implications for understanding cancer evolution, interpreting clinical cancer genome sequencing data, and directing approaches to limit tumor initiation and progression.
View details for DOI 10.1158/2159-8290.CD-20-1325
View details for PubMedID 33608386
Adeno-associated viral vector-mediated immune responses: Understanding barriers to gene delivery.
Pharmacology & therapeutics
Adeno-associated viral (AAV) vectors have emerged as the leading gene delivery platform for gene therapy and vaccination. Three AAV-based gene therapy drugs, Glybera, LUXTURNA, and ZOLGENSMA were approved between 2012 and 2019 by the European Medicines Agency and the United States Food and Drug Administration as treatments for genetic diseases hereditary lipoprotein lipase deficiency (LPLD), inherited retinal disease (IRD), and spinal muscular atrophy (SMA), respectively. Despite these therapeutic successes, clinical trials have demonstrated that host anti-viral immune responses can prevent the long-term gene expression of AAV vector-encoded genes. Therefore, it is critical that we understand the complex relationship between AAV vectors and the host immune response. This knowledge could allow for the rational design of optimized gene transfer vectors capable of either subverting host immune responses in the context of gene therapy applications, or stimulating desirable immune responses that generate protective immunity in vaccine applications to AAV vector-encoded antigens. This review provides an overview of our current understanding of the AAV-induced immune response and discusses potential strategies by which these responses can be manipulated to improve AAV vector-mediated gene transfer.
View details for DOI 10.1016/j.pharmthera.2019.107453
View details for PubMedID 31836454
- Take Risks and Constantly Challenge the Status Quo STEM CELLS AND DEVELOPMENT 2019
- Combined Computational-Experimental Approach to Explore the Molecular Mechanism of SaCas9 with a Broadened DNA Targeting Range JOURNAL OF THE AMERICAN CHEMICAL SOCIETY 2019; 141 (16): 6545–52
IL-33 Signaling Alters Regulatory T Cell Diversity in Support of Tumor Development.
2019; 29 (10): 2998–3008.e8
Regulatory T cells (Tregs) can impair anti-tumor immune responses and are associated with poor prognosis in multiple cancer types. Tregs in human tumors span diverse transcriptional states distinct from those of peripheral Tregs, but their contribution to tumor development remains unknown. Here, we use single-cell RNA sequencing (RNA-seq) to longitudinally profile dynamic shifts in the distribution of Tregs in a genetically engineered mouse model of lung adenocarcinoma. In this model, interferon-responsive Tregs are more prevalent early in tumor development, whereas a specialized effector phenotype characterized by enhanced expression of the interleukin-33 receptor ST2 is predominant in advanced disease. Treg-specific deletion of ST2 alters the evolution of effector Treg diversity, increases infiltration of CD8+ T cells into tumors, and decreases tumor burden. Our study shows that ST2 plays a critical role in Treg-mediated immunosuppression in cancer, highlighting potential paths for therapeutic intervention.
View details for DOI 10.1016/j.celrep.2019.10.120
View details for PubMedID 31801068
Efficient Generation of Transcriptomic Profiles by Random Composite Measurements.
2017; 171 (6): 1424-1436.e18
RNA profiles are an informative phenotype of cellular and tissue states but can be costly to generate at massive scale. Here, we describe how gene expression levels can be efficiently acquired with random composite measurements-in which abundances are combined in a random weighted sum. We show (1) that the similarity between pairs of expression profiles can be approximated with very few composite measurements; (2) that by leveraging sparse, modular representations of gene expression, we can use random composite measurements to recover high-dimensional gene expression levels (with 100 times fewer measurements than genes); and (3) that it is possible to blindly recover gene expression from composite measurements, even without access to training data. Our results suggest new compressive modalities as a foundation for massive scaling in high-throughput measurements and new insights into the interpretation of high-dimensional data.
View details for DOI 10.1016/j.cell.2017.10.023
View details for PubMedID 29153835
View details for PubMedCentralID PMC5726792
A Distinct Gene Module for Dysfunction Uncoupled from Activation in Tumor-Infiltrating T Cells
2016; 166 (6): 1500-?
Reversing the dysfunctional T cell state that arises in cancer and chronic viral infections is the focus of therapeutic interventions; however, current therapies are effective in only some patients and some tumor types. To gain a deeper molecular understanding of the dysfunctional T cell state, we analyzed population and single-cell RNA profiles of CD8(+) tumor-infiltrating lymphocytes (TILs) and used genetic perturbations to identify a distinct gene module for T cell dysfunction that can be uncoupled from T cell activation. This distinct dysfunction module is downstream of intracellular metallothioneins that regulate zinc metabolism and can be identified at single-cell resolution. We further identify Gata-3, a zinc-finger transcription factor in the dysfunctional module, as a regulator of dysfunction, and we use CRISPR-Cas9 genome editing to show that it drives a dysfunctional phenotype in CD8(+) TILs. Our results open novel avenues for targeting dysfunctional T cell states while leaving activation programs intact.
View details for DOI 10.1016/j.cell.2016.08.052
View details for Web of Science ID 000386339900021
View details for PubMedID 27610572
View details for PubMedCentralID PMC5019125
RBPJ Controls Development of Pathogenic Th17 Cells by Regulating IL-23 Receptor Expression.
2016; 16 (2): 392-404
Interleukin-17 (IL-17)-producing helper T cells (Th17 cells) play an important role in autoimmune diseases. However, not all Th17 cells induce tissue inflammation or autoimmunity. Th17 cells require IL-23 receptor (IL-23R) signaling to become pathogenic. The transcriptional mechanisms controlling the pathogenicity of Th17 cells and IL-23R expression are unknown. Here, we demonstrate that the canonical Notch signaling mediator RBPJ is a key driver of IL-23R expression. In the absence of RBPJ, Th17 cells fail to upregulate IL-23R, lack stability, and do not induce autoimmune tissue inflammation in vivo, whereas overexpression of IL-23R rescues this defect and promotes pathogenicity of RBPJ-deficient Th17 cells. RBPJ binds and trans-activates the Il23r promoter and induces IL-23R expression and represses anti-inflammatory IL-10 production in Th17 cells. We thus find that Notch signaling influences the development of pathogenic and non-pathogenic Th17 cells by reciprocally regulating IL-23R and IL-10 expression.
View details for DOI 10.1016/j.celrep.2016.05.088
View details for PubMedID 27346359
View details for PubMedCentralID PMC4984261
Definitive localization of intracellular proteins: Novel approach using CRISPR-Cas9 genome editing, with glucose 6-phosphate dehydrogenase as a model.
2016; 494: 55-67
Studies to determine subcellular localization and translocation of proteins are important because subcellular localization of proteins affects every aspect of cellular function. Such studies frequently utilize mutagenesis to alter amino acid sequences hypothesized to constitute subcellular localization signals. These studies often utilize fluorescent protein tags to facilitate live cell imaging. These methods are excellent for studies of monomeric proteins, but for multimeric proteins, they are unable to rule out artifacts from native protein subunits already present in the cells. That is, native monomers might direct the localization of fluorescent proteins with their localization signals obliterated. We have developed a method for ruling out such artifacts, and we use glucose 6-phosphate dehydrogenase (G6PD) as a model to demonstrate the method's utility. Because G6PD is capable of homodimerization, we employed a novel approach to remove interference from native G6PD. We produced a G6PD knockout somatic (hepatic) cell line using CRISPR-Cas9 mediated genome engineering. Transfection of G6PD knockout cells with G6PD fluorescent mutant proteins demonstrated that the major subcellular localization sequences of G6PD are within the N-terminal portion of the protein. This approach sets a new gold standard for similar studies of subcellular localization signals in all homodimerization-capable proteins.
View details for DOI 10.1016/j.ab.2015.11.002
View details for PubMedID 26576833
View details for PubMedCentralID PMC4695245
In vivo gene editing in dystrophic mouse muscle and muscle stem cells
2016; 351 (6271): 407-411
Frame-disrupting mutations in the DMD gene, encoding dystrophin, compromise myofiber integrity and drive muscle deterioration in Duchenne muscular dystrophy (DMD). Removing one or more exons from the mutated transcript can produce an in-frame mRNA and a truncated, but still functional, protein. In this study, we developed and tested a direct gene-editing approach to induce exon deletion and recover dystrophin expression in the mdx mouse model of DMD. Delivery by adeno-associated virus (AAV) of clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 endonucleases coupled with paired guide RNAs flanking the mutated Dmd exon23 resulted in excision of intervening DNA and restored the Dmd reading frame in myofibers, cardiomyocytes, and muscle stem cells after local or systemic delivery. AAV-Dmd CRISPR treatment partially recovered muscle functional deficiencies and generated a pool of endogenously corrected myogenic precursors in mdx mouse muscle.
View details for DOI 10.1126/science.aad5177
View details for Web of Science ID 000368440500046
View details for PubMedID 26721686
View details for PubMedCentralID PMC4924477
Crystal Structure of Staphylococcus aureus Cas9
2015; 162 (5): 1113-1126
The RNA-guided DNA endonuclease Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. Recently, we harnessed Staphylococcus aureus Cas9 (SaCas9), which is significantly smaller than Streptococcus pyogenes Cas9 (SpCas9), to facilitate efficient in vivo genome editing. Here, we report the crystal structures of SaCas9 in complex with a single guide RNA (sgRNA) and its double-stranded DNA targets, containing the 5'-TTGAAT-3' PAM and the 5'-TTGGGT-3' PAM, at 2.6 and 2.7 Å resolutions, respectively. The structures revealed the mechanism of the relaxed recognition of the 5'-NNGRRT-3' PAM by SaCas9. A structural comparison of SaCas9 with SpCas9 highlighted both structural conservation and divergence, explaining their distinct PAM specificities and orthologous sgRNA recognition. Finally, we applied the structural information about this minimal Cas9 to rationally design compact transcriptional activators and inducible nucleases, to further expand the CRISPR-Cas9 genome editing toolbox.
View details for DOI 10.1016/j.cell.2015.08.007
View details for Web of Science ID 000360589900020
View details for PubMedID 26317473
View details for PubMedCentralID PMC4670267
Sequence determinants of improved CRISPR sgRNA design.
2015; 25 (8): 1147-57
The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies.
View details for DOI 10.1101/gr.191452.115
View details for PubMedID 26063738
View details for PubMedCentralID PMC4509999
In vivo genome editing using Staphylococcus aureus Cas9
2015; 520 (7546): 186-U98
The RNA-guided endonuclease Cas9 has emerged as a versatile genome-editing platform. However, the size of the commonly used Cas9 from Streptococcus pyogenes (SpCas9) limits its utility for basic research and therapeutic applications that use the highly versatile adeno-associated virus (AAV) delivery vehicle. Here, we characterize six smaller Cas9 orthologues and show that Cas9 from Staphylococcus aureus (SaCas9) can edit the genome with efficiencies similar to those of SpCas9, while being more than 1 kilobase shorter. We packaged SaCas9 and its single guide RNA expression cassette into a single AAV vector and targeted the cholesterol regulatory gene Pcsk9 in the mouse liver. Within one week of injection, we observed >40% gene modification, accompanied by significant reductions in serum Pcsk9 and total cholesterol levels. We further assess the genome-wide targeting specificity of SaCas9 and SpCas9 using BLESS, and demonstrate that SaCas9-mediated in vivo genome editing has the potential to be efficient and specific.
View details for DOI 10.1038/nature14299
View details for Web of Science ID 000352454600031
View details for PubMedID 25830891
View details for PubMedCentralID PMC4393360
Genome engineering using CRISPR-Cas9 system.
Methods in molecular biology (Clifton, N.J.)
2015; 1239: 197-217
The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-Cas9 system is an adaptive immune system that exists in a variety of microbes. It could be engineered to function in eukaryotic cells as a fast, low-cost, efficient, and scalable tool for manipulating genomic sequences. In this chapter, detailed protocols are described for harnessing the CRISPR-Cas9 system from Streptococcus pyogenes to enable RNA-guided genome engineering applications in mammalian cells. We present all relevant methods including the initial site selection, molecular cloning, delivery of guide RNAs (gRNAs) and Cas9 into mammalian cells, verification of target cleavage, and assays for detecting genomic modification including indels and homologous recombination. These tools provide researchers with new instruments that accelerate both forward and reverse genetics efforts.
View details for DOI 10.1007/978-1-4939-1862-1_10
View details for PubMedID 25408407
Global microRNA depletion suppresses tumor angiogenesis.
Genes & development
2014; 28 (10): 1054-67
MicroRNAs delicately regulate the balance of angiogenesis. Here we show that depletion of all microRNAs suppresses tumor angiogenesis. We generated microRNA-deficient tumors by knocking out Dicer1. These tumors are highly hypoxic but poorly vascularized, suggestive of deficient angiogenesis signaling. Expression profiling revealed that angiogenesis genes were significantly down-regulated as a result of the microRNA deficiency. Factor inhibiting hypoxia-inducible factor 1 (HIF-1), FIH1, is derepressed under these conditions and suppresses HIF transcription. Knocking out FIH1 using CRISPR/Cas9-mediated genome engineering reversed the phenotypes of microRNA-deficient cells in HIF transcriptional activity, VEGF production, tumor hypoxia, and tumor angiogenesis. Using multiplexed CRISPR/Cas9, we deleted regions in FIH1 3' untranslated regions (UTRs) that contain microRNA-binding sites, which derepresses FIH1 protein and represses hypoxia response. These data suggest that microRNAs promote tumor responses to hypoxia and angiogenesis by repressing FIH1.
View details for DOI 10.1101/gad.239681.114
View details for PubMedID 24788094
View details for PubMedCentralID PMC4035535
Optical control of mammalian endogenous transcription and epigenetic states.
2013; 500 (7463): 472-6
The dynamic nature of gene expression enables cellular programming, homeostasis and environmental adaptation in living systems. Dissection of causal gene functions in cellular and organismal processes therefore necessitates approaches that enable spatially and temporally precise modulation of gene expression. Recently, a variety of microbial and plant-derived light-sensitive proteins have been engineered as optogenetic actuators, enabling high-precision spatiotemporal control of many cellular functions. However, versatile and robust technologies that enable optical modulation of transcription in the mammalian endogenous genome remain elusive. Here we describe the development of light-inducible transcriptional effectors (LITEs), an optogenetic two-hybrid system integrating the customizable TALE DNA-binding domain with the light-sensitive cryptochrome 2 protein and its interacting partner CIB1 from Arabidopsis thaliana. LITEs do not require additional exogenous chemical cofactors, are easily customized to target many endogenous genomic loci, and can be activated within minutes with reversibility. LITEs can be packaged into viral vectors and genetically targeted to probe specific cell populations. We have applied this system in primary mouse neurons, as well as in the brain of freely behaving mice in vivo to mediate reversible modulation of mammalian endogenous gene expression as well as targeted epigenetic chromatin modifications. The LITE system establishes a novel mode of optogenetic control of endogenous cellular processes and enables direct testing of the causal roles of genetic and epigenetic regulation in normal biological processes and disease states.
View details for DOI 10.1038/nature12466
View details for PubMedID 23877069
View details for PubMedCentralID PMC3856241
Multiplex Genome Engineering Using CRISPR/Cas Systems
2013; 339 (6121): 819-823
Functional elucidation of causal genetic variants and elements requires precise genome editing technologies. The type II prokaryotic CRISPR (clustered regularly interspaced short palindromic repeats)/Cas adaptive immune system has been shown to facilitate RNA-guided site-specific DNA cleavage. We engineered two different type II CRISPR/Cas systems and demonstrate that Cas9 nucleases can be directed by short RNAs to induce precise cleavage at endogenous genomic loci in human and mouse cells. Cas9 can also be converted into a nicking enzyme to facilitate homology-directed repair with minimal mutagenic activity. Lastly, multiple guide sequences can be encoded into a single CRISPR array to enable simultaneous editing of several sites within the mammalian genome, demonstrating easy programmability and wide applicability of the RNA-guided nuclease technology.
View details for DOI 10.1126/science.1231143
View details for Web of Science ID 000314874400049
View details for PubMedID 23287718
Comprehensive interrogation of natural TALE DNA-binding modules and transcriptional repressor domains
Transcription activator-like effectors are sequence-specific DNA-binding proteins that harbour modular, repetitive DNA-binding domains. Transcription activator-like effectors have enabled the creation of customizable designer transcriptional factors and sequence-specific nucleases for genome engineering. Here we report two improvements of the transcription activator-like effector toolbox for achieving efficient activation and repression of endogenous gene expression in mammalian cells. We show that the naturally occurring repeat-variable diresidue Asn-His (NH) has high biological activity and specificity for guanine, a highly prevalent base in mammalian genomes. We also report an effective transcription activator-like effector transcriptional repressor architecture for targeted inhibition of transcription in mammalian cells. These findings will improve the precision and effectiveness of genome engineering that can be achieved using transcription activator-like effectors.
View details for DOI 10.1038/ncomms1962
View details for Web of Science ID 000306995000040
View details for PubMedID 22828628
View details for PubMedCentralID PMC3556390
A transcription activator-like effector toolbox for genome engineering.
2012; 7 (1): 171-92
Transcription activator-like effectors (TALEs) are a class of naturally occurring DNA-binding proteins found in the plant pathogen Xanthomonas sp. The DNA-binding domain of each TALE consists of tandem 34-amino acid repeat modules that can be rearranged according to a simple cipher to target new DNA sequences. Customized TALEs can be used for a wide variety of genome engineering applications, including transcriptional modulation and genome editing. Here we describe a toolbox for rapid construction of custom TALE transcription factors (TALE-TFs) and nucleases (TALENs) using a hierarchical ligation procedure. This toolbox facilitates affordable and rapid construction of custom TALE-TFs and TALENs within 1 week and can be easily scaled up to construct TALEs for multiple targets in parallel. We also provide details for testing the activity in mammalian cells of custom TALE-TFs and TALENs using quantitative reverse-transcription PCR and Surveyor nuclease, respectively. The TALE toolbox described here will enable a broad range of biological applications.
View details for DOI 10.1038/nprot.2011.431
View details for PubMedID 22222791
View details for PubMedCentralID PMC3684555
Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription
2011; 29 (2): 149-U90
The ability to direct functional proteins to specific DNA sequences is a long-sought goal in the study and engineering of biological processes. Transcription activator-like effectors (TALEs) from Xanthomonas sp. are site-specific DNA-binding proteins that can be readily designed to target new sequences. Because TALEs contain a large number of repeat domains, it can be difficult to synthesize new variants. Here we describe a method that overcomes this problem. We leverage codon degeneracy and type IIs restriction enzymes to generate orthogonal ligation linkers between individual repeat monomers, thus allowing full-length, customized, repeat domains to be constructed by hierarchical ligation. We synthesized 17 TALEs that are customized to recognize specific DNA-binding sites, and demonstrate that they can specifically modulate transcription of endogenous genes (SOX2 and KLF4) in human cells.
View details for DOI 10.1038/nbt.1775
View details for Web of Science ID 000287023000022
View details for PubMedID 21248753
View details for PubMedCentralID PMC3084533