Le Cong
Assistant Professor of Pathology (Pathology Research) and of Genetics
Web page: http://www.conglab.com/
Bio
Dr. Cong's group is developing technology for large-scale genome editing and gene insertion for gene&cell therapy, integrating advance from metagenomics, computational biology, and high-throughput engineering. In parallel, the group also leverages these gene-editing tools for single-cell functional screening, to probe the molecular mechanisms of innate immunity in cancer and neuro-immune diseases. To accelerate these efforts, Dr. Cong's team integrates AI and machine learning into genome technologies, to design and evolve gene-editing proteins and RNAs in silico, significantly enhancing the efficiency and capabilities of these therapeutic molecules.
Dr. Cong's work has led to one of the first CRISPR/Cas9 gene-editing tools for in vivo gene therapy. More recently, his group invented tools for cleavage-free large gene insertion with novel recombination proteins (SSAP editor), and developed machine-learning optimized single-cell methods (DAISY) for studying cancer and immune diseases. Dr. Cong is a recipient of the NHGRI Genomic Innovator Award, Baxter Foundation Faculty Scholar, Genetic Engineering and Biotechnology News (GEN) Top 10 Under 40, Clinical OMICs Pioneers Under 40, and Clarivate Web of Science Highly Cited Researcher.
Academic Appointments
-
Assistant Professor, Pathology
-
Assistant Professor, Genetics
-
Member, Bio-X
-
Member, Wu Tsai Neurosciences Institute
Honors & Awards
-
Genomic Innovator Award, National Institute of Health (NIH), National Human Genome Research Institute (NHGRI)
-
Donald and Delia Baxter Foundation Faculty Scholar, Baxter Foundation
-
CRI Irvington Fellow, Cancer Research Institute
-
HHMI International Fellow, Howard Hughes Medical Institute
Boards, Advisory Committees, Professional Organizations
-
Genome Editing and New Investigator Committee Member, American Society of Gene & Cell Therapy (ASGCT) (2019 - Present)
-
Editorial Board, Gene and Genome Editing (2021 - Present)
Professional Education
-
PhD, Harvard University, Harvard Medical School., Biological and Biomedical Sciences (2014)
-
LHB, Harvard Medical School., Certificate in Leder Human Biology and Translational Medicine
-
B.S., Tsinghua University, Biological Sciences, Electronic Engineering (2009)
Community and International Work
-
Neuro-engineering and Gene-editing., Cold Spring Harbor Laboratory
Topic
Advanced Techniques in Molecular Neuroscience
Partnering Organization(s)
Cold Spring Harbor Laboratory
Location
International
Ongoing Project
Yes
Opportunities for Student Involvement
Yes
Patents
-
Feng Zhang, Le Cong, Patrick Hsu, Fei Ann Ran. "United States Patent 8,906,616 Engineering of systems, methods and optimized guide compositions for sequence manipulation"
-
Le Cong, Feng Zhang. "United States Patent 8,932,814 CRISPR-Cas nickase systems, methods and compositions for sequence manipulation in eukaryotes."
-
Feng Zhang, LeCong, Randall Platt, Neville Sanjana, Fei Ann Ran. "United States Patent 8,993,233 Engineering and optimization of systems, methods and compositions for sequence manipulation with functional domains"
-
Cong, Egloff, Garraway, Grandis, Lander, Stransky, Tward, Zhang.. "United States Patent 9,370,551. Compositions and methods of treating head and neck cancer."
-
Le Cong, Feng Zhang, Patrick Hsu, Fei Ann Ran. "United StatesEngineering of systems, methods and optimized guide compositions for sequence manipulation."
Current Research and Scholarly Interests
Our lab develops gene-editing technologies like novel CRISPR systems and large gene insertion techniques for gene&cell therapy. We also leverages these gene-editing tools for single-cell functional screening, to probe molecular mechanisms of cancer and immunological diseases. To accelerate our work, we integrate AI and machine learning to design and evolve gene-editing proteins/RNAs in silico, pushing the frontier that bridges computational and experimental biology.
2024-25 Courses
-
Independent Studies (5)
- Directed Study
BIOE 391 (Aut, Win, Spr, Sum) - Graduate Research
CBIO 399 (Aut, Win, Spr, Sum) - Graduate Research
GENE 399 (Aut, Win, Spr, Sum) - Supervised Study
GENE 260 (Aut, Win, Spr, Sum) - Undergraduate Research
PATH 199 (Aut, Win, Spr, Sum)
- Directed Study
- Prior Year Courses
Stanford Advisees
-
Doctoral Dissertation Reader (AC)
Henry Cousins, Kathryn Hanson, Yannick Lee-Yow -
Postdoctoral Faculty Sponsor
Ravi Dinesh, Xiaotong Wang, Guangxue Xu, Di Yin -
Doctoral Dissertation Advisor (AC)
Yuanhao Qu
Graduate and Fellowship Programs
-
Biology (School of Humanities and Sciences) (Phd Program)
-
Biomedical Data Science (Masters Program)
-
Biomedical Data Science (Phd Program)
-
Cytopathology (Fellowship Program)
-
Hematopathology (Fellowship Program)
-
Neuropathology (Fellowship Program)
All Publications
-
FoldMark: Protecting Protein Generative Models with Watermarking.
bioRxiv : the preprint server for biology
2024
Abstract
Protein structure is key to understanding protein function and is essential for progress in bioengineering, drug discovery, and molecular biology. Recently, with the incorporation of generative AI, the power and accuracy of computational protein structure prediction/design have been improved significantly. However, ethical concerns such as copyright protection and harmful content generation (biosecurity) pose challenges to the wide implementation of protein generative models. Here, we investigate whether it is possible to embed watermarks into protein generative models and their outputs for copyright authentication and the tracking of generated structures. As a proof of concept, we propose a two-stage method FoldMark as a generalized watermarking strategy for protein generative models. FoldMark first pretrain watermark encoder and decoder, which can minorly adjust protein structures to embed user-specific information and faithfully recover the information from the encoded structure. In the second step, protein generative models are fine-tuned with Low-Rank Adaptation modules with watermark as condition to preserve generation quality while learning to generate watermarked structures with high recovery rates. Extensive experiments are conducted on open-source protein structure prediction models (e.g., ESMFold and MultiFlow) and de novo structure design models (e.g., FrameDiff and FoldFlow) and we demonstrate that our method is effective across all these generative models. Meanwhile, our watermarking framework only exerts a negligible impact on the original protein structure quality and is robust under potential post-processing and adaptive attacks.
View details for DOI 10.1101/2024.10.23.619960
View details for PubMedID 39554012
View details for PubMedCentralID PMC11565776
-
Computationally guided high-throughput engineering of an anti-CRISPR protein for precise genome editing in human cells.
Cell reports methods
2024; 4 (10): 100882
Abstract
The application of CRISPR-Cas systems to genome editing has revolutionized experimental biology and is an emerging gene and cell therapy modality. CRISPR-Cas systems target off-target regions within the human genome, which is a challenge that must be addressed. Phages have evolved anti-CRISPR proteins (Acrs) to evade CRISPR-Cas-based immunity. Here, we engineer an Acr (AcrIIA4) to increase the precision of CRISPR-Cas-based genome targeting. We developed an approach that leveraged (1) computational guidance, (2) deep mutational scanning, and (3) highly parallel DNA repair measurements within human cells. In a single experiment, 10,000 Acr variants were tested. Variants that improved editing precision were tested in additional validation experiments that revealed robust enhancement of gene editing precision and synergy with a high-fidelity version of Cas9. This scalable high-throughput screening framework is a promising methodology to engineer Acrs to increase gene editing precision, which could be used to improve the safety of gene editing-based therapeutics.
View details for DOI 10.1016/j.crmeth.2024.100882
View details for PubMedID 39437714
-
Systematic Discovery, In Vivo Delivery, and DNA Repair Mechanism of Single-Strand Annealing Protein for Precision Integration of Large DNA Sequences
CELL PRESS. 2024: 9-10
View details for Web of Science ID 001332783400018
-
A 5′ UTR language model for decoding untranslated regions of mRNA and function predictions
NATURE MACHINE INTELLIGENCE
2024
View details for DOI 10.1038/s42256-024-00823-9
View details for Web of Science ID 001197347300002
-
A 5' UTR Language Model for Decoding Untranslated Regions of mRNA and Function Predictions.
Nature machine intelligence
2024; 6 (4): 449-460
Abstract
The 5' UTR, a regulatory region at the beginning of an mRNA molecule, plays a crucial role in regulating the translation process and impacts the protein expression level. Language models have showcased their effectiveness in decoding the functions of protein and genome sequences. Here, we introduced a language model for 5' UTR, which we refer to as the UTR-LM. The UTR-LM is pre-trained on endogenous 5' UTRs from multiple species and is further augmented with supervised information including secondary structure and minimum free energy. We fine-tuned the UTR-LM in a variety of downstream tasks. The model outperformed the best known benchmark by up to 5% for predicting the Mean Ribosome Loading, and by up to 8% for predicting the Translation Efficiency and the mRNA Expression Level. The model also applies to identifying unannotated Internal Ribosome Entry Sites within the untranslated region and improves the AUPR from 0.37 to 0.52 compared to the best baseline. Further, we designed a library of 211 novel 5' UTRs with high predicted values of translation efficiency and evaluated them via a wet-lab assay. Experiment results confirmed that our top designs achieved a 32.5% increase in protein production level relative to well-established 5' UTR optimized for therapeutics.
View details for DOI 10.1038/s42256-024-00823-9
View details for PubMedID 38855263
View details for PubMedCentralID PMC11155392
-
APOE loss-of-function variants: Compatible with longevity and associated with resistance to Alzheimer's disease pathology.
Neuron
2024
Abstract
The ε4 allele of apolipoprotein E (APOE) is the strongest genetic risk factor for sporadic Alzheimer's disease (AD). Knockdown of ε4 may provide a therapeutic strategy for AD, but the effect of APOE loss of function (LoF) on AD pathogenesis is unknown. We searched for APOE LoF variants in a large cohort of controls and patients with AD and identified seven heterozygote carriers of APOE LoF variants. Five carriers were controls (aged 71-90 years), one carrier was affected by progressive supranuclear palsy, and one carrier was affected by AD with an unremarkable age at onset of 75 years. Two APOE ε3/ε4 controls carried a stop-gain affecting ε4: one was cognitively normal at 90 years and had no neuritic plaques at autopsy; the other was cognitively healthy at 79 years, and lumbar puncture at 76 years showed normal levels of amyloid. These results suggest that ε4 drives AD risk through the gain of abnormal function and support ε4 knockdown as a viable therapeutic option.
View details for DOI 10.1016/j.neuron.2024.01.008
View details for PubMedID 38301647
-
Long sequence insertion via CRISPR/Cas gene-editing with transposase, recombinase, and integrase.
Current opinion in biomedical engineering
2023; 28
Abstract
CRISPR/Cas-based gene-editing technologies have emerged as one of the most transformative tools in genome science over the past decade, providing unprecedented possibilities for both fundamental and translational research. Following the initial wave of innovations for gene knock-out, epigenetic/RNA modulation, and nickase-mediated base-editing, recent efforts have pivoted towards long-sequence gene editing- specifically, the insertion of large fragments (>1 kb) into the endogenous genome. In this review, we survey the development of these CRISPR/Cas-based sequence insertion methodologies in conjunction with the emergence of novel families of editing enzymes, such as transposases, single-stranded DNA-annealing proteins, recombinases, and integrases. Despite facing a number of challenges, this field continues to evolve rapidly and holds the potential to catalyze a new wave of revolutionary biomedical applications.
View details for DOI 10.1016/j.cobme.2023.100491
View details for PubMedID 38549686
View details for PubMedCentralID PMC10976843
-
Long sequence insertion via CRISPR/Cas gene-editing with transposase, recombinase, and integrase
CURRENT OPINION IN BIOMEDICAL ENGINEERING
2023; 28
View details for DOI 10.1016/j.cobme.2023.100491
View details for Web of Science ID 001062222100001
-
Integrative analysis of functional genomic screening and clinical data identifies a protective role for spironolactone in severe COVID-19.
Cell reports methods
2023; 3 (7): 100503
Abstract
We demonstrate that integrative analysis of CRISPR screening datasets enables network-based prioritization of prescription drugs modulating viral entry in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by developing a network-based approach called Rapid proXimity Guidance for Repurposing Investigational Drugs (RxGRID). We use our results to guide a propensity-score-matched, retrospective cohort study of 64,349 COVID-19 patients, showing that a top candidate drug, spironolactone, is associated with improved clinical prognosis, measured by intensive care unit (ICU) admission and mechanical ventilation rates. Finally, we show that spironolactone exerts a dose-dependent inhibitory effect on viral entry in human lung epithelial cells. Our RxGRID method presents a computational framework, implemented as an open-source software package, enabling genomics researchers to identify drugs likely to modulate a molecular phenotype of interest based on high-throughput screening data. Our results, derived from this method and supported by experimental and clinical analysis, add additional supporting evidence for a potential protective role of the potassium-sparing diuretic spironolactone in severe COVID-19.
View details for DOI 10.1016/j.crmeth.2023.100503
View details for PubMedID 37529368
-
APOE loss-of-function variants: Compatible with longevity and associated with resistance to Alzheimer's Disease pathology.
medRxiv : the preprint server for health sciences
2023
Abstract
The ε4 allele of apolipoprotein E (APOE) is the strongest genetic risk factor for sporadic Alzheimer's Disease (AD). Knockdown of this allele may provide a therapeutic strategy for AD, but the effect of APOE loss-of-function (LoF) on AD pathogenesis is unknown. We searched for APOE LoF variants in a large cohort of older controls and patients with AD and identified six heterozygote carriers of APOE LoF variants. Five carriers were controls (ages 71-90) and one was an AD case with an unremarkable age-at-onset between 75-79. Two APOE ε3/ε4 controls (Subjects 1 and 2) carried a stop-gain affecting the ε4 allele. Subject 1 was cognitively normal at 90+ and had no neuritic plaques at autopsy. Subject 2 was cognitively healthy within the age range 75-79 and underwent lumbar puncture at between ages 75-79 with normal levels of amyloid. The results provide the strongest human genetics evidence yet available suggesting that ε4 drives AD risk through a gain of abnormal function and support knockdown of APOE ε4 or its protein product as a viable therapeutic option.
View details for DOI 10.1101/2023.07.20.23292771
View details for PubMedID 37547016
View details for PubMedCentralID PMC10402217
-
Gene set proximity analysis: expanding gene set enrichment analysis through learned geometric embeddings, with drug-repurposing applications in COVID-19.
Bioinformatics (Oxford, England)
2022
Abstract
MOTIVATION: Gene set analysis methods rely on knowledge-based representations of genetic interactions in the form of both gene set collections and protein-protein interaction (PPI) networks. However, explicit representations of genetic interactions often fail to capture complex interdependencies among genes, limiting the analytic power of such methods.RESULTS: We propose an extension of gene set enrichment analysis to a latent embedding space reflecting PPI network topology, called gene set proximity analysis (GSPA). Compared with existing methods, GSPA provides improved ability to identify disease-associated pathways in disease-matched gene expression datasets, while improving reproducibility of enrichment statistics for similar gene sets. GSPA is statistically straightforward, reducing to a version of traditional gene set enrichment analysis through a single user-defined parameter. We apply our method to identify novel drug associations with SARS-CoV-2 viral entry. Finally, we validate our drug association predictions through retrospective clinical analysis of claims data from 8 million patients, supporting a role for gabapentin as a risk factor and metformin as a protective factor for severe COVID-19.AVAILABILITY: GSPA is available for download as a command-line Python package at https://github.com/henrycousins/gspa.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btac735
View details for PubMedID 36394254
-
Single-cell transcriptome analysis of regenerating RGCs reveals potent glaucoma neural repair genes.
Neuron
2022
Abstract
Axon regeneration holds great promise for neural repair of CNS axonopathies, including glaucoma. Pten deletion in retinal ganglion cells (RGCs) promotes potent optic nerve regeneration, but only a small population of Pten-null RGCs are actually regenerating RGCs (regRGCs); most surviving RGCs (surRGCs) remain non-regenerative. Here, we developed a strategy to specifically label and purify regRGCs and surRGCs, respectively, from the same Pten-deletion mice after optic nerve crush, in which they differ only in their regeneration capability. Smart-Seq2 single-cell transcriptome analysis revealed novel regeneration-associated genes that significantly promote axon regeneration. The most potent of these, Anxa2, acts synergistically with its ligand tPA in Pten-deletion-induced axon regeneration. Anxa2, its downstream effector ILK, and Mpp1 dramatically protect RGC somata and axons and preserve visual function in a clinically relevant model of glaucoma, demonstrating the exciting potential of this innovative strategy to identify novel effective neural repair candidates.
View details for DOI 10.1016/j.neuron.2022.06.022
View details for PubMedID 35952672
-
Machine-learning-optimized Cas12a barcoding enables the recovery of single-cell lineages and transcriptional profiles.
Molecular cell
2022
Abstract
The development of CRISPR-based barcoding methods creates an exciting opportunity to understand cellular phylogenies. We present a compact, tunable, high-capacity Cas12a barcoding system called dual acting inverted site array (DAISY). We combined high-throughput screening and machine learning to predict and optimize the 60-bp DAISY barcode sequences. After optimization, top-performing barcodes had ∼10-fold increased capacity relative to the best random-screened designs and performed reliably across diverse cell types. DAISY barcode arrays generated ∼12 bits of entropy and ∼66,000 unique barcodes. Thus, DAISY barcodes-at a fraction of the size of Cas9 barcodes-achieved high-capacity barcoding. We coupled DAISY barcoding with single-cell RNA-seq to recover lineages and gene expression profiles from ∼47,000 human melanoma cells. A single DAISY barcode recovered up to ∼700 lineages from one parental cell. This analysis revealed heritable single-cell gene expression and potential epigenetic modulation of memory gene transcription. Overall, Cas12a DAISY barcoding is an efficient tool for investigating cell-state dynamics.
View details for DOI 10.1016/j.molcel.2022.06.001
View details for PubMedID 35752172
-
Editorial: CRISPR and alternative approaches.
Biotechnology journal
2022: e2200290
View details for DOI 10.1002/biot.202200290
View details for PubMedID 35726663
-
Neural Bandits for Protein Sequence Optimization
IEEE. 2022: 188-193
View details for DOI 10.1109/CISS53076.2022.9751154
View details for Web of Science ID 000945325900018
-
dCas9-based gene editing for cleavage-free genomic knock-in of long sequences.
Nature cell biology
2022
Abstract
Gene editing is a powerful tool for genome and cell engineering. Exemplified by CRISPR-Cas, gene editing could cause DNA damage and trigger DNA repair processes that are often error-prone. Such unwanted mutations and safety concerns can be exacerbated when altering long sequences. Here we couple microbial single-strand annealing proteins (SSAPs) with catalytically inactive dCas9 for gene editing. This cleavage-free gene editor, dCas9-SSAP, promotes the knock-in of long sequences in mammalian cells. The dCas9-SSAP editor has low on-target errors and minimal off-target effects, showing higher accuracy than canonical Cas9 methods. It is effective for inserting kilobase-scale sequences, with an efficiency of up to approximately 20% and robust performance across donor designs and cell types, including human stem cells. We show that dCas9-SSAP is less sensitive to inhibition of DNA repair enzymes than Cas9 references. We further performed truncation and aptamer engineering to minimize its size to fit into a single adeno-associated-virus vector for future application. Together, this tool opens opportunities towards safer long-sequence genome engineering.
View details for DOI 10.1038/s41556-021-00836-1
View details for PubMedID 35145221
-
The role of p53 in the development of pancreatic ductal adenocarcinoma.
AMER ASSOC CANCER RESEARCH. 2021: 58
View details for Web of Science ID 000720117400098
-
Deciphering pathogenicity of variants of uncertain significance with CRISPR-edited iPSCs.
Trends in genetics : TIG
2021
Abstract
Genetic variants play an important role in conferring risk for cardiovascular diseases (CVDs). With the rapid development of next-generation sequencing (NGS), thousands of genetic variants associated with CVDs have been identified by genome-wide association studies (GWAS), but the function of more than 40% of genetic variants is still unknown. This gap of knowledge is a barrier to the clinical application of the genetic information. However, determining the pathogenicity of a variant of uncertain significance (VUS) is challenging due to the lack of suitable model systems and accessible technologies. By combining clustered regularly interspaced short palindromic repeats (CRISPR) and human induced pluripotent stem cells (iPSCs), unprecedented advances are now possible in determining the pathogenicity of VUS in CVDs. Here, we summarize recent progress and new strategies in deciphering pathogenic variants for CVDs using CRISPR-edited human iPSCs.
View details for DOI 10.1016/j.tig.2021.08.009
View details for PubMedID 34509299
-
Conventional type I dendric cells maintain a reservoir of proliferative tumor-antigen specific TCF-1+ CD8+ Tcells in tumor-draining lymph nodes.
Immunity
2021
Abstract
In tumors, a subset of CD8+ Tcells expressing the transcription factor TCF-1 drives the response to immune checkpoint blockade. We examined the mechanisms that maintain these cells in an autochthonous model of lung adenocarcinoma. Longitudinal sampling and single-cell sequencing of tumor-antigen specific TCF-1+ CD8+ Tcells revealed that while intratumoral TCF-1+ CD8+ Tcells acquired dysfunctional features and decreased in number as tumors progressed, TCF-1+ CD8+ Tcell frequency in the tumor draining LN (dLN) remained stable. Two discrete intratumoral TCF-1+ CD8+ Tcell subsets developed over time-a proliferative SlamF6+ subset and a non-cycling SlamF6- subset. Blocking dLN egress decreased the frequency of intratumoral SlamF6+ TCF-1+ CD8+ Tcells. Conventional type I dendritic cell (cDC1) in dLN decreased in number with tumor progression, and Flt3L+anti-CD40 treatment recovered SlamF6+ Tcell frequencies and decreased tumor burden. Thus, cDC1s in tumor dLN maintain a reservoir of TCF-1+ CD8+ Tcells and their decrease contributes to failed anti-tumor immunity.
View details for DOI 10.1016/j.immuni.2021.08.026
View details for PubMedID 34534439
-
CRISPR-Cas12a System With Synergistic Phage Recombination Proteins for Multiplex Precision Editing in Human Cells.
Frontiers in cell and developmental biology
2021; 9: 719705
Abstract
The development of CRISPR-based gene-editing technologies has brought an unprecedented revolution in the field of genome engineering. Cas12a, a member of the Class 2 Type V CRISPR-associated endonuclease family distinct from Cas9, has been repurposed and developed into versatile gene-editing tools with distinct PAM recognition sites and multiplexed gene targeting capability. However, with current CRISPR/Cas12a technologies, it remains a challenge to perform efficient and precise genome editing of long sequences in mammalian cells. To address this limitation, we utilized phage recombination enzymes and developed an efficient CRISPR/Cas12a tool for multiplexed precision editing in mammalian cells. Through protein engineering, we were able to recruit phage recombination proteins to Cas12a to enhance its homology-directed repair efficiencies. Our phage-recombination-assisted Cas12a system achieved up to 3-fold improvements for kilobase-scale knock-ins in human cells without compromising the specificity of the enzyme. The performance of this system compares favorably against Cas9 references, the commonly used enzyme for gene-editing tasks, with improved specificity. Additionally, we demonstrated multi-target editing with similar improved activities thanks to the RNA-processing activity of the Cas12a system. This compact, multi-target editing tool has the potential to assist in understanding multi-gene interactions. In particular, it paves the way for a gene therapy method for human diseases that complements existing tools and is suitable for polygenic disorders and diseases requiring long-sequence corrections.
View details for DOI 10.3389/fcell.2021.719705
View details for PubMedID 35774104
View details for PubMedCentralID PMC9237396
-
Cleavage-Free dCas9 Knock-In Gene-Editing Tool Leveraging RNA-Guided Targeting of Recombineering Proteins
CELL PRESS. 2021: 107
View details for Web of Science ID 000645188700204
-
A CRISPR Landing for Genome Rewriting at Locus-Scale.
The CRISPR journal
2021; 4 (2): 163–66
View details for DOI 10.1089/crispr.2021.29124.lec
View details for PubMedID 33876954
-
Microbial single-strand annealing proteins enable CRISPR gene-editing tools with improved knock-in efficiencies and reduced off-target effects.
Nucleic acids research
2021
Abstract
Several existing technologies enable short genomic alterations including generating indels and short nucleotide variants, however,engineering more significant genomic changes is more challenging due to reduced efficiency and precision. Here, we developed RecT Editor via Designer-Cas9-Initiated Targeting (REDIT), which leverages phage single-stranded DNA-annealing proteins (SSAP) RecT for mammalian genome engineering. Relative to Cas9-mediated homology-directed repair (HDR), REDIT yielded up to a 5-fold increase of efficiency to insert kilobase-scale exogenous sequences at defined genomic regions. We validated our REDIT approach using different formats and lengths of knock-in templates. We further demonstrated that REDIT tools using Cas9 nickase have efficient gene-editing activities and reduced off-target errors, measured using a combination of targeted sequencing, genome-wide indel, and insertion mapping assays. Our experiments inhibiting repair enzyme activities suggested that REDIT has the potential to overcome limitations of endogenous DNA repair steps.Finally, our REDIT method is applicable across cell types including human stem cells,and is generalizable to different Cas9 enzymes.
View details for DOI 10.1093/nar/gkaa1264
View details for PubMedID 33619540
-
A functional taxonomy of tumor suppression in oncogenic KRAS-driven lung cancer.
Cancer discovery
2021
Abstract
Cancer genotyping has identified a large number of putative tumor suppressor genes. Carcinogenesis is a multi-step process, however the importance and specific roles of many of these genes during tumor initiation, growth and progression remain unknown. Here we use a multiplexed mouse model of oncogenic KRAS-driven lung cancer to quantify the impact of forty-eight known and putative tumor suppressor genes on diverse aspects of carcinogenesis at an unprecedented scale and resolution. We uncover many previously understudied functional tumor suppressors that constrain cancer in vivo. Inactivation of some genes substantially increased growth, while the inactivation of others increases tumor initiation and/or the emergence of exceptionally large tumors. These functional in vivo analyses revealed an unexpectedly complex landscape of tumor suppression that has implications for understanding cancer evolution, interpreting clinical cancer genome sequencing data, and directing approaches to limit tumor initiation and progression.
View details for DOI 10.1158/2159-8290.CD-20-1325
View details for PubMedID 33608386
-
Adeno-associated viral vector-mediated immune responses: Understanding barriers to gene delivery.
Pharmacology & therapeutics
2019: 107453
Abstract
Adeno-associated viral (AAV) vectors have emerged as the leading gene delivery platform for gene therapy and vaccination. Three AAV-based gene therapy drugs, Glybera, LUXTURNA, and ZOLGENSMA were approved between 2012 and 2019 by the European Medicines Agency and the United States Food and Drug Administration as treatments for genetic diseases hereditary lipoprotein lipase deficiency (LPLD), inherited retinal disease (IRD), and spinal muscular atrophy (SMA), respectively. Despite these therapeutic successes, clinical trials have demonstrated that host anti-viral immune responses can prevent the long-term gene expression of AAV vector-encoded genes. Therefore, it is critical that we understand the complex relationship between AAV vectors and the host immune response. This knowledge could allow for the rational design of optimized gene transfer vectors capable of either subverting host immune responses in the context of gene therapy applications, or stimulating desirable immune responses that generate protective immunity in vaccine applications to AAV vector-encoded antigens. This review provides an overview of our current understanding of the AAV-induced immune response and discusses potential strategies by which these responses can be manipulated to improve AAV vector-mediated gene transfer.
View details for DOI 10.1016/j.pharmthera.2019.107453
View details for PubMedID 31836454
-
Take Risks and Constantly Challenge the Status Quo
STEM CELLS AND DEVELOPMENT
2019
View details for DOI 10.1089/scd.2019.0082
View details for Web of Science ID 000469297600001
-
Combined Computational-Experimental Approach to Explore the Molecular Mechanism of SaCas9 with a Broadened DNA Targeting Range
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY
2019; 141 (16): 6545–52
View details for DOI 10.1021/jacs.8b13144
View details for Web of Science ID 000466053400019
-
IL-33 Signaling Alters Regulatory T Cell Diversity in Support of Tumor Development.
Cell reports
2019; 29 (10): 2998–3008.e8
Abstract
Regulatory T cells (Tregs) can impair anti-tumor immune responses and are associated with poor prognosis in multiple cancer types. Tregs in human tumors span diverse transcriptional states distinct from those of peripheral Tregs, but their contribution to tumor development remains unknown. Here, we use single-cell RNA sequencing (RNA-seq) to longitudinally profile dynamic shifts in the distribution of Tregs in a genetically engineered mouse model of lung adenocarcinoma. In this model, interferon-responsive Tregs are more prevalent early in tumor development, whereas a specialized effector phenotype characterized by enhanced expression of the interleukin-33 receptor ST2 is predominant in advanced disease. Treg-specific deletion of ST2 alters the evolution of effector Treg diversity, increases infiltration of CD8+ T cells into tumors, and decreases tumor burden. Our study shows that ST2 plays a critical role in Treg-mediated immunosuppression in cancer, highlighting potential paths for therapeutic intervention.
View details for DOI 10.1016/j.celrep.2019.10.120
View details for PubMedID 31801068
-
Efficient Generation of Transcriptomic Profiles by Random Composite Measurements.
Cell
2017; 171 (6): 1424-1436.e18
Abstract
RNA profiles are an informative phenotype of cellular and tissue states but can be costly to generate at massive scale. Here, we describe how gene expression levels can be efficiently acquired with random composite measurements-in which abundances are combined in a random weighted sum. We show (1) that the similarity between pairs of expression profiles can be approximated with very few composite measurements; (2) that by leveraging sparse, modular representations of gene expression, we can use random composite measurements to recover high-dimensional gene expression levels (with 100 times fewer measurements than genes); and (3) that it is possible to blindly recover gene expression from composite measurements, even without access to training data. Our results suggest new compressive modalities as a foundation for massive scaling in high-throughput measurements and new insights into the interpretation of high-dimensional data.
View details for DOI 10.1016/j.cell.2017.10.023
View details for PubMedID 29153835
View details for PubMedCentralID PMC5726792
-
A Distinct Gene Module for Dysfunction Uncoupled from Activation in Tumor-Infiltrating T Cells
CELL
2016; 166 (6): 1500-?
Abstract
Reversing the dysfunctional T cell state that arises in cancer and chronic viral infections is the focus of therapeutic interventions; however, current therapies are effective in only some patients and some tumor types. To gain a deeper molecular understanding of the dysfunctional T cell state, we analyzed population and single-cell RNA profiles of CD8(+) tumor-infiltrating lymphocytes (TILs) and used genetic perturbations to identify a distinct gene module for T cell dysfunction that can be uncoupled from T cell activation. This distinct dysfunction module is downstream of intracellular metallothioneins that regulate zinc metabolism and can be identified at single-cell resolution. We further identify Gata-3, a zinc-finger transcription factor in the dysfunctional module, as a regulator of dysfunction, and we use CRISPR-Cas9 genome editing to show that it drives a dysfunctional phenotype in CD8(+) TILs. Our results open novel avenues for targeting dysfunctional T cell states while leaving activation programs intact.
View details for DOI 10.1016/j.cell.2016.08.052
View details for Web of Science ID 000386339900021
View details for PubMedID 27610572
View details for PubMedCentralID PMC5019125
-
RBPJ Controls Development of Pathogenic Th17 Cells by Regulating IL-23 Receptor Expression.
Cell reports
2016; 16 (2): 392-404
Abstract
Interleukin-17 (IL-17)-producing helper T cells (Th17 cells) play an important role in autoimmune diseases. However, not all Th17 cells induce tissue inflammation or autoimmunity. Th17 cells require IL-23 receptor (IL-23R) signaling to become pathogenic. The transcriptional mechanisms controlling the pathogenicity of Th17 cells and IL-23R expression are unknown. Here, we demonstrate that the canonical Notch signaling mediator RBPJ is a key driver of IL-23R expression. In the absence of RBPJ, Th17 cells fail to upregulate IL-23R, lack stability, and do not induce autoimmune tissue inflammation in vivo, whereas overexpression of IL-23R rescues this defect and promotes pathogenicity of RBPJ-deficient Th17 cells. RBPJ binds and trans-activates the Il23r promoter and induces IL-23R expression and represses anti-inflammatory IL-10 production in Th17 cells. We thus find that Notch signaling influences the development of pathogenic and non-pathogenic Th17 cells by reciprocally regulating IL-23R and IL-10 expression.
View details for DOI 10.1016/j.celrep.2016.05.088
View details for PubMedID 27346359
View details for PubMedCentralID PMC4984261
-
Definitive localization of intracellular proteins: Novel approach using CRISPR-Cas9 genome editing, with glucose 6-phosphate dehydrogenase as a model.
Analytical biochemistry
2016; 494: 55-67
Abstract
Studies to determine subcellular localization and translocation of proteins are important because subcellular localization of proteins affects every aspect of cellular function. Such studies frequently utilize mutagenesis to alter amino acid sequences hypothesized to constitute subcellular localization signals. These studies often utilize fluorescent protein tags to facilitate live cell imaging. These methods are excellent for studies of monomeric proteins, but for multimeric proteins, they are unable to rule out artifacts from native protein subunits already present in the cells. That is, native monomers might direct the localization of fluorescent proteins with their localization signals obliterated. We have developed a method for ruling out such artifacts, and we use glucose 6-phosphate dehydrogenase (G6PD) as a model to demonstrate the method's utility. Because G6PD is capable of homodimerization, we employed a novel approach to remove interference from native G6PD. We produced a G6PD knockout somatic (hepatic) cell line using CRISPR-Cas9 mediated genome engineering. Transfection of G6PD knockout cells with G6PD fluorescent mutant proteins demonstrated that the major subcellular localization sequences of G6PD are within the N-terminal portion of the protein. This approach sets a new gold standard for similar studies of subcellular localization signals in all homodimerization-capable proteins.
View details for DOI 10.1016/j.ab.2015.11.002
View details for PubMedID 26576833
View details for PubMedCentralID PMC4695245
-
In vivo gene editing in dystrophic mouse muscle and muscle stem cells
SCIENCE
2016; 351 (6271): 407-411
Abstract
Frame-disrupting mutations in the DMD gene, encoding dystrophin, compromise myofiber integrity and drive muscle deterioration in Duchenne muscular dystrophy (DMD). Removing one or more exons from the mutated transcript can produce an in-frame mRNA and a truncated, but still functional, protein. In this study, we developed and tested a direct gene-editing approach to induce exon deletion and recover dystrophin expression in the mdx mouse model of DMD. Delivery by adeno-associated virus (AAV) of clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 endonucleases coupled with paired guide RNAs flanking the mutated Dmd exon23 resulted in excision of intervening DNA and restored the Dmd reading frame in myofibers, cardiomyocytes, and muscle stem cells after local or systemic delivery. AAV-Dmd CRISPR treatment partially recovered muscle functional deficiencies and generated a pool of endogenously corrected myogenic precursors in mdx mouse muscle.
View details for DOI 10.1126/science.aad5177
View details for Web of Science ID 000368440500046
View details for PubMedID 26721686
View details for PubMedCentralID PMC4924477
-
Crystal Structure of Staphylococcus aureus Cas9
CELL
2015; 162 (5): 1113-1126
Abstract
The RNA-guided DNA endonuclease Cas9 cleaves double-stranded DNA targets with a protospacer adjacent motif (PAM) and complementarity to the guide RNA. Recently, we harnessed Staphylococcus aureus Cas9 (SaCas9), which is significantly smaller than Streptococcus pyogenes Cas9 (SpCas9), to facilitate efficient in vivo genome editing. Here, we report the crystal structures of SaCas9 in complex with a single guide RNA (sgRNA) and its double-stranded DNA targets, containing the 5'-TTGAAT-3' PAM and the 5'-TTGGGT-3' PAM, at 2.6 and 2.7 Å resolutions, respectively. The structures revealed the mechanism of the relaxed recognition of the 5'-NNGRRT-3' PAM by SaCas9. A structural comparison of SaCas9 with SpCas9 highlighted both structural conservation and divergence, explaining their distinct PAM specificities and orthologous sgRNA recognition. Finally, we applied the structural information about this minimal Cas9 to rationally design compact transcriptional activators and inducible nucleases, to further expand the CRISPR-Cas9 genome editing toolbox.
View details for DOI 10.1016/j.cell.2015.08.007
View details for Web of Science ID 000360589900020
View details for PubMedID 26317473
View details for PubMedCentralID PMC4670267
-
Sequence determinants of improved CRISPR sgRNA design.
Genome research
2015; 25 (8): 1147-57
Abstract
The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies.
View details for DOI 10.1101/gr.191452.115
View details for PubMedID 26063738
View details for PubMedCentralID PMC4509999
-
In vivo genome editing using Staphylococcus aureus Cas9
NATURE
2015; 520 (7546): 186-U98
Abstract
The RNA-guided endonuclease Cas9 has emerged as a versatile genome-editing platform. However, the size of the commonly used Cas9 from Streptococcus pyogenes (SpCas9) limits its utility for basic research and therapeutic applications that use the highly versatile adeno-associated virus (AAV) delivery vehicle. Here, we characterize six smaller Cas9 orthologues and show that Cas9 from Staphylococcus aureus (SaCas9) can edit the genome with efficiencies similar to those of SpCas9, while being more than 1 kilobase shorter. We packaged SaCas9 and its single guide RNA expression cassette into a single AAV vector and targeted the cholesterol regulatory gene Pcsk9 in the mouse liver. Within one week of injection, we observed >40% gene modification, accompanied by significant reductions in serum Pcsk9 and total cholesterol levels. We further assess the genome-wide targeting specificity of SaCas9 and SpCas9 using BLESS, and demonstrate that SaCas9-mediated in vivo genome editing has the potential to be efficient and specific.
View details for DOI 10.1038/nature14299
View details for Web of Science ID 000352454600031
View details for PubMedID 25830891
View details for PubMedCentralID PMC4393360
-
Genome engineering using CRISPR-Cas9 system.
Methods in molecular biology (Clifton, N.J.)
2015; 1239: 197-217
Abstract
The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-Cas9 system is an adaptive immune system that exists in a variety of microbes. It could be engineered to function in eukaryotic cells as a fast, low-cost, efficient, and scalable tool for manipulating genomic sequences. In this chapter, detailed protocols are described for harnessing the CRISPR-Cas9 system from Streptococcus pyogenes to enable RNA-guided genome engineering applications in mammalian cells. We present all relevant methods including the initial site selection, molecular cloning, delivery of guide RNAs (gRNAs) and Cas9 into mammalian cells, verification of target cleavage, and assays for detecting genomic modification including indels and homologous recombination. These tools provide researchers with new instruments that accelerate both forward and reverse genetics efforts.
View details for DOI 10.1007/978-1-4939-1862-1_10
View details for PubMedID 25408407
-
Global microRNA depletion suppresses tumor angiogenesis.
Genes & development
2014; 28 (10): 1054-67
Abstract
MicroRNAs delicately regulate the balance of angiogenesis. Here we show that depletion of all microRNAs suppresses tumor angiogenesis. We generated microRNA-deficient tumors by knocking out Dicer1. These tumors are highly hypoxic but poorly vascularized, suggestive of deficient angiogenesis signaling. Expression profiling revealed that angiogenesis genes were significantly down-regulated as a result of the microRNA deficiency. Factor inhibiting hypoxia-inducible factor 1 (HIF-1), FIH1, is derepressed under these conditions and suppresses HIF transcription. Knocking out FIH1 using CRISPR/Cas9-mediated genome engineering reversed the phenotypes of microRNA-deficient cells in HIF transcriptional activity, VEGF production, tumor hypoxia, and tumor angiogenesis. Using multiplexed CRISPR/Cas9, we deleted regions in FIH1 3' untranslated regions (UTRs) that contain microRNA-binding sites, which derepresses FIH1 protein and represses hypoxia response. These data suggest that microRNAs promote tumor responses to hypoxia and angiogenesis by repressing FIH1.
View details for DOI 10.1101/gad.239681.114
View details for PubMedID 24788094
View details for PubMedCentralID PMC4035535
-
Optical control of mammalian endogenous transcription and epigenetic states.
Nature
2013; 500 (7463): 472-6
Abstract
The dynamic nature of gene expression enables cellular programming, homeostasis and environmental adaptation in living systems. Dissection of causal gene functions in cellular and organismal processes therefore necessitates approaches that enable spatially and temporally precise modulation of gene expression. Recently, a variety of microbial and plant-derived light-sensitive proteins have been engineered as optogenetic actuators, enabling high-precision spatiotemporal control of many cellular functions. However, versatile and robust technologies that enable optical modulation of transcription in the mammalian endogenous genome remain elusive. Here we describe the development of light-inducible transcriptional effectors (LITEs), an optogenetic two-hybrid system integrating the customizable TALE DNA-binding domain with the light-sensitive cryptochrome 2 protein and its interacting partner CIB1 from Arabidopsis thaliana. LITEs do not require additional exogenous chemical cofactors, are easily customized to target many endogenous genomic loci, and can be activated within minutes with reversibility. LITEs can be packaged into viral vectors and genetically targeted to probe specific cell populations. We have applied this system in primary mouse neurons, as well as in the brain of freely behaving mice in vivo to mediate reversible modulation of mammalian endogenous gene expression as well as targeted epigenetic chromatin modifications. The LITE system establishes a novel mode of optogenetic control of endogenous cellular processes and enables direct testing of the causal roles of genetic and epigenetic regulation in normal biological processes and disease states.
View details for DOI 10.1038/nature12466
View details for PubMedID 23877069
View details for PubMedCentralID PMC3856241
-
Multiplex Genome Engineering Using CRISPR/Cas Systems
SCIENCE
2013; 339 (6121): 819-823
Abstract
Functional elucidation of causal genetic variants and elements requires precise genome editing technologies. The type II prokaryotic CRISPR (clustered regularly interspaced short palindromic repeats)/Cas adaptive immune system has been shown to facilitate RNA-guided site-specific DNA cleavage. We engineered two different type II CRISPR/Cas systems and demonstrate that Cas9 nucleases can be directed by short RNAs to induce precise cleavage at endogenous genomic loci in human and mouse cells. Cas9 can also be converted into a nicking enzyme to facilitate homology-directed repair with minimal mutagenic activity. Lastly, multiple guide sequences can be encoded into a single CRISPR array to enable simultaneous editing of several sites within the mammalian genome, demonstrating easy programmability and wide applicability of the RNA-guided nuclease technology.
View details for DOI 10.1126/science.1231143
View details for Web of Science ID 000314874400049
View details for PubMedID 23287718
-
Comprehensive interrogation of natural TALE DNA-binding modules and transcriptional repressor domains
NATURE COMMUNICATIONS
2012; 3
Abstract
Transcription activator-like effectors are sequence-specific DNA-binding proteins that harbour modular, repetitive DNA-binding domains. Transcription activator-like effectors have enabled the creation of customizable designer transcriptional factors and sequence-specific nucleases for genome engineering. Here we report two improvements of the transcription activator-like effector toolbox for achieving efficient activation and repression of endogenous gene expression in mammalian cells. We show that the naturally occurring repeat-variable diresidue Asn-His (NH) has high biological activity and specificity for guanine, a highly prevalent base in mammalian genomes. We also report an effective transcription activator-like effector transcriptional repressor architecture for targeted inhibition of transcription in mammalian cells. These findings will improve the precision and effectiveness of genome engineering that can be achieved using transcription activator-like effectors.
View details for DOI 10.1038/ncomms1962
View details for Web of Science ID 000306995000040
View details for PubMedID 22828628
View details for PubMedCentralID PMC3556390
-
A transcription activator-like effector toolbox for genome engineering.
Nature protocols
2012; 7 (1): 171-92
Abstract
Transcription activator-like effectors (TALEs) are a class of naturally occurring DNA-binding proteins found in the plant pathogen Xanthomonas sp. The DNA-binding domain of each TALE consists of tandem 34-amino acid repeat modules that can be rearranged according to a simple cipher to target new DNA sequences. Customized TALEs can be used for a wide variety of genome engineering applications, including transcriptional modulation and genome editing. Here we describe a toolbox for rapid construction of custom TALE transcription factors (TALE-TFs) and nucleases (TALENs) using a hierarchical ligation procedure. This toolbox facilitates affordable and rapid construction of custom TALE-TFs and TALENs within 1 week and can be easily scaled up to construct TALEs for multiple targets in parallel. We also provide details for testing the activity in mammalian cells of custom TALE-TFs and TALENs using quantitative reverse-transcription PCR and Surveyor nuclease, respectively. The TALE toolbox described here will enable a broad range of biological applications.
View details for DOI 10.1038/nprot.2011.431
View details for PubMedID 22222791
View details for PubMedCentralID PMC3684555
-
Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription
NATURE BIOTECHNOLOGY
2011; 29 (2): 149-U90
Abstract
The ability to direct functional proteins to specific DNA sequences is a long-sought goal in the study and engineering of biological processes. Transcription activator-like effectors (TALEs) from Xanthomonas sp. are site-specific DNA-binding proteins that can be readily designed to target new sequences. Because TALEs contain a large number of repeat domains, it can be difficult to synthesize new variants. Here we describe a method that overcomes this problem. We leverage codon degeneracy and type IIs restriction enzymes to generate orthogonal ligation linkers between individual repeat monomers, thus allowing full-length, customized, repeat domains to be constructed by hierarchical ligation. We synthesized 17 TALEs that are customized to recognize specific DNA-binding sites, and demonstrate that they can specifically modulate transcription of endogenous genes (SOX2 and KLF4) in human cells.
View details for DOI 10.1038/nbt.1775
View details for Web of Science ID 000287023000022
View details for PubMedID 21248753
View details for PubMedCentralID PMC3084533