Junior Fellow, Harvard Society of Fellows (2022)
PhD, Massachusetts Institute of Technology, Biological Engineering (2020)
MS, Stanford University, Electrical Engineering (2014)
BS, Stanford University, Chemistry (2013)
Current Research and Scholarly Interests
Nature has created many powerful biomolecules that are hidden in organisms across kingdoms of life. Many of these biomolecules originate from microbes, which contain the most diverse gene pool among living organisms. We are integrating high-throughput computational and experimental approaches to harness the vast diversity of genes in microbes to develop new antibiotics and molecular biotechnology, and to investigate the evolution of proteins and molecular mechanisms in innate immunity.
Prokaryotic innate immunity through pattern recognition of conserved viral proteins
2022; 377 (6607): 726-+
Many organisms have evolved specialized immune pattern-recognition receptors, including nucleotide-binding oligomerization domain-like receptors (NLRs) of the STAND superfamily that are ubiquitous in plants, animals, and fungi. Although the roles of NLRs in eukaryotic immunity are well established, it is unknown whether prokaryotes use similar defense mechanisms. Here, we show that antiviral STAND (Avs) homologs in bacteria and archaea detect hallmark viral proteins, triggering Avs tetramerization and the activation of diverse N-terminal effector domains, including DNA endonucleases, to abrogate infection. Cryo-electron microscopy reveals that Avs sensor domains recognize conserved folds, active-site residues, and enzyme ligands, allowing a single Avs receptor to detect a wide variety of viruses. These findings extend the paradigm of pattern recognition of pathogen-specific proteins across all three domains of life.
View details for DOI 10.1126/science.abm4096
View details for Web of Science ID 000841499400029
View details for PubMedID 35951700
UG/Abi: a highly diverse family of prokaryotic reverse transcriptases associated with defense functions
NUCLEIC ACIDS RESEARCH
2022; 50 (11): 6084-6101
Reverse transcriptases (RTs) are enzymes capable of synthesizing DNA using RNA as a template. Within the last few years, a burst of research has led to the discovery of novel prokaryotic RTs with diverse antiviral properties, such as DRTs (Defense-associated RTs), which belong to the so-called group of unknown RTs (UG) and are closely related to the Abortive Infection system (Abi) RTs. In this work, we performed a systematic analysis of UG and Abi RTs, increasing the number of UG/Abi members up to 42 highly diverse groups, most of which are predicted to be functionally associated with other gene(s) or domain(s). Based on this information, we classified these systems into three major classes. In addition, we reveal that most of these groups are associated with defense functions and/or mobile genetic elements, and demonstrate the antiphage role of four novel groups. Besides, we highlight the presence of one of these systems in novel families of human gut viruses infecting members of the Bacteroidetes and Firmicutes phyla. This work lays the foundation for a comprehensive and unified understanding of these highly diverse RTs with enormous biotechnological potential.
View details for DOI 10.1093/nar/gkac467
View details for Web of Science ID 000804243900001
View details for PubMedID 35648479
View details for PubMedCentralID PMC9226505
A highly homogeneous polymer composed of tetrahedron-like monomers for high-isotropy expansion microscopy
2021; 16 (6): 698-+
Expansion microscopy (ExM) physically magnifies biological specimens to enable nanoscale-resolution imaging using conventional microscopes. Current ExM methods permeate specimens with free-radical-chain-growth-polymerized polyacrylate hydrogels, whose network structure limits the local isotropy of expansion as well as the preservation of morphology and shape at the nanoscale. Here we report that ExM is possible using hydrogels that have a more homogeneous network structure, assembled via non-radical terminal linking of tetrahedral monomers. As with earlier forms of ExM, such 'tetra-gel'-embedded specimens can be iteratively expanded for greater physical magnification. Iterative tetra-gel expansion of herpes simplex virus type 1 (HSV-1) virions by ~10× in linear dimension results in a median spatial error of 9.2 nm for localizing the viral envelope layer, rather than 14.3 nm from earlier versions of ExM. Moreover, tetra-gel-based expansion better preserves the virion spherical shape. Thus, tetra-gels may support ExM with reduced spatial errors and improved local isotropy, pointing the way towards single-biomolecule accuracy ExM.
View details for DOI 10.1038/s41565-021-00875-7
View details for Web of Science ID 000634625300004
View details for PubMedID 33782587
View details for PubMedCentralID PMC8197733
Diverse enzymatic activities mediate antiviral immunity in prokaryotes
2020; 369 (6507): 1077-+
Bacteria and archaea are frequently attacked by viruses and other mobile genetic elements and rely on dedicated antiviral defense systems, such as restriction endonucleases and CRISPR, to survive. The enormous diversity of viruses suggests that more types of defense systems exist than are currently known. By systematic defense gene prediction and heterologous reconstitution, here we discover 29 widespread antiviral gene cassettes, collectively present in 32% of all sequenced bacterial and archaeal genomes, that mediate protection against specific bacteriophages. These systems incorporate enzymatic activities not previously implicated in antiviral defense, including RNA editing and retron satellite DNA synthesis. In addition, we computationally predict a diverse set of other putative defense genes that remain to be characterized. These results highlight an immense array of molecular functions that microbes use against viruses.
View details for DOI 10.1126/science.aba0372
View details for Web of Science ID 000567522200040
View details for PubMedID 32855333
View details for PubMedCentralID PMC7985843
Highly Parallel Profiling of Cas9 Variant Specificity
2020; 78 (4): 794-+
Determining the off-target cleavage profile of programmable nucleases is an important consideration for any genome editing experiment, and a number of Cas9 variants have been reported that improve specificity. We describe here tagmentation-based tag integration site sequencing (TTISS), an efficient, scalable method for analyzing double-strand breaks (DSBs) that we apply in parallel to eight Cas9 variants across 59 targets. Additionally, we generated thousands of other Cas9 variants and screened for variants with enhanced specificity and activity, identifying LZ3 Cas9, a high specificity variant with a unique +1 insertion profile. This comprehensive comparison reveals a general trade-off between Cas9 activity and specificity and provides information about the frequency of generation of +1 insertions, which has implications for correcting frameshift mutations.
View details for DOI 10.1016/j.molcel.2020.02.023
View details for Web of Science ID 000535936200021
View details for PubMedID 32187529
View details for PubMedCentralID PMC7370240
Computational identification of repeat-containing proteins and systems
2020; 1: e10
View details for DOI 10.1017/qrd.2020.14
Unexpected connections between type VI-B CRISPR-Cas systems, bacterial natural competence, ubiquitin signaling network and DNA modification through a distinct family of membrane proteins
FEMS MICROBIOLOGY LETTERS
2019; 366 (8)
In addition to core Cas proteins, CRISPR-Cas loci often encode ancillary proteins that modulate the activity of the respective effectors in interference. Subtype VI-B1 CRISPR-Cas systems encode the Csx27 protein that down-regulates the activity of Cas13b when the type VI-B locus is expressed in Escherichia coli. We show that Csx27 belongs to an expansive family of proteins that contain four predicted transmembrane helices and are typically encoded in predicted operons with components of the bacterial natural transformation machinery, multidomain proteins that consist of components of the ubiquitin signaling system and proteins containing the ligand-binding WYL domain and a helix-turn-helix domain. The Csx27 family proteins are predicted to form membrane channels for ssDNA that might comprise the core of a putative novel, Ub-regulated system for DNA uptake and, possibly, degradation. In addition to these associations, a distinct subfamily of the Csx27 family appears to be a part of a novel, membrane-associated system for DNA modification. In Bacteroidetes, subtype VI-B1 systems might degrade nascent transcripts of foreign DNA in conjunction with its uptake by the bacterial cell. These predictions suggest several experimental directions for the study of type VI CRISPR-Cas systems and distinct mechanisms of foreign DNA uptake and degradation in bacteria.
View details for DOI 10.1093/femsle/fnz088
View details for Web of Science ID 000482130000006
View details for PubMedID 31089700
View details for PubMedCentralID PMC6700684
Engineering of CRISPR-Cas12b for human genome editing
2019; 10: 212
The type-V CRISPR effector Cas12b (formerly known as C2c1) has been challenging to develop for genome editing in human cells, at least in part due to the high temperature requirement of the characterized family members. Here we explore the diversity of the Cas12b family and identify a promising candidate for human gene editing from Bacillus hisashii, BhCas12b. However, at 37 °C, wild-type BhCas12b preferentially nicks the non-target DNA strand instead of forming a double strand break, leading to lower editing efficiency. Using a combination of approaches, we identify gain-of-function mutations for BhCas12b that overcome this limitation. Mutant BhCas12b facilitates robust genome editing in human cell lines and ex vivo in primary human T cells, and exhibits greater specificity compared to S. pyogenes Cas9. This work establishes a third RNA-guided nuclease platform, in addition to Cas9 and Cpf1/Cas12a, for genome editing in human cells.
View details for DOI 10.1038/s41467-018-08224-4
View details for Web of Science ID 000456285200001
View details for PubMedID 30670702
View details for PubMedCentralID PMC6342934
Engineered CRISPR-Cas9 nuclease with expanded targeting space
2018; 361 (6408): 1259-1262
The RNA-guided endonuclease Cas9 cleaves its target DNA and is a powerful genome-editing tool. However, the widely used Streptococcus pyogenes Cas9 enzyme (SpCas9) requires an NGG protospacer adjacent motif (PAM) for target recognition, thereby restricting the targetable genomic loci. Here, we report a rationally engineered SpCas9 variant (SpCas9-NG) that can recognize relaxed NG PAMs. The crystal structure revealed that the loss of the base-specific interaction with the third nucleobase is compensated by newly introduced non-base-specific interactions, thereby enabling the NG PAM recognition. We showed that SpCas9-NG induces indels at endogenous target sites bearing NG PAMs in human cells. Furthermore, we found that the fusion of SpCas9-NG and the activation-induced cytidine deaminase (AID) mediates the C-to-T conversion at target sites with NG PAMs in human cells.
View details for DOI 10.1126/science.aas9129
View details for Web of Science ID 000445152500043
View details for PubMedID 30166441
View details for PubMedCentralID PMC6368452
Effects of 3D culturing conditions on the transcriptomic profile of stem-cell-derived neurons
NATURE BIOMEDICAL ENGINEERING
2018; 2 (7): 540-554
Understanding neurological diseases requires tractable genetic systems. Engineered 3D neural tissues are an attractive choice, but how the cellular transcriptomic profiles in these tissues are affected by the encapsulating materials and are related to the human-brain transcriptome is not well understood. Here, we report the characterization of the effects of culturing conditions on the transcriptomic profiles of induced neuronal cells, as well as a method for the rapid generation of 3D co-cultures of neuronal and astrocytic cells from the same pool of human embryonic stem cells. By comparing the gene-expression profiles of neuronal cells in culture conditions relevant to the developing human brain, we found that modifying the degree of crosslinking of composite hydrogels can tune expression patterns so they correlate with those of specific brain regions and developmental stages. Moreover, by using single-cell sequencing, we show that our engineered tissues recapitulate transcriptional patterns of cell types in the human brain. The analysis of culturing conditions will inform the development of 3D neural tissues for use as tractable models of brain diseases.
View details for DOI 10.1038/s41551-018-0219-9
View details for Web of Science ID 000438459800011
View details for PubMedID 30271673
View details for PubMedCentralID PMC6157920
Engineered Cpf1 variants with altered PAM specificities
2017; 35 (8): 789-792
The RNA-guided endonuclease Cpf1 is a promising tool for genome editing in eukaryotic cells. However, the utility of the commonly used Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) and Lachnospiraceae bacterium ND2006 Cpf1 (LbCpf1) is limited by their requirement of a TTTV protospacer adjacent motif (PAM) in the DNA substrate. To address this limitation, we performed a structure-guided mutagenesis screen to increase the targeting range of Cpf1. We engineered two AsCpf1 variants carrying the mutations S542R/K607R and S542R/K548V/N552R, which recognize TYCV and TATV PAMs, respectively, with enhanced activities in vitro and in human cells. Genome-wide assessment of off-target activity using BLISS indicated that these variants retain high DNA-targeting specificity, which we further improved by introducing an additional non-PAM-interacting mutation. Introducing the identified PAM-interacting mutations at their corresponding positions in LbCpf1 similarly altered its PAM specificity. Together, these variants increase the targeting range of Cpf1 by approximately threefold in human coding sequences to one cleavage site per ∼11 bp.
View details for DOI 10.1038/nbt.3900
View details for Web of Science ID 000407125800025
View details for PubMedID 28581492
View details for PubMedCentralID PMC5548640
Structural Basis for the Altered PAM Recognition by Engineered CRISPR-Cpf1
2017; 67 (1): 139-+
The RNA-guided Cpf1 nuclease cleaves double-stranded DNA targets complementary to the CRISPR RNA (crRNA), and it has been harnessed for genome editing technologies. Recently, Acidaminococcus sp. BV3L6 (AsCpf1) was engineered to recognize altered DNA sequences as the protospacer adjacent motif (PAM), thereby expanding the target range of Cpf1-mediated genome editing. Whereas wild-type AsCpf1 recognizes the TTTV PAM, the RVR (S542R/K548V/N552R) and RR (S542R/K607R) variants can efficiently recognize the TATV and TYCV PAMs, respectively. However, their PAM recognition mechanisms remained unknown. Here we present the 2.0 Å resolution crystal structures of the RVR and RR variants bound to a crRNA and its target DNA. The structures revealed that the RVR and RR variants primarily recognize the PAM-complementary nucleotides via the substituted residues. Our high-resolution structures delineated the altered PAM recognition mechanisms of the AsCpf1 variants, providing a basis for the further engineering of CRISPR-Cpf1.
View details for DOI 10.1016/j.molcel.2017.04.019
View details for Web of Science ID 000404897300014
View details for PubMedID 28595896
View details for PubMedCentralID PMC5957533
BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks
2017; 8: 15058
Precisely measuring the location and frequency of DNA double-strand breaks (DSBs) along the genome is instrumental to understanding genomic fragility, but current methods are limited in versatility, sensitivity or practicality. Here we present Breaks Labeling In Situ and Sequencing (BLISS), featuring the following: (1) direct labelling of DSBs in fixed cells or tissue sections on a solid surface; (2) low-input requirement by linear amplification of tagged DSBs by in vitro transcription; (3) quantification of DSBs through unique molecular identifiers; and (4) easy scalability and multiplexing. We apply BLISS to profile endogenous and exogenous DSBs in low-input samples of cancer cells, embryonic stem cells and liver tissue. We demonstrate the sensitivity of BLISS by assessing the genome-wide off-target activity of two CRISPR-associated RNA-guided endonucleases, Cas9 and Cpf1, observing that Cpf1 has higher specificity than Cas9. Our results establish BLISS as a versatile, sensitive and efficient method for genome-wide DSB mapping in many applications.
View details for DOI 10.1038/ncomms15058
View details for Web of Science ID 000401222100001
View details for PubMedID 28497783
View details for PubMedCentralID PMC5437291
Protein-retention expansion microscopy of cells and tissues labeled using standard fluorescent proteins and antibodies
2016; 34 (9): 987-+
Expansion microscopy (ExM) enables imaging of preserved specimens with nanoscale precision on diffraction-limited instead of specialized super-resolution microscopes. ExM works by physically separating fluorescent probes after anchoring them to a swellable gel. The first ExM method did not result in the retention of native proteins in the gel and relied on custom-made reagents that are not widely available. Here we describe protein retention ExM (proExM), a variant of ExM in which proteins are anchored to the swellable gel, allowing the use of conventional fluorescently labeled antibodies and streptavidin, and fluorescent proteins. We validated and demonstrated the utility of proExM for multicolor super-resolution (∼70 nm) imaging of cells and mammalian tissues on conventional microscopes.
View details for DOI 10.1038/nbt.3625
View details for Web of Science ID 000383348500026
View details for PubMedID 27376584
View details for PubMedCentralID PMC5068827
Rationally engineered Cas9 nucleases with improved specificity
2016; 351 (6268): 84-88
The RNA-guided endonuclease Cas9 is a versatile genome-editing tool with a broad range of applications from therapeutics to functional annotation of genes. Cas9 creates double-strand breaks (DSBs) at targeted genomic loci complementary to a short RNA guide. However, Cas9 can cleave off-target sites that are not fully complementary to the guide, which poses a major challenge for genome editing. Here, we use structure-guided protein engineering to improve the specificity of Streptococcus pyogenes Cas9 (SpCas9). Using targeted deep sequencing and unbiased whole-genome off-target analysis to assess Cas9-mediated DNA cleavage in human cells, we demonstrate that "enhanced specificity" SpCas9 (eSpCas9) variants reduce off-target effects and maintain robust on-target cleavage. Thus, eSpCas9 could be broadly useful for genome-editing applications requiring a high level of specificity.
View details for DOI 10.1126/science.aad5227
View details for Web of Science ID 000367364200048
View details for PubMedID 26628643
View details for PubMedCentralID PMC4714946