Senior Fellow, University of Washington
Ph.D., California Institute of Technology, Biochemistry and Molecular Biophysics
B.A., UC Berkeley, MCB - Biochemistry
Current Research and Scholarly Interests
Protein design: molecular engineering, method development and novel therapeutics
Structural and mechanistic studies of pore forming toxins by protein design and artificial cells, Stanford University and Osaka University
Japan, United States
- Computational protein modeling laboratory
BIOE 301E (Win)
Independent Studies (6)
- Bioengineering Problems and Experimental Investigation
BIOE 191 (Aut, Win, Spr, Sum)
- Directed Investigation
BIOE 392 (Aut, Win, Spr, Sum)
- Directed Reading and Research
BIOMEDIN 299 (Sum)
- Directed Study
BIOE 391 (Aut, Win, Spr, Sum)
- Graduate Research
BIOPHYS 300 (Aut, Win, Spr, Sum)
- Out-of-Department Graduate Research
BIO 300X (Aut)
- Bioengineering Problems and Experimental Investigation
Prior Year Courses
- Computational Biology: Structure and Organization of Biomolecules and Cells
BIOE 279, BIOMEDIN 279, BIOPHYS 279, CME 279, CS 279 (Aut)
- Molecular and Cellular Engineering Lab
BIOE 301A (Aut)
- Molecular and Cellular Bioengineering
BIOE 300A (Win)
- Molecular and Cellular Engineering Lab
BIOE 301A (Aut)
- Computational Biology: Structure and Organization of Biomolecules and Cells
Doctoral Dissertation Reader (AC)
Jason Cheng, Michael Hollander, Pengyang Li, Andras Sagi, Sasha Zemsky
Doctoral Dissertation Advisor (AC)
Christian Choe, Alex Chu, Haotian Du, Raphael Eguchi, R. Andres Parra Sperberg, Carla Perez
Doctoral Dissertation Co-Advisor (AC)
Vandon Duong, Natalie Kolber, Sharon Newman, Jingyi Wei, Xinzhi Zou
Optical control of fast and processive engineered myosins in vitro and in living cells.
Nature chemical biology
Precision tools for spatiotemporal control of cytoskeletal motor function are needed to dissect fundamental biological processes ranging from intracellular transport to cell migration and division. Direct optical control of motor speed and direction is one promising approach, but it remains a challenge to engineer controllable motors with desirable properties such as the speed and processivity required for transport applications in living cells. Here, we develop engineered myosin motors that combine large optical modulation depths with high velocities, and create processive myosin motors with optically controllable directionality. We characterize the performance of the motors using in vitro motility assays, single-molecule tracking and live-cell imaging. Bidirectional processive motors move efficiently toward the tips of cellular protrusions in the presence of blue light, and can transport molecular cargo in cells. Robust gearshifting myosins will further enable programmable transport in contexts ranging from in vitro active matter reconstitutions to microfabricated systems that harness molecular propulsion.
View details for DOI 10.1038/s41589-021-00740-7
View details for PubMedID 33603247
Identification of N-Terminally Diversified GLP-1R Agonists Using Saturation Mutagenesis and Chemical Design.
ACS chemical biology
The glucagon-like peptide 1 receptor (GLP-1R) is a class B G-protein coupled receptor (GPCR) and diabetes drug target expressed mainly in pancreatic beta-cells that, when activated by its agonist glucagon-like peptide 1 (GLP-1) after a meal, stimulates insulin secretion and beta-cell survival and proliferation. The N-terminal region of GLP-1 interacts with membrane-proximal residues of GLP-1R, stabilizing its active conformation to trigger intracellular signaling. The best-studied agonist peptides, GLP-1 and exendin-4, share sequence homology at their N-terminal region; however, modifications that can be tolerated here are not fully understood. In this work, a functional screen of GLP-1 variants with randomized N-terminal domains reveals new GLP-1R agonists and uncovers a pattern whereby a negative charge is preferred at the third position in various sequence contexts. We further tested this sequence-structure-activity principle by synthesizing peptide analogues where this position was mutated to both canonical and noncanonical amino acids. We discovered a highly active GLP-1 analogue in which the native glutamate residue three positions from the N-terminus was replaced with the sulfo-containing amino acid cysteic acid (GLP-1-CYA). The receptor binding and downstream signaling properties elicited by GLP-1-CYA were similar to the wild type GLP-1 peptide. Computational modeling identified a likely mode of interaction of the negatively charged side chain in GLP-1-CYA with an arginine on GLP-1R. This work highlights a strategy of combinatorial peptide screening coupled with chemical exploration that could be used to generate novel agonists for other receptors with peptide ligands.
View details for DOI 10.1021/acschembio.0c00722
View details for PubMedID 33307682
Tight and specific lanthanide binding in a de novo TIM barrel with a large internal cavity designed by symmetric domain fusion.
Proceedings of the National Academy of Sciences of the United States of America
De novo protein design has succeeded in generating a large variety of globular proteins, but the construction of protein scaffolds with cavities that could accommodate large signaling molecules, cofactors, and substrates remains an outstanding challenge. The long, often flexible loops that form such cavities in many natural proteins are difficult to precisely program and thus challenging for computational protein design. Here we describe an alternative approach to this problem. We fused two stable proteins with C2 symmetry-a de novo designed dimeric ferredoxin fold and a de novo designed TIM barrel-such that their symmetry axes are aligned to create scaffolds with large cavities that can serve as binding pockets or enzymatic reaction chambers. The crystal structures of two such designs confirm the presence of a 420 cubic Angstrom chamber defined by the top of the designed TIM barrel and the bottom of the ferredoxin dimer. We functionalized the scaffold by installing a metal-binding site consisting of four glutamate residues close to the symmetry axis. The protein binds lanthanide ions with very high affinity as demonstrated by tryptophan-enhanced terbium luminescence. This approach can be extended to other metals and cofactors, making this scaffold a modular platform for the design of binding proteins and biocatalysts.
View details for DOI 10.1073/pnas.2008535117
View details for PubMedID 33203677
HIV-1 VRC01 Germline-Targeting Immunogens Select Distinct Epitope-Specific B Cell Receptors.
2020; 53 (4): 840
Activating precursor B cell receptors of HIV-1 broadly neutralizing antibodies requires specifically designed immunogens. Here, we compared the abilities of three such germline-targeting immunogens against the VRC01-class receptors to activate the targeted B cells in transgenic mice expressing the germline VH of the VRC01 antibody but diverse mouse light chains. Immunogen-specific VRC01-like B cells were isolated at different time points after immunization, their VH and VL genes were sequenced, and the corresponding antibodies characterized. VRC01 B cell sub-populations with distinct cross-reactivity properties were activated by each immunogen, and these differences correlated with distinct biophysical and biochemical features of the germline-targeting immunogens. Our study indicates that the design of effective immunogens to activate B cell receptors leading to protective HIV-1 antibodies will require a better understanding of how the biophysical properties of the epitope and its surrounding surface on the germline-targeting immunogen influence its interaction with the available receptor variants invivo.
View details for DOI 10.1016/j.immuni.2020.09.007
View details for PubMedID 33053332
Computational design of transmembrane pores.
Transmembrane channels and pores have key roles in fundamental biological processes1 and in biotechnological applications such as DNA nanopore sequencing2-4, resulting in considerable interest in the design of pore-containing proteins. Synthetic amphiphilic peptides have been found to form ion channels5,6, and there have been recent advances in de novo membrane protein design7,8 and in redesigning naturally occurring channel-containing proteins9,10. However, the de novo design of stable, well-defined transmembrane protein pores that are capable of conducting ions selectively or are large enough to enable the passage of small-molecule fluorophores remains an outstanding challenge11,12. Here we report the computational design of protein pores formed by two concentric rings of alpha-helices that are stable and monodisperse in both their water-soluble and their transmembrane forms. Crystal structures of the water-soluble forms of a 12-helical pore and a 16-helical pore closely match the computational design models. Patch-clamp electrophysiology experiments show that, when expressed in insect cells, the transmembrane form of the 12-helix pore enables the passage of ions across the membrane with high selectivity for potassium over sodium; ion passage is blocked by specific chemical modification at the pore entrance. When incorporated into liposomes using in vitro protein synthesis, the transmembrane form of the 16-helix pore-but not the 12-helix pore-enables the passage of biotinylated Alexa Fluor 488. A cryo-electron microscopy structure of the 16-helix transmembrane pore closely matches the design model. The ability to produce structurally and functionally well-defined transmembrane pores opens the door to the creation of designer channels and pores for a wide variety of applications.
View details for DOI 10.1038/s41586-020-2646-5
View details for PubMedID 32848250
Engineering a potent receptor superagonist or antagonist from a novel IL-6 family cytokine ligand.
Proceedings of the National Academy of Sciences of the United States of America
Interleukin-6 (IL-6) family cytokines signal through multimeric receptor complexes, providing unique opportunities to create novel ligand-based therapeutics. The cardiotrophin-like cytokine factor 1 (CLCF1) ligand has been shown to play a role in cancer, osteoporosis, and atherosclerosis. Once bound to ciliary neurotrophic factor receptor (CNTFR), CLCF1 mediates interactions to coreceptors glycoprotein 130 (gp130) and leukemia inhibitory factor receptor (LIFR). By increasing CNTFR-mediated binding to these coreceptors we generated a receptor superagonist which surpassed the potency of natural CNTFR ligands in neuronal signaling. Through additional mutations, we generated a receptor antagonist with increased binding to CNTFR but lack of binding to the coreceptors that inhibited tumor progression in murine xenograft models of nonsmall cell lung cancer. These studies further validate the CLCF1-CNTFR signaling axis as a therapeutic target and highlight an approach of engineering cytokine activity through a small number of mutations.
View details for DOI 10.1073/pnas.1922729117
View details for PubMedID 32522868
Macromolecular modeling and design in Rosetta: recent methods and frameworks.
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.
View details for DOI 10.1038/s41592-020-0848-2
View details for PubMedID 32483333
Computational design of closely related proteins that adopt two well-defined but structurally divergent folds.
Proceedings of the National Academy of Sciences of the United States of America
The plasticity of naturally occurring protein structures, which can change shape considerably in response to changes in environmental conditions, is critical to biological function. While computational methods have been used for de novo design of proteins that fold to a single state with a deep free-energy minimum [P.-S. Huang, S. E. Boyken, D. Baker, Nature 537, 320-327 (2016)], and to reengineer natural proteins to alter their dynamics [J. A. Davey, A. M. Damry, N. K. Goto, R. A. Chica, Nat. Chem. Biol. 13, 1280-1285 (2017)] or fold [P. A. Alexander, Y. He, Y. Chen, J. Orban, P. N. Bryan, Proc. Natl. Acad. Sci. U.S.A. 106, 21149-21154 (2009)], the de novo design of closely related sequences which adopt well-defined but structurally divergent structures remains an outstanding challenge. We designed closely related sequences (over 94% identity) that can adopt two very different homotrimeric helical bundle conformations-one short (66 A height) and the other long (100 A height)-reminiscent of the conformational transition of viral fusion proteins. Crystallographic and NMR spectroscopic characterization shows that both the short- and long-state sequences fold as designed. We sought to design bistable sequences for which both states are accessible, and obtained a single designed protein sequence that populates either the short state or the long state depending on the measurement conditions. The design of sequences which are poised to adopt two very different conformations sets the stage for creating large-scale conformational switches between structurally divergent forms.
View details for DOI 10.1073/pnas.1914808117
View details for PubMedID 32188784
- Harnessing Human Neural Networks for Protein Design. Biochemistry 2019
Heterodimer assembly from de novo repeat protein structures
AMER CHEMICAL SOC. 2019
View details for Web of Science ID 000525055503547
Multi-Scale Structural Analysis of Proteins by Deep Semantic Segmentation.
Bioinformatics (Oxford, England)
MOTIVATION: Recent advances in computational methods have facilitated large-scale sampling of protein structures, leading to breakthroughs in protein structural prediction and enabling de novo protein design. Establishing methods to identify candidate structures that can lead to native folds or designable structures remains a challenge, since few existing metrics capture high-level structural features such as architectures, folds, and conformity to conserved structural motifs. Convolutional Neural Networks (CNNs) have been successfully used in semantic segmentation - a subfield of image classification in which a class label is predicted for every pixel. Here, we apply semantic segmentation to protein structures as a novel strategy for fold identification and structure quality assessment.RESULTS: We train a CNN that assigns each residue in a multi-domain protein to one of 38 architecture classes designated by the CATH database. Our model achieves a high per-residue accuracy of 90.8% on the test set (95.0% average per-class accuracy; 87.8% average per-structure accuracy). We demonstrate that individual class probabilities can be used as a metric that indicates the degree to which a randomly generated structure assumes a specific fold, as well as a metric that highlights non-conformative regions of a protein belonging to a known class. These capabilities yield a powerful tool for guiding structural sampling for both structural prediction and design.AVAILABILITY: The trained classifier network, parser network, and entropy calculation scripts are available for download at https://git.io/fp6bd, with detailed usage instructions provided at the download page. A step-by-step tutorial for setup is provided at https://goo.gl/e8GB2S. All Rosetta commands, RosettaRemodel blueprints, and predictions for all datasets used in the study are available in the Supplementary Information.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btz650
View details for PubMedID 31424530
The molecular basis of chaperone-mediated interleukin 23 assembly control.
2019; 10 (1): 4121
The functionality of most secreted proteins depends on their assembly into a defined quaternary structure. Despite this, it remains unclear how cells discriminate unassembled proteins en route to the native state from misfolded ones that need to be degraded. Here we show how chaperones can regulate and control assembly of heterodimeric proteins, using interleukin 23 (IL-23) as a model. We find that the IL-23 α-subunit remains partially unstructured until assembly with its β-subunit occurs and identify a major site of incomplete folding. Incomplete folding is recognized by different chaperones along the secretory pathway, realizing reliable assembly control by sequential checkpoints. Structural optimization of the chaperone recognition site allows it to bypass quality control checkpoints and provides a secretion-competent IL-23α subunit, which can still form functional heterodimeric IL-23. Thus, locally-restricted incomplete folding within single-domain proteins can be used to regulate and control their assembly.
View details for DOI 10.1038/s41467-019-12006-x
View details for PubMedID 31511508
Structure and Functional Binding Epitope of V-domain Ig Suppressor of T Cell Activation.
2019; 28 (10): 2509–16.e5
V-domain immunoglobulin (Ig) suppressor of T cell activation (VISTA) is an immune checkpoint protein that inhibits the T cell response against cancer. Similar to PD-1 and CTLA-4, a blockade of VISTA promotes tumor clearance by the immune system. Here, we report a 1.85 Å crystal structure of the elusive human VISTA extracellular domain, whose lack of homology necessitated a combinatorial MR-Rosetta approach for structure determination. We highlight features that make the VISTA immunoglobulin variable (IgV)-like fold unique among B7 family members, including two additional disulfide bonds and an extended loop region with an attached helix that we show forms a contiguous binding epitope for a clinically relevant anti-VISTA antibody. We propose an overlap of this antibody-binding region with the binding epitope for V-set and Ig domain containing 3 (VSIG3), a purported functional binding partner of VISTA. The structure and functional epitope presented here will help guide future drug development efforts against this important checkpoint target.
View details for DOI 10.1016/j.celrep.2019.07.073
View details for PubMedID 31484064
De novo design of a fluorescence-activating beta-barrel.
The regular arrangements of beta-strands around a central axis in beta-barrels and of alpha-helices in coiled coils contrast with the irregular tertiary structures of most globular proteins, and have fascinated structural biologists since they were first discovered. Simple parametric models have been used to design a wide range of alpha-helical coiled-coil structures, but to date there has been no success with beta-barrels. Here we show that accurate de novo design of beta-barrels requires considerable symmetry-breaking to achieve continuous hydrogen-bond connectivity and eliminate backbone strain. We then build ensembles of beta-barrel backbone models with cavity shapes that match the fluorogenic compound DFHBI,and use a hierarchical grid-based search method to simultaneously optimize the rigid-body placement of DFHBI in these cavities and the identities of the surrounding amino acids to achieve high shape and chemical complementarity. The designs have high structural accuracy and bind and fluorescently activate DFHBI in vitro and in Escherichia coli, yeast and mammalian cells. This de novo design of small-molecule binding activity, using backbones custom-built to bind the ligand, should enable the design of increasingly sophisticated ligand-binding proteins, sensors and catalysts that are not limited by the backbone geometries available in known protein structures.
View details for PubMedID 30209393
Generative Modeling for Protein Structures
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461852002008
Designing repeat proteins: a modular approach to protein design.
Current opinion in structural biology
2017; 45: 116-123
Repeat proteins present unique opportunities for engineering because of their modular nature that potentially allows LEGO®like construction of macromolecules. Nature takes advantage of these properties and uses this type of scaffold for recognition, structure, and even signaling purposes. In recent years, new protein modeling tools facilitated the design of novel repeat proteins, creating possibilities beyond naturally occurring scaffolds alone. We highlight here the different design strategies and summarize the various structural families and novel proteins achieved.
View details for DOI 10.1016/j.sbi.2017.02.001
View details for PubMedID 28267654
Protein structure determination using metagenome sequence data
2017; 355 (6322): 294-297
Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families and that metagenome sequence data more than triple the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact-based structure matching, and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the Protein Data Bank. This approach provides the representative models for large protein families originally envisioned as the goal of the Protein Structure Initiative at a fraction of the cost.
View details for DOI 10.1126/science.aah4043
View details for Web of Science ID 000392204800039
View details for PubMedID 28104891
View details for PubMedCentralID PMC5493203
A computationally engineered RAS rheostat reveals RAS-ERK signaling dynamics
NATURE CHEMICAL BIOLOGY
2017; 13 (1): 119-126
Synthetic protein switches controlled with user-defined inputs are powerful tools for studying and controlling dynamic cellular processes. To date, these approaches have relied primarily on intermolecular regulation. Here we report a computationally guided framework for engineering intramolecular regulation of protein function. We utilize this framework to develop chemically inducible activator of RAS (CIAR), a single-component RAS rheostat that directly activates endogenous RAS in response to a small molecule. Using CIAR, we show that direct RAS activation elicits markedly different RAS-ERK signaling dynamics from growth factor stimulation, and that these dynamics differ among cell types. We also found that the clinically approved RAF inhibitor vemurafenib potently primes cells to respond to direct wild-type RAS activation. These results demonstrate the utility of CIAR for quantitatively interrogating RAS signaling. Finally, we demonstrate the general utility of our approach in design of intramolecularly regulated protein tools by applying it to the Rho family of guanine nucleotide exchange factors.
View details for DOI 10.1038/NGHEMBIO.2244
View details for Web of Science ID 000393267200022
View details for PubMedID 27870838
View details for PubMedCentralID PMC5161653
Accurate de novo design of hyperstable constrained peptides
2016; 538 (7625): 329-?
Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18-47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N-C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.
View details for DOI 10.1038/nature19791
View details for Web of Science ID 000386673100029
View details for PubMedID 27626386
View details for PubMedCentralID PMC5161715
The coming of age of de novo protein design
2016; 537 (7620): 320-327
There are 20(200) possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology.
View details for DOI 10.1038/nature19946
View details for Web of Science ID 000383098000041
View details for PubMedID 27629638
- Design of a hyperstable 60-subunit protein icosahedron NATURE 2016; 535 (7610): 136-?
De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy.
Nature chemical biology
2016; 12 (1): 29-34
Despite efforts for over 25 years, de novo protein design has not succeeded in achieving the TIM-barrel fold. Here we describe the computational design of four-fold symmetrical (β/α)8 barrels guided by geometrical and chemical principles. Experimental characterization of 33 designs revealed the importance of side chain-backbone hydrogen bonds for defining the strand register between repeat units. The X-ray crystal structure of a designed thermostable 184-residue protein is nearly identical to that of the designed TIM-barrel model. PSI-BLAST searches do not identify sequence similarities to known TIM-barrel proteins, and sensitive profile-profile searches indicate that the design sequence is distant from other naturally occurring TIM-barrel superfamilies, suggesting that Nature has sampled only a subset of the sequence space available to the TIM-barrel fold. The ability to design TIM barrels de novo opens new possibilities for custom-made enzymes.
View details for DOI 10.1038/nchembio.1966
View details for PubMedID 26595462
View details for PubMedCentralID PMC4684731
Exploring the repeat protein universe through computational protein design
2015; 528 (7583): 580-?
A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. Here we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix-loop-helix-loop structural motif. Eighty-three designs with sequences unrelated to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.
View details for DOI 10.1038/nature16162
View details for Web of Science ID 000366991900058
View details for PubMedID 26675729
View details for PubMedCentralID PMC4845728
Computational design and experimental verification of a symmetric protein homodimer
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2015; 112 (34): 10714-10719
Homodimers are the most common type of protein assembly in nature and have distinct features compared with heterodimers and higher order oligomers. Understanding homodimer interactions at the atomic level is critical both for elucidating their biological mechanisms of action and for accurate modeling of complexes of unknown structure. Computation-based design of novel protein-protein interfaces can serve as a bottom-up method to further our understanding of protein interactions. Previous studies have demonstrated that the de novo design of homodimers can be achieved to atomic-level accuracy by β-strand assembly or through metal-mediated interactions. Here, we report the design and experimental characterization of a α-helix-mediated homodimer with C2 symmetry based on a monomeric Drosophila engrailed homeodomain scaffold. A solution NMR structure shows that the homodimer exhibits parallel helical packing similar to the design model. Because the mutations leading to dimer formation resulted in poor thermostability of the system, design success was facilitated by the introduction of independent thermostabilizing mutations into the scaffold. This two-step design approach, function and stabilization, is likely to be generally applicable, especially if the desired scaffold is of low thermostability.
View details for DOI 10.1073/pnas.1505072112
View details for Web of Science ID 000360005600056
View details for PubMedID 26269568
View details for PubMedCentralID PMC4553821
Using Molecular Dynamics Simulations as an Aid in the Prediction of Domain Swapping of Computationally Designed Protein Variants
JOURNAL OF MOLECULAR BIOLOGY
2015; 427 (16): 2697-2706
In standard implementations of computational protein design, a positive-design approach is used to predict sequences that will be stable on a given backbone structure. Possible competing states are typically not considered, primarily because appropriate structural models are not available. One potential competing state, the domain-swapped dimer, is especially compelling because it is often nearly identical with its monomeric counterpart, differing by just a few mutations in a hinge region. Molecular dynamics (MD) simulations provide a computational method to sample different conformational states of a structure. Here, we tested whether MD simulations could be used as a post-design screening tool to identify sequence mutations leading to domain-swapped dimers. We hypothesized that a successful computationally designed sequence would have backbone structure and dynamics characteristics similar to that of the input structure and that, in contrast, domain-swapped dimers would exhibit increased backbone flexibility and/or altered structure in the hinge-loop region to accommodate the large conformational change required for domain swapping. While attempting to engineer a homodimer from a 51-amino-acid fragment of the monomeric protein engrailed homeodomain (ENH), we had instead generated a domain-swapped dimer (ENH_DsD). MD simulations on these proteins showed increased B-factors derived from MD simulation in the hinge loop of the ENH_DsD domain-swapped dimer relative to monomeric ENH. Two point mutants of ENH_DsD designed to recover the monomeric fold were then tested with an MD simulation protocol. The MD simulations suggested that one of these mutants would adopt the target monomeric structure, which was subsequently confirmed by X-ray crystallography.
View details for DOI 10.1016/j.jmb.2015.06.006
View details for Web of Science ID 000359960900011
View details for PubMedID 26101839
Control of repeat-protein curvature by computational protein design
NATURE STRUCTURAL & MOLECULAR BIOLOGY
2015; 22 (2): 167-174
Shape complementarity is an important component of molecular recognition, and the ability to precisely adjust the shape of a binding scaffold to match a target of interest would greatly facilitate the creation of high-affinity protein reagents and therapeutics. Here we describe a general approach to control the shape of the binding surface on repeat-protein scaffolds and apply it to leucine-rich-repeat proteins. First, self-compatible building-block modules are designed that, when polymerized, generate surfaces with unique but constant curvatures. Second, a set of junction modules that connect the different building blocks are designed. Finally, new proteins with custom-designed shapes are generated by appropriately combining building-block and junction modules. Crystal structures of the designs illustrate the power of the approach in controlling repeat-protein curvature.
View details for DOI 10.1038/nsmb.2938
View details for Web of Science ID 000348967400013
View details for PubMedID 25580576
View details for PubMedCentralID PMC4318719
A General Computational Approach for Repeat Protein Design
JOURNAL OF MOLECULAR BIOLOGY
2015; 427 (2): 563-575
Repeat proteins have considerable potential for use as modular binding reagents or biomaterials in biomedical and nanotechnology applications. Here we describe a general computational method for building idealized repeats that integrates available family sequences and structural information with Rosetta de novo protein design calculations. Idealized designs from six different repeat families were generated and experimentally characterized; 80% of the proteins were expressed and soluble and more than 40% were folded and monomeric with high thermal stability. Crystal structures determined for members of three families are within 1Å root-mean-square deviation to the design models. The method provides a general approach for fast and reliable generation of stable modular repeat protein scaffolds.
View details for DOI 10.1016/j.jmb.2014.11.005
View details for Web of Science ID 000348888200030
View details for PubMedID 25451037
View details for PubMedCentralID PMC4303030
Computational De Novo Design of a Self-Assembling Peptide with Predefined Structure
JOURNAL OF MOLECULAR BIOLOGY
2015; 427 (2): 550-562
Protein and peptide self-assembly is a powerful design principle for engineering of new biomolecules. More sophisticated biomaterials could be built if both the structure of the overall assembly and that of the self-assembling building block could be controlled. To approach this problem, we developed a computational design protocol to enable de novo design of self-assembling peptides with predefined structure. The protocol was used to design a peptide building block with a βαβ fold that self-assembles into fibrillar structures. The peptide associates into a double β-sheet structure with tightly packed α-helices decorating the exterior of the fibrils. Using circular dichroism, Fourier transform infrared spectroscopy, electron microscopy and X-ray fiber diffraction, we demonstrate that the peptide adopts the designed conformation. The results demonstrate that computational protein design can be used to engineer protein and peptide assemblies with predefined three-dimensional structures, which can serve as scaffolds for the development of functional biomaterials. Rationally designed proteins and peptides could also be used to investigate the subtle energetic and entropic tradeoffs in natural self-assembly processes and the relation between assembly structure and assembly mechanism. We demonstrate that the de novo designed peptide self-assembles with a mechanism that is more complicated than expected, in a process where small changes in solution conditions can lead to significant differences in assembly properties and conformation. These results highlight that formation of structured protein/peptide assemblies is often dependent on the formation of weak but highly precise intermolecular interactions.
View details for DOI 10.1016/j.jmb.2014.12.002
View details for Web of Science ID 000348888200029
View details for PubMedID 25498388
High thermodynamic stability of parametrically designed helical bundles
2014; 346 (6208): 481-485
We describe a procedure for designing proteins with backbones produced by varying the parameters in the Crick coiled coil-generating equations. Combinatorial design calculations identify low-energy sequences for alternative helix supercoil arrangements, and the helices in the lowest-energy arrangements are connected by loop building. We design an antiparallel monomeric untwisted three-helix bundle with 80-residue helices, an antiparallel monomeric right-handed four-helix bundle, and a pentameric parallel left-handed five-helix bundle. The designed proteins are extremely stable (extrapolated ΔGfold > 60 kilocalories per mole), and their crystal structures are close to those of the design models with nearly identical core packing between the helices. The approach enables the custom design of hyperstable proteins with fine-tuned geometries for a wide range of applications.
View details for DOI 10.1126/science.1257481
View details for Web of Science ID 000343822900046
View details for PubMedID 25342806
View details for PubMedCentralID PMC4612401
Immune Focusing and Enhanced Neutralization Induced by HIV-1 gp140 Chemical Cross-Linking
JOURNAL OF VIROLOGY
2013; 87 (18): 10163-10172
Experimental vaccine antigens based upon the HIV-1 envelope glycoproteins (Env) have failed to induce neutralizing antibodies (NAbs) against the majority of circulating viral strains as a result of antibody evasion mechanisms, including amino acid variability and conformational instability. A potential vaccine design strategy is to stabilize Env, thereby focusing antibody responses on constitutively exposed, conserved surfaces, such as the CD4 binding site (CD4bs). Here, we show that a largely trimeric form of soluble Env can be stably cross-linked with glutaraldehyde (GLA) without global modification of antigenicity. Cross-linking largely conserved binding of all potent broadly neutralizing antibodies (bNAbs) tested, including CD4bs-specific VRC01 and HJ16, but reduced binding of several non- or weakly neutralizing antibodies and soluble CD4 (sCD4). Adjuvanted administration of cross-linked or unmodified gp140 to rabbits generated indistinguishable total gp140-specific serum IgG binding titers. However, sera from animals receiving cross-linked gp140 showed significantly increased CD4bs-specific antibody binding compared to animals receiving unmodified gp140. Moreover, peptide mapping of sera from animals receiving cross-linked gp140 revealed increased binding to gp120 C1 and V1V2 regions. Finally, neutralization titers were significantly elevated in sera from animals receiving cross-linked gp140 rather than unmodified gp140. We conclude that cross-linking favors antigen stability, imparts antigenic modifications that selectively refocus antibody specificity and improves induction of NAbs, and might be a useful strategy for future vaccine design.
View details for DOI 10.1128/JVI.01161-13
View details for Web of Science ID 000323420800019
View details for PubMedID 23843636
View details for PubMedCentralID PMC3754013
Rational HIV Immunogen Design to Target Specific Germline B Cell Receptors
2013; 340 (6133): 711-716
Vaccine development to induce broadly neutralizing antibodies (bNAbs) against HIV-1 is a global health priority. Potent VRC01-class bNAbs against the CD4 binding site of HIV gp120 have been isolated from HIV-1-infected individuals; however, such bNAbs have not been induced by vaccination. Wild-type gp120 proteins lack detectable affinity for predicted germline precursors of VRC01-class bNAbs, making them poor immunogens to prime a VRC01-class response. We employed computation-guided, in vitro screening to engineer a germline-targeting gp120 outer domain immunogen that binds to multiple VRC01-class bNAbs and germline precursors, and elucidated germline binding crystallographically. When multimerized on nanoparticles, this immunogen (eOD-GT6) activates germline and mature VRC01-class B cells. Thus, eOD-GT6 nanoparticles have promise as a vaccine prime. In principle, germline-targeting strategies could be applied to other epitopes and pathogens.
View details for DOI 10.1126/science.1234150
View details for Web of Science ID 000318619000030
View details for PubMedID 23539181
View details for PubMedCentralID PMC3689846
Domain 1 of Mucosal Addressin Cell Adhesion Molecule Has an I1-set Fold and a Flexible Integrin-binding Loop
JOURNAL OF BIOLOGICAL CHEMISTRY
2013; 288 (9): 6284-6294
Mucosal addressin cell adhesion molecule (MAdCAM) binds integrin α4β7. Their interaction directs lymphocyte homing to mucosa-associated lymphoid tissues. The interaction between the two immunoglobulin superfamily (IgSF) domains of MAdCAM and integrin α4β7 is unusual in its ability to mediate either rolling adhesion or firm adhesion of lymphocytes on vascular surfaces. We determined four crystal structures of the IgSF domains of MAdCAM to test for unusual structural features that might correlate with this functional diversity. Higher resolution 1.7- and 1.4-Å structures of the IgSF domains of MAdCAM in a previously described crystal lattice revealed two alternative conformations of the integrin-binding loop, which were deformed by large lattice contacts. New crystal forms in the presence of two different Fabs to MAdCAM demonstrate a shift in IgSF domain topology from the I2- to I1-set, with a switch of integrin-binding loop from CC' to CD. The I1-set fold and CD loop appear biologically relevant. The different conformations seen in crystal structures suggest that the integrin-binding loop of MAdCAM is inherently flexible. This contrasts with rigidity of the corresponding loops in vascular cell adhesion molecule, intercellular adhesion molecule (ICAM)-1, ICAM-2, ICAM-3, and ICAM-5 and may reflect a specialization of MAdCAM to mediate both rolling and firm adhesion by binding to different α4β7 integrin conformations.
View details for DOI 10.1074/jbc.M112.413153
View details for Web of Science ID 000315820700023
View details for PubMedID 23297416
View details for PubMedCentralID PMC3585063
A Potent and Broad Neutralizing Antibody Recognizes and Penetrates the HIV Glycan Shield
2011; 334 (6059): 1097-1103
The HIV envelope (Env) protein gp120 is protected from antibody recognition by a dense glycan shield. However, several of the recently identified PGT broadly neutralizing antibodies appear to interact directly with the HIV glycan coat. Crystal structures of antigen-binding fragments (Fabs) PGT 127 and 128 with Man(9) at 1.65 and 1.29 angstrom resolution, respectively, and glycan binding data delineate a specific high mannose-binding site. Fab PGT 128 complexed with a fully glycosylated gp120 outer domain at 3.25 angstroms reveals that the antibody penetrates the glycan shield and recognizes two conserved glycans as well as a short β-strand segment of the gp120 V3 loop, accounting for its high binding affinity and broad specificity. Furthermore, our data suggest that the high neutralization potency of PGT 127 and 128 immunoglobulin Gs may be mediated by cross-linking Env trimers on the viral surface.
View details for DOI 10.1126/science.1213256
View details for Web of Science ID 000297313900041
View details for PubMedID 21998254
View details for PubMedCentralID PMC3280215
High-resolution structure prediction of a circular permutation loop
2011; 20 (11): 1929-1934
Methods for rapid and reliable design and structure prediction of linker loops would facilitate a variety of protein engineering applications. Circular permutation, in which the existing termini of a protein are linked by the polypeptide chain and new termini are created, is one such application that has been employed for decreasing proteolytic susceptibility and other functional purposes. The length and sequence of the linker can impact the expression level, solubility, structure and function of the permuted variants. Hence it is desirable to achieve atomic-level accuracy in linker design. Here, we describe the use of RosettaRemodel for design and structure prediction of circular permutation linkers on a model protein. A crystal structure of one of the permuted variants confirmed the accuracy of the computational prediction, where the all-atom rmsd of the linker region was 0.89 Å between the model and the crystal structure. This result suggests that RosettaRemodel may be generally useful for the design and structure prediction of protein loop regions for circular permutations or other structure-function manipulations.
View details for DOI 10.1002/pro.725
View details for Web of Science ID 000296273700018
View details for PubMedID 21898647
View details for PubMedCentralID PMC3267956
Computation-Guided Backbone Grafting of a Discontinuous Motif onto a Protein Scaffold
2011; 334 (6054): 373-376
The manipulation of protein backbone structure to control interaction and function is a challenge for protein engineering. We integrated computational design with experimental selection for grafting the backbone and side chains of a two-segment HIV gp120 epitope, targeted by the cross-neutralizing antibody b12, onto an unrelated scaffold protein. The final scaffolds bound b12 with high specificity and with affinity similar to that of gp120, and crystallographic analysis of a scaffold bound to b12 revealed high structural mimicry of the gp120-b12 complex structure. The method can be generalized to design other functional proteins through backbone grafting.
View details for DOI 10.1126/science.1209368
View details for Web of Science ID 000296052500052
View details for PubMedID 22021856
RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design
2011; 6 (8)
We describe RosettaRemodel, a generalized framework for flexible protein design that provides a versatile and convenient interface to the Rosetta modeling suite. RosettaRemodel employs a unified interface, called a blueprint, which allows detailed control over many aspects of flexible backbone protein design calculations. RosettaRemodel allows the construction and elaboration of customized protocols for a wide range of design problems ranging from loop insertion and deletion, disulfide engineering, domain assembly, loop remodeling, motif grafting, symmetrical units, to de novo structure modeling.
View details for DOI 10.1371/journal.pone.0024109
View details for Web of Science ID 000294680800055
View details for PubMedID 21909381
View details for PubMedCentralID PMC3166072
A Chimeric HIV-1 Envelope Glycoprotein Trimer with an Embedded Granulocyte-Macrophage Colony-stimulating Factor (GM-CSF) Domain Induces Enhanced Antibody and T Cell Responses
JOURNAL OF BIOLOGICAL CHEMISTRY
2011; 286 (25): 22250-22261
An effective HIV-1 vaccine should ideally induce strong humoral and cellular immune responses that provide sterilizing immunity over a prolonged period. Current HIV-1 vaccines have failed in inducing such immunity. The viral envelope glycoprotein complex (Env) can be targeted by neutralizing antibodies to block infection, but several Env properties limit the ability to induce an antibody response of sufficient quantity and quality. We hypothesized that Env immunogenicity could be improved by embedding an immunostimulatory protein domain within its sequence. A stabilized Env trimer was therefore engineered with the granulocyte-macrophage colony-stimulating factor (GM-CSF) inserted into the V1V2 domain of gp120. Probing with neutralizing antibodies showed that both the Env and GM-CSF components of the chimeric protein were folded correctly. Furthermore, the embedded GM-CSF domain was functional as a cytokine in vitro. Mouse immunization studies demonstrated that chimeric Env(GM-CSF) enhanced Env-specific antibody and T cell responses compared with wild-type Env. Collectively, these results show that targeting and activation of immune cells using engineered cytokine domains within the protein can improve the immunogenicity of Env subunit vaccines.
View details for DOI 10.1074/jbc.M111.229625
View details for Web of Science ID 000291719900033
View details for PubMedID 21515681
View details for PubMedCentralID PMC3121371
Modulation of Integrin Activation by an Entropic Spring in the beta-Knee
JOURNAL OF BIOLOGICAL CHEMISTRY
2010; 285 (43): 32954-32966
We show that the length of a loop in the β-knee, between the first and second cysteines (C1-C2) in integrin EGF-like (I-EGF) domain 2, modulates integrin activation. Three independent sets of mutants, including swaps among different integrin β-subunits, show that C1-C2 loop lengths of 12 and longer favor the low affinity state and masking of ligand-induced binding site (LIBS) epitopes. Shortening length from 12 to 4 residues progressively increases ligand binding and LIBS epitope exposure. Compared with length, the loop sequence had a smaller effect, which was ascribable to stabilizing loop conformation, and not interactions with the α-subunit. The data together with structural calculations support the concept that the C1-C2 loop is an entropic spring and an emerging theme that disordered regions can regulate allostery. Diversity in the length of this loop may have evolved among integrin β-subunits to adjust the equilibrium between the bent and extended conformations at different set points.
View details for DOI 10.1074/jbc.M110.145177
View details for Web of Science ID 000283048200033
View details for PubMedID 20670939
View details for PubMedCentralID PMC2963379
A de novo designed protein-protein interface
2007; 16 (12): 2770-2774
As an approach to both explore the physical/chemical parameters that drive molecular self-assembly and to generate novel protein oligomers, we have developed a procedure to generate protein dimers from monomeric proteins using computational protein docking and amino acid sequence design. A fast Fourier transform-based docking algorithm was used to generate a model for a dimeric version of the 56-amino-acid beta1 domain of streptococcal protein G. Computational amino acid sequence design of 24 residues at the dimer interface resulted in a heterodimer comprised of 12-fold and eightfold variants of the wild-type protein. The designed proteins were expressed, purified, and characterized using analytical ultracentrifugation and heteronuclear NMR techniques. Although the measured dissociation constant was modest ( approximately 300 microM), 2D-[(1)H,(15)N]-HSQC NMR spectra of one of the designed proteins in the absence and presence of its binding partner showed clear evidence of specific dimer formation.
View details for DOI 10.1110/ps.073125207
View details for Web of Science ID 000251081300023
View details for PubMedID 18029425
View details for PubMedCentralID PMC2222823
Adaptation of a fast Fourier transform-based docking algorithm for protein design
JOURNAL OF COMPUTATIONAL CHEMISTRY
2005; 26 (12): 1222-1232
Designing proteins with novel protein/protein binding properties can be achieved by combining the tools that have been developed independently for protein docking and protein design. We describe here the sequence-independent generation of protein dimer orientations by protein docking for use as scaffolds in protein sequence design algorithms. To dock monomers into sequence-independent dimer conformations, we use a reduced representation in which the side chains are approximated by spheres with atomic radii derived from known C2 symmetry-related homodimers. The interfaces of C2-related homodimers are usually more hydrophobic and protein core-like than the interfaces of heterodimers; we parameterize the radii for docking against this feature to capture and recreate the spatial characteristics of a hydrophobic interface. A fast Fourier transform-based geometric recognition algorithm is used for docking the reduced representation protein models. The resulting docking algorithm successfully predicted the wild-type homodimer orientations in 65 out of 121 dimer test cases. The success rate increases to approximately 70% for the subset of molecules with large surface area burial in the interface relative to their chain length. Forty-five of the predictions exhibited less than 1 A C(alpha) RMSD compared to the native X-ray structures. The reduced protein representation therefore appears to be a reasonable approximation and can be used to position protein backbones in plausible orientations for homodimer design.
View details for Web of Science ID 000230715200003
View details for PubMedID 15962277
A designed protein interface that blocks fibril formation
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY
2004; 126 (43): 13914-13915
Protein fibril formation is implicated in many diseases, and therefore much effort has been focused toward the development of inhibitors of this process. In a previous project, a monomeric protein was computationally engineered to bind itself and form a heterodimer complex following interfacial redesign. One of the protein monomers, termed monomer-B, was unintentionally destabilized and shown to form macroscopic fibrils. Interestingly, in the presence of the designed binding partner, fibril formation was blocked. Here we describe the complete characterization of the amyloid properties of monomer-B and the inhibition of fiber formation by the designed binding partner, monomer-A. Both proteins are mutants of the betal domain of streptococcal protein-G. The free monomer-B protein forms amyloid-type fibrils, as determined by transmission electron microscopy and the change in fluorescence of Thioflavin T, an amyloid-specific dye. Fibril formation kinetics are influenced by pH, protein concentration, and seeding with preformed fibrils. Under all conditions tested, monomer-A was able to inhibit the formation of monomer-B fibrils. This inhibition is specific to the engineered interaction, as incubation of monomer-B with wild-type protein-G (a structural homologue) did not result in inhibition under the same conditions. Thus, this de novo-designed heterodimeric complex is an excellent model system for the study of protein-based fibril formation and inhibition. This system provides additional insight into the development of pharmaceuticals for amyloid disorders, as well as the potential use of amyloid fibrils for self-assembling nanostructures.
View details for DOI 10.1021/ja0456858
View details for Web of Science ID 000224873600019
View details for PubMedID 15506739