Possu Huang's Profile | Stanford Profiles

Bio

Dr. Possu Huang received his PhD from Caltech with the first demonstration of a computationally designed novel protein-protein interface. He subsequently conducted postdoctoral research at the University of Washington before starting his group at Stanford. His research focuses on advancing the understanding of proteins for the engineering of novel therapeutics and other protein-based nanotechnology. He has contributed to a large number of de novo designed proteins, most notably to the unlocking of the design principles behind the TIM barrel fold and the invention of eOD, an HIV immunogen design. His group uses machine learning, computational modeling, structural biology and experimental library optimization to continue the expansion of protein-based molecular platforms.

Academic Appointments

Assistant Professor, Bioengineering
Member, Bio-X

Professional Education

Senior Fellow, University of Washington
Ph.D., California Institute of Technology, Biochemistry and Molecular Biophysics
B.A., UC Berkeley, MCB - Biochemistry

Contact

Academic
possu@stanford.edu

University - Faculty Department: Bioengineering Position: Asst Professor

Additional Info

Mail Code: 4245
ORCID:
https://orcid.org/0000-0002-7948-2895

Current Research and Scholarly Interests

Protein design: molecular engineering, method development and novel therapeutics

Projects

Structural and mechanistic studies of pore forming toxins by protein design and artificial cells, Stanford University and Osaka University

Location

Japan, United States

2025-26 Courses

Computational Protein Modeling Laboratory
BIOE 301E (Aut)
Molecular and Cellular Engineering Lab
BIOE 301A, EE 235 (Win)
Independent Studies (7)
- Bioengineering Problems and Experimental Investigation
  BIOE 191 (Aut, Win, Spr, Sum)
- Directed Investigation
  BIOE 392 (Aut, Win, Spr, Sum)
- Directed Reading in Biophysics
  BIOPHYS 399 (Aut, Win, Spr, Sum)
- Directed Study
  BIOE 391 (Aut, Win, Spr, Sum)
- Graduate Research
  BIOPHYS 300 (Aut, Win, Spr, Sum)
- Research
  PHYSICS 490 (Aut, Win, Spr, Sum)
- Teaching Practicum in Bioengineering
  BIOE 399 (Aut)
Prior Year Courses
2024-25 Courses
- Molecular and Cellular Engineering Lab
  BIOE 301A, EE 235 (Win)
- Protein Design and Modeling using Machine Learning Methods
  BIOS 429 (Spr)
2023-24 Courses
- Computational Protein Modeling Laboratory
  BIOE 301E (Aut)
2022-23 Courses
- Computational Protein Modeling Laboratory
  BIOE 301E (Aut)
- Molecular and Cellular Engineering Lab
  BIOE 301A (Win)

Stanford Advisees

Doctoral Dissertation Reader (AC)
Nayla Abney, Hajime Fujita, Alana Gudinas, Laura Guerrero, Joshua Sampson, Jason Saunders, Xinyu Xiang, Xiaowei Zhang
Postdoctoral Faculty Sponsor
Wenqi Shen
Doctoral Dissertation Advisor (AC)
Braxton Bell, Wyatt Blackson, Yilin Chen, Gina El Nesr, Jinho Kim, Hyejin Lee, Jingjia Liu, Tianyu Lu, Richard Shuai
Undergraduate Major Advisor
Ayushi Mohanty
Doctoral (Program)
Hajime Fujita, Thomas Lau, David Li, Lucas Sant'Anna, Magda Zaoralova

All Publications

Structural ontogeny of protein-protein interactions. Science (New York, N.Y.) Yang, A., Jiang, H., Jude, K. M., Akpinaroglu, D., Allenspach, S., Li, A. J., Bowden, J., Perez, C. P., Liu, L., Huang, P. S., Kortemme, T., Listgarten, J., Garcia, K. C. 2026; 391 (6786): eadx6931

Abstract

Understanding how protein binding sites evolve interactions with other proteins could hold clues to targeting "undruggable" surfaces. We used synthetic coevolution to engineer new interactions between naïve surfaces, simulating the de novo formation of protein complexes. We isolated seven distinct structural families of protein Z-domain complexes and found that synthetic complexes explore multiple shallow energy wells through ratchet-like docking modes, whereas complexes formed by natural binding sites converged in a deep energy well with a relatively fixed geometry. Epistasis analysis of a machine learning-estimated fitness landscape revealed "seed" contacts between binding partners that anchored the earliest stages of encounter complex formation. Our results suggest that "silent" surfaces have a shallower energy landscape than natural binding sites, disfavoring tight binding, likely owing to evolutionary counterselection.

View details for DOI 10.1126/science.adx6931

View details for PubMedID 41678610

View details for PubMedCentralID PMC12904254
High throughput mutational characterization of the GPCR ligand C5a using yeast display and deep sequencing. Structure (London, England : 1993) Xu, Y., Thakkar, K., Guan, L., Miao, Y., Mehibel, M., Lee, R. B., Marciano, D., Viswanathan, V., Wang, Z., Wang, J., Ji, L., Cao, H., Petrakian, C. F., Valenzuela, J., LaGory, E., Jia, X., Moon, E. J., Martinez, R., Wu, F., Frock, R. L., Moding, E. J., Le, Q. T., Rankin, E. B., Zhang, C., Huang, P., Olcina, M. M., Giaccia, A. J., Graves, E. E. 2025

Abstract

High-throughput mutagenesis approaches are widely employed to systematically characterize protein functions and play a critical role in therapeutic developments. As the largest class of membrane receptors, G protein-coupled receptors (GPCRs) are a primary focus of these studies. However, while significant progress has been made in understanding GPCRs themselves, mutagenesis studies on their ligands have lagged behind, because of the difficulties in solubilizing the target receptor. In this study, we present a novel approach that employs lipid vesicles to embed and stabilize target membrane receptors, allowing direct ligand screening. We applied this platform to investigate the anaphylatoxin complement 5a (C5a) and examined how mutations affect binding to its two native GPCRs: complement 5a receptor 1 (C5aR1) and complement 5a receptor 2 (C5aR2). The screening revealed new insights into the molecular basis of the interaction and led to the discovery of novel ligands that selectively activate C5aR2, but not C5aR1.

View details for DOI 10.1016/j.str.2025.10.002

View details for PubMedID 41151574
ADAPT-M: A workflow for rapid, quantitative in vitro measurements of enriched protein libraries. bioRxiv : the preprint server for biology Perez, C. P., DelRosso, N. V., Noland, C. L., Parekh, U., Choe, C. A., Eguchi, R. R., Wen, Q., Fordyce, P. M., Huang, P. S. 2025

Abstract

Protein-protein interactions underpin most cellular interactions, and engineered binders present powerful tools for probing biology and developing novel therapeutics. One bottleneck in binder generation is the scalable, quantitative characterization of these interactions. We present ADAPT-M (Affinity Determination by Adaptation of ProTein binders for Microfluidics), a streamlined workflow that connects yeast surface display (YSD) with in vitro affinity and kinetic measurements using the high-throughput STAMMPPING microfluidic platform. ADAPT-M quantifies K ds and dissociation kinetic parameters for hundreds of enriched protein variants in under one week without requiring hands-on protein purification. We applied ADAPT-M to a computationally designed library targeting the SARS-CoV-2 Omicron BA.1 receptor binding domain, successfully recovering and measuring K ds for most highly enriched YSD variants. Measurements correlate strongly with biolayer interferometry and yeast titration assays. ADAPT-M further enabled selection of lead candidates for structural and mutational analysis, which revealed designed paratopes were preserved despite binding to off-target epitopes. By bridging YSD screening and in vitro validation, ADAPT-M accelerates protein binder discovery and supports data-driven protein engineering.

View details for DOI 10.1101/2025.10.21.683815

View details for PubMedID 41279512

View details for PubMedCentralID PMC12633381
Ensemble-conditioned protein sequence design with Caliby. bioRxiv : the preprint server for biology Shuai, R. W., Lu, T., Bhatti, S., Kouba, P., Huang, P. S. 2025

Abstract

Structure-conditioned sequence design models aim to design a protein sequence that will fold into a given target structure. Deep-learning-based approaches for sequence design have proven highly successful for various protein design applications, but many non-idealized backbones still remain out of reach for current models under typical in silico success criteria. We hypothesize that training objectives prioritizing native sequence recovery unintentionally push models to reproduce non-structural signals (e.g. phylogenetic relatedness, neutral drift, or dataset sampling biases), rather than a broadly generalizable structure-sequence mapping. Inspired by recent work bridging sequence likelihood and fitness prediction in protein language models, we introduce Caliby, a Potts model-based sequence design method capable of conditioning on an ensemble of structures. Conditioning on a synthetic ensemble generated from an input backbone allows sampling of sequences consistent with the structural constraints of the ensemble while averaging out undesired biases towards the native sequence. Ensemble-conditioned sequence design with Caliby reduces native sequence recovery while substantially improving AlphaFold2 self-consistency, outperforming state-of-the-art models ProteinMPNN and ChromaDesign on both native and de novo backbones. Finally, we train a variant of Caliby on only soluble proteins and demonstrate in silico that Protpardelle-1c binder designs that were previously deemed undesignable by SolubleMPNN are actually designable under SolubleCaliby, highlighting limitations of existing filtering pipelines. These results suggest that Caliby can expand the de novo design space beyond highly idealized backbones.

View details for DOI 10.1101/2025.09.30.679633

View details for PubMedID 41256639

View details for PubMedCentralID PMC12621727
SLAE: Strictly Local All-atom Environment for Protein Representation. bioRxiv : the preprint server for biology Chen, Y., Zhao, C., Huang, P. S., Lu, T., Wayment-Steele, H. K. 2025

Abstract

Building physically grounded protein representations is central to computational biology, yet most existing approaches rely on sequence-pretrained language models or backbone-only graphs that overlook side-chain geometry and chemical detail. We present SLAE, a unified all-atom framework for learning protein representations from each residue's local atomic neighborhood using only atom types and interatomic geometries. To encourage expressive feature extraction, we introduce a novel multi-task autoencoder objective that combines coordinate reconstruction, sequence recovery, and energy regression. SLAE reconstructs all-atom structures with high fidelity from latent residue environments and achieves state-of-the-art performance across diverse downstream tasks via transfer learning. SLAE's latent space is chemically informative and environmentally sensitive, enabling quantitative assessment of structural qualities and smooth interpolation between conformations at all-atom resolution.

View details for DOI 10.1101/2025.10.03.680398

View details for PubMedID 41278779

View details for PubMedCentralID PMC12632552
Conditional Protein Structure Generation with Protpardelle-1c. bioRxiv : the preprint server for biology Lu, T., Shuai, R., Kouba, P., Li, Z., Chen, Y., Shirali, A., Kim, J., Huang, P. S. 2025

Abstract

We present Protpardelle-1c, a collection of protein structure generative models with robust motif scaffolding and support for multi-chain complex generation under hotspot-conditioning. Enabling sidechain-conditioning to a backbone-only model increased Protpardelle-1c's MotifBench score from 4.97 to 28.16, outperforming RFdiffusion's 21.27. The crop-conditional all-atom model achieved 208 unique solutions on the La-Proteina all-atom motif scaffolding benchmark, on par with La-Proteina while having ~10 times fewer parameters. At 22M parameters, Protpardelle-1c enables rapid sampling, taking 40 minutes to sample all 3000 MotifBench backbones on an NVIDIA A100-80GB, compared to 31 hours for RFdiffusion.

View details for DOI 10.1101/2025.08.18.670959

View details for PubMedID 40894579

View details for PubMedCentralID PMC12393353
Assessing generative model coverage of protein structures with SHAPES. Cell systems Lu, T., Liu, M., Chen, Y., Kim, J., Huang, P. S. 2025: 101347

Abstract

Recent advances in generative modeling enable efficient sampling of protein structures, but their tendency to optimize for designability imposes a bias toward idealized structures at the expense of loops and other complex structural motifs that are critical for function. We introduce SHAPES (structural and hierarchical assessment of proteins with embedding similarity) to evaluate five state-of-the-art generative models of protein structures. Using structural embeddings across multiple structural hierarchies, ranging from local geometries to global protein architectures, we reveal substantial undersampling of the observed protein structure space by these models. We use Fréchet protein distance (FPD) to quantify distributional coverage. Different models are distinct in their coverage behavior across different sampling noise scales and temperatures. The frequency of tertiary motifs (TERMs) further supports the observations. More robust sequence design and structure prediction methods are likely crucial in guiding the development of models with improved coverage of the designable protein space. A record of this paper's transparent peer review process is included in the supplemental information.

View details for DOI 10.1016/j.cels.2025.101347

View details for PubMedID 40738113
Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN. Proceedings of machine learning research Widatalla, T., Shuai, R. W., Hie, B. L., Huang, P. S. 2025; 267: 66746-66771

Abstract

Leading deep learning-based methods for fixed-backbone protein sequence design do not model protein sidechain conformation during sequence generation despite the large role the three-dimensional arrangement of sidechain atoms play in protein conformation, stability, and overall protein function. Instead, these models implicitly reason about crucial sidechain interactions based on backbone geometry and known amino acid sequence labels. To address this, we present FAMPNN (Full-Atom MPNN), a sequence design method that explicitly models both sequence identity and sidechain conformation for each residue, where the per-token distribution of a residue's discrete amino acid identity and its continuous sidechain conformation are learned with a combined categorical cross-entropy and diffusion loss objective. We demonstrate that learning these distributions jointly is a highly synergistic task that both improves sequence recovery while achieving state-of-the-art sidechain packing. Furthermore, benefits from full-atom modeling generalize from sequence recovery to practical protein design applications, such as zero-shot prediction of experimental binding and stability measurements.

View details for DOI 10.1101/2024.09.25.614868

View details for PubMedID 41307002

View details for PubMedCentralID PMC12646570
Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN. Proceedings of machine learning research Widatalla, T., Shuai, R. W., Hie, B. L., Huang, P. 2025; 267: 66746-66771

Abstract

Leading deep learning-based methods for fixed-backbone protein sequence design do not model protein sidechain conformation during sequence generation despite the large role the three-dimensional arrangement of sidechain atoms play in protein conformation, stability, and overall protein function. Instead, these models implicitly reason about crucial sidechain interactions based on backbone geometry and known amino acid sequence labels. To address this, we present FAMPNN (Full-Atom MPNN), a sequence design method that explicitly models both sequence identity and sidechain conformation for each residue, where the per-token distribution of a residue's discrete amino acid identity and its continuous sidechain conformation are learned with a combined categorical cross-entropy and diffusion loss objective. We demonstrate that learning these distributions jointly is a highly synergistic task that both improves sequence recovery while achieving state-of-the-art sidechain packing. Furthermore, benefits from full-atom modeling generalize from sequence recovery to practical protein design applications, such as zero-shot prediction of experimental binding and stability measurements.

View details for PubMedID 41307002
SHAPES of protein generative models Huang, P. SPRINGER. 2025: S44

View details for Web of Science ID 001597460600011
Assessing Generative Model Coverage of Protein Structures with SHAPES. bioRxiv : the preprint server for biology Lu, T., Liu, M., Chen, Y., Kim, J., Huang, P. S. 2025

Abstract

Recent advances in generative modeling enable efficient sampling of protein structures, but their tendency to optimize for designability imposes a bias toward idealized structures at the expense of loops and other complex structural motifs critical for function. We introduce SHAPES (Structural and Hierarchical Assessment of Proteins with Embedding Similarity) to evaluate five state-of-the-art generative models of protein structures. Using structural embeddings across multiple structural hierarchies, ranging from local geometries to global protein architectures, we reveal substantial undersampling of the observed protein structure space by these models. We use Fréchet Protein Distance (FPD) to quantify distributional coverage. Different models are distinct in their coverage behavior across different sampling noise scales and temperatures; the frequency of TERtiary Motifs (TERMs) further supports the observations. More robust sequence design and structure prediction methods are likely crucial in guiding the development of models with improved coverage of the designable protein space.

View details for DOI 10.1101/2025.01.09.632260

View details for PubMedID 39868321

View details for PubMedCentralID PMC11761634
Targeting peptide antigens using a multiallelic MHC I-binding system. Nature biotechnology Du, H., Mallik, L., Hwang, D., Sun, Y., Kaku, C., Hoces, D., Sun, S. M., Ghinnagow, R., Carro, S. D., Phan, H. A., Gupta, S., Blackson, W., Lee, H., Choe, C. A., Dersh, D., Liu, J., Bell, B., Yang, H., Papadaki, G. F., Young, M. C., Zhou, E., El Nesr, G., Goli, K. D., Eisenlohr, L. C., Minn, A. J., Hernandez-Lopez, R. A., Jardine, J. G., Sgourakis, N. G., Huang, P. S. 2024

Abstract

Identifying highly specific T cell receptors (TCRs) or antibodies against epitopic peptides presented by class I major histocompatibility complex (MHC I) proteins remains a bottleneck in the development of targeted therapeutics. Here, we introduce targeted recognition of antigen-MHC complex reporter for MHC I (TRACeR-I), a generalizable platform for targeting peptides on polymorphic HLA-A*, HLA-B* and HLA-C* allotypes while overcoming the cross-reactivity challenges of TCRs. Our TRACeR-MHC I co-crystal structure reveals a unique antigen recognition mechanism, with TRACeR forming extensive contacts across the entire peptide length to confer single-residue specificity at the accessible positions. We demonstrate rapid screening of TRACeR-I against a panel of disease-relevant HLAs with peptides derived from human viruses (human immunodeficiency virus, Epstein-Barr virus and severe acute respiratory syndrome coronavirus 2), and oncoproteins (Kirsten rat sarcoma virus, paired-like homeobox 2b and New York esophageal squamous cell carcinoma 1). TRACeR-based bispecific T cell engagers and chimeric antigen receptor T cells exhibit on-target killing of tumor cells with high efficacy in the low nanomolar range. Our platform empowers the development of broadly applicable MHC I-targeting molecules for research, diagnostic and therapeutic applications.

View details for DOI 10.1038/s41587-024-02505-8

View details for PubMedID 39672954

View details for PubMedCentralID 8363505
A general system for targeting MHC class II-antigen complex via a single adaptable loop. Nature biotechnology Du, H., Liu, J., Jude, K. M., Yang, X., Li, Y., Bell, B., Yang, H., Kassardjian, A., Blackson, W., Mobedi, A., Parekh, U., Parra Sperberg, R. A., Julien, J. P., Mellins, E. D., Garcia, K. C., Huang, P. S. 2024

Abstract

Major histocompatibility complex class II (MHCII) bound to a peptide antigen mediates interactions between CD4+ T cells and antigen-presenting cells. Targeting peptide-MHCII with T cell antigen receptors (TCRs) and TCR-like antibodies has shown promise for autoimmune diseases and microbiome tolerance. To develop a general targeting approach, we introduce targeted recognition of antigen-MHC complex reporter for MHCII (TRACeR-II) for the rapid development of peptide-specific MHCII binders. TRACeR-II binders have a small helical bundle scaffold and use a single loop to recognize peptide-MHCII, which offers versatility and enables structural modeling of the interactions to target MHCII antigens. We demonstrate rapid generation of TRACeR-II binders to multiple molecules with affinities in the low-nanomolar to low-micromolar range, comparable to best-in-class TCRs and antibodies. Through computational protein design, we created specific binding sequences in silico from only the sequence of a severe acute respiratory syndrome coronavirus 2 peptide. TRACeR-II provides a straightforward approach to target antigen-MHCII without relying on combinatorial selection on complementarity-determining region loops.

View details for DOI 10.1038/s41587-024-02466-y

View details for PubMedID 39672953

View details for PubMedCentralID 6977962
A General Platform for Targeting MHC-II Antigens via a Single Loop Huang, P. WILEY. 2024: 55

View details for Web of Science ID 001437110900009
An all-atom protein generative model. Proceedings of the National Academy of Sciences of the United States of America Chu, A. E., Kim, J., Cheng, L., El Nesr, G., Xu, M., Shuai, R. W., Huang, P. S. 2024; 121 (27): e2311500121

Abstract

Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically through sidechains, is an important need in protein design. However, constructing an all-atom generative model requires an appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded in the structure and sequence. We describe an all-atom diffusion model of protein structure, Protpardelle, which represents all sidechain states at once as a "superposition" state; superpositions defining a protein are collapsed into individual residue types and conformations during sample generation. When combined with sequence design methods, our model is able to codesign all-atom protein structure and sequence. Generated proteins are of good quality under the typical quality, diversity, and novelty metrics, and sidechains reproduce the chemical features and behavior of natural proteins. Finally, we explore the potential of our model to conduct all-atom protein design and scaffold functional motifs in a backbone- and rotamer-free way.

View details for DOI 10.1073/pnas.2311500121

View details for PubMedID 38916999
A cysteine-specific solubilizing tag strategy enables efficient chemical protein synthesis of difficult targets. Chemical science Li, W., Jacobsen, M. T., Park, C., Jung, J. U., Lin, N. P., Huang, P. S., Lal, R. A., Chou, D. H. 2024; 15 (9): 3214-3222

Abstract

We developed a new cysteine-specific solubilizing tag strategy via a cysteine-conjugated succinimide. This solubilizing tag remains stable under common native chemical ligation conditions and can be efficiently removed with palladium-based catalysts. Utilizing this approach, we synthesized two proteins containing notably difficult peptide segments: interleukin-2 (IL-2) and insulin. This IL-2 chemical synthesis represents the simplest and most efficient approach to date, which is enabled by the cysteine-specific solubilizing tag to synthesize and ligate long peptide segments. Additionally, we synthesized a T8P insulin variant, previously identified in an infant with neonatal diabetes. We show that T8P insulin exhibits reduced bioactivity (a 30-fold decrease compared to standard insulin), potentially contributing to the onset of diabetes in these patients. In summary, our work provides an efficient tool to synthesize challenging proteins and opens new avenues for exploring research directions in understanding their biological functions.

View details for DOI 10.1039/d3sc06032b

View details for PubMedID 38425513

View details for PubMedCentralID PMC10901488
Sparks of function by de novo protein design. Nature biotechnology Chu, A. E., Lu, T., Huang, P. S. 2024; 42 (2): 203-215

Abstract

Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.

View details for DOI 10.1038/s41587-024-02133-2

View details for PubMedID 38361073

View details for PubMedCentralID 6423711
A general platform for targeting MHC-II antigens via a single loop. bioRxiv : the preprint server for biology Du, H., Liu, J., Jude, K. M., Yang, X., Li, Y., Bell, B., Yang, H., Kassardjian, A., Mobedi, A., Parekh, U., Sperberg, R. A., Julien, J. P., Mellins, E. D., Garcia, K. C., Huang, P. S. 2024

Abstract

Class-II major histocompatibility complexes (MHC-IIs) are central to the communications between CD4+ T cells and antigen presenting cells (APCs), but intrinsic structural features associated with MHC-II make it difficult to develop a general targeting system with high affinity and antigen specificity. Here, we introduce a protein platform, Targeted Recognition of Antigen-MHC Complex Reporter for MHC-II (TRACeR-II), to enable the rapid development of peptide-specific MHC-II binders. TRACeR-II has a small helical bundle scaffold and uses an unconventional mechanism to recognize antigens via a single loop. This unique antigen-recognition mechanism renders this platform highly versatile and amenable to direct structural modeling of the interactions with the antigen. We demonstrate that TRACeR-II binders can be rapidly evolved across multiple alleles, while computational protein design can produce specific binding sequences for a SARS-CoV-2 peptide of unknown complex structure. TRACeR-II sheds light on a simple and straightforward approach to address the MHC peptide targeting challenge, without relying on combinatorial selection on complementarity determining region (CDR) loops. It presents a promising basis for further exploration in immune response modulation as well as a broad range of theragnostic applications.

View details for DOI 10.1101/2024.01.26.577489

View details for PubMedID 38352315

View details for PubMedCentralID PMC10862749
A cysteine-specific solubilizing tag strategy enables efficient chemical protein synthesis of difficult targets CHEMICAL SCIENCE Li, W., Jacobsen, M. T., Park, C., Jung, J., Lin, N., Huang, P., Lal, R. A., Chou, D. 2024

View details for DOI 10.1039/d3sc06032b

View details for Web of Science ID 001147767400001
How can the protein design community best support biologists who want to harness AI tools for protein structure prediction and design? CELL SYSTEMS Hoecker, B., Lu, P., Glasgow, A., Marks, D. S., Chatterjee, P., Slusky, J. S. G., Schueler-Furman, O., Huang, P. 2023; 14 (8): 629-632

View details for Web of Science ID 001062223800001

View details for PubMedID 37591202
Fully synthetic platform to rapidly generate tetravalent bispecific nanobody-based immunoglobulins. Proceedings of the National Academy of Sciences of the United States of America Misson Mindrebo, L., Liu, H., Ozorowski, G., Tran, Q., Woehl, J., Khalek, I., Smith, J. M., Barman, S., Zhao, F., Keating, C., Limbo, O., Verma, M., Liu, J., Stanfield, R. L., Zhu, X., Turner, H. L., Sok, D., Huang, P. S., Burton, D. R., Ward, A. B., Wilson, I. A., Jardine, J. G. 2023; 120 (24): e2216612120

Abstract

Nanobodies bind a target antigen with a kinetic profile similar to a conventional antibody, but exist as a single heavy chain domain that can be readily multimerized to engage antigen via multiple interactions. Presently, most nanobodies are produced by immunizing camelids; however, platforms for animal-free production are growing in popularity. Here, we describe the development of a fully synthetic nanobody library based on an engineered human VH3-23 variable gene and a multispecific antibody-like format designed for biparatopic target engagement. To validate our library, we selected nanobodies against the SARS-CoV-2 receptor-binding domain and employed an on-yeast epitope binning strategy to rapidly map the specificities of the selected nanobodies. We then generated antibody-like molecules by replacing the VH and VL domains of a conventional antibody with two different nanobodies, designed as a molecular clamp to engage the receptor-binding domain biparatopically. The resulting bispecific tetra-nanobody immunoglobulins neutralized diverse SARS-CoV-2 variants with potencies similar to antibodies isolated from convalescent donors. Subsequent biochemical analyses confirmed the accuracy of the on-yeast epitope binning and structures of both individual nanobodies, and a tetra-nanobody immunoglobulin revealed that the intended mode of interaction had been achieved. This overall workflow is applicable to nearly any protein target and provides a blueprint for a modular workflow for the development of multispecific molecules.

View details for DOI 10.1073/pnas.2216612120

View details for PubMedID 37276407
An all-atom protein generative model. bioRxiv : the preprint server for biology Chu, A. E., Cheng, L., Nesr, G. E., Xu, M., Huang, P. S. 2023

Abstract

Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically through sidechains, is an important need in protein design. However, constructing an all-atom generative model requires an appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded in the structure and sequence. We describe an all-atom diffusion model of protein structure, Protpardelle, which instantiates a "superposition" over the possible sidechain states, and collapses it to conduct reverse diffusion for sample generation. When combined with sequence design methods, our model is able to co-design all-atom protein structure and sequence. Generated proteins are of good quality under the typical quality, diversity, and novelty metrics, and sidechains reproduce the chemical features and behavior of natural proteins. Finally, we explore the potential of our model conduct all-atom protein design and scaffold functional motifs in a backbone- and rotamer-free way.

View details for DOI 10.1101/2023.05.24.542194

View details for PubMedID 37292974

View details for PubMedCentralID PMC10245864
Rational Design of Improved and Novel Photodissociable GFPs and RFPs Westberg, M., Trigo, M. L., Devenish, S., Huang, P., Lin, M. Z. WILEY. 2023

View details for Web of Science ID 000927844900183
De Novo Design of a Highly Stable Ovoid TIM Barrel: Unlocking Pocket Shape towards Functional Design. Biodesign research Chu, A. E., Fernandez, D., Liu, J., Eguchi, R. R., Huang, P. S. 2022; 2022: 9842315

Abstract

The ability to finely control the structure of protein folds is an important prerequisite to functional protein design. The TIM barrel fold is an important target for these efforts as it is highly enriched for diverse functions in nature. Although a TIM barrel protein has been designed de novo, the ability to finely alter the curvature of the central beta barrel and the overall architecture of the fold remains elusive, limiting its utility for functional design. Here, we report the de novo design of a TIM barrel with ovoid (twofold) symmetry, drawing inspiration from natural beta and TIM barrels with ovoid curvature. We use an autoregressive backbone sampling strategy to implement our hypothesis for elongated barrel curvature, followed by an iterative enrichment sequence design protocol to obtain sequences which yield a high proportion of successfully folding designs. Designed sequences are highly stable and fold to the designed barrel curvature as determined by a 2.1 Å resolution crystal structure. The designs show robustness to drastic mutations, retaining high melting temperatures even when multiple charged residues are buried in the hydrophobic core or when the hydrophobic core is ablated to alanine. As a scaffold with a greater capacity for hosting diverse hydrogen bonding networks and installation of binding pockets or active sites, the ovoid TIM barrel represents a major step towards the de novo design of functional TIM barrels.

View details for DOI 10.34133/2022/9842315

View details for PubMedID 37850141

View details for PubMedCentralID PMC10521652
Ig-VAE: Generative modeling of protein structure by direct 3D coordinate generation. PLoS computational biology Eguchi, R. R., Choe, C. A., Huang, P. S. 2022; 18 (6): e1010271

Abstract

While deep learning models have seen increasing applications in protein science, few have been implemented for protein backbone generation-an important task in structure-based problems such as active site and interface design. We present a new approach to building class-specific backbones, using a variational auto-encoder to directly generate the 3D coordinates of immunoglobulins. Our model is torsion- and distance-aware, learns a high-resolution embedding of the dataset, and generates novel, high-quality structures compatible with existing design tools. We show that the Ig-VAE can be used with Rosetta to create a computational model of a SARS-CoV2-RBD binder via latent space sampling. We further demonstrate that the model's generative prior is a powerful tool for guiding computational protein design, motivating a new paradigm under which backbone design is solved as constrained optimization problem in the latent space of a generative model.

View details for DOI 10.1371/journal.pcbi.1010271

View details for PubMedID 35759518
Protein sequence design with a learned potential. Nature communications Anand, N., Eguchi, R., Mathews, I. I., Perez, C. P., Derry, A., Altman, R. B., Huang, P. 2022; 13 (1): 746

Abstract

The task of protein sequence design is central to nearly all rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design. Here, we investigate the capability of a deep neural network model to automate design of sequences onto protein backbones, having learned directly from crystal structure data and without any human-specified priors. The model generalizes to native topologies not seen during training, producing experimentally stable designs. We evaluate the generalizability of our method to a de novo TIM-barrel scaffold. The model produces novel sequences, and high-resolution crystal structures of two designs show excellent agreement with in silico models. Our findings demonstrate the tractability of an entirely learned method for protein sequence design.

View details for DOI 10.1038/s41467-022-28313-9

View details for PubMedID 35136054
Chimeric mutants of staphylococcal hemolysin, which act as both one-component and two-component hemolysin, created by grafting the stem domain. The FEBS journal Ghanem, N., Kanagami, N., Matsui, T., Takeda, K., Kaneko, J., Shiraishi, Y., Choe, C. A., Uchikubo-Kamo, T., Shirouzu, M., Hashimoto, T., Ogawa, T., Matsuura, T., Huang, P. S., Yokoyama, T., Tanaka, Y. 2022

Abstract

Staphylococcus aureus expresses several hemolytic pore-forming toxins (PFTs), which are all commonly composed of three domains: cap, rim and stem. PFTs are expressed as soluble monomers and assemble to form a transmembrane β-barrel pore in the erythrocyte cell membrane. The stem domain undergoes dramatic conformational changes to form a pore. Staphylococcal PFTs are classified into two groups: one-component α-hemolysin (α-HL) and two-component γ-hemolysin (γ-HL). The α-HL forms a homo-heptamer, whereas γ-HL is an octamer composed of F-component (LukF) and S-component (Hlg2). Because PFTs are used as materials for nanopore-based sensors, knowledge of the functional properties of PFTs is used to develop new, engineered PFTs. However, it remains challenging to design PFTs with a β-barrel pore because their formation as transmembrane protein assemblies requires large conformational changes. In the present study, aiming to investigate the design principles of the β-barrel formed as a consequence of the conformational change, chimeric mutants composed of the cap/rim domains of α-HL and the stem of LukF or Hlg2 were prepared. Biochemical characterization and electron microscopy showed that one of them assembles as a heptameric one-component PFT, whereas another participates as both a heptameric one- and heptameric/octameric two-component PFT. All chimeric mutants intrinsically assemble into SDS-resistant oligomers. Based on these observations, the role of the stem domain of these PFTs is discussed. These findings provide clues for the engineering of staphylococcal PFT β-barrels for use in further promising applications.

View details for DOI 10.1111/febs.16354

View details for PubMedID 35030303
Interleukin-2 superkines by computational design. Proceedings of the National Academy of Sciences of the United States of America Ren, J., Chu, A. E., Jude, K. M., Picton, L. K., Kare, A. J., Su, L., Montano Romero, A., Huang, P. S., Garcia, K. C. 2022; 119 (12): e2117401119

Abstract

SignificanceWhile computational engineering of therapeutic proteins is a desirable goal, in practice the optimization of protein-protein interactions requires substantial experimental intervention. We present here a computational approach that focuses on stabilizing core protein structures rather than engineering the protein-protein interface. Using this approach, we designed thermostabilized interleukin-2 (IL-2) variants that bind tightly to their receptor without experimental optimization, mimicking the properties of the yeast-display engineered IL-2 variant "super-2." Our results suggest that structure-guided stabilization may be a general method for in silico affinity maturation of protein-protein interactions.

View details for DOI 10.1073/pnas.2117401119

View details for PubMedID 35294290
Structure-based protein design with deep learning. Current opinion in chemical biology Ovchinnikov, S., Huang, P. 2021; 65: 136-144

Abstract

Since the first revelation of proteins functioning as macromolecular machines through their three dimensional structures, researchers have been intrigued by the marvelous ways the biochemical processes are carried out by proteins. The aspiration to understand protein structures has fueled extensive efforts across different scientific disciplines. In recent years, it has been demonstrated that proteins with new functionality or shapes can be designed via structure-based modeling methods, and the design strategies have combined all available information - but largely piece-by-piece - from sequence derived statistics to the detailed atomic-level modeling of chemical interactions. Despite the significant progress, incorporating data-derived approaches through the use of deep learning methods can be a game changer. In this review, we summarize current progress, compare the arc of developing the deep learning approaches with the conventional methods, and describe the motivation and concepts behind current strategies that may lead to potential future opportunities.

View details for DOI 10.1016/j.cbpa.2021.08.004

View details for PubMedID 34547592
Theoretical basis for stabilizing messenger RNA through secondary structure design. Nucleic acids research Wayment-Steele, H. K., Kim, D. S., Choe, C. A., Nicol, J. J., Wellington-Oguri, R., Watkins, A. M., Parra Sperberg, R. A., Huang, P., Participants, E., Das, R. 2021

Abstract

RNA hydrolysis presents problems in manufacturing, long-term storage, world-wide deliveryand in vivo stability of messenger RNA (mRNA)-based vaccines and therapeutics. A largely unexplored strategy to reduce mRNA hydrolysis is to redesign RNAs to form double-stranded regions, which are protected from in-line cleavage and enzymatic degradation, while coding for the same proteins. The amount of stabilization that this strategy can deliver and the most effective algorithmic approach to achieve stabilization remain poorly understood. Here, we present simple calculations for estimating RNA stability against hydrolysis, and a model that links the average unpaired probability of an mRNA, or AUP, to its overall hydrolysis rate. To characterize the stabilization achievable through structure design, we compare AUP optimization by conventional mRNA design methods to results from more computationally sophisticated algorithms and crowdsourcing through the OpenVaccine challenge on the Eterna platform. We find that rational design on Eterna and the more sophisticated algorithms lead to constructs with low AUP, which we term 'superfolder' mRNAs. These designs exhibit a wide diversity of sequence and structure features that may be desirable for translation, biophysical size, and immunogenicity. Furthermore, their folding is robust to temperature, computer modeling method, choice of flanking untranslated regions, and changes in target protein sequence, as illustrated by rapid redesign of superfolder mRNAs for B.1.351, P.1and B.1.1.7 variants of the prefusion-stabilized SARS-CoV-2 spike protein. Increases in in vitro mRNA half-life by at least two-fold appear immediately achievable.

View details for DOI 10.1093/nar/gkab764

View details for PubMedID 34520542
Optical control of fast and processive engineered myosins in vitro and in living cells. Nature chemical biology Ruijgrok, P. V., Ghosh, R. P., Zemsky, S. n., Nakamura, M. n., Gong, R. n., Ning, L. n., Chen, R. n., Vachharajani, V. T., Chu, A. E., Anand, N. n., Eguchi, R. R., Huang, P. S., Lin, M. Z., Alushin, G. M., Liphardt, J. T., Bryant, Z. n. 2021

Abstract

Precision tools for spatiotemporal control of cytoskeletal motor function are needed to dissect fundamental biological processes ranging from intracellular transport to cell migration and division. Direct optical control of motor speed and direction is one promising approach, but it remains a challenge to engineer controllable motors with desirable properties such as the speed and processivity required for transport applications in living cells. Here, we develop engineered myosin motors that combine large optical modulation depths with high velocities, and create processive myosin motors with optically controllable directionality. We characterize the performance of the motors using in vitro motility assays, single-molecule tracking and live-cell imaging. Bidirectional processive motors move efficiently toward the tips of cellular protrusions in the presence of blue light, and can transport molecular cargo in cells. Robust gearshifting myosins will further enable programmable transport in contexts ranging from in vitro active matter reconstitutions to microfabricated systems that harness molecular propulsion.

View details for DOI 10.1038/s41589-021-00740-7

View details for PubMedID 33603247
Correction to 'Theoretical basis for stabilizing messenger RNA through secondary structure design'. Nucleic acids research Wayment-Steele, H. K., Kim, D. S., Choe, C. A., Nicol, J. J., Wellington-Oguri, R., Watkins, A. M., Parra Sperberg, R. A., Huang, P. S., Participants, E., Das, R. 2021

View details for DOI 10.1093/nar/gkab911

View details for PubMedID 34591967
Identification of N-Terminally Diversified GLP-1R Agonists Using Saturation Mutagenesis and Chemical Design. ACS chemical biology Longwell, C. K., Hanna, S., Hartrampf, N., Sperberg, R. A., Huang, P., Pentelute, B. L., Cochran, J. R. 2020

Abstract

The glucagon-like peptide 1 receptor (GLP-1R) is a class B G-protein coupled receptor (GPCR) and diabetes drug target expressed mainly in pancreatic beta-cells that, when activated by its agonist glucagon-like peptide 1 (GLP-1) after a meal, stimulates insulin secretion and beta-cell survival and proliferation. The N-terminal region of GLP-1 interacts with membrane-proximal residues of GLP-1R, stabilizing its active conformation to trigger intracellular signaling. The best-studied agonist peptides, GLP-1 and exendin-4, share sequence homology at their N-terminal region; however, modifications that can be tolerated here are not fully understood. In this work, a functional screen of GLP-1 variants with randomized N-terminal domains reveals new GLP-1R agonists and uncovers a pattern whereby a negative charge is preferred at the third position in various sequence contexts. We further tested this sequence-structure-activity principle by synthesizing peptide analogues where this position was mutated to both canonical and noncanonical amino acids. We discovered a highly active GLP-1 analogue in which the native glutamate residue three positions from the N-terminus was replaced with the sulfo-containing amino acid cysteic acid (GLP-1-CYA). The receptor binding and downstream signaling properties elicited by GLP-1-CYA were similar to the wild type GLP-1 peptide. Computational modeling identified a likely mode of interaction of the negatively charged side chain in GLP-1-CYA with an arginine on GLP-1R. This work highlights a strategy of combinatorial peptide screening coupled with chemical exploration that could be used to generate novel agonists for other receptors with peptide ligands.

View details for DOI 10.1021/acschembio.0c00722

View details for PubMedID 33307682
Tight and specific lanthanide binding in a de novo TIM barrel with a large internal cavity designed by symmetric domain fusion. Proceedings of the National Academy of Sciences of the United States of America Caldwell, S. J., Haydon, I. C., Piperidou, N., Huang, P., Bick, M. J., Sjostrom, H. S., Hilvert, D., Baker, D., Zeymer, C. 2020

Abstract

De novo protein design has succeeded in generating a large variety of globular proteins, but the construction of protein scaffolds with cavities that could accommodate large signaling molecules, cofactors, and substrates remains an outstanding challenge. The long, often flexible loops that form such cavities in many natural proteins are difficult to precisely program and thus challenging for computational protein design. Here we describe an alternative approach to this problem. We fused two stable proteins with C2 symmetry-a de novo designed dimeric ferredoxin fold and a de novo designed TIM barrel-such that their symmetry axes are aligned to create scaffolds with large cavities that can serve as binding pockets or enzymatic reaction chambers. The crystal structures of two such designs confirm the presence of a 420 cubic Angstrom chamber defined by the top of the designed TIM barrel and the bottom of the ferredoxin dimer. We functionalized the scaffold by installing a metal-binding site consisting of four glutamate residues close to the symmetry axis. The protein binds lanthanide ions with very high affinity as demonstrated by tryptophan-enhanced terbium luminescence. This approach can be extended to other metals and cofactors, making this scaffold a modular platform for the design of binding proteins and biocatalysts.

View details for DOI 10.1073/pnas.2008535117

View details for PubMedID 33203677
HIV-1 VRC01 Germline-Targeting Immunogens Select Distinct Epitope-Specific B Cell Receptors. Immunity Lin, Y., Parks, K. R., Weidle, C., Naidu, A. S., Khechaduri, A., Riker, A. O., Takushi, B., Chun, J., Borst, A. J., Veesler, D., Stuart, A., Agrawal, P., Gray, M., Pancera, M., Huang, P., Stamatatos, L. 2020; 53 (4): 840

Abstract

Activating precursor B cell receptors of HIV-1 broadly neutralizing antibodies requires specifically designed immunogens. Here, we compared the abilities of three such germline-targeting immunogens against the VRC01-class receptors to activate the targeted B cells in transgenic mice expressing the germline VH of the VRC01 antibody but diverse mouse light chains. Immunogen-specific VRC01-like B cells were isolated at different time points after immunization, their VH and VL genes were sequenced, and the corresponding antibodies characterized. VRC01 B cell sub-populations with distinct cross-reactivity properties were activated by each immunogen, and these differences correlated with distinct biophysical and biochemical features of the germline-targeting immunogens. Our study indicates that the design of effective immunogens to activate B cell receptors leading to protective HIV-1 antibodies will require a better understanding of how the biophysical properties of the epitope and its surrounding surface on the germline-targeting immunogen influence its interaction with the available receptor variants invivo.

View details for DOI 10.1016/j.immuni.2020.09.007

View details for PubMedID 33053332
Computational design of transmembrane pores. Nature Xu, C., Lu, P., Gamal El-Din, T. M., Pei, X. Y., Johnson, M. C., Uyeda, A., Bick, M. J., Xu, Q., Jiang, D., Bai, H., Reggiano, G., Hsia, Y., Brunette, T. J., Dou, J., Ma, D., Lynch, E. M., Boyken, S. E., Huang, P., Stewart, L., DiMaio, F., Kollman, J. M., Luisi, B. F., Matsuura, T., Catterall, W. A., Baker, D. 2020

Abstract

Transmembrane channels and pores have key roles in fundamental biological processes1 and in biotechnological applications such as DNA nanopore sequencing2-4, resulting in considerable interest in the design of pore-containing proteins. Synthetic amphiphilic peptides have been found to form ion channels5,6, and there have been recent advances in de novo membrane protein design7,8 and in redesigning naturally occurring channel-containing proteins9,10. However, the de novo design of stable, well-defined transmembrane protein pores that are capable of conducting ions selectively or are large enough to enable the passage of small-molecule fluorophores remains an outstanding challenge11,12. Here we report the computational design of protein pores formed by two concentric rings of alpha-helices that are stable and monodisperse in both their water-soluble and their transmembrane forms. Crystal structures of the water-soluble forms of a 12-helical pore and a 16-helical pore closely match the computational design models. Patch-clamp electrophysiology experiments show that, when expressed in insect cells, the transmembrane form of the 12-helix pore enables the passage of ions across the membrane with high selectivity for potassium over sodium; ion passage is blocked by specific chemical modification at the pore entrance. When incorporated into liposomes using in vitro protein synthesis, the transmembrane form of the 16-helix pore-but not the 12-helix pore-enables the passage of biotinylated Alexa Fluor 488. A cryo-electron microscopy structure of the 16-helix transmembrane pore closely matches the design model. The ability to produce structurally and functionally well-defined transmembrane pores opens the door to the creation of designer channels and pores for a wide variety of applications.

View details for DOI 10.1038/s41586-020-2646-5

View details for PubMedID 32848250
Theoretical basis for stabilizing messenger RNA through secondary structure design. bioRxiv : the preprint server for biology Wayment-Steele, H. K., Kim, D. S., Choe, C. A., Nicol, J. J., Wellington-Oguri, R., Watkins, A. M., Sperberg, R. A., Huang, P. S., Participants, E., Das, R. 2020

Abstract

RNA hydrolysis presents problems in manufacturing, long-term storage, world-wide delivery, and in vivo stability of messenger RNA (mRNA)-based vaccines and therapeutics. A largely unexplored strategy to reduce mRNA hydrolysis is to redesign RNAs to form double-stranded regions, which are protected from in-line cleavage and enzymatic degradation, while coding for the same proteins. The amount of stabilization that this strategy can deliver and the most effective algorithmic approach to achieve stabilization remain poorly understood. Motivated by the need for stabilized COVID-19 mRNA vaccines, we present simple calculations for estimating RNA stability against hydrolysis, and a model that links the average unpaired probability of an mRNA, or AUP, to its overall rate of hydrolysis. To characterize the stabilization achievable through structure design, we compare optimization of AUP by conventional mRNA design methods to results from the LinearDesign algorithm, a new Monte Carlo tree search algorithm called RiboTree, and crowdsourcing through the OpenVaccine challenge on the Eterna platform. Tests were carried out on mRNAs encoding nanoluciferase, green fluorescent protein, and COVID-19 mRNA vaccine candidates encoding SARS-CoV-2 epitopes, spike receptor binding domain, and full-length spike protein. We find that Eterna and RiboTree significantly lower AUP while maintaining a large diversity of sequence and structure features that correlate with translation, biophysical size, and immunogenicity. Our results suggest that increases in in vitro mRNA half-life by at least two-fold are immediately achievable and that further stability improvements may be enabled with thorough experimental characterization of RNA hydrolysis.

View details for DOI 10.1101/2020.08.22.262931

View details for PubMedID 32869022

View details for PubMedCentralID PMC7457604
Engineering a potent receptor superagonist or antagonist from a novel IL-6 family cytokine ligand. Proceedings of the National Academy of Sciences of the United States of America Kim, J. W., Marquez, C. P., Sperberg, R. A., Wu, J., Bae, W. G., Huang, P., Sweet-Cordero, E. A., Cochran, J. R. 2020

Abstract

Interleukin-6 (IL-6) family cytokines signal through multimeric receptor complexes, providing unique opportunities to create novel ligand-based therapeutics. The cardiotrophin-like cytokine factor 1 (CLCF1) ligand has been shown to play a role in cancer, osteoporosis, and atherosclerosis. Once bound to ciliary neurotrophic factor receptor (CNTFR), CLCF1 mediates interactions to coreceptors glycoprotein 130 (gp130) and leukemia inhibitory factor receptor (LIFR). By increasing CNTFR-mediated binding to these coreceptors we generated a receptor superagonist which surpassed the potency of natural CNTFR ligands in neuronal signaling. Through additional mutations, we generated a receptor antagonist with increased binding to CNTFR but lack of binding to the coreceptors that inhibited tumor progression in murine xenograft models of nonsmall cell lung cancer. These studies further validate the CLCF1-CNTFR signaling axis as a therapeutic target and highlight an approach of engineering cytokine activity through a small number of mutations.

View details for DOI 10.1073/pnas.1922729117

View details for PubMedID 32522868
Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nature methods Leman, J. K., Weitzner, B. D., Lewis, S. M., Adolf-Bryfogle, J., Alam, N., Alford, R. F., Aprahamian, M., Baker, D., Barlow, K. A., Barth, P., Basanta, B., Bender, B. J., Blacklock, K., Bonet, J., Boyken, S. E., Bradley, P., Bystroff, C., Conway, P., Cooper, S., Correia, B. E., Coventry, B., Das, R., De Jong, R. M., DiMaio, F., Dsilva, L., Dunbrack, R., Ford, A. S., Frenz, B., Fu, D. Y., Geniesse, C., Goldschmidt, L., Gowthaman, R., Gray, J. J., Gront, D., Guffy, S., Horowitz, S., Huang, P., Huber, T., Jacobs, T. M., Jeliazkov, J. R., Johnson, D. K., Kappel, K., Karanicolas, J., Khakzad, H., Khar, K. R., Khare, S. D., Khatib, F., Khramushin, A., King, I. C., Kleffner, R., Koepnick, B., Kortemme, T., Kuenze, G., Kuhlman, B., Kuroda, D., Labonte, J. W., Lai, J. K., Lapidoth, G., Leaver-Fay, A., Lindert, S., Linsky, T., London, N., Lubin, J. H., Lyskov, S., Maguire, J., Malmstrom, L., Marcos, E., Marcu, O., Marze, N. A., Meiler, J., Moretti, R., Mulligan, V. K., Nerli, S., Norn, C., O'Conchuir, S., Ollikainen, N., Ovchinnikov, S., Pacella, M. S., Pan, X., Park, H., Pavlovicz, R. E., Pethe, M., Pierce, B. G., Pilla, K. B., Raveh, B., Renfrew, P. D., Burman, S. S., Rubenstein, A., Sauer, M. F., Scheck, A., Schief, W., Schueler-Furman, O., Sedan, Y., Sevy, A. M., Sgourakis, N. G., Shi, L., Siegel, J. B., Silva, D., Smith, S., Song, Y., Stein, A., Szegedy, M., Teets, F. D., Thyme, S. B., Wang, R. Y., Watkins, A., Zimmerman, L., Bonneau, R. 2020

Abstract

The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.

View details for DOI 10.1038/s41592-020-0848-2

View details for PubMedID 32483333
Computational design of closely related proteins that adopt two well-defined but structurally divergent folds. Proceedings of the National Academy of Sciences of the United States of America Wei, K. Y., Moschidi, D., Bick, M. J., Nerli, S., McShan, A. C., Carter, L. P., Huang, P., Fletcher, D. A., Sgourakis, N. G., Boyken, S. E., Baker, D. 2020

Abstract

The plasticity of naturally occurring protein structures, which can change shape considerably in response to changes in environmental conditions, is critical to biological function. While computational methods have been used for de novo design of proteins that fold to a single state with a deep free-energy minimum [P.-S. Huang, S. E. Boyken, D. Baker, Nature 537, 320-327 (2016)], and to reengineer natural proteins to alter their dynamics [J. A. Davey, A. M. Damry, N. K. Goto, R. A. Chica, Nat. Chem. Biol. 13, 1280-1285 (2017)] or fold [P. A. Alexander, Y. He, Y. Chen, J. Orban, P. N. Bryan, Proc. Natl. Acad. Sci. U.S.A. 106, 21149-21154 (2009)], the de novo design of closely related sequences which adopt well-defined but structurally divergent structures remains an outstanding challenge. We designed closely related sequences (over 94% identity) that can adopt two very different homotrimeric helical bundle conformations-one short (66 A height) and the other long (100 A height)-reminiscent of the conformational transition of viral fusion proteins. Crystallographic and NMR spectroscopic characterization shows that both the short- and long-state sequences fold as designed. We sought to design bistable sequences for which both states are accessible, and obtained a single designed protein sequence that populates either the short state or the long state depending on the measurement conditions. The design of sequences which are poised to adopt two very different conformations sets the stage for creating large-scale conformational switches between structurally divergent forms.

View details for DOI 10.1073/pnas.1914808117

View details for PubMedID 32188784
Harnessing Human Neural Networks for Protein Design. Biochemistry Huang, P., Thompson, K. A. 2019

View details for DOI 10.1021/acs.biochem.9b00820

View details for PubMedID 31834781
Heterodimer assembly from de novo repeat protein structures Huang, P. AMER CHEMICAL SOC. 2019

View details for Web of Science ID 000525055503547
Multi-Scale Structural Analysis of Proteins by Deep Semantic Segmentation. Bioinformatics (Oxford, England) Eguchi, R. R., Huang, P. 2019

Abstract

MOTIVATION: Recent advances in computational methods have facilitated large-scale sampling of protein structures, leading to breakthroughs in protein structural prediction and enabling de novo protein design. Establishing methods to identify candidate structures that can lead to native folds or designable structures remains a challenge, since few existing metrics capture high-level structural features such as architectures, folds, and conformity to conserved structural motifs. Convolutional Neural Networks (CNNs) have been successfully used in semantic segmentation - a subfield of image classification in which a class label is predicted for every pixel. Here, we apply semantic segmentation to protein structures as a novel strategy for fold identification and structure quality assessment.RESULTS: We train a CNN that assigns each residue in a multi-domain protein to one of 38 architecture classes designated by the CATH database. Our model achieves a high per-residue accuracy of 90.8% on the test set (95.0% average per-class accuracy; 87.8% average per-structure accuracy). We demonstrate that individual class probabilities can be used as a metric that indicates the degree to which a randomly generated structure assumes a specific fold, as well as a metric that highlights non-conformative regions of a protein belonging to a known class. These capabilities yield a powerful tool for guiding structural sampling for both structural prediction and design.AVAILABILITY: The trained classifier network, parser network, and entropy calculation scripts are available for download at https://git.io/fp6bd, with detailed usage instructions provided at the download page. A step-by-step tutorial for setup is provided at https://goo.gl/e8GB2S. All Rosetta commands, RosettaRemodel blueprints, and predictions for all datasets used in the study are available in the Supplementary Information.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

View details for DOI 10.1093/bioinformatics/btz650

View details for PubMedID 31424530
The molecular basis of chaperone-mediated interleukin 23 assembly control. Nature communications Meier, S. n., Bohnacker, S. n., Klose, C. J., Lopez, A. n., Choe, C. A., Schmid, P. W., Bloemeke, N. n., Rührnößl, F. n., Haslbeck, M. n., Bieren, J. E., Sattler, M. n., Huang, P. S., Feige, M. J. 2019; 10 (1): 4121

Abstract

The functionality of most secreted proteins depends on their assembly into a defined quaternary structure. Despite this, it remains unclear how cells discriminate unassembled proteins en route to the native state from misfolded ones that need to be degraded. Here we show how chaperones can regulate and control assembly of heterodimeric proteins, using interleukin 23 (IL-23) as a model. We find that the IL-23 α-subunit remains partially unstructured until assembly with its β-subunit occurs and identify a major site of incomplete folding. Incomplete folding is recognized by different chaperones along the secretory pathway, realizing reliable assembly control by sequential checkpoints. Structural optimization of the chaperone recognition site allows it to bypass quality control checkpoints and provides a secretion-competent IL-23α subunit, which can still form functional heterodimeric IL-23. Thus, locally-restricted incomplete folding within single-domain proteins can be used to regulate and control their assembly.

View details for DOI 10.1038/s41467-019-12006-x

View details for PubMedID 31511508
Structure and Functional Binding Epitope of V-domain Ig Suppressor of T Cell Activation. Cell reports Mehta, N. n., Maddineni, S. n., Mathews, I. I., Andres Parra Sperberg, R. n., Huang, P. S., Cochran, J. R. 2019; 28 (10): 2509–16.e5

Abstract

V-domain immunoglobulin (Ig) suppressor of T cell activation (VISTA) is an immune checkpoint protein that inhibits the T cell response against cancer. Similar to PD-1 and CTLA-4, a blockade of VISTA promotes tumor clearance by the immune system. Here, we report a 1.85 Å crystal structure of the elusive human VISTA extracellular domain, whose lack of homology necessitated a combinatorial MR-Rosetta approach for structure determination. We highlight features that make the VISTA immunoglobulin variable (IgV)-like fold unique among B7 family members, including two additional disulfide bonds and an extended loop region with an attached helix that we show forms a contiguous binding epitope for a clinically relevant anti-VISTA antibody. We propose an overlap of this antibody-binding region with the binding epitope for V-set and Ig domain containing 3 (VSIG3), a purported functional binding partner of VISTA. The structure and functional epitope presented here will help guide future drug development efforts against this important checkpoint target.

View details for DOI 10.1016/j.celrep.2019.07.073

View details for PubMedID 31484064
De novo design of a fluorescence-activating beta-barrel. Nature Dou, J., Vorobieva, A. A., Sheffler, W., Doyle, L. A., Park, H., Bick, M. J., Mao, B., Foight, G. W., Lee, M. Y., Gagnon, L. A., Carter, L., Sankaran, B., Ovchinnikov, S., Marcos, E., Huang, P., Vaughan, J. C., Stoddard, B. L., Baker, D. 2018

Abstract

The regular arrangements of beta-strands around a central axis in beta-barrels and of alpha-helices in coiled coils contrast with the irregular tertiary structures of most globular proteins, and have fascinated structural biologists since they were first discovered. Simple parametric models have been used to design a wide range of alpha-helical coiled-coil structures, but to date there has been no success with beta-barrels. Here we show that accurate de novo design of beta-barrels requires considerable symmetry-breaking to achieve continuous hydrogen-bond connectivity and eliminate backbone strain. We then build ensembles of beta-barrel backbone models with cavity shapes that match the fluorogenic compound DFHBI,and use a hierarchical grid-based search method to simultaneously optimize the rigid-body placement of DFHBI in these cavities and the identities of the surrounding amino acids to achieve high shape and chemical complementarity. The designs have high structural accuracy and bind and fluorescently activate DFHBI in vitro and in Escherichia coli, yeast and mammalian cells. This de novo design of small-molecule binding activity, using backbones custom-built to bind the ligand, should enable the design of increasingly sophisticated ligand-binding proteins, sensors and catalysts that are not limited by the backbone geometries available in known protein structures.

View details for PubMedID 30209393
Generative Modeling for Protein Structures Anand, N., Huang, P. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018

View details for Web of Science ID 000461852002008
Designing repeat proteins: a modular approach to protein design. Current opinion in structural biology Parmeggiani, F., Huang, P. 2017; 45: 116-123

Abstract

Repeat proteins present unique opportunities for engineering because of their modular nature that potentially allows LEGO®like construction of macromolecules. Nature takes advantage of these properties and uses this type of scaffold for recognition, structure, and even signaling purposes. In recent years, new protein modeling tools facilitated the design of novel repeat proteins, creating possibilities beyond naturally occurring scaffolds alone. We highlight here the different design strategies and summarize the various structural families and novel proteins achieved.

View details for DOI 10.1016/j.sbi.2017.02.001

View details for PubMedID 28267654
Protein structure determination using metagenome sequence data SCIENCE Ovchinnikov, S., Park, H., Varghese, N., Huang, P., Pavlopoulos, G. A., Kim, D. E., Kamisetty, H., Kyrpides, N. C., Baker, D. 2017; 355 (6322): 294-297

Abstract

Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families and that metagenome sequence data more than triple the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact-based structure matching, and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the Protein Data Bank. This approach provides the representative models for large protein families originally envisioned as the goal of the Protein Structure Initiative at a fraction of the cost.

View details for DOI 10.1126/science.aah4043

View details for Web of Science ID 000392204800039

View details for PubMedID 28104891

View details for PubMedCentralID PMC5493203
A computationally engineered RAS rheostat reveals RAS-ERK signaling dynamics NATURE CHEMICAL BIOLOGY Rose, J. C., Huang, P., Camp, N. D., Ye, J., Leidal, A. M., Goreshnik, I., Trevillian, B. M., Dickinson, M. S., Cunningham-Bryant, D., Debnath, J., Baker, D., Wolf-Yadlin, A., Maly, D. J. 2017; 13 (1): 119-126

Abstract

Synthetic protein switches controlled with user-defined inputs are powerful tools for studying and controlling dynamic cellular processes. To date, these approaches have relied primarily on intermolecular regulation. Here we report a computationally guided framework for engineering intramolecular regulation of protein function. We utilize this framework to develop chemically inducible activator of RAS (CIAR), a single-component RAS rheostat that directly activates endogenous RAS in response to a small molecule. Using CIAR, we show that direct RAS activation elicits markedly different RAS-ERK signaling dynamics from growth factor stimulation, and that these dynamics differ among cell types. We also found that the clinically approved RAF inhibitor vemurafenib potently primes cells to respond to direct wild-type RAS activation. These results demonstrate the utility of CIAR for quantitatively interrogating RAS signaling. Finally, we demonstrate the general utility of our approach in design of intramolecularly regulated protein tools by applying it to the Rho family of guanine nucleotide exchange factors.

View details for DOI 10.1038/NGHEMBIO.2244

View details for Web of Science ID 000393267200022

View details for PubMedID 27870838

View details for PubMedCentralID PMC5161653
Accurate de novo design of hyperstable constrained peptides NATURE Bhardwaj, G., Mulligan, V. K., Bahl, C. D., Gilmore, J. M., Harvey, P. J., Cheneval, O., Buchko, G. W., Pulavarti, S. V., Kaas, Q., Eletsky, A., Huang, P., Johnsen, W. A., Greisen, P. J., Rocklin, G. J., Song, Y., Linsky, T. W., Watkins, A., Rettie, S. A., Xu, X., Carter, L. P., Bonneau, R., Olson, J. M., Coutsias, E., Correnti, C. E., Szyperski, T., Craik, D. J., Baker, D. 2016; 538 (7625): 329-?

Abstract

Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18-47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N-C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.

View details for DOI 10.1038/nature19791

View details for Web of Science ID 000386673100029

View details for PubMedID 27626386

View details for PubMedCentralID PMC5161715
The coming of age of de novo protein design NATURE Huang, P., Boyken, S. E., Baker, D. 2016; 537 (7620): 320-327

Abstract

There are 20(200) possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology.

View details for DOI 10.1038/nature19946

View details for Web of Science ID 000383098000041

View details for PubMedID 27629638
Design of a hyperstable 60-subunit protein icosahedron NATURE Hsia, Y., Bale, J. B., Gonen, S., Shi, D., Sheffler, W., Fong, K. K., Nattermann, U., Xu, C., Huang, P., Ravichandran, R., Yi, S., Davis, T. N., Gonen, T., King, N. P., Baker, D. 2016; 535 (7610): 136-?

View details for DOI 10.1038/nature18010

View details for Web of Science ID 000379015600039
De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nature chemical biology Huang, P., Feldmeier, K., Parmeggiani, F., Fernandez Velasco, D. A., Höcker, B., Baker, D. 2016; 12 (1): 29-34

Abstract

Despite efforts for over 25 years, de novo protein design has not succeeded in achieving the TIM-barrel fold. Here we describe the computational design of four-fold symmetrical (β/α)8 barrels guided by geometrical and chemical principles. Experimental characterization of 33 designs revealed the importance of side chain-backbone hydrogen bonds for defining the strand register between repeat units. The X-ray crystal structure of a designed thermostable 184-residue protein is nearly identical to that of the designed TIM-barrel model. PSI-BLAST searches do not identify sequence similarities to known TIM-barrel proteins, and sensitive profile-profile searches indicate that the design sequence is distant from other naturally occurring TIM-barrel superfamilies, suggesting that Nature has sampled only a subset of the sequence space available to the TIM-barrel fold. The ability to design TIM barrels de novo opens new possibilities for custom-made enzymes.

View details for DOI 10.1038/nchembio.1966

View details for PubMedID 26595462

View details for PubMedCentralID PMC4684731
Exploring the repeat protein universe through computational protein design NATURE Brunette, T. J., Parmeggiani, F., Huang, P., Bhabha, G., Ekiert, D. C., Tsutakawa, S. E., Hura, G. L., Tainer, J. A., Baker, D. 2015; 528 (7583): 580-?

Abstract

A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. Here we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix-loop-helix-loop structural motif. Eighty-three designs with sequences unrelated to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.

View details for DOI 10.1038/nature16162

View details for Web of Science ID 000366991900058

View details for PubMedID 26675729

View details for PubMedCentralID PMC4845728
Computational design and experimental verification of a symmetric protein homodimer PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Mou, Y., Huang, P., Hsu, F., Huang, S., Mayo, S. L. 2015; 112 (34): 10714-10719

Abstract

Homodimers are the most common type of protein assembly in nature and have distinct features compared with heterodimers and higher order oligomers. Understanding homodimer interactions at the atomic level is critical both for elucidating their biological mechanisms of action and for accurate modeling of complexes of unknown structure. Computation-based design of novel protein-protein interfaces can serve as a bottom-up method to further our understanding of protein interactions. Previous studies have demonstrated that the de novo design of homodimers can be achieved to atomic-level accuracy by β-strand assembly or through metal-mediated interactions. Here, we report the design and experimental characterization of a α-helix-mediated homodimer with C2 symmetry based on a monomeric Drosophila engrailed homeodomain scaffold. A solution NMR structure shows that the homodimer exhibits parallel helical packing similar to the design model. Because the mutations leading to dimer formation resulted in poor thermostability of the system, design success was facilitated by the introduction of independent thermostabilizing mutations into the scaffold. This two-step design approach, function and stabilization, is likely to be generally applicable, especially if the desired scaffold is of low thermostability.

View details for DOI 10.1073/pnas.1505072112

View details for Web of Science ID 000360005600056

View details for PubMedID 26269568

View details for PubMedCentralID PMC4553821
Using Molecular Dynamics Simulations as an Aid in the Prediction of Domain Swapping of Computationally Designed Protein Variants JOURNAL OF MOLECULAR BIOLOGY Mou, Y., Huang, P., Thomas, L. M., Mayo, S. L. 2015; 427 (16): 2697-2706

Abstract

In standard implementations of computational protein design, a positive-design approach is used to predict sequences that will be stable on a given backbone structure. Possible competing states are typically not considered, primarily because appropriate structural models are not available. One potential competing state, the domain-swapped dimer, is especially compelling because it is often nearly identical with its monomeric counterpart, differing by just a few mutations in a hinge region. Molecular dynamics (MD) simulations provide a computational method to sample different conformational states of a structure. Here, we tested whether MD simulations could be used as a post-design screening tool to identify sequence mutations leading to domain-swapped dimers. We hypothesized that a successful computationally designed sequence would have backbone structure and dynamics characteristics similar to that of the input structure and that, in contrast, domain-swapped dimers would exhibit increased backbone flexibility and/or altered structure in the hinge-loop region to accommodate the large conformational change required for domain swapping. While attempting to engineer a homodimer from a 51-amino-acid fragment of the monomeric protein engrailed homeodomain (ENH), we had instead generated a domain-swapped dimer (ENH_DsD). MD simulations on these proteins showed increased B-factors derived from MD simulation in the hinge loop of the ENH_DsD domain-swapped dimer relative to monomeric ENH. Two point mutants of ENH_DsD designed to recover the monomeric fold were then tested with an MD simulation protocol. The MD simulations suggested that one of these mutants would adopt the target monomeric structure, which was subsequently confirmed by X-ray crystallography.

View details for DOI 10.1016/j.jmb.2015.06.006

View details for Web of Science ID 000359960900011

View details for PubMedID 26101839
Control of repeat-protein curvature by computational protein design NATURE STRUCTURAL & MOLECULAR BIOLOGY Park, K., Shen, B. W., Parmeggiani, F., Huang, P., Stoddard, B. L., Baker, D. 2015; 22 (2): 167-174

Abstract

Shape complementarity is an important component of molecular recognition, and the ability to precisely adjust the shape of a binding scaffold to match a target of interest would greatly facilitate the creation of high-affinity protein reagents and therapeutics. Here we describe a general approach to control the shape of the binding surface on repeat-protein scaffolds and apply it to leucine-rich-repeat proteins. First, self-compatible building-block modules are designed that, when polymerized, generate surfaces with unique but constant curvatures. Second, a set of junction modules that connect the different building blocks are designed. Finally, new proteins with custom-designed shapes are generated by appropriately combining building-block and junction modules. Crystal structures of the designs illustrate the power of the approach in controlling repeat-protein curvature.

View details for DOI 10.1038/nsmb.2938

View details for Web of Science ID 000348967400013

View details for PubMedID 25580576

View details for PubMedCentralID PMC4318719
A General Computational Approach for Repeat Protein Design JOURNAL OF MOLECULAR BIOLOGY Parmeggiani, F., Huang, P., Vorobiev, S., Xiao, R., Park, K., Caprari, S., Su, M., Seetharaman, J., Mao, L., Janjua, H., Montelione, G. T., Hunt, J., Baker, D. 2015; 427 (2): 563-575

Abstract

Repeat proteins have considerable potential for use as modular binding reagents or biomaterials in biomedical and nanotechnology applications. Here we describe a general computational method for building idealized repeats that integrates available family sequences and structural information with Rosetta de novo protein design calculations. Idealized designs from six different repeat families were generated and experimentally characterized; 80% of the proteins were expressed and soluble and more than 40% were folded and monomeric with high thermal stability. Crystal structures determined for members of three families are within 1Å root-mean-square deviation to the design models. The method provides a general approach for fast and reliable generation of stable modular repeat protein scaffolds.

View details for DOI 10.1016/j.jmb.2014.11.005

View details for Web of Science ID 000348888200030

View details for PubMedID 25451037

View details for PubMedCentralID PMC4303030
Computational De Novo Design of a Self-Assembling Peptide with Predefined Structure JOURNAL OF MOLECULAR BIOLOGY Kaltofen, S., Li, C., Huang, P., Serpell, L. C., Barth, A., Andre, I. 2015; 427 (2): 550-562

Abstract

Protein and peptide self-assembly is a powerful design principle for engineering of new biomolecules. More sophisticated biomaterials could be built if both the structure of the overall assembly and that of the self-assembling building block could be controlled. To approach this problem, we developed a computational design protocol to enable de novo design of self-assembling peptides with predefined structure. The protocol was used to design a peptide building block with a βαβ fold that self-assembles into fibrillar structures. The peptide associates into a double β-sheet structure with tightly packed α-helices decorating the exterior of the fibrils. Using circular dichroism, Fourier transform infrared spectroscopy, electron microscopy and X-ray fiber diffraction, we demonstrate that the peptide adopts the designed conformation. The results demonstrate that computational protein design can be used to engineer protein and peptide assemblies with predefined three-dimensional structures, which can serve as scaffolds for the development of functional biomaterials. Rationally designed proteins and peptides could also be used to investigate the subtle energetic and entropic tradeoffs in natural self-assembly processes and the relation between assembly structure and assembly mechanism. We demonstrate that the de novo designed peptide self-assembles with a mechanism that is more complicated than expected, in a process where small changes in solution conditions can lead to significant differences in assembly properties and conformation. These results highlight that formation of structured protein/peptide assemblies is often dependent on the formation of weak but highly precise intermolecular interactions.

View details for DOI 10.1016/j.jmb.2014.12.002

View details for Web of Science ID 000348888200029

View details for PubMedID 25498388
High thermodynamic stability of parametrically designed helical bundles SCIENCE Huang, P., Oberdorfer, G., Xu, C., Pei, X. Y., Nannenga, B. L., Rogers, J. M., DiMaio, F., Gonen, T., Luisi, B., Baker, D. 2014; 346 (6208): 481-485

Abstract

We describe a procedure for designing proteins with backbones produced by varying the parameters in the Crick coiled coil-generating equations. Combinatorial design calculations identify low-energy sequences for alternative helix supercoil arrangements, and the helices in the lowest-energy arrangements are connected by loop building. We design an antiparallel monomeric untwisted three-helix bundle with 80-residue helices, an antiparallel monomeric right-handed four-helix bundle, and a pentameric parallel left-handed five-helix bundle. The designed proteins are extremely stable (extrapolated ΔGfold > 60 kilocalories per mole), and their crystal structures are close to those of the design models with nearly identical core packing between the helices. The approach enables the custom design of hyperstable proteins with fine-tuned geometries for a wide range of applications.

View details for DOI 10.1126/science.1257481

View details for Web of Science ID 000343822900046

View details for PubMedID 25342806

View details for PubMedCentralID PMC4612401
Immune Focusing and Enhanced Neutralization Induced by HIV-1 gp140 Chemical Cross-Linking JOURNAL OF VIROLOGY Schiffner, T., Kong, L., Duncan, C. J., Back, J. W., Benschop, J. J., Shen, X., Huang, P. S., Stewart-Jones, G. B., DeStefano, J., Seaman, M. S., Tomaras, G. D., Montefiori, D. C., Schief, W. R., Sattentau, Q. J. 2013; 87 (18): 10163-10172

Abstract

Experimental vaccine antigens based upon the HIV-1 envelope glycoproteins (Env) have failed to induce neutralizing antibodies (NAbs) against the majority of circulating viral strains as a result of antibody evasion mechanisms, including amino acid variability and conformational instability. A potential vaccine design strategy is to stabilize Env, thereby focusing antibody responses on constitutively exposed, conserved surfaces, such as the CD4 binding site (CD4bs). Here, we show that a largely trimeric form of soluble Env can be stably cross-linked with glutaraldehyde (GLA) without global modification of antigenicity. Cross-linking largely conserved binding of all potent broadly neutralizing antibodies (bNAbs) tested, including CD4bs-specific VRC01 and HJ16, but reduced binding of several non- or weakly neutralizing antibodies and soluble CD4 (sCD4). Adjuvanted administration of cross-linked or unmodified gp140 to rabbits generated indistinguishable total gp140-specific serum IgG binding titers. However, sera from animals receiving cross-linked gp140 showed significantly increased CD4bs-specific antibody binding compared to animals receiving unmodified gp140. Moreover, peptide mapping of sera from animals receiving cross-linked gp140 revealed increased binding to gp120 C1 and V1V2 regions. Finally, neutralization titers were significantly elevated in sera from animals receiving cross-linked gp140 rather than unmodified gp140. We conclude that cross-linking favors antigen stability, imparts antigenic modifications that selectively refocus antibody specificity and improves induction of NAbs, and might be a useful strategy for future vaccine design.

View details for DOI 10.1128/JVI.01161-13

View details for Web of Science ID 000323420800019

View details for PubMedID 23843636

View details for PubMedCentralID PMC3754013
Rational HIV Immunogen Design to Target Specific Germline B Cell Receptors SCIENCE Jardine, J., Julien, J., Menis, S., Ota, T., Kalyuzhniy, O., McGuire, A., Sok, D., Huang, P., MacPherson, S., Jones, M., Nieusma, T., Mathison, J., Baker, D., Ward, A. B., Burton, D. R., Stamatatos, L., Nemazee, D., Wilson, I. A., Schief, W. R. 2013; 340 (6133): 711-716

Abstract

Vaccine development to induce broadly neutralizing antibodies (bNAbs) against HIV-1 is a global health priority. Potent VRC01-class bNAbs against the CD4 binding site of HIV gp120 have been isolated from HIV-1-infected individuals; however, such bNAbs have not been induced by vaccination. Wild-type gp120 proteins lack detectable affinity for predicted germline precursors of VRC01-class bNAbs, making them poor immunogens to prime a VRC01-class response. We employed computation-guided, in vitro screening to engineer a germline-targeting gp120 outer domain immunogen that binds to multiple VRC01-class bNAbs and germline precursors, and elucidated germline binding crystallographically. When multimerized on nanoparticles, this immunogen (eOD-GT6) activates germline and mature VRC01-class B cells. Thus, eOD-GT6 nanoparticles have promise as a vaccine prime. In principle, germline-targeting strategies could be applied to other epitopes and pathogens.

View details for DOI 10.1126/science.1234150

View details for Web of Science ID 000318619000030

View details for PubMedID 23539181

View details for PubMedCentralID PMC3689846
Domain 1 of Mucosal Addressin Cell Adhesion Molecule Has an I1-set Fold and a Flexible Integrin-binding Loop JOURNAL OF BIOLOGICAL CHEMISTRY Yu, Y., Zhu, J., Huang, P., Wang, J., Pullen, N., Springer, T. A. 2013; 288 (9): 6284-6294

Abstract

Mucosal addressin cell adhesion molecule (MAdCAM) binds integrin α4β7. Their interaction directs lymphocyte homing to mucosa-associated lymphoid tissues. The interaction between the two immunoglobulin superfamily (IgSF) domains of MAdCAM and integrin α4β7 is unusual in its ability to mediate either rolling adhesion or firm adhesion of lymphocytes on vascular surfaces. We determined four crystal structures of the IgSF domains of MAdCAM to test for unusual structural features that might correlate with this functional diversity. Higher resolution 1.7- and 1.4-Å structures of the IgSF domains of MAdCAM in a previously described crystal lattice revealed two alternative conformations of the integrin-binding loop, which were deformed by large lattice contacts. New crystal forms in the presence of two different Fabs to MAdCAM demonstrate a shift in IgSF domain topology from the I2- to I1-set, with a switch of integrin-binding loop from CC' to CD. The I1-set fold and CD loop appear biologically relevant. The different conformations seen in crystal structures suggest that the integrin-binding loop of MAdCAM is inherently flexible. This contrasts with rigidity of the corresponding loops in vascular cell adhesion molecule, intercellular adhesion molecule (ICAM)-1, ICAM-2, ICAM-3, and ICAM-5 and may reflect a specialization of MAdCAM to mediate both rolling and firm adhesion by binding to different α4β7 integrin conformations.

View details for DOI 10.1074/jbc.M112.413153

View details for Web of Science ID 000315820700023

View details for PubMedID 23297416

View details for PubMedCentralID PMC3585063
A Potent and Broad Neutralizing Antibody Recognizes and Penetrates the HIV Glycan Shield SCIENCE Pejchal, R., Doores, K. J., Walker, L. M., Khayat, R., Huang, P., Wang, S., Stanfield, R. L., Julien, J., Ramos, A., Crispin, M., Depetris, R., Katpally, U., Marozsan, A., Cupo, A., Maloveste, S., Liu, Y., McBride, R., Ito, Y., Sanders, R. W., Ogohara, C., Paulson, J. C., Feizi, T., Scanlan, C. N., Wong, C., Moore, J. P., Olson, W. C., Ward, A. B., Poignard, P., Schief, W. R., Burton, D. R., Wilson, I. A. 2011; 334 (6059): 1097-1103

Abstract

The HIV envelope (Env) protein gp120 is protected from antibody recognition by a dense glycan shield. However, several of the recently identified PGT broadly neutralizing antibodies appear to interact directly with the HIV glycan coat. Crystal structures of antigen-binding fragments (Fabs) PGT 127 and 128 with Man(9) at 1.65 and 1.29 angstrom resolution, respectively, and glycan binding data delineate a specific high mannose-binding site. Fab PGT 128 complexed with a fully glycosylated gp120 outer domain at 3.25 angstroms reveals that the antibody penetrates the glycan shield and recognizes two conserved glycans as well as a short β-strand segment of the gp120 V3 loop, accounting for its high binding affinity and broad specificity. Furthermore, our data suggest that the high neutralization potency of PGT 127 and 128 immunoglobulin Gs may be mediated by cross-linking Env trimers on the viral surface.

View details for DOI 10.1126/science.1213256

View details for Web of Science ID 000297313900041

View details for PubMedID 21998254

View details for PubMedCentralID PMC3280215
High-resolution structure prediction of a circular permutation loop PROTEIN SCIENCE Correia, B. E., Holmes, M. A., Huang, P., Strong, R. K., Schief, W. R. 2011; 20 (11): 1929-1934

Abstract

Methods for rapid and reliable design and structure prediction of linker loops would facilitate a variety of protein engineering applications. Circular permutation, in which the existing termini of a protein are linked by the polypeptide chain and new termini are created, is one such application that has been employed for decreasing proteolytic susceptibility and other functional purposes. The length and sequence of the linker can impact the expression level, solubility, structure and function of the permuted variants. Hence it is desirable to achieve atomic-level accuracy in linker design. Here, we describe the use of RosettaRemodel for design and structure prediction of circular permutation linkers on a model protein. A crystal structure of one of the permuted variants confirmed the accuracy of the computational prediction, where the all-atom rmsd of the linker region was 0.89 Å between the model and the crystal structure. This result suggests that RosettaRemodel may be generally useful for the design and structure prediction of protein loop regions for circular permutations or other structure-function manipulations.

View details for DOI 10.1002/pro.725

View details for Web of Science ID 000296273700018

View details for PubMedID 21898647

View details for PubMedCentralID PMC3267956
Computation-Guided Backbone Grafting of a Discontinuous Motif onto a Protein Scaffold SCIENCE Azoitei, M. L., Correia, B. E., Ban, Y. A., Carrico, C., Kalyuzhniy, O., Chen, L., Schroeter, A., Huang, P., McLellan, J. S., Kwong, P. D., Baker, D., Strong, R. K., Schief, W. R. 2011; 334 (6054): 373-376

Abstract

The manipulation of protein backbone structure to control interaction and function is a challenge for protein engineering. We integrated computational design with experimental selection for grafting the backbone and side chains of a two-segment HIV gp120 epitope, targeted by the cross-neutralizing antibody b12, onto an unrelated scaffold protein. The final scaffolds bound b12 with high specificity and with affinity similar to that of gp120, and crystallographic analysis of a scaffold bound to b12 revealed high structural mimicry of the gp120-b12 complex structure. The method can be generalized to design other functional proteins through backbone grafting.

View details for DOI 10.1126/science.1209368

View details for Web of Science ID 000296052500052

View details for PubMedID 22021856
RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design PLOS ONE Huang, P., Ban, Y. A., Richter, F., Andre, I., Vernon, R., Schief, W. R., Baker, D. 2011; 6 (8)

Abstract

We describe RosettaRemodel, a generalized framework for flexible protein design that provides a versatile and convenient interface to the Rosetta modeling suite. RosettaRemodel employs a unified interface, called a blueprint, which allows detailed control over many aspects of flexible backbone protein design calculations. RosettaRemodel allows the construction and elaboration of customized protocols for a wide range of design problems ranging from loop insertion and deletion, disulfide engineering, domain assembly, loop remodeling, motif grafting, symmetrical units, to de novo structure modeling.

View details for DOI 10.1371/journal.pone.0024109

View details for Web of Science ID 000294680800055

View details for PubMedID 21909381

View details for PubMedCentralID PMC3166072
A Chimeric HIV-1 Envelope Glycoprotein Trimer with an Embedded Granulocyte-Macrophage Colony-stimulating Factor (GM-CSF) Domain Induces Enhanced Antibody and T Cell Responses JOURNAL OF BIOLOGICAL CHEMISTRY van Montfort, T., Melchers, M., Isik, G., Menis, S., Huang, P., Matthews, K., Michael, E., Berkhout, B., Schief, W. R., Moore, J. P., Sanders, R. W. 2011; 286 (25): 22250-22261

Abstract

An effective HIV-1 vaccine should ideally induce strong humoral and cellular immune responses that provide sterilizing immunity over a prolonged period. Current HIV-1 vaccines have failed in inducing such immunity. The viral envelope glycoprotein complex (Env) can be targeted by neutralizing antibodies to block infection, but several Env properties limit the ability to induce an antibody response of sufficient quantity and quality. We hypothesized that Env immunogenicity could be improved by embedding an immunostimulatory protein domain within its sequence. A stabilized Env trimer was therefore engineered with the granulocyte-macrophage colony-stimulating factor (GM-CSF) inserted into the V1V2 domain of gp120. Probing with neutralizing antibodies showed that both the Env and GM-CSF components of the chimeric protein were folded correctly. Furthermore, the embedded GM-CSF domain was functional as a cytokine in vitro. Mouse immunization studies demonstrated that chimeric Env(GM-CSF) enhanced Env-specific antibody and T cell responses compared with wild-type Env. Collectively, these results show that targeting and activation of immune cells using engineered cytokine domains within the protein can improve the immunogenicity of Env subunit vaccines.

View details for DOI 10.1074/jbc.M111.229625

View details for Web of Science ID 000291719900033

View details for PubMedID 21515681

View details for PubMedCentralID PMC3121371
Modulation of Integrin Activation by an Entropic Spring in the beta-Knee JOURNAL OF BIOLOGICAL CHEMISTRY Smagghe, B. J., Huang, P., Ban, Y. A., Baker, D., Springer, T. A. 2010; 285 (43): 32954-32966

Abstract

We show that the length of a loop in the β-knee, between the first and second cysteines (C1-C2) in integrin EGF-like (I-EGF) domain 2, modulates integrin activation. Three independent sets of mutants, including swaps among different integrin β-subunits, show that C1-C2 loop lengths of 12 and longer favor the low affinity state and masking of ligand-induced binding site (LIBS) epitopes. Shortening length from 12 to 4 residues progressively increases ligand binding and LIBS epitope exposure. Compared with length, the loop sequence had a smaller effect, which was ascribable to stabilizing loop conformation, and not interactions with the α-subunit. The data together with structural calculations support the concept that the C1-C2 loop is an entropic spring and an emerging theme that disordered regions can regulate allostery. Diversity in the length of this loop may have evolved among integrin β-subunits to adjust the equilibrium between the bent and extended conformations at different set points.

View details for DOI 10.1074/jbc.M110.145177

View details for Web of Science ID 000283048200033

View details for PubMedID 20670939

View details for PubMedCentralID PMC2963379
A de novo designed protein-protein interface PROTEIN SCIENCE Huang, P., Love, J. J., Mayo, S. L. 2007; 16 (12): 2770-2774

Abstract

As an approach to both explore the physical/chemical parameters that drive molecular self-assembly and to generate novel protein oligomers, we have developed a procedure to generate protein dimers from monomeric proteins using computational protein docking and amino acid sequence design. A fast Fourier transform-based docking algorithm was used to generate a model for a dimeric version of the 56-amino-acid beta1 domain of streptococcal protein G. Computational amino acid sequence design of 24 residues at the dimer interface resulted in a heterodimer comprised of 12-fold and eightfold variants of the wild-type protein. The designed proteins were expressed, purified, and characterized using analytical ultracentrifugation and heteronuclear NMR techniques. Although the measured dissociation constant was modest ( approximately 300 microM), 2D-[(1)H,(15)N]-HSQC NMR spectra of one of the designed proteins in the absence and presence of its binding partner showed clear evidence of specific dimer formation.

View details for DOI 10.1110/ps.073125207

View details for Web of Science ID 000251081300023

View details for PubMedID 18029425

View details for PubMedCentralID PMC2222823
Adaptation of a fast Fourier transform-based docking algorithm for protein design JOURNAL OF COMPUTATIONAL CHEMISTRY Huang, P. S., Love, J. J., Mayo, S. L. 2005; 26 (12): 1222-1232

Abstract

Designing proteins with novel protein/protein binding properties can be achieved by combining the tools that have been developed independently for protein docking and protein design. We describe here the sequence-independent generation of protein dimer orientations by protein docking for use as scaffolds in protein sequence design algorithms. To dock monomers into sequence-independent dimer conformations, we use a reduced representation in which the side chains are approximated by spheres with atomic radii derived from known C2 symmetry-related homodimers. The interfaces of C2-related homodimers are usually more hydrophobic and protein core-like than the interfaces of heterodimers; we parameterize the radii for docking against this feature to capture and recreate the spatial characteristics of a hydrophobic interface. A fast Fourier transform-based geometric recognition algorithm is used for docking the reduced representation protein models. The resulting docking algorithm successfully predicted the wild-type homodimer orientations in 65 out of 121 dimer test cases. The success rate increases to approximately 70% for the subset of molecules with large surface area burial in the interface relative to their chain length. Forty-five of the predictions exhibited less than 1 A C(alpha) RMSD compared to the native X-ray structures. The reduced protein representation therefore appears to be a reasonable approximation and can be used to position protein backbones in plausible orientations for homodimer design.

View details for Web of Science ID 000230715200003

View details for PubMedID 15962277
A designed protein interface that blocks fibril formation JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Shukla, U. J., Marino, H., Huang, P. S., Mayo, S. L., Love, J. J. 2004; 126 (43): 13914-13915

Abstract

Protein fibril formation is implicated in many diseases, and therefore much effort has been focused toward the development of inhibitors of this process. In a previous project, a monomeric protein was computationally engineered to bind itself and form a heterodimer complex following interfacial redesign. One of the protein monomers, termed monomer-B, was unintentionally destabilized and shown to form macroscopic fibrils. Interestingly, in the presence of the designed binding partner, fibril formation was blocked. Here we describe the complete characterization of the amyloid properties of monomer-B and the inhibition of fiber formation by the designed binding partner, monomer-A. Both proteins are mutants of the betal domain of streptococcal protein-G. The free monomer-B protein forms amyloid-type fibrils, as determined by transmission electron microscopy and the change in fluorescence of Thioflavin T, an amyloid-specific dye. Fibril formation kinetics are influenced by pH, protein concentration, and seeding with preformed fibrils. Under all conditions tested, monomer-A was able to inhibit the formation of monomer-B fibrils. This inhibition is specific to the engineered interaction, as incubation of monomer-B with wild-type protein-G (a structural homologue) did not result in inhibition under the same conditions. Thus, this de novo-designed heterodimeric complex is an excellent model system for the study of protein-based fibril formation and inhibition. This system provides additional insight into the development of pharmaceuticals for amyloid disorders, as well as the potential use of amyloid fibrils for self-assembling nanostructures.

View details for DOI 10.1021/ja0456858

View details for Web of Science ID 000224873600019

View details for PubMedID 15506739

Possu Huang

Assistant Professor of Bioengineering

Bio

Academic Appointments

Professional Education

Contact

Additional Info

Links

Current Research and Scholarly Interests

Projects

Location

2025-26 Courses

2024-25 Courses

2023-24 Courses

2022-23 Courses

Stanford Advisees

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract