Professional Education

  • Doctor of Philosophy, University of Washington (2017)
  • Bachelor of Science, University of Science and Technology of China (2010)


  • David Baker, Christine Tinberg, Sagar Khare, Jiayi Dou. "United States Patent 9840539 High affinity digoxigenin binding proteins", University Of Washington, Dec 12, 2017

All Publications

  • HARNESSING BACKBONE STRAIN TO DESIGN BETA-BARREL PROTEINS DE NOVO: FROM FIRST PRINCIPLES TO APPLICATION Vorobieva, A., Dou, J., Sheffler, W., Mao, B., Bick, M., Doyle, L., Klima, J., Gagnon, L., Kipnis, Y., Stoddard, B., Baker, D. WILEY. 2019: 115–16
  • Improving the Efficiency of Ligand-Binding Protein Design with Molecular Dynamics Simulations. Journal of chemical theory and computation Barros, E. P., Schiffer, J. M., Vorobieva, A., Dou, J., Baker, D., Amaro, R. E. 2019


    Custom-designed ligand-binding proteins represent a promising class of macromolecules with exciting applications toward the design of new enzymes or the engineering of antibodies and small-molecule recruited proteins for therapeutic interventions. However, several challenges remain in designing a protein sequence such that the binding site organization results in high affinity interaction with a bound ligand. Here, we study the dynamics of explicitly solvated designed proteins through all-atom molecular dynamics (MD) simulations to gain insight into the causes that lead to the low affinity or instability of most of these designs, despite the prediction of their success by the computational design methodology. Simulations ranging from 500 to 1000 ns per replicate were conducted on 37 designed protein variants encompassing two distinct folds and a range of ligand affinities, resulting in more than 180 μs of combined sampling. The simulations provide retrospective insights into the properties affecting ligand affinity that can prove useful in guiding further steps of design optimization. Features indicate that entropic components are particularly important for affinity, which are not easily incorporated in the empirical models often used in design protocols. Additionally, we demonstrate that the application of machine learning approaches built upon the output from the simulations can help discriminate between successful and failed binders, such that MD could act as a screening step in protein design, resulting in a more efficient process.

    View details for DOI 10.1021/acs.jctc.9b00483

    View details for PubMedID 31442033

  • De novo design of a fluorescence-activating beta-barrel. Nature Dou, J., Vorobieva, A. A., Sheffler, W., Doyle, L. A., Park, H., Bick, M. J., Mao, B., Foight, G. W., Lee, M. Y., Gagnon, L. A., Carter, L., Sankaran, B., Ovchinnikov, S., Marcos, E., Huang, P., Vaughan, J. C., Stoddard, B. L., Baker, D. 2018


    The regular arrangements of beta-strands around a central axis in beta-barrels and of alpha-helices in coiled coils contrast with the irregular tertiary structures of most globular proteins, and have fascinated structural biologists since they were first discovered. Simple parametric models have been used to design a wide range of alpha-helical coiled-coil structures, but to date there has been no success with beta-barrels. Here we show that accurate de novo design of beta-barrels requires considerable symmetry-breaking to achieve continuous hydrogen-bond connectivity and eliminate backbone strain. We then build ensembles of beta-barrel backbone models with cavity shapes that match the fluorogenic compound DFHBI,and use a hierarchical grid-based search method to simultaneously optimize the rigid-body placement of DFHBI in these cavities and the identities of the surrounding amino acids to achieve high shape and chemical complementarity. The designs have high structural accuracy and bind and fluorescently activate DFHBI in vitro and in Escherichia coli, yeast and mammalian cells. This de novo design of small-molecule binding activity, using backbones custom-built to bind the ligand, should enable the design of increasingly sophisticated ligand-binding proteins, sensors and catalysts that are not limited by the backbone geometries available in known protein structures.

    View details for PubMedID 30209393

  • Sampling and energy evaluation challenges in ligand binding protein design PROTEIN SCIENCE Dou, J., Doyle, L., Greisen, P., Schena, A., Park, H., Johnsson, K., Stoddard, B. L., Baker, D. 2017; 26 (12): 2426–37


    The steroid hormone 17α-hydroxylprogesterone (17-OHP) is a biomarker for congenital adrenal hyperplasia and hence there is considerable interest in development of sensors for this compound. We used computational protein design to generate protein models with binding sites for 17-OHP containing an extended, nonpolar, shape-complementary binding pocket for the four-ring core of the compound, and hydrogen bonding residues at the base of the pocket to interact with carbonyl and hydroxyl groups at the more polar end of the ligand. Eight of 16 designed proteins experimentally tested bind 17-OHP with micromolar affinity. A co-crystal structure of one of the designs revealed that 17-OHP is rotated 180° around a pseudo-two-fold axis in the compound and displays multiple binding modes within the pocket, while still interacting with all of the designed residues in the engineered site. Subsequent rounds of mutagenesis and binding selection improved the ligand affinity to nanomolar range, while appearing to constrain the ligand to a single bound conformation that maintains the same "flipped" orientation relative to the original design. We trace the discrepancy in the design calculations to two sources: first, a failure to model subtle backbone changes which alter the distribution of sidechain rotameric states and second, an underestimation of the energetic cost of desolvating the carbonyl and hydroxyl groups of the ligand. The difference between design model and crystal structure thus arises from both sampling limitations and energy function inaccuracies that are exacerbated by the near two-fold symmetry of the molecule.

    View details for DOI 10.1002/pro.3317

    View details for Web of Science ID 000416063400011

    View details for PubMedID 28980354

    View details for PubMedCentralID PMC5699494

  • Principles for designing proteins with cavities formed by curved beta sheets SCIENCE Marcos, E., Basanta, B., Chidyausiku, T. M., Tang, Y., Oberdorfer, G., Liu, G., Swapna, G. T., Guan, R., Silva, D., Dou, J., Pereira, J., Xiao, R., Sankaran, B., Zwart, P. H., Montelione, G. T., Baker, D. 2017; 355 (6321): 201–6


    Active sites and ligand-binding cavities in native proteins are often formed by curved β sheets, and the ability to control β-sheet curvature would allow design of binding proteins with cavities customized to specific ligands. Toward this end, we investigated the mechanisms controlling β-sheet curvature by studying the geometry of β sheets in naturally occurring protein structures and folding simulations. The principles emerging from this analysis were used to design, de novo, a series of proteins with curved β sheets topped with α helices. Nuclear magnetic resonance and crystal structures of the designs closely match the computational models, showing that β-sheet curvature can be controlled with atomic-level accuracy. Our approach enables the design of proteins with cavities and provides a route to custom design ligand-binding and catalytic sites.

    View details for DOI 10.1126/science.aah7389

    View details for Web of Science ID 000391743700048

    View details for PubMedID 28082595

    View details for PubMedCentralID PMC5588894

  • CSAR Benchmark Exercise 2013: Evaluation of Results from a Combined Computational Protein Design, Docking, and Scoring/Ranking Challenge JOURNAL OF CHEMICAL INFORMATION AND MODELING Smith, R. D., Damm-Ganamet, K. L., Dunbar, J. B., Ahmed, A., Chinnaswamy, K., Delproposto, J. E., Kubish, G. M., Tinberg, C. E., Khare, S. D., Dou, J., Doyle, L., Stuckey, J. A., Baker, D., Carlson, H. A. 2016; 56 (6): 1022–31


    Community Structure-Activity Resource (CSAR) conducted a benchmark exercise to evaluate the current computational methods for protein design, ligand docking, and scoring/ranking. The exercise consisted of three phases. The first phase required the participants to identify and rank order which designed sequences were able to bind the small molecule digoxigenin. The second phase challenged the community to select a near-native pose of digoxigenin from a set of decoy poses for two of the designed proteins. The third phase investigated the ability of current methods to rank/score the binding affinity of 10 related steroids to one of the designed proteins (pKd = 4.1 to 6.7). We found that 11 of 13 groups were able to correctly select the sequence that bound digoxigenin, with most groups providing the correct three-dimensional structure for the backbone of the protein as well as all atoms of the active-site residues. Eleven of the 14 groups were able to select the appropriate pose from a set of plausible decoy poses. The ability to predict absolute binding affinities is still a difficult task, as 8 of 14 groups were able to correlate scores to affinity (Pearson-r > 0.7) of the designed protein for congeneric steroids and only 5 of 14 groups were able to correlate the ranks of the 10 related ligands (Spearman-ρ > 0.7).

    View details for DOI 10.1021/acs.jcim.5b00387

    View details for Web of Science ID 000378826800009

    View details for PubMedID 26419257

  • Computational design of ligand-binding proteins with high affinity and selectivity NATURE Tinberg, C. E., Khare, S. D., Dou, J., Doyle, L., Nelson, J. W., Schena, A., Jankowski, W., Kalodimos, C. G., Johnsson, K., Stoddard, B. L., Baker, D. 2013; 501 (7466): 212-+


    The ability to design proteins with high affinity and selectivity for any given small molecule is a rigorous test of our understanding of the physiochemical principles that govern molecular recognition. Attempts to rationally design ligand-binding proteins have met with little success, however, and the computational design of protein-small-molecule interfaces remains an unsolved problem. Current approaches for designing ligand-binding proteins for medical and biotechnological uses rely on raising antibodies against a target antigen in immunized animals and/or performing laboratory-directed evolution of proteins with an existing low affinity for the desired ligand, neither of which allows complete control over the interactions involved in binding. Here we describe a general computational method for designing pre-organized and shape complementary small-molecule-binding sites, and use it to generate protein binders to the steroid digoxigenin (DIG). Of seventeen experimentally characterized designs, two bind DIG; the model of the higher affinity binder has the most energetically favourable and pre-organized interface in the design set. A comprehensive binding-fitness landscape of this design, generated by library selections and deep sequencing, was used to optimize its binding affinity to a picomolar level, and X-ray co-crystal structures of two variants show atomic-level agreement with the corresponding computational models. The optimized binder is selective for DIG over the related steroids digitoxigenin, progesterone and β-oestradiol, and this steroid binding preference can be reprogrammed by manipulation of explicitly designed hydrogen-bonding interactions. The computational design method presented here should enable the development of a new generation of biosensors, therapeutics and diagnostics.

    View details for DOI 10.1038/nature12443

    View details for Web of Science ID 000324244900039

    View details for PubMedID 24005320

    View details for PubMedCentralID PMC3898436