All Publications


  • The protein domains of vertebrate species in which selection is more effective have greater intrinsic structural disorder. bioRxiv : the preprint server for biology Weibel, C. A., Wheeler, A. L., James, J. E., Willis, S. M., McShea, H., Masel, J. 2024

    Abstract

    The nearly neutral theory of molecular evolution posits variation among species in the effectiveness of selection. In an idealized model, the census population size determines both this minimum magnitude of the selection coefficient required for deleterious variants to be reliably purged, and the amount of neutral diversity. Empirically, an "effective population size" is often estimated from the amount of putatively neutral genetic diversity and is assumed to also capture a species' effectiveness of selection. A potentially more direct measure of the effectiveness of selection is the degree to which selection maintains preferred codons. However, past metrics that compare codon bias across species are confounded by among-species variation in %GC content and/or amino acid composition. Here we propose a new Codon Adaptation Index of Species (CAIS), based on Kullback-Leibler divergence, that corrects for both confounders. We demonstrate the use of CAIS correlations, as well as the Effective Number of Codons, to show that the protein domains of more highly adapted vertebrate species evolve higher intrinsic structural disorder.

    View details for DOI 10.1101/2023.03.02.530449

    View details for PubMedID 38712167

    View details for PubMedCentralID PMC11071303

  • nQMaker: estimating time non-reversible amino acid substitution models. Systematic biology Dang, C. C., Minh, B. Q., McShea, H., Masel, J., James, J. E., Vinh, L. S., Lanfear, R. 2022

    Abstract

    Amino acid substitution models are a key component in phylogenetic analyses of protein sequences. All commonly-used amino acid models available to date are time-reversible, an assumption designed for computational convenience but not for biological reality. Another significant downside to time-reversible models is that they do not allow inference of rooted trees without outgroups. In this paper, we introduce a maximum likelihood approach nQMaker, an extension of the recently published QMaker method, that allows the estimation of time non-reversible amino acid substitution models and rooted phylogenetic trees from a set of protein sequence alignments. We show that the non-reversible models estimated with nQMaker are a much better fit to empirical alignments than pre-existing reversible models, across a wide range of datasets including mammals, birds, plants, fungi, and other taxa, and that the improvements in model fit scale with the size of the dataset. Notably, for the recently published plant and bird trees, these non-reversible models correctly recovered the commonly estimated root placements with very high statistical support without the need to use an outgroup. We provide nQMaker as an easy-to-use feature in the IQ-TREE software (http://www.iqtree.org), allowing users to estimate non-reversible models and rooted phylogenies from their own protein datasets. The datasets and scripts used in this paper are available at https://doi.org/10.6084/m9.figshare.14516712.

    View details for DOI 10.1093/sysbio/syac007

    View details for PubMedID 35139203

  • Reconstructing the evolutionary history of nitrogenases: Evidence for ancestral molybdenum-cofactor utilization. Geobiology Garcia, A. K., McShea, H., Kolaczkowski, B., Kacar, B. 2020

    Abstract

    The nitrogenase metalloenzyme family, essential for supplying fixed nitrogen to the biosphere, is one of life's key biogeochemical innovations. The three forms of nitrogenase differ in their metal dependence, each binding either a FeMo-, FeV-, or FeFe-cofactor where the reduction of dinitrogen takes place. The history of nitrogenase metal dependence has been of particular interest due to the possible implication that ancient marine metal availabilities have significantly constrained nitrogenase evolution over geologic time. Here, we reconstructed the evolutionary history of nitrogenases, and combined phylogenetic reconstruction, ancestral sequence inference, and structural homology modeling to evaluate the potential metal dependence of ancient nitrogenases. We find that active-site sequence features can reliably distinguish extant Mo-nitrogenases from V- and Fe-nitrogenases and that inferred ancestral sequences at the deepest nodes of the phylogeny suggest these ancient proteins most resemble modern Mo-nitrogenases. Taxa representing early-branching nitrogenase lineages lack one or more biosynthetic nifE and nifN genes that both contribute to the assembly of the FeMo-cofactor in studied organisms, suggesting that early Mo-nitrogenases may have utilized an alternate and/or simplified pathway for cofactor biosynthesis. Our results underscore the profound impacts that protein-level innovations likely had on shaping global biogeochemical cycles throughout the Precambrian, in contrast to organism-level innovations that characterize the Phanerozoic Eon.

    View details for DOI 10.1111/gbi.12381

    View details for PubMedID 32065506