Aaron Behr
Ph.D. Student in Biology, admitted Autumn 2020
All Publications
-
Replicative selfish genetic elements are driving rapid pathogenic adaptation of Enterococcus faecium.
bioRxiv : the preprint server for biology
2025
Abstract
Understanding how healthcare-associated pathogens adapt in clinical environments can inform strategies to reduce their burden. Here, we investigate the hypothesis that insertion sequences (IS), prokaryotic transposable elements, are a dominant mediator of rapid genomic evolution in healthcare-associated pathogens. Among 28,207 publicly available pathogen genomes, we find high copy numbers of the replicative ISL3 family in healthcare-associated Enterococcus faecium, Streptococcus pneumoniae and Staphylococcus aureus. In E. faecium, the ESKAPE pathogen with the highest IS density, we find that ISL3 proliferation has increased in the last 30 years. To enable better identification of structural variants, we long read-sequenced a new, single hospital collection of 282 Enterococcal infection isolates collected over three years. In these samples, we observed extensive, ongoing structural variation of the E. faecium genome, largely mediated by active replicative ISL3 elements. To determine if ISL3 is actively replicating in clinical timescales in its natural, gut microbiome reservoir, we long read-sequenced a collection of 28 longitudinal stool samples from patients undergoing hematopoietic cell transplantation, whose gut microbiomes were dominated by E. faecium. We found up to six structural variants of a given E. faecium strain within a single stool sample. Examining longitudinal samples from one individual in further detail, we find ISL3 elements can replicate and move to specific positions with profound regulatory effects on neighboring gene expression. In particular, we identify an ISL3 element that upon insertion replaces an imperfect -35 promoter sequence at a folT gene locus with a perfect -35 sequence, which leads to substantial upregulation of expression of folT, driving highly effective folate scavenging. As a known folate auxotroph, E. faecium depends on other members of the microbiota or diet to supply folate. Enhanced folate scavenging may enable E. faecium to thrive in the setting of microbiome collapse that is common in HCT and other critically ill patients. Together, ISL3 expansion has enabled E. faecium to rapidly evolve in healthcare settings, and this likely contributes to its metabolic fitness and may strongly influence its ongoing trajectory of genomic evolution.
View details for DOI 10.1101/2025.03.16.643550
View details for PubMedID 40161577
View details for PubMedCentralID PMC11952509
-
pong: fast analysis and visualization of latent clusters in population genetic data.
Bioinformatics (Oxford, England)
2016; 32 (18): 2817-23
Abstract
A series of methods in population genetics use multilocus genotype data to assign individuals membership in latent clusters. These methods belong to a broad class of mixed-membership models, such as latent Dirichlet allocation used to analyze text corpora. Inference from mixed-membership models can produce different output matrices when repeatedly applied to the same inputs, and the number of latent clusters is a parameter that is often varied in the analysis pipeline. For these reasons, quantifying, visualizing, and annotating the output from mixed-membership models are bottlenecks for investigators across multiple disciplines from ecology to text data mining.We introduce pong, a network-graphical approach for analyzing and visualizing membership in latent clusters with a native interactive D3.js visualization. pong leverages efficient algorithms for solving the Assignment Problem to dramatically reduce runtime while increasing accuracy compared with other methods that process output from mixed-membership models. We apply pong to 225 705 unlinked genome-wide single-nucleotide variants from 2426 unrelated individuals in the 1000 Genomes Project, and identify previously overlooked aspects of global human population structure. We show that pong outpaces current solutions by more than an order of magnitude in runtime while providing a customizable and interactive visualization of population structure that is more accurate than those produced by current tools.pong is freely available and can be installed using the Python package management system pip. pong's source code is available at https://github.com/abehr/pongaaron_behr@alumni.brown.edu or sramachandran@brown.eduSupplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btw327
View details for PubMedID 27283948
View details for PubMedCentralID PMC5018373
https://orcid.org/0000-0001-6660-9546