Accurate Allele Frequencies from Ultra-low Coverage Pool-Seq Samples in Evolve-and-Resequence Experiments.
G3 (Bethesda, Md.)
Evolve-and-resequence (E+R) experiments leverage next-generation sequencing technology to track the allele frequency dynamics of populations as they evolve. While previous work has shown that adaptive alleles can be detected by comparing frequency trajectories from many replicate populations, this power comes at the expense of high-coverage (>100x) sequencing of many pooled samples, which can be cost-prohibitive. Here, we show that accurate estimates of allele frequencies can be achieved with very shallow sequencing depths (<5x) via inference of known founder haplotypes in small genomic windows. This technique can be used to efficiently estimate frequencies for any number of bi-allelic SNPs in populations of any model organism founded with sequenced homozygous strains. Using both experimentally-pooled and simulated samples of Drosophila melanogaster, we show that haplotype inference can improve allele frequency accuracy by orders of magnitude for up to 50 generations of recombination, and is robust to moderate levels of missing data, as well as different selection regimes. Finally, we show that a simple linear model generated from these simulations can predict the accuracy of haplotype-derived allele frequencies in other model organisms and experimental designs. To make these results broadly accessible for use in E+R experiments, we introduce HAF-pipe, an open-source software tool for calculating haplotype-derived allele frequencies from raw sequencing data. Ultimately, by reducing sequencing costs without sacrificing accuracy, our method facilitates E+R designs with higher replication and resolution, and thereby, increased power to detect adaptive alleles.
View details for DOI 10.1534/g3.119.400755
View details for PubMedID 31636085
Clonal replacement and heterogeneity in breast tumors treated with neoadjuvant HER2-targeted therapy.
2019; 10 (1): 657
Genomic changes observed across treatment may result from either clonal evolution or geographically disparate sampling of heterogeneous tumors. Here we use computational modeling based on analysis of fifteen primary breast tumors and find that apparent clonal change between two tumor samples can frequently be explained by pre-treatment heterogeneity, such that at least two regions are necessary to detect treatment-induced clonal shifts. To assess for clonal replacement, we devise a summary statistic based on whole-exome sequencing of a pre-treatment biopsy and multi-region sampling of the post-treatment surgical specimen and apply this measure to five breast tumors treated with neoadjuvant HER2-targeted therapy. Two tumors underwent clonal replacement with treatment, and mathematical modeling indicates these two tumors had resistant subclones prior to treatment and rates of resistance-related genomic changes that were substantially larger than previous estimates. Our results provide a needed framework to incorporate primary tumor heterogeneity in investigating the evolution of resistance.
View details for PubMedID 30737380
Deep sequencing of natural and experimental populations of Drosophila melanogaster reveals biases in the spectrum of new mutations.
2017; 27 (12): 1988–2000
Mutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on having precise measurements of mutational rates and patterns. We generate a data set for this purpose using (1) de novo mutations from mutation accumulation experiments and (2) extremely rare polymorphisms from natural populations. The first, mutation accumulation (MA) lines are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. The second, rare genetic variation from natural populations allows the study of mutation because extremely rare polymorphisms are relatively unaffected by the filter of natural selection. We use both methods in Drosophila melanogaster, first generating our own novel data set of sequenced MA lines and performing a meta-analysis of all published MA mutations (∼2000 events) and then identifying a high quality set of ∼70,000 extremely rare (≤0.1%) polymorphisms that are fully validated with resequencing. We use these data sets to precisely measure mutational rates and patterns. Highlights of our results include: a high rate of multinucleotide mutation events at both short (∼5 bp) and long (∼1 kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and using our precise context-dependent mutation rates to predict long-term evolutionary patterns at synonymous sites. We also show that de novo mutations from independent MA experiments display similar patterns of single nucleotide mutation and well match the patterns of mutation found in natural populations.
View details for PubMedID 29079675