Honors & Awards
Graduate Student Fellowship, Stanford CEHG (2017-2018)
Gerald J. Lieberman Fellowship, VPGE (2016-2017)
Graduate Research Fellowship, National Science Foundation (2013-2016)
Thouron Award, Thouron Award (2012-2013)
Education & Certifications
BA, University of Pennsylvania, Mathematics (2012)
MSc (Res), University of Oxford, Statistics (2013)
Dmitri Petrov, Doctoral (Program)
A spatio-temporal assessment of simian/human immunodeficiency virus (SHIV) evolution reveals a highly dynamic process within the host.
2017; 13 (5)
The process by which drug-resistant HIV-1 arises and spreads spatially within an infected individual is poorly understood. Studies have found variable results relating how HIV-1 in the blood differs from virus sampled in tissues, offering conflicting findings about whether HIV-1 throughout the body is homogeneously distributed. However, most of these studies sample only two compartments and few have data from multiple time points. To directly measure how drug resistance spreads within a host and to assess how spatial structure impacts its emergence, we examined serial sequences from four macaques infected with RT-SHIVmne027, a simian immunodeficiency virus encoding HIV-1 reverse transcriptase (RT), and treated with RT inhibitors. Both viral DNA and RNA (vDNA and vRNA) were isolated from the blood (including plasma and peripheral blood mononuclear cells), lymph nodes, gut, and vagina at a median of four time points and RT was characterized via single-genome sequencing. The resulting sequences reveal a dynamic system in which vRNA rapidly acquires drug resistance concomitantly across compartments through multiple independent mutations. Fast migration results in the same viral genotypes present across compartments, but not so fast as to equilibrate their frequencies immediately. The blood and lymph nodes were found to be compartmentalized rarely, while both the blood and lymph node were more frequently different from mucosal tissues. This study suggests that even oft-sampled blood does not fully capture the viral dynamics in other parts of the body, especially the gut where vRNA turnover was faster than the plasma and vDNA retained fewer wild-type viruses than other sampled compartments. Our findings of transient compartmentalization across multiple tissues may help explain the varied results of previous compartmentalization studies in HIV-1.
View details for DOI 10.1371/journal.ppat.1006358
View details for PubMedID 28542550
The population genetics of drug resistance evolution in natural populations of viral, bacterial and eukaryotic pathogens.
2016; 25 (1): 42-66
Drug resistance is a costly consequence of pathogen evolution and a major concern in public health. In this review, we show how population genetics can be used to study the evolution of drug resistance and also how drug resistance evolution is informative as an evolutionary model system. We highlight five examples from diverse organisms with particular focus on: (i) identifying drug resistance loci in the malaria parasite Plasmodium falciparum using the genomic signatures of selective sweeps, (ii) determining the role of epistasis in drug resistance evolution in influenza, (iii) quantifying the role of standing genetic variation in the evolution of drug resistance in HIV, (iv) using drug resistance mutations to study clonal interference dynamics in tuberculosis and (v) analysing the population structure of the core and accessory genome of Staphylococcus aureus to understand the spread of methicillin resistance. Throughout this review, we discuss the uses of sequence data and population genetic theory in studying the evolution of drug resistance.
View details for DOI 10.1111/mec.13474
View details for PubMedID 26578204
View details for PubMedCentralID PMC4943078
More effective drugs lead to harder selective sweeps in the evolution of drug resistance in HIV-1.
In the early days of HIV treatment, drug resistance occurred rapidly and predictably in all patients, but under modern treatments, resistance arises slowly, if at all. The probability of resistance should be controlled by the rate of generation of resistance mutations. If many adaptive mutations arise simultaneously, then adaptation proceeds by soft selective sweeps in which multiple adaptive mutations spread concomitantly, but if adaptive mutations occur rarely in the population, then a single adaptive mutation should spread alone in a hard selective sweep. Here, we use 6717 HIV-1 consensus sequences from patients treated with first-line therapies between 1989 and 2013 to confirm that the transition from fast to slow evolution of drug resistance was indeed accompanied with the expected transition from soft to hard selective sweeps. This suggests more generally that evolution proceeds via hard sweeps if resistance is unlikely and via soft sweeps if it is likely.
View details for DOI 10.7554/eLife.10670
View details for PubMedID 26882502
View details for PubMedCentralID PMC4764592
Identifying signatures of selection in genetic time series.
2014; 196 (2): 509–22
Both genetic drift and natural selection cause the frequencies of alleles in a population to vary over time. Discriminating between these two evolutionary forces, based on a time series of samples from a population, remains an outstanding problem with increasing relevance to modern data sets. Even in the idealized situation when the sampled locus is independent of all other loci, this problem is difficult to solve, especially when the size of the population from which the samples are drawn is unknown. A standard χ(2)-based likelihood-ratio test was previously proposed to address this problem. Here we show that the χ(2)-test of selection substantially underestimates the probability of type I error, leading to more false positives than indicated by its P-value, especially at stringent P-values. We introduce two methods to correct this bias. The empirical likelihood-ratio test (ELRT) rejects neutrality when the likelihood-ratio statistic falls in the tail of the empirical distribution obtained under the most likely neutral population size. The frequency increment test (FIT) rejects neutrality if the distribution of normalized allele-frequency increments exhibits a mean that deviates significantly from zero. We characterize the statistical power of these two tests for selection, and we apply them to three experimental data sets. We demonstrate that both ELRT and FIT have power to detect selection in practical parameter regimes, such as those encountered in microbial evolution experiments. Our analysis applies to a single diallelic locus, assumed independent of all other loci, which is most relevant to full-genome selection scans in sexual organisms, and also to evolution experiments in asexual organisms as long as clonal interference is weak. Different techniques will be required to detect selection in time series of cosegregating linked loci.
View details for DOI 10.1534/genetics.113.158220
View details for PubMedID 24318534
View details for PubMedCentralID PMC3914623
LDx: Estimation of Linkage Disequilibrium from High-Throughput Pooled Resequencing Data
2012; 7 (11)
High-throughput pooled resequencing offers significant potential for whole genome population sequencing. However, its main drawback is the loss of haplotype information. In order to regain some of this information, we present LDx, a computational tool for estimating linkage disequilibrium (LD) from pooled resequencing data. LDx uses an approximate maximum likelihood approach to estimate LD (r(2)) between pairs of SNPs that can be observed within and among single reads. LDx also reports r(2) estimates derived solely from observed genotype counts. We demonstrate that the LDx estimates are highly correlated with r(2) estimated from individually resequenced strains. We discuss the performance of LDx using more stringent quality conditions and infer via simulation the degree to which performance can improve based on read depth. Finally we demonstrate two possible uses of LDx with real and simulated pooled resequencing data. First, we use LDx to infer genomewide patterns of decay of LD with physical distance in D. melanogaster population resequencing data. Second, we demonstrate that r(2) estimates from LDx are capable of distinguishing alternative demographic models representing plausible demographic histories of D. melanogaster.
View details for DOI 10.1371/journal.pone.0048588
View details for Web of Science ID 000312272600012
View details for PubMedID 23152785
View details for PubMedCentralID PMC3494690
Natural Selection Affects Multiple Aspects of Genetic Variation at Putatively Neutral Sites across the Human Genome
2011; 7 (10)
A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.
View details for DOI 10.1371/journal.pgen.1002326
View details for Web of Science ID 000296665400027
View details for PubMedID 22022285
View details for PubMedCentralID PMC3192825