Current Research and Scholarly Interests


We study the evolution of complex traits by developing new experimental and computational methods.

Our work brings together quantitative genetics, genomics, epigenetics, and evolutionary biology to achieve a deeper understanding of how genetic variation shapes the phenotypic diversity of life. Our main focus is on the evolution of gene expression, which is the primary fuel for natural selection. Our long-term goal is to be able to introduce complex traits into new species via genome editing.

2024-25 Courses


Stanford Advisees


Graduate and Fellowship Programs


  • Biology (School of Humanities and Sciences) (Phd Program)
  • Biomedical Data Science (Phd Program)

All Publications


  • Disentangling cell-intrinsic and extrinsic factors underlying gene expression evolution. bioRxiv : the preprint server for biology Starr, A. L., Nishimura, T., Igarashi, K. J., Funamoto, C., Nakauchi, H., Fraser, H. B. 2024

    Abstract

    Chimeras have played a foundational role in biology, for example by enabling the classification of developmental processes into those driven intrinsically by individual cells versus those driven extrinsically by their extracellular environment. Here, we extend this framework to decompose evolutionary divergence in gene expression and other quantitative traits into cell-intrinsic, extrinsic, and intrinsic-extrinsic interaction components. Applying this framework to reciprocal rat-mouse chimeras, we found that the majority of gene expression divergence is attributable to cell-intrinsic factors, though extrinsic factors also play an integral role. For example, a rat-like extracellular environment extrinsically up-regulates the expression of a key transcriptional regulator of the endoplasmic reticulum (ER) stress response in some but not all cell types, which in turn strongly predicts extrinsic up-regulation of its target genes and of the ER stress response pathway as a whole. This effect is also seen at the protein level, suggesting propagation through multiple regulatory levels. We also demonstrate that our framework is applicable to a cellular trait, neuronal differentiation, and estimated the intrinsic and extrinsic contributions to its divergence. Finally, we show that imprinted genes are dramatically mis-expressed in species-mismatched environments, suggesting that mismatch between rapidly evolving intrinsic and extrinsic mechanisms controlling gene imprinting may contribute to barriers to interspecies chimerism. Overall, our conceptual framework opens new avenues to investigate the mechanistic basis of evolutionary divergence in gene expression and other quantitative traits in any multicellular organism.

    View details for DOI 10.1101/2024.05.06.592777

    View details for PubMedID 38798687

    View details for PubMedCentralID PMC11118348

  • Primate cell fusion disentangles gene regulatory divergence in neurodevelopment. Nature Agoglia, R. M., Sun, D. n., Birey, F. n., Yoon, S. J., Miura, Y. n., Sabatini, K. n., Pașca, S. P., Fraser, H. B. 2021

    Abstract

    Among primates, humans display a unique trajectory of development that is responsible for the many traits specific to our species. However, the inaccessibility of primary human and chimpanzee tissues has limited our ability to study human evolution. Comparative in vitro approaches using primate-derived induced pluripotent stem cells have begun to reveal species differences on the cellular and molecular levels1,2. In particular, brain organoids have emerged as a promising platform to study primate neural development in vitro3-5, although cross-species comparisons of organoids are complicated by differences in developmental timing and variability of differentiation6,7. Here we develop a new platform to address these limitations by fusing human and chimpanzee induced pluripotent stem cells to generate a panel of tetraploid hybrid stem cells. We applied this approach to study species divergence in cerebral cortical development by differentiating these cells into neural organoids. We found that hybrid organoids provide a controlled system for disentangling cis- and trans-acting gene-expression divergence across cell types and developmental stages, revealing a signature of selection on astrocyte-related genes. In addition, we identified an upregulation of the human somatostatin receptor 2 gene (SSTR2), which regulates neuronal calcium signalling and is associated with neuropsychiatric disorders8,9. We reveal a human-specific response to modulation of SSTR2 function in cortical neurons, underscoring the potential of this platform for elucidating the molecular basis of human evolution.

    View details for DOI 10.1038/s41586-021-03343-3

    View details for PubMedID 33731928

  • Human-chimpanzee fused cells reveal cis-regulatory divergence underlying skeletal evolution. Nature genetics Gokhman, D. n., Agoglia, R. M., Kinnebrew, M. n., Gordon, W. n., Sun, D. n., Bajpai, V. K., Naqvi, S. n., Chen, C. n., Chan, A. n., Chen, C. n., Petrov, D. A., Ahituv, N. n., Zhang, H. n., Mishina, Y. n., Wysocka, J. n., Rohatgi, R. n., Fraser, H. B. 2021

    Abstract

    Gene regulatory divergence is thought to play a central role in determining human-specific traits. However, our ability to link divergent regulation to divergent phenotypes is limited. Here, we utilized human-chimpanzee hybrid induced pluripotent stem cells to study gene expression separating these species. The tetraploid hybrid cells allowed us to separate cis- from trans-regulatory effects, and to control for nongenetic confounding factors. We differentiated these cells into cranial neural crest cells, the primary cell type giving rise to the face. We discovered evidence of lineage-specific selection on the hedgehog signaling pathway, including a human-specific sixfold down-regulation of EVC2 (LIMBIN), a key hedgehog gene. Inducing a similar down-regulation of EVC2 substantially reduced hedgehog signaling output. Mice and humans lacking functional EVC2 show striking phenotypic parallels to human-chimpanzee craniofacial differences, suggesting that the regulatory divergence of hedgehog signaling may have contributed to the unique craniofacial morphology of humans.

    View details for DOI 10.1038/s41588-021-00804-3

    View details for PubMedID 33731941

  • Detecting selection with a genetic cross. Proceedings of the National Academy of Sciences of the United States of America Fraser, H. B. 2020

    Abstract

    Distinguishing which traits have evolved under natural selection, as opposed to neutral evolution, is a major goal of evolutionary biology. Several tests have been proposed to accomplish this, but these either rely on false assumptions or suffer from low power. Here, I introduce an approach to detecting selection that makes minimal assumptions and only requires phenotypic data from 10 individuals. The test compares the phenotypic difference between two populations to what would be expected by chance under neutral evolution, which can be estimated from the phenotypic distribution of an F2 cross between those populations. Simulations show that the test is robust to variation in the number of loci affecting the trait, the distribution of locus effect sizes, heritability, dominance, and epistasis. Comparing its performance to the QTL sign test-an existing test of selection that requires both genotype and phenotype data-the new test achieves comparable power with 50- to 100-fold fewer individuals (and no genotype data). Applying the test to empirical data spanning over a century shows strong directional selection in many crops, as well as on naturally selected traits such as head shape in Hawaiian Drosophila and skin color in humans. Applied to gene expression data, the test reveals that the strength of stabilizing selection acting on mRNA levels in a species is strongly associated with that species' effective population size. In sum, this test is applicable to phenotypic data from almost any genetic cross, allowing selection to be detected more easily and powerfully than previously possible.

    View details for DOI 10.1073/pnas.2014277117

    View details for PubMedID 32848059

  • The somatic mutation landscape of the human body. Genome biology Garcia-Nieto, P. E., Morrison, A. J., Fraser, H. B. 2019; 20 (1): 298

    Abstract

    BACKGROUND: Somatic mutations in healthy tissues contribute to aging, neurodegeneration, and cancer initiation, yet they remain largely uncharacterized.RESULTS: To gain a better understanding of the genome-wide distribution and functional impact of somatic mutations, we leverage the genomic information contained in the transcriptome to uniformly call somatic mutations from over 7500 tissue samples, representing 36 distinct tissues. This catalog, containing over 280,000 mutations, reveals a wide diversity of tissue-specific mutation profiles associated with gene expression levels and chromatin states. For example, lung samples with low expression of the mismatch-repair gene MLH1 show a mutation signature of deficient mismatch repair. In addition, we find pervasive negative selection acting on missense and nonsense mutations, except for mutations previously observed in cancer samples, which are under positive selection and are highly enriched in many healthy tissues.CONCLUSIONS: These findings reveal fundamental patterns of tissue-specific somatic evolution and shed light on aging and the earliest stages of tumorigenesis.

    View details for DOI 10.1186/s13059-019-1919-5

    View details for PubMedID 31874648

  • Fine-mapping cis-regulatory variants in diverse human populations. eLife Tehranchi, A., Hie, B., Dacre, M., Kaplow, I., Pettie, K., Combs, P., Fraser, H. B. 2019; 8

    Abstract

    Genome-wide association studies (GWAS) are a powerful approach for connecting genotype to phenotype. Most GWAS hits are located in cis-regulatory regions, but the underlying causal variants and their molecular mechanisms remain unknown. To better understand human cis-regulatory variation, we mapped quantitative trait loci for chromatin accessibility (caQTLs)-a key step in cis-regulation-in 1000 individuals from 10 diverse populations. Most caQTLs were shared across populations, allowing us to leverage the genetic diversity to fine-map candidate causal regulatory variants, several thousand of which have been previously implicated in GWAS. In addition, many caQTLs that affect the expression of distal genes also alter the landscape of long-range chromosomal interactions, suggesting a mechanism for long-range expression QTLs. In sum, our results show that molecular QTL mapping integrated across diverse populations provides a high-resolution view of how worldwide human genetic variation affects chromatin accessibility, gene expression, and phenotype.Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that minor issues remain unresolved (see decision letter).

    View details for PubMedID 30650056

  • Tissue-Specific cis-Regulatory Divergence Implicates eloF in Inhibiting Interspecies Mating in Drosophila. Current biology : CB Combs, P. A., Krupp, J. J., Khosla, N. M., Bua, D., Petrov, D. A., Levine, J. D., Fraser, H. B. 2018

    Abstract

    Reproductive isolation is a key component of speciation. In many insects, a major driver of this isolation is cuticular hydrocarbon pheromones, which help to identify potential intraspecific mates [1-3]. When the distributions of related species overlap, there may be strong selection on mate choice for intraspecific partners [4-9] because interspecific hybridization carries significant fitness costs [10]. Drosophila hasbeen a key model for the investigation of reproductive isolation; although both male and female mate choices have been extensively investigated [6,11-16], the genes underlying species recognition remain largely unknown. To explore the molecular mechanisms underlying Drosophila speciation, we measured tissue-specific cis-regulatory divergence using RNA sequencing (RNA-seq) in D.simulans * D.sechellia hybrids. By focusing on cis-regulatory changes specific to female oenocytes, the tissue that produces cuticular hydrocarbons, we rapidly identified a small number of candidate genes. We found that one of these, the fatty acid elongase eloF, broadly affects the hydrocarbons present on D.sechellia and D.melanogaster females, as well asthe propensity of D.simulans males to mate withthem. Therefore, cis-regulatory changes in eloF may be a major driver in the sexual isolation of D.simulans from multiple other species. Our RNA-seq approach proved to be far more efficient than quantitative trait locus (QTL) mapping in identifying candidate genes; the same framework can be used to pinpoint candidate drivers of cis-regulatory divergence in traits differing between any interfertile species.

    View details for PubMedID 30503619

  • Functional Genetic Variants Revealed by Massively Parallel Precise Genome Editing CELL Sharon, E., Chen, S. A., Khosla, N. M., Smith, J. D., Pritchard, J. K., Fraser, H. B. 2018; 175 (2): 544-+
  • Functional Genetic Variants Revealed by Massively Parallel Precise Genome Editing. Cell Sharon, E., Chen, S. A., Khosla, N. M., Smith, J. D., Pritchard, J. K., Fraser, H. B. 2018

    Abstract

    A major challenge in genetics is to identify genetic variants driving natural phenotypic variation. However, current methods of genetic mapping havelimited resolution. To address this challenge, we developed aCRISPR-Cas9-based high-throughput genome editing approach that can introduce thousands of specific genetic variants in a single experiment. This enabled us to study the fitness consequences of 16,006 natural genetic variants in yeast. We identified 572 variants with significant fitness differences in glucose media; these are highly enriched in promoters, particularly in transcription factor binding sites, while only 19.2% affect amino acid sequences. Strikingly, nearby variants nearly always favor the same parent's alleles, suggesting that lineage-specific selection is often driven by multiple clusteredvariants. In sum, our genome editing approach reveals the genetic architecture of fitness variation at single-base resolution and could be adapted tomeasure the effects of genome-wide genetic variation in any screen for cell survival or cell-sortable markers.

    View details for PubMedID 30245013

  • Pooled ChIP-Seq Links Variation in Transcription Factor Binding to Complex Disease Risk CELL Tehranchi, A. K., Myrthil, M., Martin, T., Hie, B. L., Golan, D., Fraser, H. B. 2016; 165 (3): 730-741

    Abstract

    Cis-regulatory elements such as transcription factor (TF) binding sites can be identified genome-wide, but it remains far more challenging to pinpoint genetic variants affecting TF binding. Here, we introduce a pooling-based approach to mapping quantitative trait loci (QTLs) for molecular-level traits. Applying this to five TFs and a histone modification, we mapped thousands of cis-acting QTLs, with over 25-fold lower cost compared to standard QTL mapping. We found that single genetic variants frequently affect binding of multiple TFs, and CTCF can recruit all five TFs to its binding sites. These QTLs often affect local chromatin and transcription but can also influence long-range chromosomal contacts, demonstrating a role for natural genetic variation in chromosomal architecture. Thousands of these QTLs have been implicated in genome-wide association studies, providing candidate molecular mechanisms for many disease risk loci and suggesting that TF binding variation may underlie a large fraction of human phenotypic variation.

    View details for DOI 10.1016/j.cell.2016.03.041

    View details for Web of Science ID 000374636800029

    View details for PubMedID 27087447

    View details for PubMedCentralID PMC4842172

  • Dissecting the Genetic Basis of a Complex cis-Regulatory Adaptation. PLoS genetics Naranjo, S., Smith, J. D., Artieri, C. G., Zhang, M., Zhou, Y., Palmer, M. E., Fraser, H. B. 2015; 11 (12)

    Abstract

    Although single genes underlying several evolutionary adaptations have been identified, the genetic basis of complex, polygenic adaptations has been far more challenging to pinpoint. Here we report that the budding yeast Saccharomyces paradoxus has recently evolved resistance to citrinin, a naturally occurring mycotoxin. Applying a genome-wide test for selection on cis-regulation, we identified five genes involved in the citrinin response that are constitutively up-regulated in S. paradoxus. Four of these genes are necessary for resistance, and are also sufficient to increase the resistance of a sensitive strain when over-expressed. Moreover, cis-regulatory divergence in the promoters of these genes contributes to resistance, while exacting a cost in the absence of citrinin. Our results demonstrate how the subtle effects of individual regulatory elements can be combined, via natural selection, into a complex adaptation. Our approach can be applied to dissect the genetic basis of polygenic adaptations in a wide range of species.

    View details for DOI 10.1371/journal.pgen.1005751

    View details for PubMedID 26713447

  • Genetic conflict reflected in tissue-specific maps of genomic imprinting in human and mouse. Nature genetics Babak, T., Deveale, B., Tsang, E. K., Zhou, Y., Li, X., Smith, K. S., Kukurba, K. R., Zhang, R., Li, J. B., van der Kooy, D., Montgomery, S. B., Fraser, H. B. 2015; 47 (5): 544-549

    Abstract

    Genomic imprinting is an epigenetic process that restricts gene expression to either the maternally or paternally inherited allele. Many theories have been proposed to explain its evolutionary origin, but understanding has been limited by a paucity of data mapping the breadth and dynamics of imprinting within any organism. We generated an atlas of imprinting spanning 33 mouse and 45 human developmental stages and tissues. Nearly all imprinted genes were imprinted in early development and either retained their parent-of-origin expression in adults or lost it completely. Consistent with an evolutionary signature of parental conflict, imprinted genes were enriched for coexpressed pairs of maternally and paternally expressed genes, showed accelerated expression divergence between human and mouse, and were more highly expressed than their non-imprinted orthologs in other species. Our approach demonstrates a general framework for the discovery of imprinting in any species and sheds light on the causes and consequences of genomic imprinting in mammals.

    View details for DOI 10.1038/ng.3274

    View details for PubMedID 25848752

  • Gene expression drives local adaptation in humans GENOME RESEARCH Fraser, H. B. 2013; 23 (7): 1089-1096

    Abstract

    The molecular basis of adaptation-and, in particular, the relative roles of protein-coding versus gene expression changes-has long been the subject of speculation and debate. Recently, the genotyping of diverse human populations has led to the identification of many putative "local adaptations" that differ between populations. Here I show that these local adaptations are over 10-fold more likely to affect gene expression than amino acid sequence. In addition, a novel framework for identifying polygenic local adaptations detects recent positive selection on the expression levels of genes involved in UV radiation response, immune cell proliferation, and diabetes-related pathways. These results provide the first examples of polygenic gene expression adaptation in humans, as well as the first genome-scale support for the hypothesis that changes in gene expression have driven human adaptation.

    View details for DOI 10.1101/gr.152710.112

    View details for Web of Science ID 000321119900006

    View details for PubMedID 23539138

  • Polygenic cis-regulatory adaptaion in the evolution of yeast pathogenicity GENOME RESEARCH Fraser, H. B., Levy, S., Chavan, A., Shah, H. B., Perez, J. C., Zhou, Y., Siegal, M. L., Sinha, H. 2012; 22 (10): 1930-1939

    Abstract

    The acquisition of new genes, via horizontal transfer or gene duplication/diversification, has been the dominant mechanism thus far implicated in the evolution of microbial pathogenicity. In contrast, the role of many other modes of evolution--such as changes in gene expression regulation-remains unknown. A transition to a pathogenic lifestyle has recently taken place in some lineages of the budding yeast Saccharomyces cerevisiae. Here we identify a module of physically interacting proteins involved in endocytosis that has experienced selective sweeps for multiple cis-regulatory mutations that down-regulate gene expression levels in a pathogenic yeast. To test if these adaptations affect virulence, we created a panel of single-allele knockout strains whose hemizygous state mimics the genes' adaptive down-regulations, and measured their virulence in a mammalian host. Despite having no growth advantage in standard laboratory conditions, nearly all of the strains were more virulent than their wild-type progenitor, suggesting that these adaptations likely played a role in the evolution of pathogenicity. Furthermore, genetic variants at these loci were associated with clinical origin across 88 diverse yeast strains, suggesting the adaptations may have contributed to the virulence of a wide range of clinical isolates. We also detected pleiotropic effects of these adaptations on a wide range of morphological traits, which appear to have been mitigated by compensatory mutations at other loci. These results suggest that cis-regulatory adaptation can occur at the level of physically interacting modules and that one such polygenic adaptation led to increased virulence during the evolution of a pathogenic yeast.

    View details for DOI 10.1101/gr.134080.111

    View details for Web of Science ID 000309325900010

    View details for PubMedID 22645260

    View details for PubMedCentralID PMC3460188

  • Cell-type-specific cis-regulatory divergence in gene expression and chromatin accessibility revealed by human-chimpanzee hybrid cells. eLife Wang, B., Starr, A. L., Fraser, H. B. 2024; 12

    Abstract

    Although gene expression divergence has long been postulated to be the primary driver of human evolution, identifying the genes and genetic variants underlying uniquely human traits has proven to be quite challenging. Theory suggests that cell-type-specific cis-regulatory variants may fuel evolutionary adaptation due to the specificity of their effects. These variants can precisely tune the expression of a single gene in a single cell-type, avoiding the potentially deleterious consequences of trans-acting changes and non-cell type-specific changes that can impact many genes and cell types, respectively. It has recently become possible to quantify human-specific cis-acting regulatory divergence by measuring allele-specific expression in human-chimpanzee hybrid cells-the product of fusing induced pluripotent stem (iPS) cells of each species in vitro. However, these cis-regulatory changes have only been explored in a limited number of cell types. Here, we quantify human-chimpanzee cis-regulatory divergence in gene expression and chromatin accessibility across six cell types, enabling the identification of highly cell-type-specific cis-regulatory changes. We find that cell-type-specific genes and regulatory elements evolve faster than those shared across cell types, suggesting an important role for genes with cell-type-specific expression in human evolution. Furthermore, we identify several instances of lineage-specific natural selection that may have played key roles in specific cell types, such as coordinated changes in the cis-regulation of dozens of genes involved in neuronal firing in motor neurons. Finally, using novel metrics and a machine learning model, we identify genetic variants that likely alter chromatin accessibility and transcription factor binding, leading to neuron-specific changes in the expression of the neurodevelopmentally important genes FABP7 and GAD1. Overall, our results demonstrate that integrative analysis of cis-regulatory divergence in chromatin accessibility and gene expression across cell types is a promising approach to identify the specific genes and genetic variants that make us human.

    View details for DOI 10.7554/eLife.89594

    View details for PubMedID 38358392

  • Chromatin activity identifies differential gene regulation across human ancestries. Genome biology Pettie, K. P., Mumbach, M., Lea, A. J., Ayroles, J., Chang, H. Y., Kasowski, M., Fraser, H. B. 2024; 25 (1): 21

    Abstract

    Current evidence suggests that cis-regulatory elements controlling gene expression may be the predominant target of natural selection in humans and other species. Detecting selection acting on these elements is critical to understanding evolution but remains challenging because we do not know which mutations will affect gene regulation.To address this, we devise an approach to search for lineage-specific selection on three critical steps in transcriptional regulation: chromatin activity, transcription factor binding, and chromosomal looping. Applying this approach to lymphoblastoid cells from 831 individuals of either European or African descent, we find strong signals of differential chromatin activity linked to gene expression differences between ancestries in numerous contexts, but no evidence of functional differences in chromosomal looping. Moreover, we show that enhancers rather than promoters display the strongest signs of selection associated with sites of differential transcription factor binding.Overall, our study indicates that some cis-regulatory adaptation may be more easily detected at the level of chromatin than DNA sequence. This work provides a vast resource of genomic interaction data from diverse human populations and establishes a novel selection test that will benefit future study of regulatory evolution in humans and other species.

    View details for DOI 10.1186/s13059-024-03165-2

    View details for PubMedID 38225662

    View details for PubMedCentralID PMC10789071

  • Cell type-specific cis-regulatory divergence in gene expression and chromatin accessibility revealed by human-chimpanzee hybrid cells. bioRxiv : the preprint server for biology Wang, B., Starr, A. L., Fraser, H. B. 2023

    Abstract

    Although gene expression divergence has long been postulated to be the primary driver of human evolution, identifying the genes and genetic variants underlying uniquely human traits has proven to be quite challenging. Theory suggests that cell type-specific cis-regulatory variants may fuel evolutionary adaptation due to the specificity of their effects. These variants can precisely tune the expression of a single gene in a single cell type, avoiding the potentially deleterious consequences of trans-acting changes and non-cell type-specific changes that can impact many genes and cell types, respectively. It has recently become possible to quantify human-specific cis-acting regulatory divergence by measuring allele-specific expression in human-chimpanzee hybrid cells-the product of fusing induced pluripotent stem (iPS) cells of each species in vitro. However, these cis-regulatory changes have only been explored in a limited number of cell types. Here, we quantify human-chimpanzee cis-regulatory divergence in gene expression and chromatin accessibility across six cell types, enabling the identification of highly cell type-specific cis-regulatory changes. We find that cell type-specific genes and regulatory elements evolve faster than those shared across cell types, suggesting an important role for genes with cell type-specific expression in human evolution. Furthermore, we identify several instances of lineage-specific natural selection that may have played key roles in specific cell types, such as coordinated changes in the cis-regulation of dozens of genes involved in neuronal firing in motor neurons. Finally, using novel metrics and a machine learning model, we identify genetic variants that likely alter chromatin accessibility and transcription factor binding, leading to neuron-specific changes in the expression of the neurodevelopmentally important genes FABP7 and GAD1. Overall, our results demonstrate that integrative analysis of cis-regulatory divergence in chromatin accessibility and gene expression across cell types is a promising approach to identify the specific genes and genetic variants that make us human.

    View details for DOI 10.1101/2023.05.22.541747

    View details for PubMedID 37292820

    View details for PubMedCentralID PMC10245923

  • Allele-specific expression reveals genetic drivers of tissue regeneration in mice. Cell stem cell Mack, K. L., Talbott, H. E., Griffin, M. F., Parker, J. B., Guardino, N. J., Spielman, A. F., Davitt, M. F., Mascharak, S., Downer, M., Morgan, A., Valencia, C., Akras, D., Berger, M. J., Wan, D. C., Fraser, H. B., Longaker, M. T. 2023

    Abstract

    In adult mammals, skin wounds typically heal by scarring rather than through regeneration. In contrast, "super-healer" Murphy Roths Large (MRL) mice have the unusual ability to regenerate ear punch wounds; however, the molecular basis for this regeneration remains elusive. Here, in hybrid crosses between MRL and non-regenerating mice, we used allele-specific gene expression to identify cis-regulatory variation associated with ear regeneration. Analyzing three major cell populations (immune, fibroblast, and endothelial), we found that genes with cis-regulatory differences specifically in fibroblasts were associated with wound-healing pathways and also co-localized with quantitative trait loci for ear wound-healing. Ectopic treatment with one of these proteins, complement factor H (CFH), accelerated wound repair and induced regeneration in typically fibrotic wounds. Through single-cell RNA sequencing (RNA-seq), we observed that CFH treatment dramatically reduced immune cell recruitment to wounds, suggesting a potential mechanism for CFH's effect. Overall, our results provide insights into the molecular drivers of regeneration with potential clinical implications.

    View details for DOI 10.1016/j.stem.2023.08.010

    View details for PubMedID 37714154

  • Gene-by-environment interactions are pervasive among natural genetic variants. Cell genomics Chen, S. A., Kern, A. F., Ang, R. M., Xie, Y., Fraser, H. B. 2023; 3 (4): 100273

    Abstract

    Gene-by-environment (GxE) interactions, in which a genetic variant's phenotypic effect is condition specific, are fundamental for understanding fitness landscapes and evolution but have been difficult to identify at the single-nucleotide level. Although many condition-specific quantitative trait loci (QTLs) have been mapped, these typically contain numerous inconsequential variants in linkage, precluding understanding of the causal GxE variants. Here, we introduce BARcoded Cas9 retron precise parallel editing via homology (CRISPEY-BAR), a high-throughput precision genome editing strategy, and use it to map GxE interactions of naturally occurring genetic polymorphisms impacting yeast growth. We identified hundreds of GxE variants within condition-specific QTLs, revealing unexpected genetic complexity. Moreover, we found that 93.7% of non-neutral natural variants within ergosterol biosynthesis pathway genes showed GxE interactions, including many impacting antifungal drug resistance through diverse molecular mechanisms. In sum, our results suggest an extremely complex, context-dependent fitness landscape characterized by pervasive GxE interactions while also demonstrating massively parallel genome editing as an effective means for investigating this complexity.

    View details for DOI 10.1016/j.xgen.2023.100273

    View details for PubMedID 37082145

    View details for PubMedCentralID PMC10112290

  • Widespread epistasis among beneficial genetic variants revealed by high-throughput genome editing. Cell genomics Ang, R. M., Chen, S. A., Kern, A. F., Xie, Y., Fraser, H. B. 2023; 3 (4): 100260

    Abstract

    The phenotypic effect of any genetic variant can be altered by variation at other genomic loci. Known as epistasis, these genetic interactions shape the genotype-phenotype map of every species, yet their origins remain poorly understood. To investigate this, we employed high-throughput genome editing to measure the fitness effects of 1,826 naturally polymorphic variants in four strains of Saccharomyces cerevisiae. About 31% of variants affect fitness, of which 24% have strain-specific fitness effects indicative of epistasis. We found that beneficial variants are more likely to exhibit genetic interactions and that these interactions can be mediated by specific traits such as flocculation ability. This work suggests that adaptive evolution will often involve trade-offs where a variant is only beneficial in some genetic backgrounds, potentially explaining why many beneficial variants remain polymorphic. In sum, we provide a framework to understand the factors influencing epistasis with single-nucleotide resolution, revealing widespread epistasis among beneficial variants.

    View details for DOI 10.1016/j.xgen.2023.100260

    View details for PubMedID 37082144

    View details for PubMedCentralID PMC10112194

  • Evolution of spatial and temporal cis-regulatory divergence in sticklebacks. Molecular biology and evolution Mack, K. L., Square, T. A., Zhao, B., Miller, C. T., Fraser, H. B. 2023

    Abstract

    Cis-regulatory changes are thought to play a major role in adaptation. Threespine sticklebacks have repeatedly colonized freshwater habitats in the Northern Hemisphere, where they have evolved a suite of phenotypes that distinguish them from marine populations, including changes in physiology, behavior, and morphology. To understand the role of gene regulatory evolution in adaptive divergence, here we investigate cis-regulatory changes in gene expression between marine and freshwater ecotypes through allele-specific expression (ASE) in F1 hybrids. Surveying seven ecologically relevant tissues, including three sampled across two developmental stages, we identified cis-regulatory divergence affecting a third of genes, nearly half of which were tissue-specific. Next, we compared allele-specific expression in dental tissues at two timepoints to characterize cis-regulatory changes during development between marine and freshwater fish. Applying a genome-wide test for selection on cis-regulatory changes, we find evidence for lineage-specific selection on several processes between ecotypes, including the Wnt signaling pathway in dental tissues. Finally, we show that genes with ASE, particularly those that are tissue-specific, are strongly enriched in genomic regions of repeated marine-freshwater divergence, supporting an important role for these cis-regulatory differences in parallel adaptive evolution of sticklebacks to freshwater habitats. Altogether, our results provide insight into the cis-regulatory landscape of divergence between stickleback ecotypes across tissues and during development and supports a fundamental role for tissue-specific cis-regulatory changes in rapid adaptation to new environments.

    View details for DOI 10.1093/molbev/msad034

    View details for PubMedID 36805962

  • Accounting for cis-regulatory constraint prioritizes genes likely to affect species-specific traits. Genome biology Starr, A. L., Gokhman, D., Fraser, H. B. 2023; 24 (1): 11

    Abstract

    Measuring allele-specific expression in interspecies hybrids is a powerful way to detect cis-regulatory changes underlying adaptation. However, it remains difficult to identify genes most likely to explain species-specific traits. Here, we outline a simple strategy that leverages population-scale allele-specific RNA-seq data to identify genes that show constrained cis-regulation within species yet show divergence between species. Applying this strategy to data from human-chimpanzee hybrid cortical organoids, we identify signatures of lineage-specific selection on genes related to saccharide metabolism, neurodegeneration, and primary cilia. We also highlight cis-regulatory divergence in CUX1 and EDNRB that may shape the trajectory of human brain development.

    View details for DOI 10.1186/s13059-023-02846-8

    View details for PubMedID 36658652

  • Existing methods are effective at measuring natural selection on gene expression. Nature ecology & evolution Fraser, H. B. 2022

    View details for DOI 10.1038/s41559-022-01889-7

    View details for PubMedID 36344679

  • Bacterial Retrons Enable Precise Gene Editing in Human Cells Zhao, B. CELL PRESS. 2022: 102
  • Transcriptome diversity is a systematic source of variation in RNA-sequencing data. PLoS computational biology García-Nieto, P. E., Wang, B., Fraser, H. B. 2022; 18 (3): e1009939

    Abstract

    RNA sequencing has been widely used as an essential tool to probe gene expression. While standard practices have been established to analyze RNA-seq data, it is still challenging to interpret and remove artifactual signals. Several biological and technical factors such as sex, age, batches, and sequencing technology have been found to bias these estimates. Probabilistic estimation of expression residuals (PEER), which infers broad variance components in gene expression measurements, has been used to account for some systematic effects, but it has remained challenging to interpret these PEER factors. Here we show that transcriptome diversity-a simple metric based on Shannon entropy-explains a large portion of variability in gene expression and is the strongest known factor encoded in PEER factors. We then show that transcriptome diversity has significant associations with multiple technical and biological variables across diverse organisms and datasets. In sum, transcriptome diversity provides a simple explanation for a major source of variation in both gene expression estimates and PEER covariates.

    View details for DOI 10.1371/journal.pcbi.1009939

    View details for PubMedID 35324895

  • Bacterial Retrons Enable Precise Gene Editing in Human Cells. The CRISPR journal Zhao, B., Chen, S. A., Lee, J., Fraser, H. B. 1800

    Abstract

    Retrons are bacterial genetic elements involved in anti-phage defense. They have the unique ability to reverse transcribe RNA into multicopy single-stranded DNA (msDNA) that remains covalently linked to their template RNA. Retrons coupled with CRISPR-Cas9 in yeast have been shown to improve the efficiency of precise genome editing via homology-directed repair (HDR). In human cells, HDR editing efficiency has been limited by challenges associated with delivering extracellular donor DNA encoding the desired mutation. In this study, we tested the ability of retrons to produce msDNA as donor DNA and facilitate HDR by tethering msDNA to guide RNA in HEK293T and K562 cells. Through heterologous reconstitution of retrons from multiple bacterial species with the CRISPR-Cas9 system, we demonstrated HDR rates of up to 11.4%. Overall, our findings represent the first step in extending retron-based precise gene editing to human cells.

    View details for DOI 10.1089/crispr.2021.0065

    View details for PubMedID 35076284

  • cis-Regulatory changes in locomotor genes are associated with the evolution of burrowing behavior. Cell reports Hu, C. K., York, R. A., Metz, H. C., Bedford, N. L., Fraser, H. B., Hoekstra, H. E. 2022; 38 (7): 110360

    Abstract

    How evolution modifies complex, innate behaviors is largely unknown. Divergence in many morphological traits, and some behaviors, is linked to cis-regulatory changes in gene expression. Given this, we compare brain gene expression of two interfertile sister species of Peromyscus mice that show large and heritable differences in burrowing behavior. Species-level differential expression and allele-specific expression in F1 hybrids indicate a preponderance of cis-regulatory divergence, including many genes whose cis-regulation is affected by burrowing behavior. Genes related to locomotor coordination show the strongest signals of lineage-specific selection on burrowing-induced cis-regulatory changes. Furthermore, genetic markers closest to these candidate genes associate with variation in burrow shape in a genetic cross, suggesting an enrichment for loci affecting burrowing behavior near these candidate locomotor genes. Our results provide insight into how cis-regulated gene expression can depend on behavioral context and how this dynamic regulatory divergence between species may contribute to behavioral evolution.

    View details for DOI 10.1016/j.celrep.2022.110360

    View details for PubMedID 35172153

  • Divergent patterns of selection on metabolite levels and gene expression. BMC ecology and evolution Kern, A. F., Yang, G. X., Khosla, N. M., Ang, R. M., Snyder, M. P., Fraser, H. B. 2021; 21 (1): 185

    Abstract

    BACKGROUND: Natural selection can act on multiple genes in the same pathway, leading to polygenic adaptation. For example, adaptive changes were found to down-regulate six genes involved in ergosterol biosynthesis-an essential pathway targeted by many antifungal drugs-in some strains of the yeast Saccharomyces cerevisiae. However, the impact of this polygenic adaptation on metabolite levels was unknown. Here, we performed targeted mass spectrometry to measure the levels of eight metabolites in this pathway in 74 yeast strains from a genetic cross.RESULTS: Through quantitative trait locus (QTL) mapping we identified 19 loci affecting ergosterol pathway metabolite levels, many of which overlap loci that also impact gene expression within the pathway. We then used the recently developed v-test, which identified selection acting upon three metabolite levels within the pathway, none of which were predictable from the gene expression adaptation.CONCLUSIONS: These data showed that effects of selection on metabolite levels were complex and not predictable from gene expression data. This suggests that a deeper understanding of metabolism is necessary before we can understand the impacts of even relatively straightforward gene expression adaptations on metabolic pathways.

    View details for DOI 10.1186/s12862-021-01915-5

    View details for PubMedID 34587900

  • GRINS: Genetic elements that recode assembly-line polyketide synthases and accelerate their diversification. Proceedings of the National Academy of Sciences of the United States of America Nivina, A., Herrera Paredes, S., Fraser, H. B., Khosla, C. 2021; 118 (26)

    Abstract

    Assembly-line polyketide synthases (PKSs) are large and complex enzymatic machineries with a multimodular architecture, typically encoded in bacterial genomes by biosynthetic gene clusters. Their modularity has led to an astounding diversity of biosynthesized molecules, many with medical relevance. Thus, understanding the mechanisms that drive PKS evolution is fundamental for both functional prediction of natural PKSs as well as for the engineering of novel PKSs. Here, we describe a repetitive genetic element in assembly-line PKS genes which appears to play a role in accelerating the diversification of closely related biosynthetic clusters. We named this element GRINS: genetic repeats of intense nucleotide skews. GRINS appear to recode PKS protein regions with a biased nucleotide composition and to promote gene conversion. GRINS are present in a large number of assembly-line PKS gene clusters and are particularly widespread in the actinobacterial genus Streptomyces While the molecular mechanisms associated with GRINS appearance, dissemination, and maintenance are unknown, the presence of GRINS in a broad range of bacterial phyla and gene families indicates that these genetic elements could play a fundamental role in protein evolution.

    View details for DOI 10.1073/pnas.2100751118

    View details for PubMedID 34162709

  • The cis-regulatory effects of modern human-specific variants. eLife Weiss, C. V., Harshman, L., Inoue, F., Fraser, H. B., Petrov, D. A., Ahituv, N., Gokhman, D. 2021; 10

    Abstract

    The Neanderthal and Denisovan genomes enabled the discovery of sequences that differ between modern and archaic humans, the majority of which are noncoding. However, our understanding of the regulatory consequences of these differences remains limited, in part due to the decay of regulatory marks in ancient samples. Here, we used a massively parallel reporter assay in embryonic stem cells, neural progenitor cells, and bone osteoblasts to investigate the regulatory effects of the 14,042 single-nucleotide modern human-specific variants. Overall, 1791 (13%) of sequences containing these variants showed active regulatory activity, and 407 (23%) of these drove differential expression between human groups. Differentially active sequences were associated with divergent transcription factor binding motifs, and with genes enriched for vocal tract and brain anatomy and function. This work provides insight into the regulatory function of variants that emerged along the modern human lineage and the recent evolution of human gene expression.

    View details for DOI 10.7554/eLife.63713

    View details for PubMedID 33885362

  • Lineage-specific selection and the evolution of virulence in the Candida clade. Proceedings of the National Academy of Sciences of the United States of America Singh-Babak, S. D., Babak, T. n., Fraser, H. B., Johnson, A. D. 2021; 118 (12)

    Abstract

    Candida albicans is the most common cause of systemic fungal infections in humans and is considerably more virulent than its closest known relative, Candida dubliniensis. To investigate this difference, we constructed interspecies hybrids and quantified mRNA levels produced from each genome in the hybrid. This approach systematically identified expression differences in orthologous genes arising from cis-regulatory sequence changes that accumulated since the two species last shared a common ancestor, some 10 million y ago. We documented many orthologous gene-expression differences between the two species, and we pursued one striking observation: All 15 genes coding for the enzymes of glycolysis showed higher expression from the C. albicans genome than the C. dubliniensis genome in the interspecies hybrid. This pattern requires evolutionary changes to have occurred at each gene; the fact that they all act in the same direction strongly indicates lineage-specific natural selection as the underlying cause. To test whether these expression differences contribute to virulence, we created a C. dubliniensis strain in which all 15 glycolysis genes were produced at modestly elevated levels and found that this strain had significantly increased virulence in the standard mouse model of systemic infection. These results indicate that small expression differences across a deeply conserved set of metabolism enzymes can play a significant role in the evolution of virulence in fungal pathogens.

    View details for DOI 10.1073/pnas.2016818118

    View details for PubMedID 33723044

  • Harnessing novel gene expression analyses to identify drivers of regenerative ear wound healing in MRL mice desJardins-Park, H. E., Mack, K. L., Davitt, M. F., Griffin, M., Fraser, H. B., Longaker, M. T. WILEY. 2020: S25
  • Molecular mechanisms of coronary disease revealed using quantitative trait loci for TCF21 binding, chromatin accessibility, and chromosomal looping. Genome biology Zhao, Q. n., Dacre, M. n., Nguyen, T. n., Pjanic, M. n., Liu, B. n., Iyer, D. n., Cheng, P. n., Wirka, R. n., Kim, J. B., Fraser, H. B., Quertermous, T. n. 2020; 21 (1): 135

    Abstract

    To investigate the epigenetic and transcriptional mechanisms of coronary artery disease (CAD) risk, as well as the functional regulation of chromatin structure and function, we create a catalog of genetic variants associated with three stages of transcriptional cis-regulation in primary human coronary artery vascular smooth muscle cells (HCASMCs).We use a pooling approach with HCASMC lines to map regulatory variants that mediate binding of the CAD-associated transcription factor TCF21 with ChIPseq studies (bQTLs), variants that regulate chromatin accessibility with ATACseq studies (caQTLs), and chromosomal looping with Hi-C methods (clQTLs). We examine the overlap of these QTLs and their relationship to smooth muscle-specific genes and transcription factors. Further, we use multiple analyses to show that these QTLs are highly associated with CAD GWAS loci and correlate to lead SNPs where they show allelic effects. By utilizing genome editing, we verify that identified functional variants can regulate both chromatin accessibility and chromosomal looping, providing new insights into functional mechanisms regulating chromatin state and chromosomal structure. Finally, we directly link the disease-associated TGFB1-SMAD3 pathway to the CAD-associated FN1 gene through a response QTL that modulates both chromatin accessibility and chromosomal looping.Together, these studies represent the most thorough mapping of multiple QTL types in a highly disease-relevant primary cultured cell type and provide novel insights into their functional overlap and mechanisms that underlie these genomic features and their relationship to disease risk.

    View details for DOI 10.1186/s13059-020-02049-5

    View details for PubMedID 32513244

  • Fine-mapping cis-regulatory variants in diverse human populations ELIFE Tehranchi, A., Hie, B., Dacre, M., Kaplow, I., Pettie, K., Combs, P., Fraser, H. B. 2019; 8
  • Improving Estimates of Compensatory cis-trans Regulatory Divergence TRENDS IN GENETICS Fraser, H. B. 2019; 35 (1): 3-5
  • Tissue-Specific cis-Regulatory Divergence Implicates eloF in Inhibiting Interspecies Mating in Drosophila CURRENT BIOLOGY Combs, P. A., Krupp, J. J., Khosla, N. M., Bua, D., Petrov, D. A., Levine, J. D., Fraser, H. B. 2018; 28 (24): 3969-+
  • Comparative expression profiling reveals widespread coordinated evolution of gene expression across eukaryotes NATURE COMMUNICATIONS Martin, T., Fraser, H. B. 2018; 9
  • Comparative expression profiling reveals widespread coordinated evolution of gene expression across eukaryotes. Nature communications Martin, T., Fraser, H. B. 2018; 9 (1): 4963

    Abstract

    Comparative studies of gene expression across species have revealed many important insights, but have also been limited by the number of species represented. Here we develop an approach to identify orthologs between highly diverged transcriptome assemblies, and apply this to 657 RNA-seq gene expression profiles from 309 diverse unicellular eukaryotes. We analyzed the resulting data for coevolutionary patterns, and identify several hundred protein complexes and pathways whose expression levels have evolved in a coordinated fashion across the trillions of generations separating these species, including many gene sets with little or no within-species co-expression across environmental or genetic perturbations. We also detect examples of adaptive evolution, for example of tRNA ligase levels to match genome-wide codon usage. In sum, we find that comparative studies from extremely diverse organisms can reveal new insights into the evolution of gene expression, including coordinated evolution of some of the most conserved protein complexes in eukaryotes.

    View details for PubMedID 30470754

  • Behavior-dependent cis regulation reveals genes and pathways associated with bower building in cichlid fishes PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA York, R. A., Patil, C., Abdilleh, K., Johnson, Z. V., Conte, M. A., Genner, M. J., McGrath, P. T., Fraser, H. B., Fernald, R. D., Streelman, J. 2018; 115 (47): E11081-E11090
  • Behavior-dependent cis regulation reveals genes and pathways associated with bower building in cichlid fishes. Proceedings of the National Academy of Sciences of the United States of America York, R. A., Patil, C., Abdilleh, K., Johnson, Z. V., Conte, M. A., Genner, M. J., McGrath, P. T., Fraser, H. B., Fernald, R. D., Streelman, J. T. 2018

    Abstract

    Many behaviors are associated with heritable genetic variation [Kendler and Greenspan (2006) Am J Psychiatry 163:1683-1694]. Genetic mapping has revealed genomic regions or, in a few cases, specific genes explaining part of this variation [Bendesky and Bargmann (2011) Nat Rev Gen 12:809-820]. However, the genetic basis of behavioral evolution remains unclear. Here we investigate the evolution of an innate extended phenotype, bower building, among cichlid fishes of Lake Malawi. Males build bowers of two types, pits or castles, to attract females for mating. We performed comparative genome-wide analyses of 20 bower-building species and found that these phenotypes have evolved multiple times with thousands of genetic variants strongly associated with this behavior, suggesting a polygenic architecture. Remarkably, F1 hybrids of a pit-digging and a castle-building species perform sequential construction of first a pit and then a castle bower. Analysis of brain gene expression in these hybrids showed that genes near behavior-associated variants display behavior-dependent allele-specific expression with preferential expression of the pit-digging species allele during pit digging and of the castle-building species allele during castle building. These genes are highly enriched for functions related to neurodevelopment and neural plasticity. Our results suggest that natural behaviors are associated with complex genetic architectures that alter behavior via cis-regulatory differences whose effects on gene expression are specific to the behavior itself.

    View details for PubMedID 30397142

  • Spatially varying cis-regulatory divergence in Drosophila embryos elucidates cis-regulatory logic. PLoS genetics Combs, P. A., Fraser, H. B. 2018; 14 (11): e1007631

    Abstract

    Spatial patterning of gene expression is a key process in development, yet how it evolves is still poorly understood. Both cis- and trans-acting changes could participate in complex interactions, so to isolate the cis-regulatory component of patterning evolution, we measured allele-specific spatial gene expression patterns in D. melanogaster * simulans hybrid embryos. RNA-seq of cryo-sectioned slices revealed 66 genes with strong spatially varying allele-specific expression. We found that hunchback, a major regulator of developmental patterning, had reduced expression of the D. simulans allele specifically in the anterior tip of hybrid embryos. Mathematical modeling of hunchback cis-regulation suggested a candidate transcription factor binding site variant, which we verified as causal using CRISPR-Cas9 genome editing. In sum, even comparing morphologically near-identical species we identified surprisingly extensive spatial variation in gene expression, suggesting not only that development is robust to many such changes, but also that natural selection may have ample raw material for evolving new body plans via changes in spatial patterning.

    View details for PubMedID 30383747

  • Spatially varying cis-regulatory divergence in Drosophila embryos elucidates cis-regulatory logic PLOS GENETICS Combs, P. A., Fraser, H. B. 2018; 14 (11)
  • Improving Estimates of Compensatory cis-trans Regulatory Divergence. Trends in genetics : TIG Fraser, H. B. 2018

    Abstract

    Interspecific hybrids have played a key role in research on gene expression regulation. A growing number of studies have measured genome-wide allele-specific expression in hybrids and observed that cis-regulatory changes often oppose trans-acting changes affecting the same genes, suggesting stabilizing selection for compensatory changes. However, the most common method for estimating these effects is biased, producing artifactual patterns of compensatory evolution. Here I introduce a simple modification leveraging biological replicates that ameliorates the bias.

    View details for PubMedID 30270122

  • High-resolution mapping of cis-regulatory variation in budding yeast PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Kita, R., Venkataram, S., Zhou, Y., Fraser, H. B. 2017; 114 (50): E10736–E10744

    Abstract

    Genetic variants affecting gene-expression levels are a major source of phenotypic variation. The approximate locations of these variants can be mapped as expression quantitative trait loci (eQTLs); however, a major limitation of eQTLs is their low resolution, which precludes investigation of the causal variants and their molecular mechanisms. Here we report RNA-seq and full genome sequences for 85 diverse isolates of the yeast Saccharomyces cerevisiae-including wild, domesticated, and human clinical strains-which allowed us to perform eQTL mapping with 50-fold higher resolution than previously possible. In addition to variants in promoters, we uncovered an important role for variants in 3'UTRs, especially those affecting binding of the PUF family of RNA-binding proteins. The eQTLs are predominantly under negative selection, particularly those affecting essential genes and conserved genes. However, applying the sign test for lineage-specific selection revealed the polygenic up-regulation of dozens of biofilm suppressor genes in strains isolated from human patients, consistent with the key role of biofilms in fungal pathogenicity. In addition, a single variant in the promoter of a biofilm suppressor, NIT3, showed the strongest genome-wide association with clinical origin. Altogether, our results demonstrate the power of high-resolution eQTL mapping in understanding the molecular mechanisms of regulatory variation, as well as the natural selection acting on this variation that drives adaptation to environments, ranging from laboratories to vineyards to the human body.

    View details for PubMedID 29183975

  • Worldwide patterns of human epigenetic variation NATURE ECOLOGY & EVOLUTION Carja, O., Maclsaac, J. L., Mah, S. M., Henn, B. M., Kobor, M. S., Feldman, M. W., Fraser, H. B. 2017; 1 (10): 1577–83
  • Worldwide patterns of human epigenetic variation. Nature ecology & evolution Carja, O., MacIsaac, J. L., Mah, S. M., Henn, B. M., Kobor, M. S., Feldman, M. W., Fraser, H. B. 2017; 1 (10): 1577-1583

    Abstract

    DNA methylation is an epigenetic modification, influenced by both genetic and environmental variation, that plays a key role in transcriptional regulation and many organismal phenotypes. Although patterns of DNA methylation have been shown to differ between human populations, it remains to be determined how epigenetic diversity relates to the patterns of genetic and gene expression variation at a global scale. Here we measured DNA methylation at 485,000 CpG sites in five diverse human populations, and analysed these data together with genome-wide genotype and gene expression data. We found that population-specific DNA methylation mirrors genetic variation, and has greater local genetic control than mRNA levels. We estimated the rate of epigenetic divergence between populations, which indicates far greater evolutionary stability of DNA methylation in humans than has been observed in plants. This study provides a deeper understanding of worldwide patterns of human epigenetic diversity, as well as initial estimates of the rate of epigenetic divergence in recent human evolution.

    View details for DOI 10.1038/s41559-017-0299-z

    View details for PubMedID 29185505

  • Cis-regulatory evolution in prokaryotes revealed by interspecific archaeal hybrids SCIENTIFIC REPORTS Artieri, C. G., Naor, A., Turgeman-Grott, I., Zhou, Y., York, R., Gophna, U., Fraser, H. B. 2017; 7: 3986

    Abstract

    The study of allele-specific expression (ASE) in interspecific hybrids has played a central role in our understanding of a wide range of phenomena, including genomic imprinting, X-chromosome inactivation, and cis-regulatory evolution. However across the hundreds of studies of hybrid ASE, all have been restricted to sexually reproducing eukaryotes, leaving a major gap in our understanding of the genomic patterns of cis-regulatory evolution in prokaryotes. Here we introduce a method to generate stable hybrids between two species of halophilic archaea, and measure genome-wide ASE in these hybrids with RNA-seq. We found that over half of all genes have significant ASE, and that genes encoding kinases show evidence of lineage-specific selection on their cis-regulation. This pattern of polygenic selection suggested species-specific adaptation to low phosphate conditions, which we confirmed with growth experiments. Altogether, our work extends the study of ASE to archaea, and suggests that cis-regulation can evolve under polygenic lineage-specific selection in prokaryotes.

    View details for PubMedID 28638059

  • Local Adaptation of Sun-Exposure-Dependent Gene Expression Regulation in Human Skin. PLoS genetics Kita, R., Fraser, H. B. 2016; 12 (10)

    Abstract

    Sun-exposure is a key environmental variable in the study of human evolution. Several skin-pigmentation genes serve as classical examples of positive selection, suggesting that sun-exposure has significantly shaped worldwide genomic variation. Here we investigate the interaction between genetic variation and sun-exposure, and how this impacts gene expression regulation. Using RNA-Seq data from 607 human skin samples, we identified thousands of transcripts that are differentially expressed between sun-exposed skin and non-sun-exposed skin. We then tested whether genetic variants may influence each individual's gene expression response to sun-exposure. Our analysis revealed 10 sun-exposure-dependent gene expression quantitative trait loci (se-eQTLs), including genes involved in skin pigmentation (SLC45A2) and epidermal differentiation (RASSF9). The allele frequencies of the RASSF9 se-eQTL across diverse populations correlate with the magnitude of solar radiation experienced by these populations, suggesting local adaptation to varying levels of sunlight. These results provide the first examples of sun-exposure-dependent regulatory variation and suggest that this variation has contributed to recent human adaptation.

    View details for DOI 10.1371/journal.pgen.1006382

    View details for PubMedID 27760139

    View details for PubMedCentralID PMC5070784

  • Genetic variation in MHC proteins is associated with T cell receptor expression biases. Nature genetics Sharon, E., Sibener, L. V., Battle, A., Fraser, H. B., Garcia, K. C., Pritchard, J. K. 2016; 48 (9): 995-1002

    Abstract

    In each individual, a highly diverse T cell receptor (TCR) repertoire interacts with peptides presented by major histocompatibility complex (MHC) molecules. Despite extensive research, it remains controversial whether germline-encoded TCR-MHC contacts promote TCR-MHC specificity and, if so, whether differences exist in TCR V gene compatibilities with different MHC alleles. We applied expression quantitative trait locus (eQTL) mapping to test for associations between genetic variation and TCR V gene usage in a large human cohort. We report strong trans associations between variation in the MHC locus and TCR V gene usage. Fine-mapping of the association signals identifies specific amino acids from MHC genes that bias V gene usage, many of which contact or are spatially proximal to the TCR or peptide in the TCR-peptide-MHC complex. Hence, these MHC variants, several of which are linked to autoimmune diseases, can directly affect TCR-MHC interaction. These results provide the first examples of trans-QTL effects mediated by protein-protein interactions and are consistent with intrinsic TCR-MHC specificity.

    View details for DOI 10.1038/ng.3625

    View details for PubMedID 27479906

    View details for PubMedCentralID PMC5010864

  • Disentangling Sources of Selection on Exonic Transcriptional Enhancers. Molecular biology and evolution Agoglia, R. M., Fraser, H. B. 2016; 33 (2): 585-590

    Abstract

    In addition to coding for proteins, exons can also impact transcription by encoding regulatory elements such as enhancers. It has been debated whether such features confer heightened selective constraint, or evolve neutrally. We have addressed this question by developing a new approach to disentangle the sources of selection acting on exonic enhancers, in which we model the evolutionary rates of every possible substitution as a function of their effects on both protein sequence and enhancer activity. In three exonic enhancers, we found no significant association between evolutionary rates and effects on enhancer activity. This suggests that despite having biochemical activity, these exonic enhancers have no detectable selective constraint, and thus are unlikely to play a major role in protein evolution.

    View details for DOI 10.1093/molbev/msv234

    View details for PubMedID 26500252

    View details for PubMedCentralID PMC4909131

  • A pooling-based approach to mapping genetic variants associated with DNA methylation GENOME RESEARCH Kaplow, I. M., MacIsaac, J. L., Mah, S. M., McEwen, L. M., Kobor, M. S., Fraser, H. B. 2015; 25 (6): 907-917

    Abstract

    DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.

    View details for DOI 10.1101/gr.183749.114

    View details for Web of Science ID 000355565900012

    View details for PubMedID 25910490

    View details for PubMedCentralID PMC4448686

  • Common variants spanning PLK4 are associated with mitotic-origin aneuploidy in human embryos SCIENCE McCoy, R. C., Demko, Z., Ryan, A., Banjevic, M., Hill, M., Sigurjonsson, S., Rabinowitz, M., Fraser, H. B., Petrov, D. A. 2015; 348 (6231): 235-238

    Abstract

    Aneuploidy, the inheritance of an atypical chromosome complement, is common in early human development and is the primary cause of pregnancy loss. By screening day-3 embryos during in vitro fertilization cycles, we identified an association between aneuploidy of putative mitotic origin and linked genetic variants on chromosome 4 of maternal genomes. This associated region contains a candidate gene, Polo-like kinase 4 (PLK4), that plays a well-characterized role in centriole duplication and has the ability to alter mitotic fidelity upon minor dysregulation. Mothers with the high-risk genotypes contributed fewer embryos for testing at day 5, suggesting that their embryos are less likely to survive to blastocyst formation. The associated region coincides with a signature of a selective sweep in ancient humans, suggesting that the causal variant was either the target of selection or hitchhiked to substantial frequency.

    View details for DOI 10.1126/science.aaa3337

    View details for Web of Science ID 000352613700046

    View details for PubMedID 25859044

  • Discordance of DNA Methylation Variance Between two Accessible Human Tissues. Scientific reports Jiang, R., Jones, M. J., Chen, E., Neumann, S. M., Fraser, H. B., Miller, G. E., Kobor, M. S. 2015; 5: 8257-?

    Abstract

    Population epigenetic studies have been seeking to identify differences in DNA methylation between specific exposures, demographic factors, or diseases in accessible tissues, but relatively little is known about how inter-individual variability differs between these tissues. This study presents an analysis of DNA methylation differences between matched peripheral blood mononuclear cells (PMBCs) and buccal epithelial cells (BECs), the two most accessible tissues for population studies, in 998 promoter-located CpG sites. Specifically we compared probe-wise DNA methylation variance, and how this variance related to demographic factors across the two tissues. PBMCs had overall higher DNA methylation than BECs, and the two tissues tended to differ most at genomic regions of low CpG density. Furthermore, although both tissues showed appreciable probe-wise variability, the specific regions and magnitude of variability differed strongly between tissues. Lastly, through exploratory association analysis, we found indication of differential association of BEC and PBMC with demographic variables. The work presented here offers insight into variability of DNA methylation between individuals and across tissues and helps guide decisions on the suitability of buccal epithelial or peripheral mononuclear cells for the biological questions explored by epigenetic studies in human populations.

    View details for DOI 10.1038/srep08257

    View details for PubMedID 25660083

    View details for PubMedCentralID PMC4321176

  • Accounting for biases in riboprofiling data indicates a major role for proline in stalling translation GENOME RESEARCH Artieri, C. G., Fraser, H. B. 2014; 24 (12): 2011-2021

    Abstract

    The recent advent of ribosome profiling-sequencing of short ribosome-bound fragments of mRNA-has offered an unprecedented opportunity to interrogate the sequence features responsible for modulating translational rates. Nevertheless, numerous analyses of the first riboprofiling data set have produced equivocal and often incompatible results. Here we analyze three independent yeast riboprofiling data sets, including two with much higher coverage than previously available, and find that all three show substantial technical sequence biases that confound interpretations of ribosomal occupancy. After accounting for these biases, we find no effect of previously implicated factors on ribosomal pausing. Rather, we find that incorporation of proline, whose unique side-chain stalls peptide synthesis in vitro, also slows the ribosome in vivo. We also reanalyze a method that implicated positively charged amino acids as the major determinant of ribosomal stalling and demonstrate that it produces false signals of stalling in low-coverage data. Our results suggest that any analysis of riboprofiling data should account for sequencing biases and sparse coverage. To this end, we establish a robust methodology that enables analysis of ribosome profiling data without prior assumptions regarding which positions spanned by the ribosome cause stalling.

    View details for DOI 10.1101/gr.175893.114

    View details for Web of Science ID 000345810600009

    View details for PubMedID 25294246

    View details for PubMedCentralID PMC4248317

  • Transcript Length Mediates Developmental Timing of Gene Expression Across Drosophila MOLECULAR BIOLOGY AND EVOLUTION Artieri, C. G., Fraser, H. B. 2014; 31 (11): 2879-2889

    Abstract

    The time required to transcribe genes with long primary transcripts may limit their ability to be expressed in cells with short mitotic cycles, a phenomenon termed intron delay. As such short cycles are a hallmark of the earliest stages of insect development, we tested the impact of intron delay on the Drosophila developmental transcriptome. We find that long zygotically expressed genes show substantial delay in expression relative to their shorter counterparts, which is not observed for maternally deposited transcripts. Patterns of RNA-seq coverage along transcripts show that this delay is consistent with their inability to completely transcribe long transcripts, but not with transcriptional initiation-based regulatory control. We further show that highly expressed zygotic genes maintain compact transcribed regions across the Drosophila phylogeny, allowing conservation of embryonic expression patterns. We propose that the physical constraints of intron delay affect patterns of expression and the evolution of gene structure of a substantial portion of the Drosophila transcriptome.

    View details for DOI 10.1093/molbev/msu226

    View details for Web of Science ID 000344622800005

    View details for PubMedCentralID PMC4209130

  • Transcript length mediates developmental timing of gene expression across Drosophila. Molecular biology and evolution Artieri, C. G., Fraser, H. B. 2014; 31 (11): 2879-2889

    Abstract

    The time required to transcribe genes with long primary transcripts may limit their ability to be expressed in cells with short mitotic cycles, a phenomenon termed intron delay. As such short cycles are a hallmark of the earliest stages of insect development, we tested the impact of intron delay on the Drosophila developmental transcriptome. We find that long zygotically expressed genes show substantial delay in expression relative to their shorter counterparts, which is not observed for maternally deposited transcripts. Patterns of RNA-seq coverage along transcripts show that this delay is consistent with their inability to completely transcribe long transcripts, but not with transcriptional initiation-based regulatory control. We further show that highly expressed zygotic genes maintain compact transcribed regions across the Drosophila phylogeny, allowing conservation of embryonic expression patterns. We propose that the physical constraints of intron delay affect patterns of expression and the evolution of gene structure of a substantial portion of the Drosophila transcriptome.

    View details for DOI 10.1093/molbev/msu226

    View details for PubMedID 25069653

  • Evolution at two levels of gene expression in yeast. Genome research Artieri, C. G., Fraser, H. B. 2014; 24 (3): 411-421

    Abstract

    Despite the greater functional importance of protein levels, our knowledge of gene expression evolution is based almost entirely on studies of mRNA levels. In contrast, our understanding of how translational regulation evolves has lagged far behind. Here we have applied ribosome profiling-which measures both global mRNA levels and their translation rates-to two species of Saccharomyces yeast and their interspecific hybrid in order to assess the relative contributions of changes in mRNA abundance and translation to regulatory evolution. We report that both cis- and trans-acting regulatory divergence in translation are abundant, affecting at least 35% of genes. The majority of translational divergence acts to buffer changes in mRNA abundance, suggesting a widespread role for stabilizing selection acting across regulatory levels. Nevertheless, we observe evidence of lineage-specific selection acting on several yeast functional modules, including instances of reinforcing selection acting at both levels of regulation. Finally, we also uncover multiple instances of stop-codon readthrough that are conserved between species. Our analysis reveals the underappreciated complexity of post-transcriptional regulatory divergence and indicates that partitioning the search for the locus of selection into the binary categories of "coding" versus "regulatory" may overlook a significant source of selection, acting at multiple regulatory levels along the path from genotype to phenotype.

    View details for DOI 10.1101/gr.165522.113

    View details for PubMedID 24318729

  • A Novel Test for Selection on cis-Regulatory Elements Reveals Positive and Negative Selection Acting on Mammalian Transcriptional Enhancers. Molecular biology and evolution Smith, J. D., McManus, K. F., Fraser, H. B. 2013; 30 (11): 2509-2518

    Abstract

    Measuring natural selection on genomic elements involved in the cis-regulation of gene expression-such as transcriptional enhancers and promoters-is critical for understanding the evolution of genomes, yet it remains a major challenge. Many studies have attempted to detect positive or negative selection in these noncoding elements by searching for those with the fastest or slowest rates of evolution, but this can be problematic. Here, we introduce a new approach to this issue, and demonstrate its utility on three mammalian transcriptional enhancers. Using results from saturation mutagenesis studies of these enhancers, we classified all possible point mutations as upregulating, downregulating, or silent, and determined which of these mutations have occurred on each branch of a phylogeny. Applying a framework analogous to Ka/Ks in protein-coding genes, we measured the strength of selection on upregulating and downregulating mutations, in specific branches as well as entire phylogenies. We discovered distinct modes of selection acting on different enhancers: although all three have experienced negative selection against downregulating mutations, the selection pressures on upregulating mutations vary. In one case, we detected positive selection for upregulation, whereas the other two had no detectable selection on upregulating mutations. Our methodology is applicable to the growing number of saturation mutagenesis data sets, and provides a detailed picture of the mode and strength of natural selection acting on cis-regulatory elements.

    View details for DOI 10.1093/molbev/mst134

    View details for PubMedID 23904330

    View details for PubMedCentralID PMC3808868

  • Ancient cis-regulatory constraints and the evolution of genome architecture TRENDS IN GENETICS Irimia, M., Maeso, I., Roy, S. W., Fraser, H. B. 2013; 29 (9): 521-528

    Abstract

    The order of genes along metazoan chromosomes has generally been thought to be largely random, with few implications for organismal function. However, two recent studies, reporting hundreds of pairs of genes that have remained linked in diverse metazoan species over hundreds of millions of years of evolution, suggest widespread functional implications for gene order. These associations appear to largely reflect cis-regulatory constraints, with either (i) multiple genes sharing transcriptional regulatory elements, or (ii) regulatory elements for a developmental gene being found within a neighboring 'bystander' gene (known as a genomic regulatory block). We discuss implications, questions raised, and new research directions arising from these studies, as well as evidence for similar phenomena in other eukaryotic groups.

    View details for DOI 10.1016/j.tig.2013.05.008

    View details for Web of Science ID 000324284000006

    View details for PubMedID 23791467

  • The molecular mechanism of a cis-regulatory adaptation in yeast. PLoS genetics Chang, J., Zhou, Y., Hu, X., Lam, L., Henry, C., Green, E. M., Kita, R., Kobor, M. S., Fraser, H. B. 2013; 9 (9)

    View details for DOI 10.1371/journal.pgen.1003813

    View details for PubMedID 24068973

  • The molecular mechanism of a cis-regulatory adaptation in yeast. PLoS genetics Chang, J., Zhou, Y., Hu, X., Lam, L., Henry, C., Green, E. M., Kita, R., Kobor, M. S., Fraser, H. B. 2013; 9 (9)

    Abstract

    Despite recent advances in our ability to detect adaptive evolution involving the cis-regulation of gene expression, our knowledge of the molecular mechanisms underlying these adaptations has lagged far behind. Across all model organisms, the causal mutations have been discovered for only a handful of gene expression adaptations, and even for these, mechanistic details (e.g. the trans-regulatory factors involved) have not been determined. We previously reported a polygenic gene expression adaptation involving down-regulation of the ergosterol biosynthesis pathway in the budding yeast Saccharomyces cerevisiae. Here we investigate the molecular mechanism of a cis-acting mutation affecting a member of this pathway, ERG28. We show that the causal mutation is a two-base deletion in the promoter of ERG28 that strongly reduces the binding of two transcription factors, Sok2 and Mot3, thus abolishing their regulation of ERG28. This down-regulation increases resistance to a widely used antifungal drug targeting ergosterol, similar to mutations disrupting this pathway in clinical yeast isolates. The identification of the causal genetic variant revealed that the selection likely occurred after the deletion was already present at high frequency in the population, rather than when it was a new mutation. These results provide a detailed view of the molecular mechanism of a cis-regulatory adaptation, and underscore the importance of this view to our understanding of evolution at the molecular level.

    View details for DOI 10.1371/journal.pgen.1003813

    View details for PubMedID 24068973

  • Cell-cycle regulated transcription associates with DNA replication timing in yeast and human GENOME BIOLOGY Fraser, H. B. 2013; 14 (10)

    Abstract

    Eukaryotic DNA replication follows a specific temporal program, with some genomic regions consistently replicating earlier than others, yet what determines this program is largely unknown. Highly transcribed regions have been observed to replicate in early S-phase in all plant and animal species studied to date, but this relationship is thought to be absent from both budding yeast and fission yeast. No association between cell-cycle regulated transcription and replication timing has been reported for any species.Here I show that in budding yeast, fission yeast, and human, the genes most highly transcribed during S-phase replicate early, whereas those repressed in S-phase replicate late. Transcription during other cell-cycle phases shows either the opposite correlation with replication timing, or no relation. The relationship is strongest near late-firing origins of replication, which is not consistent with a previously proposed model—that replication timing may affect transcription—and instead suggests a potential mechanism involving the recruitment of limiting replication initiation factors during S-phase.These results suggest that S-phase transcription may be an important determinant of DNA replication timing across eukaryotes, which may explain the well-established association between transcription and replication timing.

    View details for DOI 10.1186/gb-2013-14-10-r111

    View details for Web of Science ID 000329387500004

    View details for PubMedCentralID PMC3983658

  • Differences in enhancer activity in mouse and zebrafish reporter assays are often associated with changes in gene expression BMC GENOMICS Ariza-Cosano, A., Visel, A., Pennacchio, L. A., Fraser, H. B., Luis Gomez-Skarmeta, J., Irimia, M., Bessa, J. 2012; 13

    Abstract

    Phenotypic evolution in animals is thought to be driven in large part by differences in gene expression patterns, which can result from sequence changes in cis-regulatory elements (cis-changes) or from changes in the expression pattern or function of transcription factors (trans-changes). While isolated examples of trans-changes have been identified, the scale of their overall contribution to regulatory and phenotypic evolution remains unclear.Here, we attempt to examine the prevalence of trans-effects and their potential impact on gene expression patterns in vertebrate evolution by comparing the function of identical human tissue-specific enhancer sequences in two highly divergent vertebrate model systems, mouse and zebrafish. Among 47 human conserved non-coding elements (CNEs) tested in transgenic mouse embryos and in stable zebrafish lines, at least one species-specific expression domain was observed in the majority (83%) of cases, and 36% presented dramatically different expression patterns between the two species. Although some of these discrepancies may be due to the use of different transgenesis systems in mouse and zebrafish, in some instances we found an association between differences in enhancer activity and changes in the endogenous gene expression patterns between mouse and zebrafish, suggesting a potential role for trans-changes in the evolution of gene expression.In total, our results: (i) serve as a cautionary tale for studies investigating the role of human enhancers in different model organisms, and (ii) suggest that changes in the trans environment may play a significant role in the evolution of gene expression in vertebrates.

    View details for DOI 10.1186/1471-2164-13-713

    View details for Web of Science ID 000313248200001

    View details for PubMedID 23253453

    View details for PubMedCentralID PMC3541358

  • Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints GENOME RESEARCH Irimia, M., Tena, J. J., Alexis, M. S., Fernandez-Minan, A., Maeso, I., Bogdanovic, O., de la Calle-Mustienes, E., Roy, S. W., Gomez-Skarmeta, J. L., Fraser, H. B. 2012; 22 (12): 2356-2367

    Abstract

    The order of genes in eukaryotic genomes has generally been assumed to be neutral, since gene order is largely scrambled over evolutionary time. Only a handful of exceptional examples are known, typically involving deeply conserved clusters of tandemly duplicated genes (e.g., Hox genes and histones). Here we report the first systematic survey of microsynteny conservation across metazoans, utilizing 17 genome sequences. We identified nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages across over 600 million years of evolution. Integrating sequence conservation, gene expression data, gene function, epigenetic marks, and other genomic features, we provide extensive evidence that many conserved ancient linkages involve (1) the coordinated transcription of neighboring genes, or (2) genomic regulatory blocks (GRBs) in which transcriptional enhancers controlling developmental genes are contained within nearby bystander genes. In addition, we generated ChIP-seq data for key histone modifications in zebrafish embryos, which provided further evidence of putative GRBs in embryonic development. Finally, using chromosome conformation capture (3C) assays and stable transgenic experiments, we demonstrate that enhancers within bystander genes drive the expression of genes such as Otx and Islet, critical regulators of central nervous system development across bilaterians. These results suggest that ancient genomic functional associations are far more common than previously thought-involving ∼12% of the ancestral bilaterian genome-and that cis-regulatory constraints are crucial in determining metazoan genome architecture.

    View details for DOI 10.1101/gr.139725.112

    View details for Web of Science ID 000311895500004

    View details for PubMedID 22722344

    View details for PubMedCentralID PMC3514665

  • Factors underlying variable DNA methylation in a human community cohort PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Lam, L. L., Emberly, E., Fraser, H. B., Neumann, S. M., Chen, E., Miller, G. E., Kobor, M. S. 2012; 109: 17253-17260

    Abstract

    Epigenetics is emerging as an attractive mechanism to explain the persistent genomic embedding of early-life experiences. Tightly linked to chromatin, which packages DNA into chromosomes, epigenetic marks primarily serve to regulate the activity of genes. DNA methylation is the most accessible and characterized component of the many chromatin marks that constitute the epigenome, making it an ideal target for epigenetic studies in human populations. Here, using peripheral blood mononuclear cells collected from a community-based cohort stratified for early-life socioeconomic status, we measured DNA methylation in the promoter regions of more than 14,000 human genes. Using this approach, we broadly assessed and characterized epigenetic variation, identified some of the factors that sculpt the epigenome, and determined its functional relation to gene expression. We found that the leukocyte composition of peripheral blood covaried with patterns of DNA methylation at many sites, as did demographic factors, such as sex, age, and ethnicity. Furthermore, psychosocial factors, such as perceived stress, and cortisol output were associated with DNA methylation, as was early-life socioeconomic status. Interestingly, we determined that DNA methylation was strongly correlated to the ex vivo inflammatory response of peripheral blood mononuclear cells to stimulation with microbial products that engage Toll-like receptors. In contrast, our work found limited effects of DNA methylation marks on the expression of associated genes across individuals, suggesting a more complex relationship than anticipated.

    View details for DOI 10.1073/pnas.1121249109

    View details for Web of Science ID 000310510500018

    View details for PubMedID 23045638

    View details for PubMedCentralID PMC3477380

  • Population-specificity of human DNA methylation GENOME BIOLOGY Fraser, H. B., Lam, L. L., Neumann, S. M., Kobor, M. S. 2012; 13 (2)

    Abstract

    Ethnic differences in human DNA methylation have been shown for a number of CpG sites, but the genome-wide patterns and extent of these differences are largely unknown. In addition, whether the genetic control of polymorphic DNA methylation is population-specific has not been investigated.Here we measure DNA methylation near the transcription start sites of over 14, 000 genes in 180 cell lines derived from one African and one European population. We find population-specific patterns of DNA methylation at over a third of all genes. Furthermore, although the methylation at over a thousand CpG sites is heritable, these heritabilities also differ between populations, suggesting extensive divergence in the genetic control of DNA methylation. In support of this, genetic mapping of DNA methylation reveals that most of the population specificity can be explained by divergence in allele frequencies between populations, and that there is little overlap in genetic associations between populations. These population-specific genetic associations are supported by the patterns of DNA methylation in several hundred brain samples, suggesting that they hold in vivo and across tissues.These results suggest that DNA methylation is highly divergent between populations, and that this divergence may be due in large part to a combination of differences in allele frequencies and complex epistasis or gene × environment interactions.

    View details for DOI 10.1186/gb-2012-13-2-r8

    View details for Web of Science ID 000305391700001

    View details for PubMedID 22322129

    View details for PubMedCentralID PMC3334571

  • Genome-wide approaches to the study of adaptive gene expression evolution Systematic studies of evolutionary adaptations involving gene expression will allow many fundamental questions in evolutionary biology to be addressed BIOESSAYS Fraser, H. B. 2011; 33 (6): 469-477

    Abstract

    The role of gene expression in evolutionary adaptation has been a subject of debate for over 40 years. cis-regulation of transcription has been proposed to be the primary source of morphological novelty in evolution, though this is based on only a handful of examples. Recently the first genome-wide studies of gene expression adaptation have been published, giving us an initial global view of this process. Systematic studies such as these will allow a number of key questions currently facing the field of gene expression evolution to be addressed.

    View details for DOI 10.1002/bies.201000094

    View details for Web of Science ID 000291548300012

    View details for PubMedID 21538412

  • Systematic Detection of Polygenic cis-Regulatory Evolution PLOS GENETICS Fraser, H. B., Babak, T., Tsang, J., Zhou, Y., Zhang, B., Mehrabian, M., Schadt, E. E. 2011; 7 (3)

    Abstract

    The idea that most morphological adaptations can be attributed to changes in the cis-regulation of gene expression levels has been gaining increasing acceptance, despite the fact that only a handful of such cases have so far been demonstrated. Moreover, because each of these cases involves only one gene, we lack any understanding of how natural selection may act on cis-regulation across entire pathways or networks. Here we apply a genome-wide test for selection on cis-regulation to two subspecies of the mouse Mus musculus. We find evidence for lineage-specific selection at over 100 genes involved in diverse processes such as growth, locomotion, and memory. These gene sets implicate candidate genes that are supported by both quantitative trait loci and a validated causality-testing framework, and they predict a number of phenotypic differences, which we confirm in all four cases tested. Our results suggest that gene expression adaptation is widespread and that these adaptations can be highly polygenic, involving cis-regulatory changes at numerous functionally related genes. These coordinated adaptations may contribute to divergence in a wide range of morphological, physiological, and behavioral phenotypes.

    View details for DOI 10.1371/journal.pgen.1002023

    View details for Web of Science ID 000288996600053

    View details for PubMedID 21483757

    View details for PubMedCentralID PMC3069120

  • Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation BMC GENOMICS Babak, T., Garrett-Engele, P., Armour, C. D., Raymond, C. K., Keller, M. P., Chen, R., Rohl, C. A., Johnson, J. M., Attie, A. D., Fraser, H. B., Schadt, E. E. 2010; 11

    Abstract

    Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application.Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants.Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing.

    View details for DOI 10.1186/1471-2164-11-473

    View details for Web of Science ID 000282789200002

    View details for PubMedID 20707912

    View details for PubMedCentralID PMC3091669

  • Evidence for widespread adaptive evolution of gene expression in budding yeast PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Fraser, H. B., Moses, A. M., Schadt, E. E. 2010; 107 (7): 2977-2982

    Abstract

    Changes in gene expression have been proposed to underlie many, or even most, adaptive differences between species. Despite the increasing acceptance of this view, only a handful of cases of adaptive gene expression evolution have been demonstrated. To address this discrepancy, we introduce a simple test for lineage-specific selection on gene expression. Applying the test to genome-wide gene expression data from the budding yeast Saccharomyces cerevisiae, we find that hundreds of gene expression levels have been subject to lineage-specific selection. Comparing these findings with independent population genetic evidence of selective sweeps suggests that this lineage-specific selection has resulted in recent sweeps at over a hundred genes, most of which led to increased transcript levels. Examination of the implicated genes revealed a specific biochemical pathway--ergosterol biosynthesis--where the expression of multiple genes has been subject to selection for reduced levels. In sum, these results suggest that adaptive evolution of gene expression is common in yeast, that regulatory adaptation can occur at the level of entire pathways, and that similar genome-wide scans may be possible in other species, including humans.

    View details for DOI 10.1073/pnas.0912245107

    View details for Web of Science ID 000274599500050

    View details for PubMedID 20133628

    View details for PubMedCentralID PMC2840270

  • The Quantitative Genetics of Phenotypic Robustness PLOS ONE Fraser, H. B., Schadt, E. E. 2010; 5 (1)

    Abstract

    Phenotypic robustness, or canalization, has been extensively investigated both experimentally and theoretically. However, it remains unknown to what extent robustness varies between individuals, and whether factors buffering environmental variation also buffer genetic variation. Here we introduce a quantitative genetic approach to these issues, and apply this approach to data from three species. In mice, we find suggestive evidence that for hundreds of gene expression traits, robustness is polymorphic and can be genetically mapped to discrete genomic loci. Moreover, we find that the polymorphisms buffering genetic variation are distinct from those buffering environmental variation. In fact, these two classes have quite distinct mechanistic bases: environmental buffers of gene expression are predominantly sex-specific and trans-acting, whereas genetic buffers are not sex-specific and often cis-acting. Data from studies of morphological and life-history traits in plants and yeast support the distinction between polymorphisms buffering genetic and environmental variation, and further suggest that loci buffering different types of environmental variation do overlap with one another. These preliminary results suggest that naturally occurring polymorphisms affecting phenotypic robustness could be abundant, and that these polymorphisms may generally buffer either genetic or environmental variation, but not both.

    View details for DOI 10.1371/journal.pone.0008635

    View details for Web of Science ID 000273414200013

    View details for PubMedID 20072615

    View details for PubMedCentralID PMC2799522

  • Common polymorphic transcript variation in human disease GENOME RESEARCH Fraser, H. B., Xie, X. 2009; 19 (4): 567-575

    Abstract

    Most human genes are thought to express different transcript isoforms in different cell types; however, the full extent and functional consequences of polymorphic transcript variation (PTV), which differ between individuals within the same cell type, are unknown. Here we show that PTV is widespread in B-cells from two human populations. Tens of thousands of exons were found to be polymorphically expressed in a heritable fashion, and over 1000 of these showed strong correlations with single nucleotide polymorphism (SNP) genotypes in cis. The SNPs associated with PTV display signs of having been subject to recent positive selection in humans, and they are also highly enriched for SNPs implicated by recent genome-wide association studies of four autoimmune diseases. From this disease-association overlap, we infer that PTV is the likely mechanism by which eight common polymorphisms contribute to disease risk. A catalog of PTV will be a valuable resource for interpreting results from future disease-association studies and understanding the spectrum of phenotypic differences among humans.

    View details for DOI 10.1101/gr.083477.108

    View details for Web of Science ID 000264781900005

    View details for PubMedID 19189928

  • Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Yassour, M., Kapian, T., Fraser, H. B., Levin, J. Z., Pfiffner, J., Adiconis, X., Schroth, G., Luo, S., Khrebtukova, I., Gnirke, A., Nusbaum, C., Thompson, D., Friedman, N., Regev, A. 2009; 106 (9): 3264-3269

    Abstract

    Defining the transcriptome, the repertoire of transcribed regions encoded in the genome, is a challenging experimental task. Current approaches, relying on sequencing of ESTs or cDNA libraries, are expensive and labor-intensive. Here, we present a general approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the unannotated genome sequence and millions of short reads from a single massively parallel sequencing run. Using novel algorithms, we automatically construct a highly accurate transcript catalog. Our approach automatically and fully defines 86% of the genes expressed under the given conditions, and discovers 160 previously undescribed transcription units of 250 bp or longer. It correctly demarcates the 5' and 3' UTR boundaries of 86 and 77% of expressed genes, respectively. The method further identifies 83% of known splice junctions in expressed genes, and discovers 25 previously uncharacterized introns, including 2 cases of condition-dependent intron retention. Our framework is applicable to poorly understood organisms, and can lead to greater understanding of the transcribed elements in an explored genome.

    View details for DOI 10.1073/pnas.0812841106

    View details for Web of Science ID 000263844100053

    View details for PubMedID 19208812

  • Confirmation of organized modularity in the yeast interactome PLOS BIOLOGY Bertin, N., Simonis, N., Dupuy, D., Cusick, M. E., Han, J. J., Fraser, H. B., Roth, F. P., Vidal, M. 2007; 5 (6): 1206-1210

    View details for DOI 10.1371/journal.pbio.0050153

    View details for Web of Science ID 000247173200005

    View details for PubMedID 17564493

  • Assessing the determinants of evolutionary rates in the presence of noise MOLECULAR BIOLOGY AND EVOLUTION Plotkin, J. B., Fraser, H. B. 2007; 24 (5): 1113-1121

    Abstract

    Although protein sequences are known to evolve at vastly different rates, little is known about what determines their rate of evolution. However, a recent study using principal component regression (PCR) has concluded that evolutionary rates in yeast are primarily governed by a single determinant related to translation frequency. Here, we demonstrate that noise in biological data can confound PCRs, leading to spurious conclusions. When equalizing noise levels across 7 predictor variables used in previous studies, we find no evidence that protein evolution is dominated by a single determinant. Our results indicate that a variety of factors--including expression level, gene dispensability, and protein-protein interactions--may independently affect evolutionary rates in yeast. More accurate measurements or more sophisticated statistical techniques will be required to determine which one, if any, of these factors dominates protein evolution.

    View details for DOI 10.1093/molbev/msm044

    View details for Web of Science ID 000246802400004

    View details for PubMedID 17347158

  • Using protein complexes to predict phenotypic effects of gene mutation GENOME BIOLOGY Fraser, H. B., Plotkin, J. B. 2007; 8 (11)

    Abstract

    Predicting the phenotypic effects of mutations is a central goal of genetics research; it has important applications in elucidating how genotype determines phenotype and in identifying human disease genes.Using a wide range of functional genomic data from the yeast Saccharomyces cerevisiae, we show that the best predictor of a protein's knockout phenotype is the knockout phenotype of other proteins that are present in a protein complex with it. Even the addition of multiple datasets does not improve upon the predictions made from protein complex membership. Similarly, we find that a proxy for protein complexes is a powerful predictor of disease phenotypes in humans.We propose that identifying human protein complexes containing known disease genes will be an efficient method for large-scale disease gene discovery, and that yeast may prove to be an informative model system for investigating, and even predicting, the genetic basis of both Mendelian and complex disease phenotypes.

    View details for DOI 10.1186/gb-2007-8-11-r252

    View details for Web of Science ID 000252101100026

    View details for PubMedID 18042286

  • Coevolution, modularity and human disease CURRENT OPINION IN GENETICS & DEVELOPMENT Fraser, H. B. 2006; 16 (6): 637-644

    Abstract

    The concepts of coevolution and modularity have been studied separately for decades. Recent advances in genomics have led to the first systematic studies in each of these fields at the molecular level, resulting in several important discoveries. Both coevolution and modularity appear to be pervasive features of genomic data from all species studied to date, and their presence can be detected in many types of datasets, including genome sequences, gene expression data, and protein-protein interaction data. Moreover, the combination of these two ideas might have implications for our understanding of many aspects of biology, ranging from the general architecture of living systems to the causes of various human diseases.

    View details for DOI 10.1016/j.gde.2006.09.001

    View details for Web of Science ID 000242647400016

    View details for PubMedID 17005391

  • Codon usage and selection on proteins JOURNAL OF MOLECULAR EVOLUTION Plotkin, J. B., Dushoff, J., Desai, M. M., Fraser, H. B. 2006; 63 (5): 635-653

    Abstract

    Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest--requiring approximately 100 selected sites--but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice.

    View details for DOI 10.1007/s00239-005-0233-x

    View details for Web of Science ID 000242014800006

    View details for PubMedID 17043750

  • Estimating selection pressures from limited comparative data MOLECULAR BIOLOGY AND EVOLUTION Plotkin, J. B., Dushoff, J., Desai, M. M., Fraser, H. B. 2006; 23 (8): 1457-1459

    Abstract

    We recently introduced a novel method for estimating selection pressures on proteins, termed "volatility," which requires only a single genome sequence. Some criticisms that have been levied against this approach are valid, but many others are based on misconceptions of volatility, or they apply equally to comparative methods of estimating selection. Here, we introduce a simple regression technique for estimating selection pressures on all proteins in a genome, on the basis of limited comparative data. The regression technique does not depend on an underlying population-genetic mechanism. This new approach to estimating selection across a genome should be more powerful and more widely applicable than volatility itself.

    View details for DOI 10.1093/molberv/msl021

    View details for Web of Science ID 000239281200001

    View details for PubMedID 16754640

  • Aging and gene expression in the primate brain PLOS BIOLOGY Fraser, H. B., Khaitovich, P., Plotkin, J. B., Paabo, S., Eisen, M. B. 2005; 3 (9): 1653-1661

    Abstract

    It is well established that gene expression levels in many organisms change during the aging process, and the advent of DNA microarrays has allowed genome-wide patterns of transcriptional changes associated with aging to be studied in both model organisms and various human tissues. Understanding the effects of aging on gene expression in the human brain is of particular interest, because of its relation to both normal and pathological neurodegeneration. Here we show that human cerebral cortex, human cerebellum, and chimpanzee cortex each undergo different patterns of age-related gene expression alterations. In humans, many more genes undergo consistent expression changes in the cortex than in the cerebellum; in chimpanzees, many genes change expression with age in cortex, but the pattern of changes in expression bears almost no resemblance to that of human cortex. These results demonstrate the diversity of aging patterns present within the human brain, as well as how rapidly genome-wide patterns of aging can evolve between species; they may also have implications for the oxidative free radical theory of aging, and help to improve our understanding of human neurodegenerative diseases.

    View details for DOI 10.1371/journal.pbio.0030274

    View details for Web of Science ID 000231820900016

    View details for PubMedID 16048372

  • Sum1p, the origin recognition complex, and the spreading of a promoter-specific repressor in Saccharomyces cerevisiae MOLECULAR AND CELLULAR BIOLOGY Lynch, P. J., Fraser, H. B., Sevastopoulos, E., Rine, J., Rusche, L. N. 2005; 25 (14): 5920-5932

    Abstract

    In Saccharomyces cerevisiae, Sum1p is a promoter-specific repressor. A single amino acid change generates the mutant Sum1-1p, which causes regional silencing at new loci where wild-type Sum1p does not act. Thus, Sum1-1p is a model for understanding how the spreading of repressive chromatin is regulated. When wild-type Sum1p was targeted to a locus where mutant Sum1-1p spreads, wild-type Sum1p did not spread as efficiently as mutant Sum1-1p did, despite being in the same genomic context. Thus, the SUM1-1 mutation altered the ability of the protein to spread. The spreading of Sum1-1p required both an enzymatically active deacetylase, Hst1p, and the N-terminal tail of histone H4, consistent with the spreading of Sum1-1p involving sequential modification of and binding to histone tails, as observed for other silencing proteins. Furthermore, deletion of the N-terminal tail of H4 caused Sum1-1p to return to loci where wild-type Sum1p acts, consistent with the SUM1-1 mutation increasing the affinity of the protein for H4 tails. These results imply that the spreading of repressive chromatin proteins is regulated by their affinities for histone tails. Finally, this study uncovered a functional connection between wild-type Sum1p and the origin recognition complex, and this relationship also contributes to mutant Sum1-1p localization.

    View details for DOI 10.1128/MCB.25.14.5920-5932.2005

    View details for Web of Science ID 000230267000012

    View details for PubMedID 15988008

  • Functional genomic analysis of the rates of protein evolution PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Wall, D. P., Hirsh, A. E., Fraser, H. B., Kumm, J., Giaever, G., Eisen, M. B., Feldman, M. W. 2005; 102 (15): 5483-5488

    Abstract

    The evolutionary rates of proteins vary over several orders of magnitude. Recent work suggests that analysis of large data sets of evolutionary rates in conjunction with the results from high-throughput functional genomic experiments can identify the factors that cause proteins to evolve at such dramatically different rates. To this end, we estimated the evolutionary rates of >3,000 proteins in four species of the yeast genus Saccharomyces and investigated their relationship with levels of expression and protein dispensability. Each protein's dispensability was estimated by the growth rate of mutants deficient for the protein. Our analyses of these improved evolutionary and functional genomic data sets yield three main results. First, dispensability and expression have independent, significant effects on the rate of protein evolution. Second, measurements of expression levels in the laboratory can be used to filter data sets of dispensability estimates, removing variates that are unlikely to reflect real biological effects. Third, structural equation models show that although we may reasonably infer that dispensability and expression have significant effects on protein evolutionary rate, we cannot yet accurately estimate the relative strengths of these effects.

    View details for DOI 10.1073/pnas.0501761102

    View details for Web of Science ID 000228376600036

    View details for PubMedID 15800036

    View details for PubMedCentralID PMC555735

  • Modularity and evolutionary constraint on proteins NATURE GENETICS Fraser, H. B. 2005; 37 (4): 351-352

    Abstract

    Modularity, which has been found in the functional and physical protein interaction networks of many organisms, has been postulated to affect both the mode and tempo of evolution. Here I show that in the yeast Saccharomyces cerevisiae, protein interaction hubs situated in single modules are highly constrained, whereas those connecting different modules are more plastic. This pattern of change could reflect a tendency for evolutionary innovations to occur by altering the proteins and interactions between rather than within modules, in a manner somewhat similar to the evolution of new proteins through the shuffling of conserved protein domains.

    View details for DOI 10.1038/ng1530

    View details for Web of Science ID 000228040000016

    View details for PubMedID 15750592

  • Adjusting for selection on synonymous sites in estimates of evolutionary distance MOLECULAR BIOLOGY AND EVOLUTION Hirsh, A. E., Fraser, H. B., Wall, D. P. 2005; 22 (1): 174-177

    Abstract

    Evolution at silent sites is often used to estimate the pace of selectively neutral processes or to infer differences in divergence times of genes. However, silent sites are subject to selection in favor of preferred codons, and the strength of such selection varies dramatically across genes. Here, we use the relationship between codon bias and synonymous divergence observed in four species of the genus Saccharomyces to provide a simple correction for selection on silent sites.

    View details for DOI 10.1093/molbev/msh265

    View details for Web of Science ID 000225730100018

    View details for PubMedID 15371530

  • Conservation and evolution of cis-regulatory systems in ascomycete fungi PLOS BIOLOGY Gasch, A. P., Moses, A. M., Chiang, D. Y., Fraser, H. B., Berardini, M., Eisen, M. B. 2004; 2 (12): 2202-2219

    Abstract

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups fromS. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.

    View details for DOI 10.1371/journal.pbio.0020398

    View details for Web of Science ID 000226099600021

    View details for PubMedID 15534694

  • Coevolution of gene expression among interacting proteins PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Fraser, H. B., Hirsh, A. E., Wall, D. P., Eisen, M. B. 2004; 101 (24): 9033-9038

    Abstract

    Physically interacting proteins or parts of proteins are expected to evolve in a coordinated manner that preserves proper interactions. Such coevolution at the amino acid-sequence level is well documented and has been used to predict interacting proteins, domains, and amino acids. Interacting proteins are also often precisely coexpressed with one another, presumably to maintain proper stoichiometry among interacting components. Here, we show that the expression levels of physically interacting proteins coevolve. We estimate average expression levels of genes from four closely related fungi of the genus Saccharomyces using the codon adaptation index and show that expression levels of interacting proteins exhibit coordinated changes in these different species. We find that this coevolution of expression is a more powerful predictor of physical interaction than is coevolution of amino acid sequence. These results demonstrate that gene expression levels can coevolve, adding another dimension to the study of the coevolution of interacting proteins and underscoring the importance of maintaining coexpression of interacting proteins over evolutionary time. Our results also suggest that expression coevolution can be used for computational prediction of protein-protein interactions.

    View details for DOI 10.1073/pnas.0402591101

    View details for Web of Science ID 000222104900038

    View details for PubMedID 15175431

    View details for PubMedCentralID PMC439012

  • Noise minimization in eukaryotic gene expression PLOS BIOLOGY Fraser, H. B., Hirsh, A. E., Giaever, G., Kumm, J., Eisen, M. B. 2004; 2 (6): 834-838

    Abstract

    All organisms have elaborate mechanisms to control rates of protein production. However, protein production is also subject to stochastic fluctuations, or "noise." Several recent studies in Saccharomyces cerevisiae and Escherichia coli have investigated the relationship between transcription and translation rates and stochastic fluctuations in protein levels, or more generally, how such randomness is a function of intrinsic and extrinsic factors. However, the fundamental question of whether stochasticity in protein expression is generally biologically relevant has not been addressed, and it remains unknown whether random noise in the protein production rate of most genes significantly affects the fitness of any organism. We propose that organisms should be particularly sensitive to variation in the protein levels of two classes of genes: genes whose deletion is lethal to the organism and genes that encode subunits of multiprotein complexes. Using an experimentally verified model of stochastic gene expression in S. cerevisiae, we estimate the noise in protein production for nearly every yeast gene, and confirm our prediction that the production of essential and complex-forming proteins involves lower levels of noise than does the production of most other genes. Our results support the hypothesis that noise in gene expression is a biologically important variable, is generally detrimental to organismal fitness, and is subject to natural selection.

    View details for DOI 10.1371/journal.pbio.0020137

    View details for Web of Science ID 000222380400022

    View details for PubMedID 15124029

    View details for PubMedCentralID PMC400249

  • Evolutionary rate depends on number of protein-protein interactions independently of gene expression level BMC EVOLUTIONARY BIOLOGY Fraser, H. B., Hirsh, A. E. 2004; 4

    Abstract

    Whether or not a protein's number of physical interactions with other proteins plays a role in determining its rate of evolution has been a contentious issue. A recent analysis suggested that the observed correlation between number of interactions and evolutionary rate may be due to experimental biases in high-throughput protein interaction data sets.The number of interactions per protein, as measured by some protein interaction data sets, shows no correlation with evolutionary rate. Other data sets, however, do reveal a relationship. Furthermore, even when experimental biases of these data sets are taken into account, a real correlation between number of interactions and evolutionary rate appears to exist.A strong and significant correlation between a protein's number of interactions and evolutionary rate is apparent for interaction data from some studies. The extremely low agreement between different protein interaction data sets indicates that interaction data are still of low coverage and/or quality. These limitations may explain why some data sets reveal no correlation with evolutionary rates.

    View details for Web of Science ID 000222014000001

    View details for PubMedID 15165289

    View details for PubMedCentralID PMC420460

  • Detecting selection using a single genome sequence of M-tuberculosis and P-falciparum NATURE Plotkin, J. B., Dushoff, J., Fraser, H. B. 2004; 428 (6986): 942-945

    Abstract

    Selective pressures on proteins are usually measured by comparing nucleotide sequences. Here we introduce a method to detect selection on the basis of a single genome sequence. We catalogue the relative strength of selection on each gene in the entire genomes of Mycobacterium tuberculosis and Plasmodium falciparum. Our analysis confirms that most antigens are under strong selection for amino-acid substitutions, particularly the PE/PPE family of putative surface proteins in M. tuberculosis and the EMP1 family of cytoadhering surface proteins in P. falciparum. We also identify many uncharacterized proteins that are under strong selection in each pathogen. We provide a genome-wide analysis of natural selection acting on different stages of an organism's life cycle: genes expressed in the ring stage of P. falciparum are under stronger positive selection than those expressed in other stages of the parasite's life cycle. Our method of estimating selective pressures requires far fewer data than comparative sequence analysis, and it measures selection across an entire genome; the method can readily be applied to a large range of sequenced organisms.

    View details for DOI 10.1038/nature02458

    View details for Web of Science ID 000221083000041

    View details for PubMedID 15118727

  • Detecting putative orthologs BIOINFORMATICS Wall, D. P., Fraser, H. B., Hirsh, A. E. 2003; 19 (13): 1710-1711

    Abstract

    We developed an algorithm that improves upon the common procedure of taking reciprocal best blast hits(rbh) in the identification of orthologs. The method-reciprocal smallest distance algorithm (rsd)-relies on global sequence alignment and maximum likelihood estimation of evolutionary distances to detect orthologs between two genomes. rsd finds many putative orthologs missed by rbh because it is less likely than rbh to be misled by the presence of a close paralog.

    View details for DOI 10.1093/bioinformatics/btg213

    View details for Web of Science ID 000185310600016

    View details for PubMedID 15593400

  • A simple dependence between protein evolution rate and the number of protein-protein interactions BMC EVOLUTIONARY BIOLOGY Fraser, H. B., Wall, D. P., Hirsh, A. E. 2003; 3

    Abstract

    It has been shown for an evolutionarily distant genomic comparison that the number of protein-protein interactions a protein has correlates negatively with their rates of evolution. However, the generality of this observation has recently been challenged. Here we examine the problem using protein-protein interaction data from the yeast Saccharomyces cerevisiae and genome sequences from two other yeast species.In contrast to a previous study that used an incomplete set of protein-protein interactions, we observed a highly significant correlation between number of interactions and evolutionary distance to either Candida albicans or Schizosaccharomyces pombe. This study differs from the previous one in that it includes all known protein interactions from S. cerevisiae, and a larger set of protein evolutionary rates. In both evolutionary comparisons, a simple monotonic relationship was found across the entire range of the number of protein-protein interactions. In agreement with our earlier findings, this relationship cannot be explained by the fact that proteins with many interactions tend to be important to yeast. The generality of these correlations in other kingdoms of life unfortunately cannot be addressed at this time, due to the incompleteness of protein-protein interaction data from organisms other than S. cerevisiae.Protein-protein interactions tend to slow the rate at which proteins evolve. This may be due to structural constraints that must be met to maintain interactions, but more work is needed to definitively establish the mechanism(s) behind the correlations we have observed.

    View details for Web of Science ID 000188122100011

    View details for PubMedID 12769820

  • Evolutionary rate in the protein interaction network SCIENCE Fraser, H. B., Hirsh, A. E., Steinmetz, L. M., Scharfe, C., Feldman, M. W. 2002; 296 (5568): 750-752

    Abstract

    High-throughput screens have begun to reveal the protein interaction network that underpins most cellular functions in the yeast Saccharomyces cerevisiae. How the organization of this network affects the evolution of the proteins that compose it is a fundamental question in molecular evolution. We show that the connectivity of well-conserved proteins in the network is negatively correlated with their rate of evolution. Proteins with more interactors evolve more slowly not because they are more important to the organism, but because a greater proportion of the protein is directly involved in its function. At sites important for interaction between proteins, evolutionary changes may occur largely by coevolution, in which substitutions in one protein result in selection pressure for reciprocal changes in interacting partners. We confirm one predicted outcome of this process-namely, that interacting proteins evolve at similar rates.

    View details for Web of Science ID 000175281700060

    View details for PubMedID 11976460

  • Explaining mortality rate plateaus PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Weitz, J. S., Fraser, H. B. 2001; 98 (26): 15383-15386

    Abstract

    We propose a stochastic model of aging to explain deviations from exponential growth in mortality rates commonly observed in empirical studies. Mortality rate plateaus are explained as a generic consequence of considering death in terms of first passage times for processes undergoing a random walk with drift. Simulations of populations with age-dependent distributions of viabilities agree with a wide array of experimental results. The influence of cohort size is well accounted for by the stochastic nature of the model.

    View details for Web of Science ID 000172848800114

    View details for PubMedID 11752476

  • Protein dispensability and rate of evolution NATURE Hirsh, A. E., Fraser, H. B. 2001; 411 (6841): 1046-1049

    Abstract

    If protein evolution is due in large part to slightly deleterious amino acid substitutions, then the rate of evolution should be greater in proteins that contribute less to individual fitness. The rationale for this prediction is that relatively dispensable proteins should be subject to weaker purifying selection, and should therefore accumulate mildly deleterious substitutions more rapidly. Although this argument was presented over twenty years ago, and is fundamental to many applications of evolutionary theory, the prediction has proved difficult to confirm. In fact, a recent study showed that essential mouse genes do not evolve more slowly than non-essential ones. Thus, although a variety of factors influencing the rate of protein evolution have been supported by extensive sequence analysis, the relationship between protein dispensability and evolutionary rate has remained unconfirmed. Here we use the results from a highly parallel growth assay of single gene deletions in yeast to assess protein dispensability, which we relate to evolutionary rate estimates that are based on comparisons of sequences drawn from twenty-one fully annotated genomes. Our analysis reveals a highly significant relationship between protein dispensability and evolutionary rate, and explains why this relationship is not detectable by categorical comparison of essential versus non-essential proteins. The relationship is highly conserved, so that protein dispensability in yeast is also predictive of evolutionary rate in a nematode worm.

    View details for Web of Science ID 000169528500047

    View details for PubMedID 11429604