Current Research and Scholarly Interests
We are interested in a broad range of problems at the interface of genomics and evolutionary biology. One current focus of the lab is in understanding how genetic variation impacts gene regulation and complex traits. We also have long-term interests in using genetic data to learn about population structure, history and adaptation, especially in humans.
FOR UP-TO-DATE DETAILS ON MY LAB AND RESEARCH, PLEASE SEE: http://pritchardlab.stanford.edu
- Advanced Genetics
GENE 205 (Win)
Independent Studies (10)
- Biomedical Informatics Teaching Methods
BIOMEDIN 290 (Aut, Win, Spr)
- Directed Reading and Research
BIOMEDIN 299 (Aut, Win, Spr)
- Directed Reading in Genetics
GENE 299 (Aut, Win, Spr)
- Graduate Research
BIO 300 (Aut, Win, Spr, Sum)
- Graduate Research
GENE 399 (Aut, Win, Spr)
- Medical Scholars Research
BIOMEDIN 370 (Aut, Win, Spr)
- Medical Scholars Research
GENE 370 (Aut, Win, Spr)
- Out-of-Department Directed Reading
BIO 198X (Spr, Sum)
- Supervised Study
GENE 260 (Aut, Win, Spr)
- Undergraduate Research
GENE 199 (Aut, Win, Spr)
- Biomedical Informatics Teaching Methods
Prior Year Courses
- Advanced Genetics
GENE 205 (Win)
- Statistical and Machine Learning Methods for Genomics
BIOMEDIN 245, CS 373, GENE 245, STATS 345 (Spr)
- Advanced Genetics
GENE 205 (Win)
- Advanced Genetics
Postdoctoral Faculty Sponsor
Anand Bhaskar, David Golan, Kelley Harris, Yang Li, Anil Raj, Eilon Sharon
Doctoral Dissertation Reader (AC)
Alicia Schep, Ashley Tehranchi, Zachary Zappala
Doctoral Dissertation Co-Advisor (AC)
Doctoral Dissertation Advisor (AC)
Diego Calderon, Natalie Telis
Graduate and Fellowship Programs
Biology (School of Humanities and Sciences) (Phd Program)
Biomedical Informatics (Phd Program)
- Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions CELL 2015; 162 (5): 1051-1065
- The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans SCIENCE 2015; 348 (6235): 648-660
Reprogramming LCLs to iPSCs Results in Recovery of Donor-Specific Gene Expression Signature
2015; 11 (5)
Renewable in vitro cell cultures, such as lymphoblastoid cell lines (LCLs), have facilitated studies that contributed to our understanding of genetic influence on human traits. However, the degree to which cell lines faithfully maintain differences in donor-specific phenotypes is still debated. We have previously reported that standard cell line maintenance practice results in a loss of donor-specific gene expression signatures in LCLs. An alternative to the LCL model is the induced pluripotent stem cell (iPSC) system, which carries the potential to model tissue-specific physiology through the use of differentiation protocols. Still, existing LCL banks represent an important source of starting material for iPSC generation, and it is possible that the disruptions in gene regulation associated with long-term LCL maintenance could persist through the reprogramming process. To address this concern, we studied the effect of reprogramming mature LCL cultures from six unrelated donors to iPSCs on the ensuing gene expression patterns within and between individuals. We show that the reprogramming process results in a recovery of donor-specific gene regulatory signatures, increasing the number of genes with a detectable donor effect by an order of magnitude. The proportion of variation in gene expression statistically attributed to donor increases from 6.9% in LCLs to 24.5% in iPSCs (P < 10-15). Since environmental contributions are unlikely to be a source of individual variation in our system of highly passaged cultured cell lines, our observations suggest that the effect of genotype on gene regulation is more pronounced in iPSCs than in LCLs. Our findings indicate that iPSCs can be a powerful model system for studies of phenotypic variation across individuals in general, and the genetic association with variation in gene regulation in particular. We further conclude that LCLs are an appropriate starting material for iPSC generation.
View details for DOI 10.1371/journal.pgen.1005216
View details for Web of Science ID 000355305200032
View details for PubMedID 25950834
Genomic variation. Impact of regulatory variation from RNA to protein.
2015; 347 (6222): 664-667
The phenotypic consequences of expression quantitative trait loci (eQTLs) are presumably due to their effects on protein expression levels. Yet the impact of genetic variation, including eQTLs, on protein levels remains poorly understood. To address this, we mapped genetic variants that are associated with eQTLs, ribosome occupancy (rQTLs), or protein abundance (pQTLs). We found that most QTLs are associated with transcript expression levels, with consequent effects on ribosome and protein levels. However, eQTLs tend to have significantly reduced effect sizes on protein levels, which suggests that their potential impact on downstream phenotypes is often attenuated or buffered. Additionally, we identified a class of cis QTLs that affect protein abundance with little or no effect on messenger RNA or ribosome levels, which suggests that they may arise from differences in posttranslational regulation.
View details for DOI 10.1126/science.1260793
View details for PubMedID 25657249
msCentipede: Modeling Heterogeneity across Genomic Sites and Replicates Improves Accuracy in the Inference of Transcription Factor Binding.
2015; 10 (9)
Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework fails to account for the substantial variation in the DNase I cleavage profiles across different binding sites. Neither does it account for variation in the profiles at the same binding site across multiple replicate DNase I experiments, which are increasingly available. In this work, we introduce new methods, based on multi-scale models for inhomogeneous Poisson processes, to account for such variation in DNase I cleavage patterns both within and across binding sites. These models account for the spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-seq peaks for those factors. Finally, we explore the effects of DNase I sequence bias on inference of factor binding using a simple extension to our framework that allows for a more flexible background model. The proposed model can also be easily applied to paired-end ATAC-seq and DNase-seq data. msCentipede, a Python implementation of our algorithm, is available at http://rajanil.github.io/msCentipede.
View details for DOI 10.1371/journal.pone.0138030
View details for PubMedID 26406244
The Genetic and Mechanistic Basis for Variation in Gene Regulation
2015; 11 (1)
It is now well established that noncoding regulatory variants play a central role in the genetics of common diseases and in evolution. However, until recently, we have known little about the mechanisms by which most regulatory variants act. For instance, what types of functional elements in DNA, RNA, or proteins are most often affected by regulatory variants? Which stages of gene regulation are typically altered? How can we predict which variants are most likely to impact regulation in a given cell type? Recent studies, in many cases using quantitative trait loci (QTL)-mapping approaches in cell lines or tissue samples, have provided us with considerable insight into the properties of genetic loci that have regulatory roles. Such studies have uncovered novel biochemical regulatory interactions and led to the identification of previously unrecognized regulatory mechanisms. We have learned that genetic variation is often directly associated with variation in regulatory activities (namely, we can map regulatory QTLs, not just expression QTLs [eQTLs]), and we have taken the first steps towards understanding the causal order of regulatory events (for example, the role of pioneer transcription factors). Yet, in most cases, we still do not know how to interpret overlapping combinations of regulatory interactions, and we are still far from being able to predict how variation in regulatory mechanisms is propagated through a chain of interactions to eventually result in changes in gene expression profiles.
View details for DOI 10.1371/journal.pgen.1004857
View details for Web of Science ID 000349314600009
View details for PubMedID 25569255
- Methylation QTLs Are Associated with Coordinated Changes in Transcription Factor Binding, Histone Modifications, and Gene Expression Levels PLOS GENETICS 2014; 10 (9)
- fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets GENETICS 2014; 197 (2): 573-U207
The deleterious mutation load is insensitive to recent population history
2014; 46 (3): 220-?
Human populations have undergone major changes in population size in the past 100,000 years, including recent rapid growth. How these demographic events have affected the burden of deleterious mutations in individuals and the frequencies of disease mutations in populations remains unclear. We use population genetic models to show that recent human demography has probably had little impact on the average burden of deleterious mutations. This prediction is supported by two exome sequence data sets showing that individuals of west African and European ancestry carry very similar burdens of damaging mutations. We further show that for many diseases, rare alleles are unlikely to contribute a large fraction of the heritable variation, and therefore the impact of recent growth is likely to be modest. However, for those diseases that have a direct impact on fitness, strongly deleterious rare mutations probably do have an important role, and recent growth will have increased their impact.
View details for DOI 10.1038/ng.2896
View details for Web of Science ID 000332036700005
View details for PubMedID 24509481
The Functional Consequences of Variation in Transcription Factor Binding
2014; 10 (3)
One goal of human genetics is to understand how the information for precise and dynamic gene expression programs is encoded in the genome. The interactions of transcription factors (TFs) with DNA regulatory elements clearly play an important role in determining gene expression outputs, yet the regulatory logic underlying functional transcription factor binding is poorly understood. Many studies have focused on characterizing the genomic locations of TF binding, yet it is unclear to what extent TF binding at any specific locus has functional consequences with respect to gene expression output. To evaluate the context of functional TF binding we knocked down 59 TFs and chromatin modifiers in one HapMap lymphoblastoid cell line. We then identified genes whose expression was affected by the knockdowns. We intersected the gene expression data with transcription factor binding data (based on ChIP-seq and DNase-seq) within 10 kb of the transcription start sites of expressed genes. This combination of data allowed us to infer functional TF binding. Using this approach, we found that only a small subset of genes bound by a factor were differentially expressed following the knockdown of that factor, suggesting that most interactions between TF and chromatin do not result in measurable changes in gene expression levels of putative target genes. We found that functional TF binding is enriched in regulatory elements that harbor a large number of TF binding sites, at sites with predicted higher binding affinity, and at sites that are enriched in genomic regions annotated as "active enhancers."
View details for DOI 10.1371/journal.pgen.1004226
View details for Web of Science ID 000337144700050
View details for PubMedID 24603674
The chromatin architectural proteins HMGD1 and H1 bind reciprocally and have opposite effects on chromatin structure and gene regulation
Chromatin architectural proteins interact with nucleosomes to modulate chromatin accessibility and higher-order chromatin structure. While these proteins are almost certainly important for gene regulation they have been studied far less than the core histone proteins.Here we describe the genomic distributions and functional roles of two chromatin architectural proteins: histone H1 and the high mobility group protein HMGD1 in Drosophila S2 cells. Using ChIP-seq, biochemical and gene specific approaches, we find that HMGD1 binds to highly accessible regulatory chromatin and active promoters. In contrast, H1 is primarily associated with heterochromatic regions marked with repressive histone marks. We find that the ratio of HMGD1 to H1 binding is a better predictor of gene activity than either protein by itself, which suggests that reciprocal binding between these proteins is important for gene regulation. Using knockdown experiments, we show that HMGD1 and H1 affect the occupancy of the other protein, change nucleosome repeat length and modulate gene expression.Collectively, our data suggest that dynamic and mutually exclusive binding of H1 and HMGD1 to nucleosomes and their linker sequences may control the fluid chromatin structure that is required for transcriptional regulation. This study provides a framework to further study the interplay between chromatin architectural proteins and epigenetics in gene regulation.
View details for DOI 10.1186/1471-2164-15-92
View details for Web of Science ID 000332575900002
View details for PubMedID 24484546
The effect of freeze-thaw cycles on gene expression levels in lymphoblastoid cell lines.
2014; 9 (9)
Epstein-Barr virus (EBV) transformed lymphoblastoid cell lines (LCLs) are a widely used renewable resource for functional genomic studies in humans. The ability to accumulate multidimensional data pertaining to the same individual cell lines, from complete genomic sequences to detailed gene regulatory profiles, further enhances the utility of LCLs as a model system. However, the extent to which LCLs are a faithful model system is relatively unknown. We have previously shown that gene expression profiles of newly established LCLs maintain a strong individual component. Here, we extend our study to investigate the effect of freeze-thaw cycles on gene expression patterns in mature LCLs, especially in the context of inter-individual variation in gene expression. We report a profound difference in the gene expression profiles of newly established and mature LCLs. Once newly established LCLs undergo a freeze-thaw cycle, the individual specific gene expression signatures become much less pronounced as the gene expression levels in LCLs from different individuals converge to a more uniform profile, which reflects a mature transformed B cell phenotype. We found that previously identified eQTLs are enriched among the relatively few genes whose regulations in mature LCLs maintain marked individual signatures. We thus conclude that while insight drawn from gene regulatory studies in mature LCLs may generally not be affected by the artificial nature of the LCL model system, many aspects of primary B cell biology cannot be observed and studied in mature LCL cultures.
View details for DOI 10.1371/journal.pone.0107166
View details for PubMedID 25192014
- Epigenetic modifications are associated with inter-species gene expression variation in primates GENOME BIOLOGY 2014; 15 (12)
Primate Transcript and Protein Expression Levels Evolve Under Compensatory Selection Pressures
2013; 342 (6162): 1100-1104
Changes in gene regulation have likely played an important role in the evolution of primates. Differences in messenger RNA (mRNA) expression levels across primates have often been documented; however, it is not yet known to what extent measurements of divergence in mRNA levels reflect divergence in protein expression levels, which are probably more important in determining phenotypic differences. We used high-resolution, quantitative mass spectrometry to collect protein expression measurements from human, chimpanzee, and rhesus macaque lymphoblastoid cell lines and compared them to transcript expression data from the same samples. We found dozens of genes with significant expression differences between species at the mRNA level yet little or no difference in protein expression. Overall, our data suggest that protein expression levels evolve under stronger evolutionary constraint than mRNA levels.
View details for DOI 10.1126/science.1242379
View details for Web of Science ID 000327518600059
View details for PubMedID 24136357
Identification of Genetic Variants That Affect Histone Modifications in Human Cells
2013; 342 (6159): 747-749
Histone modifications are important markers of function and chromatin state, yet the DNA sequence elements that direct them to specific genomic locations are poorly understood. Here, we identify hundreds of quantitative trait loci, genome-wide, that affect histone modification or RNA polymerase II (Pol II) occupancy in Yoruba lymphoblastoid cell lines (LCLs). In many cases, the same variant is associated with quantitative changes in multiple histone marks and Pol II, as well as in deoxyribonuclease I sensitivity and nucleosome positioning. Transcription factor binding site polymorphisms are correlated overall with differences in local histone modification, and we identify specific transcription factors whose binding leads to histone modification in LCLs. Furthermore, variants that affect chromatin at distal regulatory sites frequently also direct changes in chromatin and gene expression at associated promoters.
View details for DOI 10.1126/science.1242429
View details for Web of Science ID 000326647600046
View details for PubMedID 24136359