Jonathan Pritchard, Postdoctoral Faculty Sponsor
Bayesian model comparison for rare-variant association studies.
American journal of human genetics
Whole-genome sequencing studies applied to large populations or biobanks with extensive phenotyping raise new analytic challenges. The need to consider many variants at a locus or group of genes simultaneously and the potential to study many correlated phenotypes with shared genetic architecture provide opportunities for discovery not addressed by the traditional one variant, one phenotype association study. Here, we introduce a Bayesian model comparison approach called MRP (multiple rare variants and phenotypes) for rare-variant association studies that considers correlation, scale, and direction of genetic effects across a group of genetic variants, phenotypes, and studies, requiring only summary statistic data. We apply our method to exome sequencing data (n = 184,698) across 2,019 traits from the UK Biobank, aggregating signals in genes. MRP demonstrates an ability to recover signals such as associations between PCSK9 and LDL cholesterol levels. We additionally find MRP effective in conducting meta-analyses in exome data. Non-biomarker findings include associations between MC1R and red hair color and skin color, IL17RA and monocyte count, and IQGAP2 and mean platelet volume. Finally, we apply MRP in a multi-phenotype setting; after clustering the 35 biomarker phenotypes based on genetic correlation estimates, we find that joint analysis of these phenotypes results in substantial power gains for gene-trait associations, such as in TNFRSF13B in one of the clusters containing diabetes- and lipid-related traits. Overall, we show that the MRP model comparison approach improves upon useful features from widely used meta-analysis approaches for rare-variant association analyses and prioritizes protective modifiers of disease risk.
View details for DOI 10.1016/j.ajhg.2021.11.005
View details for PubMedID 34822764
Variable prediction accuracy of polygenic scores within an ancestry group.
Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.
View details for DOI 10.7554/eLife.48376
View details for PubMedID 31999256
Measuring intolerance to mutation in human genetics
2019; 51 (5): 772-+
In numerous applications, from working with animal models to mapping the genetic basis of human disease susceptibility, knowing whether a single disrupting mutation in a gene is likely to be deleterious is useful. With this goal in mind, a number of measures have been developed to identify genes in which protein-truncating variants (PTVs), or other types of mutations, are absent or kept at very low frequency in large population samples-genes that appear 'intolerant' to mutation. One measure in particular, the probability of being loss-of-function intolerant (pLI), has been widely adopted. This measure was designed to classify genes into three categories, null, recessive and haploinsufficient, on the basis of the contrast between observed and expected numbers of PTVs. Such population-genetic approaches can be useful in many applications. As we clarify, however, they reflect the strength of selection acting on heterozygotes and not dominance or haploinsufficiency.
View details for DOI 10.1038/s41588-019-0383-1
View details for Web of Science ID 000466842000004
View details for PubMedID 30962618
View details for PubMedCentralID PMC6615471
Reduced signal for polygenic adaptation of height in UK Biobank.
Several recent papers have reported strong signals of selection on European polygenic height scores. These analyses used height effect estimates from the GIANT consortium and replication studies. Here, we describe a new analysis based on the the UK Biobank (UKB), a large, independent dataset. We find that the signals of selection using UKB effect estimates are strongly attenuated or absent. We also provide evidence that previous analyses were confounded by population stratification. Therefore, the conclusion of strong polygenic adaptation now lacks support. Moreover, these discrepancies highlight (1) that methods for correcting for population stratification in GWAS may not always be sufficient for polygenic trait analyses, and (2) that claims of differences in polygenic scores between populations should be treated with caution until these issues are better understood.Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
View details for PubMedID 30895923
- Reduced signal for polygenic adaptation of height in UK Biobank ELIFE 2019; 8
Identifying genetic variants that affect viability in large cohorts
2017; 15 (9): e2002458
A number of open questions in human evolutionary genetics would become tractable if we were able to directly measure evolutionary fitness. As a step towards this goal, we developed a method to examine whether individual genetic variants, or sets of genetic variants, currently influence viability. The approach consists in testing whether the frequency of an allele varies across ages, accounting for variation in ancestry. We applied it to the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort and to the parents of participants in the UK Biobank. Across the genome, we found only a few common variants with large effects on age-specific mortality: tagging the APOE ε4 allele and near CHRNA3. These results suggest that when large, even late-onset effects are kept at low frequency by purifying selection. Testing viability effects of sets of genetic variants that jointly influence 1 of 42 traits, we detected a number of strong signals. In participants of the UK Biobank of British ancestry, we found that variants that delay puberty timing are associated with a longer parental life span (P~6.2 × 10-6 for fathers and P~2.0 × 10-3 for mothers), consistent with epidemiological studies. Similarly, variants associated with later age at first birth are associated with a longer maternal life span (P~1.4 × 10-3). Signals are also observed for variants influencing cholesterol levels, risk of coronary artery disease (CAD), body mass index, as well as risk of asthma. These signals exhibit consistent effects in the GERA cohort and among participants of the UK Biobank of non-British ancestry. We also found marked differences between males and females, most notably at the CHRNA3 locus, and variants associated with risk of CAD and cholesterol levels. Beyond our findings, the analysis serves as a proof of principle for how upcoming biomedical data sets can be used to learn about selection effects in contemporary humans.
View details for DOI 10.1371/journal.pbio.2002458
View details for Web of Science ID 000411978200007
View details for PubMedID 28873088
View details for PubMedCentralID PMC5584811
Entropic forces drive self-organization and membrane fusion by SNARE proteins
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2017; 114 (21): 5455–60
SNARE proteins are the core of the cell's fusion machinery and mediate virtually all known intracellular membrane fusion reactions on which exocytosis and trafficking depend. Fusion is catalyzed when vesicle-associated v-SNAREs form trans-SNARE complexes ("SNAREpins") with target membrane-associated t-SNAREs, a zippering-like process releasing ∼65 kT per SNAREpin. Fusion requires several SNAREpins, but how they cooperate is unknown and reports of the number required vary widely. To capture the collective behavior on the long timescales of fusion, we developed a highly coarse-grained model that retains key biophysical SNARE properties such as the zippering energy landscape and the surface charge distribution. In simulations the ∼65-kT zippering energy was almost entirely dissipated, with fully assembled SNARE motifs but uncomplexed linker domains. The SNAREpins self-organized into a circular cluster at the fusion site, driven by entropic forces that originate in steric-electrostatic interactions among SNAREpins and membranes. Cooperative entropic forces expanded the cluster and pulled the membranes together at the center point with high force. We find that there is no critical number of SNAREs required for fusion, but instead the fusion rate increases rapidly with the number of SNAREpins due to increasing entropic forces. We hypothesize that this principle finds physiological use to boost fusion rates to meet the demanding timescales of neurotransmission, exploiting the large number of v-SNAREs available in synaptic vesicles. Once in an unfettered cluster, we estimate ≥15 SNAREpins are required for fusion within the ∼1-ms timescale of neurotransmitter release.
View details for DOI 10.1073/pnas.1611506114
View details for Web of Science ID 000401797800052
View details for PubMedID 28490503
View details for PubMedCentralID PMC5448213
- Simulation of semi-crystalline polyethylene: Effect of short-chain branching on tie chains and trapped entanglements POLYMER 2015; 72: 177–84