Administrative Appointments


  • Director of Genome Informatics, Department of Pathology (2011 - Present)

Professional Education


  • B.A.Sc., University of British Columbia, Engineering Physics (2002)
  • Ph.D., University of British Columbia, Genetics (2006)

Current Research and Scholarly Interests


We focus on understanding the effects of genome variation on cellular phenotypes and cellular modeling of disease through genomic approaches such as next generation RNA sequencing in combination with developing and utilizing state-of-the-art bioinformatics and statistical genetics approaches. See our website at http://montgomerylab.stanford.edu/

2023-24 Courses


Stanford Advisees


Graduate and Fellowship Programs


All Publications


  • RNA Sequencing in Disease Diagnosis. Annual review of genomics and human genetics Smail, C., Montgomery, S. B. 2024

    Abstract

    RNA sequencing (RNA-seq) enables the accurate measurement of multiple transcriptomic phenotypes for modeling the impacts of disease variants. Advances in technologies, experimental protocols, and analysis strategies are rapidly expanding the application of RNA-seq to identify disease biomarkers, tissue- and cell-type-specific impacts, and the spatial localization of disease-associated mechanisms. Ongoing international efforts to construct biobank-scale transcriptomic repositories with matched genomic data across diverse population groups are further increasing the utility of RNA-seq approaches by providing large-scale normative reference resources. The availability of these resources, combined with improved computational analysis pipelines, has enabled the detection of aberrant transcriptomic phenotypes underlying rare diseases. Further expansion of these resources, across both somatic and developmental tissues, is expected to soon provide unprecedented insights to resolve disease origin, mechanism of action, and causal gene contributions, suggesting the continued high utility of RNA-seq in disease diagnosis. Expected final online publication date for the Annual Review of Genomics and Human Genetics, Volume 25 is August 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

    View details for DOI 10.1146/annurev-genom-021623-121812

    View details for PubMedID 38360541

  • Impact of genome build on RNA-seq interpretation and diagnostics. medRxiv : the preprint server for health sciences Ungar, R. A., Goddard, P. C., Jensen, T. D., Degalez, F., Smith, K. S., Jin, C. A., Bonner, D. E., Bernstein, J. A., Wheeler, M. T., Montgomery, S. B. 2024

    Abstract

    Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network (UDN) and Genomics Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium. We identified 2,800 genes with build-dependent quantification across six routinely-collected biospecimens, including 1,391 protein-coding genes and 341 known rare disease genes. We further observed multiple genes that only have detectable expression in a subset of genome builds. Finally, we characterized how genome build impacts the detection of outlier transcriptomic events. Combined, we provide a database of genes impacted by build choice, and recommend that transcriptomics-guided analyses and diagnoses are cross-referenced with these data for robustness.

    View details for DOI 10.1101/2024.01.11.24301165

    View details for PubMedID 38260490

    View details for PubMedCentralID PMC10802764

  • Genetic architecture of cardiac dynamic flow volumes. Nature genetics Gomes, B., Singh, A., O'Sullivan, J. W., Schnurr, T. M., Goddard, P. C., Loong, S., Amar, D., Hughes, J. W., Kostur, M., Haddad, F., Salerno, M., Foo, R., Montgomery, S. B., Parikh, V. N., Meder, B., Ashley, E. A. 2023

    Abstract

    Cardiac blood flow is a critical determinant of human health. However, the definition of its genetic architecture is limited by the technical challenge of capturing dynamic flow volumes from cardiac imaging at scale. We present DeepFlow, a deep-learning system to extract cardiac flow and volumes from phase-contrast cardiac magnetic resonance imaging. A mixed-linear model applied to 37,653 individuals from the UK Biobank reveals genome-wide significant associations across cardiac dynamic flow volumes spanning from aortic forward velocity to aortic regurgitation fraction. Mendelian randomization reveals a causal role for aortic root size in aortic valve regurgitation. Among the most significant contributing variants, localizing genes (near ELN, PRDM6 and ADAMTS7) are implicated in connective tissue and blood pressure pathways. Here we show that DeepFlow cardiac flow phenotyping at scale, combined with genotyping data, reinforces the contribution of connective tissue genes, blood pressure and root size to aortic valve function.

    View details for DOI 10.1038/s41588-023-01587-5

    View details for PubMedID 38082205

    View details for PubMedCentralID 7612636

  • Organ aging signatures in the plasma proteome track health and disease. Nature Oh, H. S., Rutledge, J., Nachun, D., Pálovics, R., Abiose, O., Moran-Losada, P., Channappa, D., Urey, D. Y., Kim, K., Sung, Y. J., Wang, L., Timsina, J., Western, D., Liu, M., Kohlfeld, P., Budde, J., Wilson, E. N., Guen, Y., Maurer, T. M., Haney, M., Yang, A. C., He, Z., Greicius, M. D., Andreasson, K. I., Sathyan, S., Weiss, E. F., Milman, S., Barzilai, N., Cruchaga, C., Wagner, A. D., Mormino, E., Lehallier, B., Henderson, V. W., Longo, F. M., Montgomery, S. B., Wyss-Coray, T. 2023; 624 (7990): 164-172

    Abstract

    Animal studies show aging varies between individuals as well as between organs within an individual1-4, but whether this is true in humans and its effect on age-related diseases is unknown. We utilized levels of human blood plasma proteins originating from specific organs to measure organ-specific aging differences in living individuals. Using machine learning models, we analysed aging in 11 major organs and estimated organ age reproducibly in five independent cohorts encompassing 5,676 adults across the human lifespan. We discovered nearly 20% of the population show strongly accelerated age in one organ and 1.7% are multi-organ agers. Accelerated organ aging confers 20-50% higher mortality risk, and organ-specific diseases relate to faster aging of those organs. We find individuals with accelerated heart aging have a 250% increased heart failure risk and accelerated brain and vascular aging predict Alzheimer's disease (AD) progression independently from and as strongly as plasma pTau-181 (ref. 5), the current best blood-based biomarker for AD. Our models link vascular calcification, extracellular matrix alterations and synaptic protein shedding to early cognitive decline. We introduce a simple and interpretable method to study organ aging using plasma proteomics data, predicting diseases and aging effects.

    View details for DOI 10.1038/s41586-023-06802-1

    View details for PubMedID 38057571

    View details for PubMedCentralID PMC10700136

  • Transcriptomics and chromatin accessibility in multiple African population samples. bioRxiv : the preprint server for biology DeGorter, M. K., Goddard, P. C., Karakoc, E., Kundu, S., Yan, S. M., Nachun, D., Abell, N., Aguirre, M., Carstensen, T., Chen, Z., Durrant, M., Dwaracherla, V. R., Feng, K., Gloudemans, M. J., Hunter, N., Moorthy, M. P., Pomilla, C., Rodrigues, K. B., Smith, C. J., Smith, K. S., Ungar, R. A., Balliu, B., Fellay, J., Flicek, P., McLaren, P. J., Henn, B., McCoy, R. C., Sugden, L., Kundaje, A., Sandhu, M. S., Gurdasani, D., Montgomery, S. B. 2023

    Abstract

    Mapping the functional human genome and impact of genetic variants is often limited to European-descendent population samples. To aid in overcoming this limitation, we measured gene expression using RNA sequencing in lymphoblastoid cell lines (LCLs) from 599 individuals from six African populations to identify novel transcripts including those not represented in the hg38 reference genome. We used whole genomes from the 1000 Genomes Project and 164 Maasai individuals to identify 8,881 expression and 6,949 splicing quantitative trait loci (eQTLs/sQTLs), and 2,611 structural variants associated with gene expression (SV-eQTLs). We further profiled chromatin accessibility using ATAC-Seq in a subset of 100 representative individuals, to identity chromatin accessibility quantitative trait loci (caQTLs) and allele-specific chromatin accessibility, and provide predictions for the functional effect of 78.9 million variants on chromatin accessibility. Using this map of eQTLs and caQTLs we fine-mapped GWAS signals for a range of complex diseases. Combined, this work expands global functional genomic data to identify novel transcripts, functional elements and variants, understand population genetic history of molecular quantitative trait loci, and further resolve the genetic basis of multiple human traits and disease.

    View details for DOI 10.1101/2023.11.04.564839

    View details for PubMedID 37986808

    View details for PubMedCentralID PMC10659267

  • Multi- Omic Profiling of Macrophages Lacking Tet2 or Dnmt3a Reveals Mechanisms of Hyper-Inflammation in Clonal Hematopoiesis Rodrigues, K. B., Gopakumar, J., Weng, Z., Mitchell, S., Maurer, M., Nachun, D., Eulalio, T., Estrada, D., Mazumder, T., Ma, L., Montgomery, S., Jaiswal, S. AMER SOC HEMATOLOGY. 2023
  • Integrative analyses highlight functional regulatory variants associated with neuropsychiatric diseases. Nature genetics Guo, M. G., Reynolds, D. L., Ang, C. E., Liu, Y., Zhao, Y., Donohue, L. K., Siprashvili, Z., Yang, X., Yoo, Y., Mondal, S., Hong, A., Kain, J., Meservey, L., Fabo, T., Elfaki, I., Kellman, L. N., Abell, N. S., Pershad, Y., Bayat, V., Etminani, P., Holodniy, M., Geschwind, D. H., Montgomery, S. B., Duncan, L. E., Urban, A. E., Altman, R. B., Wernig, M., Khavari, P. A. 2023

    Abstract

    Noncoding variants of presumed regulatory function contribute to the heritability of neuropsychiatric disease. A total of 2,221 noncoding variants connected to risk for ten neuropsychiatric disorders, including autism spectrum disorder, attention deficit hyperactivity disorder, bipolar disorder, borderline personality disorder, major depression, generalized anxiety disorder, panic disorder, post-traumatic stress disorder, obsessive-compulsive disorder and schizophrenia, were studied in developing human neural cells. Integrating epigenomic and transcriptomic data with massively parallel reporter assays identified differentially-active single-nucleotide variants (daSNVs) in specific neural cell types. Expression-gene mapping, network analyses and chromatin looping nominated candidate disease-relevant target genes modulated by these daSNVs. Follow-up integration of daSNV gene editing with clinical cohort analyses suggested that magnesium transport dysfunction may increase neuropsychiatric disease risk and indicated that common genetic pathomechanisms may mediate specific symptoms that are shared across multiple neuropsychiatric diseases.

    View details for DOI 10.1038/s41588-023-01533-5

    View details for PubMedID 37857935

    View details for PubMedCentralID 4112379

  • The functional impact of rare variation across the regulatory cascade. Cell genomics Li, T., Ferraro, N., Strober, B. J., Aguet, F., Kasela, S., Arvanitis, M., Ni, B., Wiel, L., Hershberg, E., Ardlie, K., Arking, D. E., Beer, R. L., Brody, J., Blackwell, T. W., Clish, C., Gabriel, S., Gerszten, R., Guo, X., Gupta, N., Johnson, W. C., Lappalainen, T., Lin, H. J., Liu, Y., Nickerson, D. A., Papanicolaou, G., Pritchard, J. K., Qasba, P., Shojaie, A., Smith, J., Sotoodehnia, N., Taylor, K. D., Tracy, R. P., Van Den Berg, D., Wheeler, M. T., Rich, S. S., Rotter, J. I., Battle, A., Montgomery, S. B. 2023; 3 (10): 100401

    Abstract

    Each human genome has tens of thousands of rare genetic variants; however, identifying impactful rare variants remains a major challenge. We demonstrate how use of personal multi-omics can enable identification of impactful rare variants by using the Multi-Ethnic Study of Atherosclerosis, which included several hundred individuals, with whole-genome sequencing, transcriptomes, methylomes, and proteomes collected across two time points, 10 years apart. We evaluated each multi-omics phenotype's ability to separately and jointly inform functional rare variation. By combining expression and protein data, we observed rare stop variants 62 times and rare frameshift variants 216 times as frequently as controls, compared to 13-27 times as frequently for expression or protein effects alone. We extended a Bayesian hierarchical model, "Watershed," to prioritize specific rare variants underlying multi-omics signals across the regulatory cascade. With this approach, we identified rare variants that exhibited large effect sizes on multiple complex traits including height, schizophrenia, and Alzheimer's disease.

    View details for DOI 10.1016/j.xgen.2023.100401

    View details for PubMedID 37868038

    View details for PubMedCentralID PMC10589633

  • Integrated single-cell multiome analysis reveals muscle fiber-type gene regulatory circuitry modulated by endurance exercise. bioRxiv : the preprint server for biology Rubenstein, A. B., Smith, G. R., Zhang, Z., Chen, X., Chambers, T. L., Ruf-Zamojski, F., Mendelev, N., Cheng, W. S., Zamojski, M., Amper, M. A., Nair, V. D., Marderstein, A. R., Montgomery, S. B., Troyanskaya, O. G., Zaslavsky, E., Trappe, T., Trappe, S., Sealfon, S. C. 2023

    Abstract

    Endurance exercise is an important health modifier. We studied cell-type specific adaptations of human skeletal muscle to acute endurance exercise using single-nucleus (sn) multiome sequencing in human vastus lateralis samples collected before and 3.5 hours after 40 min exercise at 70% VO2max in four subjects, as well as in matched time of day samples from two supine resting circadian controls. High quality same-cell RNA-seq and ATAC-seq data were obtained from 37,154 nuclei comprising 14 cell types. Among muscle fiber types, both shared and fiber-type specific regulatory programs were identified. Single-cell circuit analysis identified distinct adaptations in fast, slow and intermediate fibers as well as LUM-expressing FAP cells, involving a total of 328 transcription factors (TFs) acting at altered accessibility sites regulating 2,025 genes. These data and circuit mapping provide single-cell insight into the processes underlying tissue and metabolic remodeling responses to exercise.

    View details for DOI 10.1101/2023.09.26.558914

    View details for PubMedID 37808658

    View details for PubMedCentralID PMC10557702

  • Author Correction: Africa-specific human genetic variation near CHD1L associates with HIV-1 load. Nature McLaren, P. J., Porreca, I., Iaconis, G., Mok, H. P., Mukhopadhyay, S., Karakoc, E., Cristinelli, S., Pomilla, C., Bartha, I., Thorball, C. W., Tough, R. H., Angelino, P., Kiar, C. S., Carstensen, T., Fatumo, S., Porter, T., Jarvis, I., Skarnes, W. C., Bassett, A., DeGorter, M. K., Sathya Moorthy, M. P., Tuff, J. F., Kim, E. Y., Walter, M., Simons, L. M., Bashirova, A., Buchbinder, S., Carrington, M., Cossarizza, A., De Luca, A., Goedert, J. J., Goldstein, D. B., Haas, D. W., Herbeck, J. T., Johnson, E. O., Kaleebu, P., Kilembe, W., Kirk, G. D., Kootstra, N. A., Kral, A. H., Lambotte, O., Luo, M., Mallal, S., Martinez-Picado, J., Meyer, L., Miro, J. M., Moodley, P., Motala, A. A., Mullins, J. I., Nam, K., Obel, N., Pirie, F., Plummer, F. A., Poli, G., Price, M. A., Rauch, A., Theodorou, I., Trkola, A., Walker, B. D., Winkler, C. A., Zagury, J. F., Montgomery, S. B., Ciuffi, A., Hultquist, J. F., Wolinsky, S. M., Dougan, G., Lever, A. M., Gurdasani, D., Groom, H., Sandhu, M. S., Fellay, J. 2023

    View details for DOI 10.1038/s41586-023-06591-7

    View details for PubMedID 37670157

  • Beyond the exome: What's next in diagnostic testing for Mendelian conditions. American journal of human genetics Wojcik, M. H., Reuter, C. M., Marwaha, S., Mahmoud, M., Duyzend, M. H., Barseghyan, H., Yuan, B., Boone, P. M., Groopman, E. E., Délot, E. C., Jain, D., Sanchis-Juan, A., Starita, L. M., Talkowski, M., Montgomery, S. B., Bamshad, M. J., Chong, J. X., Wheeler, M. T., Berger, S. I., O'Donnell-Luria, A., Sedlazeck, F. J., Miller, D. E. 2023; 110 (8): 1229-1248

    Abstract

    Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order, and emerging technologies, such as optical genome mapping and long-read DNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to research consortia focused on elucidating the underlying cause of rare unsolved genetic disorders.

    View details for DOI 10.1016/j.ajhg.2023.06.009

    View details for PubMedID 37541186

  • Africa-specific human genetic variation near CHD1L associates with HIV-1 load. Nature McLaren, P. J., Porreca, I., Iaconis, G., Mok, H. P., Mukhopadhyay, S., Karakoc, E., Cristinelli, S., Pomilla, C., Bartha, I., Thorball, C. W., Tough, R. H., Angelino, P., Kiar, C. S., Carstensen, T., Fatumo, S., Porter, T., Jarvis, I., Skarnes, W. C., Bassett, A., DeGorter, M. K., Sathya Moorthy, M. P., Tuff, J. F., Kim, E. Y., Walter, M., Simons, L. M., Bashirova, A., Buchbinder, S., Carrington, M., Cossarizza, A., De Luca, A., Goedert, J. J., Goldstein, D. B., Haas, D. W., Herbeck, J. T., Johnson, E. O., Kaleebu, P., Kilembe, W., Kirk, G. D., Kootstra, N. A., Kral, A. H., Lambotte, O., Luo, M., Mallal, S., Martinez-Picado, J., Meyer, L., Miro, J. M., Moodley, P., Motala, A. A., Mullins, J. I., Nam, K., Obel, N., Pirie, F., Plummer, F. A., Poli, G., Price, M. A., Rauch, A., Theodorou, I., Trkola, A., Walker, B. D., Winkler, C. A., Zagury, J. F., Montgomery, S. B., Ciuffi, A., Hultquist, J. F., Wolinsky, S. M., Dougan, G., Lever, A. M., Gurdasani, D., Groom, H., Sandhu, M. S., Fellay, J. 2023

    Abstract

    HIV-1 remains a global health crisis1, highlighting the need to identify new targets for therapies. Here, given the disproportionate HIV-1 burden and marked human genome diversity in Africa2, we assessed the genetic determinants of control of set-point viral load in 3,879 people of African ancestries living with HIV-1 participating in the international collaboration for the genomics of HIV3. We identify a previously undescribed association signal on chromosome 1 where the peak variant associates with an approximately 0.3 log10-transformed copies per ml lower set-point viral load per minor allele copy and is specific to populations of African descent. The top associated variant is intergenic and lies between a long intergenic non-coding RNA (LINC00624) and the coding gene CHD1L, which encodes a helicase that is involved in DNA repair4. Infection assays in iPS cell-derived macrophages and other immortalized cell lines showed increased HIV-1 replication in CHD1L-knockdown and CHD1L-knockout cells. We provide evidence from population genetic studies that Africa-specific genetic variation near CHD1L associates with HIV replication in vivo. Although experimental studies suggest that CHD1L is able to limit HIV infection in some cell types in vitro, further investigation is required to understand the mechanisms underlying our observations, including any potential indirect effects of CHD1L on HIV spread in vivo that our cell-based assays cannot recapitulate.

    View details for DOI 10.1038/s41586-023-06370-4

    View details for PubMedID 37532928

    View details for PubMedCentralID 3723635

  • Molecular quantitative trait loci NATURE REVIEWS METHODS PRIMERS Aguet, F., Alasoo, K., Li, Y., Battle, A., Im, H., Montgomery, S. B., Lappalainen, T. 2023; 3 (1)
  • Beyond the exome: what's next in diagnostic testing for Mendelian conditions. ArXiv Wojcik, M. H., Reuter, C. M., Marwaha, S., Mahmoud, M., Duyzend, M. H., Barseghyan, H., Yuan, B., Boone, P. M., Groopman, E. E., Délot, E. C., Jain, D., Sanchis-Juan, A., Starita, L. M., Talkowski, M., Montgomery, S. B., Bamshad, M. J., Chong, J. X., Wheeler, M. T., Berger, S. I., O'Donnell-Luria, A., Sedlazeck, F. J., Miller, D. E. 2023

    Abstract

    Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order and emerging technologies, such as optical genome mapping and long-read DNA or RNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to a consortium such as GREGoR, which is focused on elucidating the underlying cause of rare unsolved genetic disorders.

    View details for DOI 10.1002/ajmg.a.63053

    View details for PubMedID 36713248

    View details for PubMedCentralID PMC9882576

  • The mitochondrial multi-omic response to exercise training across tissues. bioRxiv : the preprint server for biology Amar, D., Gay, N. R., Jimenez-Morales, D., Beltran, P. M., Ramaker, M. E., Raja, A. N., Zhao, B., Sun, Y., Marwaha, S., Gaul, D., Hershman, S. G., Xia, A., Lanza, I., Fernandez, F. M., Montgomery, S. B., Hevener, A. L., Ashley, E. A., Walsh, M. J., Sparks, L. M., Burant, C. F., Rector, R. S., Thyfault, J., Wheeler, M. T., Goodpaster, B. H., Coen, P. M., Schenk, S., Bodine, S. C., Lindholm, M. E. 2023

    Abstract

    Mitochondria are adaptable organelles with diverse cellular functions critical to whole-body metabolic homeostasis. While chronic endurance exercise training is known to alter mitochondrial activity, these adaptations have not yet been systematically characterized. Here, the Molecular Transducers of Physical Activity Consortium (MoTrPAC) mapped the longitudinal, multi-omic changes in mitochondrial analytes across 19 tissues in male and female rats endurance trained for 1, 2, 4 or 8 weeks. Training elicited substantial changes in the adrenal gland, brown adipose, colon, heart and skeletal muscle, while we detected mild responses in the brain, lung, small intestine and testes. The colon response was characterized by non-linear dynamics that resulted in upregulation of mitochondrial function that was more prominent in females. Brown adipose and adrenal tissues were characterized by substantial downregulation of mitochondrial pathways. Training induced a previously unrecognized robust upregulation of mitochondrial protein abundance and acetylation in the liver, and a concomitant shift in lipid metabolism. The striated muscles demonstrated a highly coordinated response to increase oxidative capacity, with the majority of changes occurring in protein abundance and post-translational modifications. We identified exercise upregulated networks that are downregulated in human type 2 diabetes and liver cirrhosis. In both cases HSD17B10, a central dehydrogenase in multiple metabolic pathways and mitochondrial tRNA maturation, was the main hub. In summary, we provide a multi-omic, cross-tissue atlas of the mitochondrial response to training and identify candidates for prevention of disease-associated mitochondrial dysfunction.

    View details for DOI 10.1101/2023.01.13.523698

    View details for PubMedID 36711881

    View details for PubMedCentralID PMC9882193

  • Multiomic identification of key transcriptional regulatory programs during endurance exercise training. bioRxiv : the preprint server for biology Smith, G. R., Zhao, B., Lindholm, M. E., Raja, A., Viggars, M., Pincas, H., Gay, N. R., Sun, Y., Ge, Y., Nair, V. D., Sanford, J. A., S Amper, M. A., Vasoya, M., Smith, K. S., Montgomery, S., Zaslavsky, E., Bodine, S. C., Esser, K. A., Walsh, M. J., Snyder, M. P., Sealfon, S. C., MoTrPAC Study Group 2023

    Abstract

    Transcription factors (TFs) play a key role in regulating gene expression and responses to stimuli. We conducted an integrated analysis of chromatin accessibility and RNA expression across various rat tissues following endurance exercise training (EET) to map epigenomic changes to transcriptional changes and determine key TFs involved. We uncovered tissue-specific changes across both omic layers, including highly correlated differentially accessible regions (DARs) and differentially expressed genes (DEGs). We identified open chromatin regions associated with DEGs (DEGaPs) and found tissue-specific and genomic feature-specific TF motif enrichment patterns among both DARs and DEGaPs. Accessible promoters of up-vs. down-regulated DEGs per tissue showed distinct TF enrichment patterns. Further, some EET-induced TFs in skeletal muscle were either validated at the proteomic level (MEF2C and NUR77) or correlated with exercise-related phenotypic changes. We provide an in-depth analysis of the epigenetic and trans-factor-dependent processes governing gene expression during EET.

    View details for DOI 10.1101/2023.01.10.523450

    View details for PubMedID 36711841

  • RNAget: an API to securely retrieve RNA quantifications. Bioinformatics (Oxford, England) Upchurch, S., Palumbo, E., Adams, J., Bujold, D., Bourque, G., Nedzel, J., Graham, K., Kagda, M. S., Assis, P., Hitz, B., Righi, E., Guigo, R., Wold, B. J., GA4GH RNA-Seq Task Team, Adams, J., Brazma, A., Bujold, D., Burchard, J., Capka, J., Cherry, M., Clarke, L., Craft, B., Dermitzakis, M., Diekhans, M., Dursi, J., Fitzsimons, M. S., Flaming, Z., Garrido, R., Gil, A., Godden, P., Green, M., Guigo, R., Guttman, M., Haas, B., Haeussler, M., Hitz, B., Li, B., Linnarsson, S., Lipski, A., Liu, D., Longerich, S., Lougheed, D., Manning, J., Marioni, J., Meyer, C., Montgomery, S., Morrow, A., Munoz-Power Fuentes, A., Nedzel, J., Nguyen, D., Osborn, K., Ouellette, F., Palumbo, E., Papatheodorou, I., Pervouchine, D., Ramani, A., Rambla, J., Sadjad, B., Steinberg, D., Talkar, J., Tickle, T., Tzeng, K., Upchurch, S., Vaisipour, S., Watford, S., Wold, B., Zhang, Z., Zhu, J. 2023; 39 (4)

    Abstract

    SUMMARY: Large-scale sharing of genomic quantification data requires standardized access interfaces. In this Global Alliance for Genomics and Health project, we developed RNAget, an API for secure access to genomic quantification data in matrix form. RNAget provides for slicing matrices to extract desired subsets of data and is applicable to all expression matrix-format data, including RNA sequencing and microarrays. Further, it generalizes to quantification matrices of other sequence-based genomics such as ATAC-seq and ChIP-seq.AVAILABILITY AND IMPLEMENTATION: https://ga4gh-rnaseq.github.io/schema/docs/index.html.

    View details for DOI 10.1093/bioinformatics/btad126

    View details for PubMedID 36897015

  • Methylation differences in Alzheimer's disease neuropathologic change in the aged human brain. Acta neuropathologica communications Lang, A. L., Eulalio, T., Fox, E., Yakabi, K., Bukhari, S. A., Kawas, C. H., Corrada, M. M., Montgomery, S. B., Heppner, F. L., Capper, D., Nachun, D., Montine, T. J. 2022; 10 (1): 174

    Abstract

    Alzheimer's disease (AD) is the most common cause of dementia with advancing age as its strongest risk factor. AD neuropathologic change (ADNC) is known to be associated with numerous DNA methylation changes in the human brain, but the oldest old (> 90 years) have so far been underrepresented in epigenetic studies of ADNC. Our study participants were individuals aged over 90 years (n = 47) from The 90+ Study. We analyzed DNA methylation from bulk samples in eight precisely dissected regions of the human brain: middle frontal gyrus, cingulate gyrus, entorhinal cortex, dentate gyrus, CA1, substantia nigra, locus coeruleus and cerebellar cortex. We deconvolved our bulk data into cell-type-specific (CTS) signals using computational methods. CTS methylation differences were analyzed across different levels of ADNC. The highest amount of ADNC related methylation differences was found in the dentate gyrus, a region that has so far been underrepresented in large scale multi-omic studies. In neurons of the dentate gyrus, DNA methylation significantly differed with increased burden of amyloid beta (Aβ) plaques at 5897 promoter regions of protein-coding genes. Amongst these, higher Aβ plaque burden was associated with promoter hypomethylation of the Presenilin enhancer 2 (PEN-2) gene, one of the rate limiting genes in the formation of gamma-secretase, a multicomponent complex that is responsible in part for the endoproteolytic cleavage of amyloid precursor protein into Aβ peptides. In addition to novel ADNC related DNA methylation changes, we present the most detailed array-based methylation survey of the old aged human brain to date. Our open-sourced dataset can serve as a brain region reference panel for future studies and help advance research in aging and neurodegenerative diseases.

    View details for DOI 10.1186/s40478-022-01470-0

    View details for PubMedID 36447297

    View details for PubMedCentralID PMC9710143

  • Deep learning-assisted genome-wide characterization of massively parallel reporter assays. Nucleic acids research Lu, F., Sossin, A., Abell, N., Montgomery, S. B., He, Z. 2022

    Abstract

    Massively parallel reporter assay (MPRA) is a high-throughput method that enables the study of the regulatory activities of tens of thousands of DNA oligonucleotides in a single experiment. While MPRA experiments have grown in popularity, their small sample sizes compared to the scale of the human genome limits our understanding of the regulatory effects they detect. To address this, we develop a deep learning model, MpraNet, to distinguish potential MPRA targets from the background genome. This model achieves high discriminative performance (AUROC=0.85) at differentiating MPRA positives from a set of control variants that mimic the background genome when applied to the lymphoblastoid cell line. We observe that existing functional scores represent very distinct functional effects, and most of them fail to characterize the regulatory effect that MPRA detects. Using MpraNet, we predict potential MPRA functional variants across the genome and identify the distributions of MPRA effect relative to other characteristics of genetic variation, including allele frequency, alternative functional annotations specified by FAVOR, and phenome-wide associations. We also observed that the predicted MPRA positives are not uniformly distributed across the genome; instead, they are clumped together in active regions comprising 9.95% of the genome and inactive regions comprising 89.07% of the genome. Furthermore, we propose our model as a screen to filter MPRA experiment candidates at genome-wide scale, enabling future experiments to be more cost-efficient by increasing precision relative to that observed from previous MPRAs.

    View details for DOI 10.1093/nar/gkac990

    View details for PubMedID 36350674

  • RNA editing underlies genetic risk of common inflammatory diseases. Nature Li, Q., Gloudemans, M. J., Geisinger, J. M., Fan, B., Aguet, F., Sun, T., Ramaswami, G., Li, Y. I., Ma, J. B., Pritchard, J. K., Montgomery, S. B., Li, J. B. 2022

    Abstract

    A major challenge in human genetics is to identify the molecular mechanisms of trait-associated and disease-associated variants. To achieve this, quantitative trait locus (QTL) mapping of genetic variants with intermediate molecular phenotypes such as gene expression and splicing have been widely adopted1,2. However, despite successes, the molecular basis for a considerable fraction of trait-associated and disease-associated variants remains unclear3,4. Here we show that ADAR-mediated adenosine-to-inosine RNA editing, a post-transcriptional event vital for suppressing cellular double-stranded RNA (dsRNA)-mediated innate immune interferon responses5-11, is an important potential mechanism underlying genetic variants associated with common inflammatory diseases. We identified and characterized 30,319 cis-RNA editing QTLs (edQTLs) across 49 human tissues. These edQTLs were significantly enriched in genome-wide association study signals for autoimmune and immune-mediated diseases. Colocalization analysis of edQTLs with disease risk loci further pinpointed key, putatively immunogenic dsRNAs formed by expected inverted repeat Alu elements as well as unexpected, highly over-represented cis-natural antisense transcripts. Furthermore, inflammatory disease risk variants, in aggregate, were associated with reduced editing of nearby dsRNAs and induced interferon responses in inflammatory diseases. This unique directional effect agrees with the established mechanism that lack of RNA editing by ADAR1 leads to the specific activation of the dsRNA sensor MDA5 and subsequent interferon responses and inflammation7-9. Our findings implicate cellular dsRNA editing and sensing as a previously underappreciated mechanism of common inflammatory diseases.

    View details for DOI 10.1038/s41586-022-05052-x

    View details for PubMedID 35922514

  • Temporal dynamics of the multi-omic response to endurance exercise training across tissues Gay, N. R., Beltran, P., Amar, D., Montgomery, S. B., Carr, S. A., Motrpac Study Grp ELSEVIER. 2022: S31
  • Integration of rare expression outlier-associated variants improves polygenic risk prediction. American journal of human genetics Smail, C., Ferraro, N. M., Hui, Q., Durrant, M. G., Aguirre, M., Tanigawa, Y., Keever-Keigher, M. R., Rao, A. S., Justesen, J. M., Li, X., Gloudemans, M. J., Assimes, T. L., Kooperberg, C., Reiner, A. P., Huang, J., O'Donnell, C. J., Sun, Y. V., Million Veteran Program, Rivas, M. A., Montgomery, S. B. 2022

    Abstract

    Polygenic risk scores (PRSs) quantify the contribution of multiple genetic loci to an individual's likelihood of a complex trait or disease. However, existing PRSs estimate this likelihood with common genetic variants, excluding the impact of rare variants. Here, we report on a method to identify rare variants associated with outlier gene expression and integrate their impact into PRS predictions for body mass index (BMI), obesity, and bariatric surgery. Between the top and bottom 10%, we observed a 20.8% increase in risk for obesity (p= 3*10-14), 62.3% increase in risk for severe obesity (p= 1*10-6), and median 5.29 years earlier onset for bariatric surgery (p=0.008), as a function of expression outlier-associated rare variant burden when controlling for common variant PRS. We show that these predictions were more significant than integrating the effects of rare protein-truncating variants (PTVs), observing a mean 19% increase in phenotypic variance explained with expression outlier-associated rare variants when compared with PTVs (p= 2*10-15). We replicated these findings by using data from the Million Veteran Program and demonstrated that PRSs across multiple traits and diseases can benefit from the inclusion of expression outlier-associated rare variants identified through population-scale transcriptome sequencing.

    View details for DOI 10.1016/j.ajhg.2022.04.015

    View details for PubMedID 35588732

  • Multiple causal variants underlie genetic associations in humans. Science (New York, N.Y.) Abell, N. S., DeGorter, M. K., Gloudemans, M. J., Greenwald, E., Smith, K. S., He, Z., Montgomery, S. B. 2022; 375 (6586): 1247-1254

    Abstract

    Associations between genetic variation and traits are often in noncoding regions with strong linkage disequilibrium (LD), where a single causal variant is assumed to underlie the association. We applied a massively parallel reporter assay (MPRA) to functionally evaluate genetic variants in high, local LD for independent cis-expression quantitative trait loci (eQTL). We found that 17.7% of eQTLs exhibit more than one major allelic effect in tight LD. The detected regulatory variants were highly and specifically enriched for activating chromatin structures and allelic transcription factor binding. Integration of MPRA profiles with eQTL/complex trait colocalizations across 114 human traits and diseases identified causal variant sets demonstrating how genetic association signals can manifest through multiple, tightly linked causal variants.

    View details for DOI 10.1126/science.abj5117

    View details for PubMedID 35298243

  • Integration of genetic colocalizations with physiological and pharmacological perturbations identifies cardiometabolic disease genes. Genome medicine Gloudemans, M. J., Balliu, B., Nachun, D., Schnurr, T. M., Durrant, M. G., Ingelsson, E., Wabitsch, M., Quertermous, T., Montgomery, S. B., Knowles, J. W., Carcamo-Orive, I. 2022; 14 (1): 31

    Abstract

    BACKGROUND: Identification of causal genes for polygenic human diseases has been extremely challenging, and our understanding of how physiological and pharmacological stimuli modulate genetic risk at disease-associated loci is limited. Specifically, insulin resistance (IR), a common feature of cardiometabolic disease, including type 2 diabetes, obesity, and dyslipidemia, lacks well-powered genome-wide association studies (GWAS), and therefore, few associated loci and causal genes have been identified.METHODS: Here, we perform and integrate linkage disequilibrium (LD)-adjusted colocalization analyses across nine cardiometabolic traits (fasting insulin, fasting glucose, insulin sensitivity, insulin sensitivity index, type 2 diabetes, triglycerides, high-density lipoprotein, body mass index, and waist-hip ratio) combined with expression and splicing quantitative trait loci (eQTLs and sQTLs) from five metabolically relevant human tissues (subcutaneous and visceral adipose, skeletal muscle, liver, and pancreas). To elucidate the upstream regulators and functional mechanisms for these genes, we integrate their transcriptional responses to 21 relevant physiological and pharmacological perturbations in human adipocytes, hepatocytes, and skeletal muscle cells and map their protein-protein interactions.RESULTS: We identify 470 colocalized loci and prioritize 207 loci with a single colocalized gene. Patterns of shared colocalizations across traits and tissues highlight different potential roles for colocalized genes in cardiometabolic disease and distinguish several genes involved in pancreatic beta-cell function from others with a more direct role in skeletal muscle, liver, and adipose tissues. At the loci with a single colocalized gene, 42 of these genes were regulated by insulin and 35 by glucose in perturbation experiments, including 17 regulated by both. Other metabolic perturbations regulated the expression of 30 more genes not regulated by glucose or insulin, pointing to other potential upstream regulators of candidate causal genes.CONCLUSIONS: Our use of transcriptional responses under metabolic perturbations to contextualize genetic associations from our custom colocalization approach provides a list of likely causal genes and their upstream regulators in the context of IR-associated cardiometabolic risk.

    View details for DOI 10.1186/s13073-022-01036-8

    View details for PubMedID 35292083

  • Integration of genetic colocalizations with physiological and pharmacological perturbations identifies cardiometabolic disease genes Gloudemans, M. J., Balliu, B., Nachun, D., Durrant, M. G., Ingelsson, E., Wabitsch, M., Quertermous, T., Montgomery, S. B., Knowles, J., Carcamo-Orive, I. W B SAUNDERS CO-ELSEVIER INC. 2022: S24-S25
  • TOWARDS TRANSCRIPTOMICS AS A PRIMARY TOOL FOR RARE DISEASE INVESTIGATION. Cold Spring Harbor molecular case studies Montgomery, S. B., Bernstein, J. A., Wheeler, M. T. 2022

    Abstract

    In the past five years transcriptome or RNA-sequencing (RNA-seq) has steadily emerged as a complementary assay for rare disease diagnosis and discovery. In this perspective, we summarize several recent developments and challenges in use of RNA-seq for rare disease investigation. Using an accessible patient sample, such as blood, skin, or muscle, RNA-seq enables the assay of expressed RNA transcripts. Analysis of RNA-seq allows the identification of aberrant or outlier gene expression and alternative splicing as functional evidence to support rare disease study and diagnosis. Further, many types of variant effects can be profiled beyond coding variants, as the consequences of non-coding variants that impact gene expression and splicing can be directly observed. This is particularly apparent for structural variants which disproportionately underlie outlier gene expression and for splicing variants where RNA-seq can both measure aberrant canonical splicing and detect deep intronic effects. However, a major potential limitation of RNA-seq in rare disease investigation is the developmental and cell type-specificity of gene expression as a pathogenic variant's effect may be limited to a specific spatiotemporal context and access to a patient's tissue sample from the relevant tissue and timing of disease expression may not be possible. We speculate that as advances in computational methods and emerging experimental techniques overcome both developmental and cell type-specificity, there will be broadening use of RNA sequencing and multi-omics in rare disease diagnosis and delivery of precision health.

    View details for DOI 10.1101/mcs.a006198

    View details for PubMedID 35217565

  • Lymphoid blast transformation in an MPN with BCR-JAK2 treated with ruxolitinib: putative mechanisms of resistance. Blood advances Chen, J. A., Hou, Y., Roskin, K. M., Arber, D. A., Bangs, C. D., Baughn, L. B., Cherry, A. M., Ewalt, M. D., Fire, A. Z., Fresard, L., Kearney, H. M., Montgomery, S. B., Ohgami, R. S., Pearce, K. E., Pitel, B. A., Merker, J. D., Gotlib, J. 2021; 5 (17): 3492-3496

    Abstract

    The basis for acquired resistance to JAK inhibition in patients with JAK2-driven hematologic malignancies is not well understood. We report a patient with a myeloproliferative neoplasm (MPN) with a BCR activator of RhoGEF and GTPase (BCR)-JAK2 fusion with initial hematologic response to ruxolitinib who rapidly developed B-lymphoid blast transformation. We analyzed pre-ruxolitinib and blast transformation samples using genome sequencing, DNA mate-pair sequencing (MPseq), RNA sequencing (RNA-seq), and chromosomal microarray to characterize possible mechanisms of resistance. No resistance mutations in the BCR-JAK2 fusion gene or transcript were identified, and fusion transcript expression levels remained stable. However, at the time of blast transformation, MPseq detected a new IKZF1 copy-number loss, which is predicted to result in loss of normal IKZF1 protein translation. RNA-seq revealed significant upregulation of genes negatively regulated by IKZF1, including IL7R and CRLF2. Disease progression was also characterized by adaptation to an activated B-cell receptor (BCR)-like signaling phenotype, with marked upregulation of genes such as CD79A, CD79B, IGLL1, VPREB1, BLNK, ZAP70, RAG1, and RAG2. In summary, IKZF1 deletion and a switch from cytokine dependence to activated BCR-like signaling phenotype represent putative mechanisms of ruxolitinib resistance in this case, recapitulating preclinical data on resistance to JAK inhibition in CRLF2-rearranged Philadelphia chromosome-like acute lymphoblastic leukemia.

    View details for DOI 10.1182/bloodadvances.2020004174

    View details for PubMedID 34505882

  • Genome-wide functional screen of 3'UTR variants uncovers causal variants for human disease and evolution. Cell Griesemer, D., Xue, J. R., Reilly, S. K., Ulirsch, J. C., Kukreja, K., Davis, J. R., Kanai, M., Yang, D. K., Butts, J. C., Guney, M. H., Luban, J., Montgomery, S. B., Finucane, H. K., Novina, C. D., Tewhey, R., Sabeti, P. C. 2021

    Abstract

    3' untranslated region (3'UTR) variants are strongly associated with human traits and diseases, yet few have been causally identified. We developed the massively parallel reporter assay for 3'UTRs (MPRAu) to sensitively assay 12,173 3'UTR variants. We applied MPRAu to six human cell lines, focusing on genetic variants associated with genome-wide association studies (GWAS) and human evolutionary adaptation. MPRAu expands our understanding of 3'UTR function, suggesting that simple sequences predominately explain 3'UTR regulatory activity. We adapt MPRAu to uncover diverse molecular mechanisms at base pair resolution, including an adenylate-uridylate (AU)-rich element of LEPR linked to potential metabolic evolutionary adaptations in East Asians. We nominate hundreds of 3'UTR causal variants with genetically fine-mapped phenotype associations. Using endogenous allelic replacements, we characterize one variant that disrupts a miRNA site regulating the viral defense gene TRIM14 and one that alters PILRB abundance, nominating a causal variant underlying transcriptional changes in age-related macular degeneration.

    View details for DOI 10.1016/j.cell.2021.08.025

    View details for PubMedID 34534445

  • The role of Sp140 revealed in IgE and mast cell responses in Collaborative Cross mice. JCI insight Matsushita, K., Li, X., Nakamura, Y., Dong, D., Mukai, K., Tsai, M., Montgomery, S. B., Galli, S. J. 2021; 6 (12)

    Abstract

    Mouse IgE and mast cell (MC) functions have been studied primarily using inbred strains. Here, we (a) identified effects of genetic background on mouse IgE and MC phenotypes, (b) defined the suitability of various strains for studying IgE and MC functions, and (c) began to study potentially novel genes involved in such functions. We screened 47 Collaborative Cross (CC) strains, as well as C57BL/6J and BALB/cJ mice, for strength of passive cutaneous anaphylaxis (PCA) and responses to the intestinal parasite Strongyloides venezuelensis (S.v.). CC mice exhibited a diversity in PCA strength and S.v. responses. Among strains tested, C57BL/6J and CC027 mice showed, respectively, moderate and uniquely potent MC activity. Quantitative trait locus analysis and RNA sequencing of BM-derived cultured MCs (BMCMCs) from CC027 mice suggested Sp140 as a candidate gene for MC activation. siRNA-mediated knock-down of Sp140 in BMCMCs decreased IgE-dependent histamine release and cytokine production. Our results demonstrated marked variations in IgE and MC activity in vivo, and in responses to S.v., across CC strains. C57BL/6J and CC027 represent useful models for studying MC functions. Additionally, we identified Sp140 as a gene that contributes to IgE-dependent MC activation.

    View details for DOI 10.1172/jci.insight.146572

    View details for PubMedID 34156030

  • Identification of putative causal loci in whole-genome sequencing data via knockoff statistics. Nature communications He, Z., Liu, L., Wang, C., Le Guen, Y., Lee, J., Gogarten, S., Lu, F., Montgomery, S., Tang, H., Silverman, E. K., Cho, M. H., Greicius, M., Ionita-Laza, I. 2021; 12 (1): 3152

    Abstract

    The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability, and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.

    View details for DOI 10.1038/s41467-021-22889-4

    View details for PubMedID 34035245

  • Compound heterozygous KCTD7 variants in progressive myoclonus epilepsy. Journal of neurogenetics Burke, E. A., Sturgeon, M., Zastrow, D. B., Fernandez, L., Prybol, C., Marwaha, S., Frothingham, E. P., Ward, P. A., Eng, C. M., Fresard, L., Montgomery, S. B., Enns, G. M., Fisher, P. G., Wolfe, L. A., Harding, B., Carrington, B., Bishop, K., Sood, R., Huang, Y., Elkahloun, A., Toro, C., Bassuk, A. G., Wheeler, M. T., Markello, T. C., Gahl, W. A., Malicdan, M. C. 2021: 1–10

    Abstract

    KCTD7 is a member of the potassium channel tetramerization domain-containing protein family and has been associated with progressive myoclonic epilepsy (PME), characterized by myoclonus, epilepsy, and neurological deterioration. Here we report four affected individuals from two unrelated families in which we identified KCTD7 compound heterozygous single nucleotide variants through exome sequencing. RNAseq was used to detect a non-annotated splicing junction created by a synonymous variant in the second family. Whole-cell patch-clamp analysis of neuroblastoma cells overexpressing the patients' variant alleles demonstrated aberrant potassium regulation. While all four patients experienced many of the common clinical features of PME, they also showed variable phenotypes not previously reported, including dysautonomia, brain pathology findings including a significantly reduced thalamus, and the lack of myoclonic seizures. To gain further insight into the pathogenesis of the disorder, zinc finger nucleases were used to generate kctd7 knockout zebrafish. Kctd7 homozygous mutants showed global dysregulation of gene expression and increased transcription of c-fos, which has previously been correlated with seizure activity in animal models. Together these findings expand the known phenotypic spectrum of KCTD7-associated PME, report a new animal model for future studies, and contribute valuable insights into the disease.

    View details for DOI 10.1080/01677063.2021.1892095

    View details for PubMedID 33970744

  • Population-scale tissue transcriptomics maps long non-coding RNAs to complex disease. Cell de Goede, O. M., Nachun, D. C., Ferraro, N. M., Gloudemans, M. J., Rao, A. S., Smail, C., Eulalio, T. Y., Aguet, F., Ng, B., Xu, J., Barbeira, A. N., Castel, S. E., Kim-Hellmuth, S., Park, Y., Scott, A. J., Strober, B. J., GTEx Consortium, Brown, C. D., Wen, X., Hall, I. M., Battle, A., Lappalainen, T., Im, H. K., Ardlie, K. G., Mostafavi, S., Quertermous, T., Kirkegaard, K., Montgomery, S. B., Anand, S., Gabriel, S., Getz, G. A., Graubert, A., Hadley, K., Handsaker, R. E., Huang, K. H., Li, X., MacArthur, D. G., Meier, S. R., Nedzel, J. L., Nguyen, D. T., Segre, A. V., Todres, E., Balliu, B., Bonazzola, R., Brown, A., Conrad, D. F., Cotter, D. J., Cox, N., Das, S., Dermitzakis, E. T., Einson, J., Engelhardt, B. E., Eskin, E., Flynn, E. D., Fresard, L., Gamazon, E. R., Garrido-Martin, D., Gay, N. R., Guigo, R., Hamel, A. R., He, Y., Hoffman, P. J., Hormozdiari, F., Hou, L., Jo, B., Kasela, S., Kashin, S., Kellis, M., Kwong, A., Li, X., Liang, Y., Mangul, S., Mohammadi, P., Munoz-Aguirre, M., Nobel, A. B., Oliva, M., Park, Y., Parsana, P., Reverter, F., Rouhana, J. M., Sabatti, C., Saha, A., Stephens, M., Stranger, B. E., Teran, N. A., Vinuela, A., Wang, G., Wright, F., Wucher, V., Zou, Y., Ferreira, P. G., Li, G., Mele, M., Yeger-Lotem, E., Bradbury, D., Krubit, T., McLean, J. A., Qi, L., Robinson, K., Roche, N. V., Smith, A. M., Tabor, D. E., Undale, A., Bridge, J., Brigham, L. E., Foster, B. A., Gillard, B. M., Hasz, R., Hunter, M., Johns, C., Johnson, M., Karasik, E., Kopen, G., Leinweber, W. F., McDonald, A., Moser, M. T., Myer, K., Ramsey, K. D., Roe, B., Shad, S., Thomas, J. A., Walters, G., Washington, M., Wheeler, J., Jewell, S. D., Rohrer, D. C., Valley, D. R., Davis, D. A., Mash, D. C., Barcus, M. E., Branton, P. A., Sobin, L., Barker, L. K., Gardiner, H. M., Mosavel, M., Siminoff, L. A., Flicek, P., Haeussler, M., Juettemann, T., Kent, W. J., Lee, C. M., Powell, C. C., Rosenbloom, K. R., Ruffier, M., Sheppard, D., Taylor, K., Trevanion, S. J., Zerbino, D. R., Abell, N. S., Akey, J., Chen, L., Demanelis, K., Doherty, J. A., Feinberg, A. P., Hansen, K. D., Hickey, P. F., Jasmine, F., Jiang, L., Kaul, R., Kibriya, M. G., Li, J. B., Li, Q., Lin, S., Linder, S. E., Pierce, B. L., Rizzardi, L. F., Skol, A. D., Smith, K. S., Snyder, M., Stamatoyannopoulos, J., Tang, H., Wang, M., Carithers, L. J., Guan, P., Koester, S. E., Little, A. R., Moore, H. M., Nierras, C. R., Rao, A. K., Vaught, J. B., Volpi, S. 2021

    Abstract

    Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.

    View details for DOI 10.1016/j.cell.2021.03.050

    View details for PubMedID 33864768

  • Functional and structural analysis of cytokine selective IL6ST defects that cause recessive hyper-IgE syndrome. The Journal of allergy and clinical immunology Chen, Y., Zastrow, D. B., Metcalfe, R. D., Gartner, L., Krause, F., Morton, C. J., Marwaha, S., Fresard, L., Huang, Y., Zhao, C., McCormack, C., Bick, D., Worthey, E. A., Eng, C. M., Gold, J., Undiagnosed Diseases Network, Montgomery, S. B., Fisher, P. G., Ashley, E. A., Wheeler, M. T., Parker, M. W., Shanmugasundaram, V., Putoczki, T. L., Schmidt-Arras, D., Laurence, A., Bernstein, J. A., Griffin, M. D., Uhlig, H. H. 2021

    Abstract

    BACKGROUND: Biallelic variants in IL6ST cause a recessive form of hyper-IgE syndrome (HIES) characterized by high IgE, eosinophilia, defective acute phase response, susceptibility to bacterial infections and skeletal abnormalities due to cytokine selective loss-of-function in GP130 with defective IL-6 and IL-11, variable OSM and IL-27 but sparing LIF signaling.OBJECTIVE: To understand the functional and structural impact of recessive HIES-associated IL6ST variants.METHODS: We investigated a patient with HIES using exome, genome and RNA sequencing. Functional assays assessed IL-6, IL-11, IL-27, OSM, LIF, CT-1, CLC, and CNTF signaling. Molecular dynamic simulations and structural modeling of GP130 cytokine receptor complexes were performed.RESULTS: We identify a patient with compound heterozygous novel missense variants in IL6ST (p.Ala517Pro, and exon-skipping null variant p.Gly484_Pro518delinsArg). The p.Ala517Pro variant results in a more profound IL-6 and IL-11 dominated signaling defect compared to the previously identified recessive IL6ST variants p.Asn404Tyr, and p.Pro498Leu. Molecular dynamics simulations suggest that the p.Ala517Pro and p.Asn404Tyr variants result in increased flexibility of the extracellular membrane-proximal domains of GP130. We propose a structural model that explains the cytokine selectivity of pathogenic IL6ST variants that result in recessive HIES. The variants destabilize the hexameric cytokine receptor complexes whereas the trimeric LIF-GP130-LIFR complex remains stable by an additional membrane-proximal interaction. Deletion of this membrane-proximal interaction site in GP130 consequently causes additional defective LIF signaling and Stuve-Wiedemann syndrome.CONCLUSION: Our data provide a structural basis to understand clinical phenotypes in patients with IL6ST variants.

    View details for DOI 10.1016/j.jaci.2021.02.044

    View details for PubMedID 33771552

  • Identification of rare and common regulatory variants in pluripotent cells using population-scale transcriptomics. Nature genetics Bonder, M. J., Smail, C., Gloudemans, M. J., Fresard, L., Jakubosky, D., D'Antonio, M., Li, X., Ferraro, N. M., Carcamo-Orive, I., Mirauta, B., Seaton, D. D., Cai, N., Vakili, D., Horta, D., Zhao, C., Zastrow, D. B., Bonner, D. E., HipSci Consortium, iPSCORE consortium, Undiagnosed Diseases Network, PhLiPS consortium, Wheeler, M. T., Kilpinen, H., Knowles, J. W., Smith, E. N., Frazer, K. A., Montgomery, S. B., Stegle, O., Jan Bonder, M., Seaton, D., Jakubosky, D. A., Brown, C. D., Park, Y. 2021

    Abstract

    Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less well known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of new colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.

    View details for DOI 10.1038/s41588-021-00800-7

    View details for PubMedID 33664507

  • Evaluating the Genomic Parameters Governing rAAV-Mediated Homologous Recombination MOLECULAR THERAPY Spector, L. P., Tiffany, M., Ferraro, N. M., Abell, N. S., Montgomery, S. B., Kay, M. A. 2021; 29 (3): 1028–46
  • Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome biology Barbeira, A. N., Bonazzola, R., Gamazon, E. R., Liang, Y., Park, Y., Kim-Hellmuth, S., Wang, G., Jiang, Z., Zhou, D., Hormozdiari, F., Liu, B., Rao, A., Hamel, A. R., Pividori, M. D., Aguet, F., GTEx GWAS Working Group, Bastarache, L., Jordan, D. M., Verbanck, M., Do, R., GTEx Consortium, Stephens, M., Ardlie, K., McCarthy, M., Montgomery, S. B., Segre, A. V., Brown, C. D., Lappalainen, T., Wen, X., Im, H. K. 2021; 22 (1): 49

    Abstract

    The resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.

    View details for DOI 10.1186/s13059-020-02252-4

    View details for PubMedID 33499903

  • Nonsense-mediated decay is highly stable across individuals and tissues. American journal of human genetics Teran, N. A., Nachun, D. C., Eulalio, T., Ferraro, N. M., Smail, C., Rivas, M. A., Montgomery, S. B. 2021

    Abstract

    Precise interpretation of the effects of rare protein-truncating variants (PTVs) is important for accurate determination of variant impact. Current methods for assessing the ability of PTVs to induce nonsense-mediated decay (NMD) focus primarily on the position of the variant in the transcript. We used RNA sequencing of the Genotype Tissue Expression v.8 cohort to compute the efficiency of NMD using allelic imbalance for 2,320 rare (genome aggregation database minor allele frequency ≤ 1%) PTVs across 809 individuals in 49 tissues. We created an interpretable predictive model using penalized logistic regression in order to evaluate the comprehensive influence of variant annotation, tissue, and inter-individual variation on NMD. We found that variant position, allele frequency, the inclusion of ultra-rare and singleton variants, and conservation were predictive of allelic imbalance. Furthermore, we found that NMD effects were highly concordant across tissues and individuals. Due to this high consistency, we demonstrate in silico that utilizing peripheral tissues or cell lines provides accurate prediction of NMD for PTVs.

    View details for DOI 10.1016/j.ajhg.2021.06.008

    View details for PubMedID 34216550

  • An integrated approach to identify environmental modulators of genetic risk factors for complex traits. American journal of human genetics Balliu, B., Carcamo-Orive, I., Gloudemans, M. J., Nachun, D. C., Durrant, M. G., Gazal, S., Park, C. Y., Knowles, D. A., Wabitsch, M., Quertermous, T., Knowles, J. W., Montgomery, S. B. 2021

    Abstract

    Complex traits and diseases can be influenced by both genetics and environment. However, given the large number of environmental stimuli and power challenges for gene-by-environment testing, it remains a critical challenge to identify and prioritize specific disease-relevant environmental exposures. We propose a framework for leveraging signals from transcriptional responses to environmental perturbations to identify disease-relevant perturbations that can modulate genetic risk for complex traits and inform the functions of genetic variants associated with complex traits. We perturbed human skeletal-muscle-, fat-, and liver-relevant cell lines with 21 perturbations affecting insulin resistance, glucose homeostasis, and metabolic regulation in humans and identified thousands of environmentally responsive genes. By combining these data with GWASs from 31 distinct polygenic traits, we show that the heritability of multiple traits is enriched in regions surrounding genes responsive to specific perturbations and, further, that environmentally responsive genes are enriched for associations with specific diseases and phenotypes from the GWAS Catalog. Overall, we demonstrate the advantages of large-scale characterization of transcriptional changes in diversely stimulated and pathologically relevant cells to identify disease-relevant perturbations.

    View details for DOI 10.1016/j.ajhg.2021.08.014

    View details for PubMedID 34582792

  • Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer's and Parkinson's diseases. Nature genetics Corces, M. R., Shcherbina, A., Kundu, S., Gloudemans, M. J., Fresard, L., Granja, J. M., Louie, B. H., Eulalio, T., Shams, S., Bagdatli, S. T., Mumbach, M. R., Liu, B., Montine, K. S., Greenleaf, W. J., Kundaje, A., Montgomery, S. B., Chang, H. Y., Montine, T. J. 2020

    Abstract

    Genome-wide association studies of neurological diseases have identified thousands of variants associated with disease phenotypes. However, most of these variants do not alter coding sequences, making it difficult to assign their function. Here, we present a multi-omic epigenetic atlas of the adult human brain through profiling of single-cell chromatin accessibility landscapes and three-dimensional chromatin interactions of diverse adult brain regions across a cohort of cognitively healthy individuals. We developed a machine-learning classifier to integrate this multi-omic framework and predict dozens of functional SNPs for Alzheimer's and Parkinson's diseases, nominating target genes and cell types for previously orphaned loci from genome-wide association studies. Moreover, we dissected the complex inverted haplotype of the MAPT (encoding tau) Parkinson's disease risk locus, identifying putative ectopic regulatory interactions in neurons that may mediate this disease association. This work expands understanding of inherited variation and provides a roadmap for the epigenomic dissection of causal regulatory variation in disease.

    View details for DOI 10.1038/s41588-020-00721-x

    View details for PubMedID 33106633

  • The GTEx Consortium atlas of genetic regulatory effects across human tissues SCIENCE Aguet, F., Barbeira, A. N., Bonazzola, R., Brown, A., Castel, S. E., Jo, B., Kasela, S., Kim-Hellmuth, S., Liang, Y., Parsana, P., Flynn, E., Fresard, L., Gamazon, E. R., Hamel, A. R., He, Y., Hormozdiari, F., Mohammadi, P., Munoz-Aguirre, M., Ardlie, K. G., Battle, A., Bonazzola, R., Brown, C. D., Cox, N., Dermitzakis, E. T., Engelhardt, B. E., Garrido-Martin, D., Gay, N. R., Getz, G., Guigo, R., Hamel, A. R., Handsaker, R. E., He, Y., Hoffman, P. J., Hormozdiari, F., Im, H., Jo, B., Kasela, S., Kashin, S., Kim-Hellmuth, S., Kwong, A., Lappalainen, T., Li, X., Liang, Y., MacArthur, D. G., Mohammadi, P., Montgomery, S. B., Munoz-Aguirre, M., Rouhana, J. M., Hormozdiari, F., Im, H., Kim-Hellmuth, S., Ardlie, K. G., Getz, G., Guigo, R., Im, H., Lappalainen, T., Montgomery, S. B., Im, H., Lappalainen, T., Lappalainen, T., Anand, S., Gabriel, S., Getz, G., Graubert, A., Hadley, K., Handsaker, R. E., Huang, K. H., Kashin, S., Li, X., MacArthur, D. G., Meier, S. R., Nedzel, J. L., Balliu, B., Conrad, D., Cotter, D. J., Das, S., de Goede, O. M., Eskin, E., Eulalio, T. Y., Ferraro, N. M., Garrido-Martin, D., Gay, N. R., Getz, G., Graubert, A., Guigo, R., Hadley, K., Hamel, A. R., Handsaker, R. E., He, Y., Hoffman, P. J., Hormozdiari, F., Hou, L., Huang, K. H., Im, H., Jo, B., Kasela, S., Kashin, S., Kellis, M., Kim-Hellmuth, S., Kwong, A., Lappalainen, T., Li, X., Li, X., Liang, Y., MacArthur, D. G., Mangul, S., Meier, S. R., Mohammadi, P., Montgomery, S. B., Munoz-Aguirre, M., Nachun, D. C., Nedzel, J. L., Nguyen, D. Y., Nobel, A. B., Park, Y., Reverter, F., Sabatti, C., Saha, A., Segre, A., Stephens, M., Strober, B. J., Teran, N. A., Todres, E., Vinuela, A., Wang, G., Wen, X., Wright, F., Wucher, V., Zou, Y., Ferreira, P. G., Li, G., Mele, M., Yeger-Lotem, E., Barcus, M. E., Bradbury, D., Krubit, T., McLean, J. A., Qi, L., Robinson, K., Roche, N., Smith, A. M., Tabor, D. E., Undale, A., Bridge, J., Brigham, L. E., Foster, B. A., Gillard, B. M., Hasz, R., Hunter, M., Johns, C., Johnson, M., Karasik, E., Kopen, G., Leinweber, W. F., McDonald, A., Moser, M. T., Myer, K., Ramsey, K. D., Roe, B., Shad, S., Thomas, J. A., Walters, G., Washington, M., Wheeler, J., Jewell, S. D., Rohrer, D. C., Valley, D. R., Davis, D. A., Mash, D. C., Branton, P. A., Sobin, L., Barker, L. K., Gardiner, H. M., Mosavel, M., Siminoff, L. A., Flicek, P., Haeussler, M., Juettemann, T., Kent, W., Lee, C. M., Powell, C. C., Rosenbloom, K. R., Ruffier, M., Sheppard, D., Taylor, K., Trevanion, S. J., Zerbino, D. R., Abell, N. S., Akey, J., Chen, L., Demanelis, K., Doherty, J. A., Feinberg, A. P., Hansen, K. D., Hickey, P. F., Hou, L., Jasmine, F., Jiang, L., Kaul, R., Kellis, M., Kibriya, M. G., Li, J., Li, Q., Lin, S., Linder, S. E., Montgomery, S. B., Oliva, M., Park, Y., Pierce, B. L., Rizzardi, L. F., Skol, A. D., Smith, K. S., Snyder, M., Stamatoyannopoulos, J., Tang, H., Wang, M., Carithers, L. J., Guan, P., Koester, S. E., Little, A., Moore, H. M., Nierras, C. R., Rao, A. K., Vaught, J. B., Volpi, S., GTEx Consortium 2020; 369 (6509): 1318-+
  • Molecular Transducers of Physical Activity Consortium (MoTrPAC): Mapping the Dynamic Responses to Exercise. Cell Sanford, J. A., Nogiec, C. D., Lindholm, M. E., Adkins, J. N., Amar, D., Dasari, S., Drugan, J. K., Fernandez, F. M., Radom-Aizik, S., Schenk, S., Snyder, M. P., Tracy, R. P., Vanderboom, P., Trappe, S., Walsh, M. J., Molecular Transducers of Physical Activity Consortium, Adkins, J. N., Amar, D., Dasari, S., Drugan, J. K., Evans, C. R., Fernandez, F. M., Li, Y., Lindholm, M. E., Nogiec, C. D., Radom-Aizik, S., Sanford, J. A., Schenk, S., Snyder, M. P., Tomlinson, L., Tracy, R. P., Trappe, S., Vanderboom, P., Walsh, M. J., Alekel, D. L., Bekirov, I., Boyce, A. T., Boyington, J., Fleg, J. L., Joseph, L. J., Laughlin, M. R., Maruvada, P., Morris, S. A., McGowan, J. A., Nierras, C., Pai, V., Peterson, C., Ramos, E., Roary, M. C., Williams, J. P., Xia, A., Cornell, E., Rooney, J., Miller, M. E., Ambrosius, W. T., Rushing, S., Stowe, C. L., Rejeski, W. J., Nicklas, B. J., Pahor, M., Lu, C., Trappe, T., Chambers, T., Raue, U., Lester, B., Bergman, B. C., Bessesen, D. H., Jankowski, C. M., Kohrt, W. M., Melanson, E. L., Moreau, K. L., Schauer, I. E., Schwartz, R. S., Kraus, W. E., Slentz, C. A., Huffman, K. M., Johnson, J. L., Willis, L. H., Kelly, L., Houmard, J. A., Dubis, G., Broskey, N., Goodpaster, B. H., Sparks, L. M., Coen, P. M., Cooper, D. M., Haddad, F., Rankinen, T., Ravussin, E., Johannsen, N., Harris, M., Jakicic, J. M., Newman, A. B., Forman, D. D., Kershaw, E., Rogers, R. J., Nindl, B. C., Page, L. C., Stefanovic-Racic, M., Barr, S. L., Rasmussen, B. B., Moro, T., Paddon-Jones, D., Volpi, E., Spratt, H., Musi, N., Espinoza, S., Patel, D., Serra, M., Gelfond, J., Burns, A., Bamman, M. M., Buford, T. W., Cutter, G. R., Bodine, S. C., Esser, K., Farrar, R. P., Goodyear, L. J., Hirshman, M. F., Albertson, B. G., Qian, W., Piehowski, P., Gritsenko, M. A., Monore, M. E., Petyuk, V. A., McDermott, J. E., Hansen, J. N., Hutchison, C., Moore, S., Gaul, D. A., Clish, C. B., Avila-Pacheco, J., Dennis, C., Kellis, M., Carr, S., Jean-Beltran, P. M., Keshishian, H., Mani, D. R., Clauser, K., Krug, K., Mundorff, C., Pearce, C., Ivanova, A. A., Ortlund, E. A., Maner-Smith, K., Uppal, K., Zhang, T., Sealfon, S. C., Zavlasky, E., Nair, V., Li, S., Jain, N., Ge, Y., Sun, Y., Nudelman, G., Ruf-Zamojski, F., Smith, G., Pincas, N., Rubenstein, A., Amper, M. A., Seenarine, N., Lappalainen, T., Lanza, I. R., Nair, K. S., Klaus, K., Montgomery, S. B., Smith, K. S., Gay, N. R., Zhao, B., Hung, C. J., Zebarjadi, N., Balliu, B., Fresard, L., Burant, C. F., Li, J. Z., Kachman, M., Soni, T., Raskind, A. B., Gerszten, R., Robbins, J., Ilkayeva, O., Muehlbauer, M. J., Newgard, C. B., Ashley, E. A., Wheeler, M. T., Jimenez-Morales, D., Raja, A., Dalton, K. P., Zhen, J., Kim, Y. S., Christle, J. W., Marwaha, S., Chin, E. T., Hershman, S. G., Hastie, T., Tibshirani, R., Rivas, M. A. 2020; 181 (7): 1464–74

    Abstract

    Exercise provides a robust physiological stimulus that evokes cross-talk among multiple tissues that when repeated regularly (i.e., training) improves physiological capacity, benefits numerous organ systems, and decreases the risk for premature mortality. However, a gap remains in identifying the detailed molecular signals induced by exercise that benefits health and prevents disease. The Molecular Transducers of Physical Activity Consortium (MoTrPAC) was established to address this gap and generate a molecular map of exercise. Preclinical and clinical studies will examine the systemic effects of endurance and resistance exercise across a range of ages and fitness levels by molecular probing of multiple tissues before and after acute and chronic exercise. From this multi-omic and bioinformatic analysis, a molecular map of exercise will be established. Altogether, MoTrPAC will provide a public database that is expected to enhance our understanding of the health benefits of exercise and to provide insight into how physical activity mitigates disease.

    View details for DOI 10.1016/j.cell.2020.06.004

    View details for PubMedID 32589957

  • Discovery and quality analysis of a comprehensive set of structural variants and short tandem repeats. Nature communications Jakubosky, D., Smith, E. N., D'Antonio, M., Jan Bonder, M., Young Greenwald, W. W., D'Antonio-Chronowska, A., Matsui, H., i2QTL Consortium, Stegle, O., Montgomery, S. B., DeBoever, C., Frazer, K. A., Bonder, M. J., Cai, N., Carcamo-Orive, I., D'Antonio, M., Frazer, K. A., Young Greenwald, W. W., Jakubosky, D., Knowles, J. W., Matsui, H., McCarthy, D. J., Mirauta, B. A., Montgomery, S. B., Quertermous, T., Seaton, D. D., Smail, C., Smith, E. N., Stegle, O. 2020; 11 (1): 2928

    Abstract

    Structural variants (SVs) and short tandem repeats (STRs) are important sources of genetic diversity but are not routinely analyzed in genetic studies because they are difficult to accurately identify and genotype. Because SVs and STRs range in size and type, it is necessary to apply multiple algorithms that incorporate different types of evidence from sequencing data and employ complex filtering strategies to discover a comprehensive set of high-quality and reproducible variants. Here we assemble a set of 719 deep whole genome sequencing (WGS) samples (mean 42*) from 477 distinct individuals which we use to discover and genotype a wide spectrum of SV and STR variants using five algorithms. We use 177 unique pairs of genetic replicates to identify factors that affect variant call reproducibility and develop a systematic filtering strategy to create of one of the most complete and well characterized maps of SVs and STRs to date.

    View details for DOI 10.1038/s41467-020-16481-5

    View details for PubMedID 32522985

  • Properties of structural variants and short tandem repeats associated with gene expression and complex traits. Nature communications Jakubosky, D., D'Antonio, M., Bonder, M. J., Smail, C., Donovan, M. K., Young Greenwald, W. W., Matsui, H., i2QTL Consortium, D'Antonio-Chronowska, A., Stegle, O., Smith, E. N., Montgomery, S. B., DeBoever, C., Frazer, K. A., Bonder, M. J., Cai, N., Carcamo-Orive, I., D'Antonio, M., Frazer, K. A., Young Greenwald, W. W., Jakubosky, D., Knowles, J. W., Matsui, H., McCarthy, D. J., Mirauta, B. A., Montgomery, S. B., Quertermous, T., Seaton, D. D., Smail, C., Smith, E. N., Stegle, O. 2020; 11 (1): 2927

    Abstract

    Structural variants (SVs) and short tandem repeats (STRs) comprise a broad group of diverse DNA variants which vastly differ in their sizes and distributions across the genome. Here, we identify genomic features of SV classes and STRs that are associated with gene expression and complex traits, including their locations relative to eGenes, likelihood of being associated with multiple eGenes, associated eGene types (e.g., coding, noncoding, level of evolutionary constraint), effect sizes, linkage disequilibrium with tagging single nucleotide variants used in GWAS, and likelihood of being associated with GWAS traits. We identify a set of high-impact SVs/STRs associated with the expression of three or more eGenes via chromatin loops and show that they are highly enriched for being associated with GWAS traits. Our study provides insights into the genomic properties of structural variant classes and short tandem repeats that are associated with gene expression and human traits.

    View details for DOI 10.1038/s41467-020-16482-4

    View details for PubMedID 32522982

  • Transcriptional and Position Effect Contributions to rAAV-Mediated Gene Targeting Spector, L. P., Tiffany, M., Ferraro, N. M., Abell, N. S., Montgomery, S. B., Kay, M. A. CELL PRESS. 2020: 290
  • Molecular Choreography of Acute Exercise. Cell Contrepois, K. n., Wu, S. n., Moneghetti, K. J., Hornburg, D. n., Ahadi, S. n., Tsai, M. S., Metwally, A. A., Wei, E. n., Lee-McMullen, B. n., Quijada, J. V., Chen, S. n., Christle, J. W., Ellenberger, M. n., Balliu, B. n., Taylor, S. n., Durrant, M. G., Knowles, D. A., Choudhry, H. n., Ashland, M. n., Bahmani, A. n., Enslen, B. n., Amsallem, M. n., Kobayashi, Y. n., Avina, M. n., Perelman, D. n., Schüssler-Fiorenza Rose, S. M., Zhou, W. n., Ashley, E. A., Montgomery, S. B., Chaib, H. n., Haddad, F. n., Snyder, M. P. 2020; 181 (5): 1112–30.e16

    Abstract

    Acute physical activity leads to several changes in metabolic, cardiovascular, and immune pathways. Although studies have examined selected changes in these pathways, the system-wide molecular response to an acute bout of exercise has not been fully characterized. We performed longitudinal multi-omic profiling of plasma and peripheral blood mononuclear cells including metabolome, lipidome, immunome, proteome, and transcriptome from 36 well-characterized volunteers, before and after a controlled bout of symptom-limited exercise. Time-series analysis revealed thousands of molecular changes and an orchestrated choreography of biological processes involving energy metabolism, oxidative stress, inflammation, tissue repair, and growth factor response, as well as regulatory pathways. Most of these processes were dampened and some were reversed in insulin-resistant participants. Finally, we discovered biological pathways involved in cardiopulmonary exercise response and developed prediction models revealing potential resting blood-based biomarkers of peak oxygen consumption.

    View details for DOI 10.1016/j.cell.2020.04.043

    View details for PubMedID 32470399

  • Evaluating the genomic parameters governing rAAV-mediated homologous recombination. Molecular therapy : the journal of the American Society of Gene Therapy Spector, L. P., Tiffany, M. n., Ferraro, N. M., Abell, N. S., Montgomery, S. B., Kay, M. A. 2020

    Abstract

    Recombinant AAV vectors have the unique ability to promote targeted integration of transgenes via homologous recombination at specified genomic sites reaching frequencies of 0.1-1%. We studied genomic parameters that influence targeting efficiencies on a large scale. To do this, we generated more than 1000 engineered, doxycycline-inducible target sites in the human HAP1 cell line and infected this polyclonal population with a library of AAV-DJ targeting vectors each carrying a unique barcode. The heterogeneity of barcode integration at each target site provided an assessment of targeting efficiency at that locus. We compared targeting efficiency with and without target site transcription for identical chromosomal positions. Targeting efficiency was enhanced by target site transcription, while chromatin accessibility was associated with an increased likelihood of targeting. ChromHMM chromatin states characterizing transcription and enhancers in wildtype K562 cells were also associated with increased AAV-HR efficiency with and without target site transcription, respectively. Furthermore, the amenability of a site to targeting was influenced by the endogenous transcriptional level of intersecting genes. These results define important parameters that may not only assist in designing optimal targeting vectors for genome editing, but also provide new insights into the mechanism of AAV-mediated homologous recombination.

    View details for DOI 10.1016/j.ymthe.2020.11.025

    View details for PubMedID 33248247

  • The impact of sex on gene expression across human tissues. Science (New York, N.Y.) Oliva, M. n., Muñoz-Aguirre, M. n., Kim-Hellmuth, S. n., Wucher, V. n., Gewirtz, A. D., Cotter, D. J., Parsana, P. n., Kasela, S. n., Balliu, B. n., Viñuela, A. n., Castel, S. E., Mohammadi, P. n., Aguet, F. n., Zou, Y. n., Khramtsova, E. A., Skol, A. D., Garrido-Martín, D. n., Reverter, F. n., Brown, A. n., Evans, P. n., Gamazon, E. R., Payne, A. n., Bonazzola, R. n., Barbeira, A. N., Hamel, A. R., Martinez-Perez, A. n., Soria, J. M., Pierce, B. L., Stephens, M. n., Eskin, E. n., Dermitzakis, E. T., Segrè, A. V., Im, H. K., Engelhardt, B. E., Ardlie, K. G., Montgomery, S. B., Battle, A. J., Lappalainen, T. n., Guigó, R. n., Stranger, B. E. 2020; 369 (6509)

    Abstract

    Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.

    View details for DOI 10.1126/science.aba3066

    View details for PubMedID 32913072

  • Impact of admixture and ancestry on eQTL analysis and GWAS colocalization in GTEx. Genome biology Gay, N. R., Gloudemans, M. n., Antonio, M. L., Abell, N. S., Balliu, B. n., Park, Y. n., Martin, A. R., Musharoff, S. n., Rao, A. S., Aguet, F. n., Barbeira, A. N., Bonazzola, R. n., Hormozdiari, F. n., Ardlie, K. G., Brown, C. D., Im, H. K., Lappalainen, T. n., Wen, X. n., Montgomery, S. B. 2020; 21 (1): 233

    Abstract

    Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization.Here, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in seven tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe 31 loci (0.02%) where a significant colocalization is called only with one eQTL ancestry adjustment method. Notably, both adjustments produce similar numbers of significant colocalizations within each of two different colocalization methods, COLOC and FINEMAP. Finally, we identify a small subset of eQTL-associated variants highly correlated with local ancestry, providing a resource to enhance functional follow-up.We provide a local ancestry map for admixed individuals in the GTEx v8 release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of the results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.

    View details for DOI 10.1186/s13059-020-02113-0

    View details for PubMedID 32912333

  • Transcriptomic signatures across human tissues identify functional rare genetic variation. Science (New York, N.Y.) Ferraro, N. M., Strober, B. J., Einson, J. n., Abell, N. S., Aguet, F. n., Barbeira, A. N., Brandt, M. n., Bucan, M. n., Castel, S. E., Davis, J. R., Greenwald, E. n., Hess, G. T., Hilliard, A. T., Kember, R. L., Kotis, B. n., Park, Y. n., Peloso, G. n., Ramdas, S. n., Scott, A. J., Smail, C. n., Tsang, E. K., Zekavat, S. M., Ziosi, M. n., Aradhana, n. n., Ardlie, K. G., Assimes, T. L., Bassik, M. C., Brown, C. D., Correa, A. n., Hall, I. n., Im, H. K., Li, X. n., Natarajan, P. n., Lappalainen, T. n., Mohammadi, P. n., Montgomery, S. B., Battle, A. n. 2020; 369 (6509)

    Abstract

    Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.

    View details for DOI 10.1126/science.aaz5900

    View details for PubMedID 32913073

  • FAM13A affects body fat distribution and adipocyte function. Nature communications Fathzadeh, M. n., Li, J. n., Rao, A. n., Cook, N. n., Chennamsetty, I. n., Seldin, M. n., Zhou, X. n., Sangwung, P. n., Gloudemans, M. J., Keller, M. n., Attie, A. n., Yang, J. n., Wabitsch, M. n., Carcamo-Orive, I. n., Tada, Y. n., Lusis, A. J., Shin, M. K., Molony, C. M., McLaughlin, T. n., Reaven, G. n., Montgomery, S. B., Reilly, D. n., Quertermous, T. n., Ingelsson, E. n., Knowles, J. W. 2020; 11 (1): 1465

    Abstract

    Genetic variation in the FAM13A (Family with Sequence Similarity 13 Member A) locus has been associated with several glycemic and metabolic traits in genome-wide association studies (GWAS). Here, we demonstrate that in humans, FAM13A alleles are associated with increased FAM13A expression in subcutaneous adipose tissue (SAT) and an insulin resistance-related phenotype (e.g. higher waist-to-hip ratio and fasting insulin levels, but lower body fat). In human adipocyte models, knockdown of FAM13A in preadipocytes accelerates adipocyte differentiation. In mice, Fam13a knockout (KO) have a lower visceral to subcutaneous fat (VAT/SAT) ratio after high-fat diet challenge, in comparison to their wild-type counterparts. Subcutaneous adipocytes in KO mice show a size distribution shift toward an increased number of smaller adipocytes, along with an improved adipogenic potential. Our results indicate that GWAS-associated variants within the FAM13A locus alter adipose FAM13A expression, which in turn, regulates adipocyte differentiation and contribute to changes in body fat distribution.

    View details for DOI 10.1038/s41467-020-15291-z

    View details for PubMedID 32193374

  • A Bioinformatic Analysis of Integrative Mobile Genetic Elements Highlights Their Role in Bacterial Adaptation. Cell host & microbe Durrant, M. G., Li, M. M., Siranosian, B. A., Montgomery, S. B., Bhatt, A. S. 2019

    Abstract

    Mobile genetic elements (MGEs) contribute to bacterial adaptation and evolution; however, high-throughput, unbiased MGE detection remains challenging. We describe MGEfinder, a bioinformatic toolbox that identifies integrative MGEs and their insertion sites by using short-read sequencing data. MGEfinder identifies the genomic site of each MGE insertion and infers the identity of the inserted sequence. We apply MGEfinder to 12,374 sequenced isolates of 9 prevalent bacterial pathogens, includingMycobacterium tuberculosis, Staphylococcus aureus, and Escherichia coli, and identify thousands of MGEs, including candidate insertion sequences, conjugative transposons, and prophage elements. The MGE repertoire and insertion rates vary across species, and integration sites often cluster near genes related to antibiotic resistance, virulence, and pathogenicity. MGE insertions likely contribute to antibiotic resistance in laboratory experiments and clinical isolates. Additionally, we identified thousands of mobility genes, a subset of which have unknown function opening avenues for exploration. Future application of MGEfinder to commensal bacteria will further illuminate bacterial adaptation and evolution.

    View details for DOI 10.1016/j.chom.2019.10.022

    View details for PubMedID 31862382

  • Genetic regulation of gene expression and splicing during a 10-year period of human aging. Genome biology Balliu, B., Durrant, M., Goede, O. d., Abell, N., Li, X., Liu, B., Gloudemans, M. J., Cook, N. L., Smith, K. S., Knowles, D. A., Pala, M., Cucca, F., Schlessinger, D., Jaiswal, S., Sabatti, C., Lind, L., Ingelsson, E., Montgomery, S. B. 2019; 20 (1): 230

    Abstract

    BACKGROUND: Molecular and cellular changes are intrinsic to aging and age-related diseases. Prior cross-sectional studies have investigated the combined effects of age and genetics on gene expression and alternative splicing; however, there has been no long-term, longitudinal characterization of these molecular changes, especially in older age.RESULTS: We perform RNA sequencing in whole blood from the same individuals at ages 70 and 80 to quantify how gene expression, alternative splicing, and their genetic regulation are altered during this 10-year period of advanced aging at a population and individual level. We observe that individuals are more similar to their own expression profiles later in life than profiles of other individuals their own age. We identify 1291 and 294 genes differentially expressed and alternatively spliced with age, as well as 529 genes with outlying individual trajectories. Further, we observe a strong correlation of genetic effects on expression and splicing between the two ages, with a small subset of tested genes showing a reduction in genetic associations with expression and splicing in older age.CONCLUSIONS: These findings demonstrate that, although the transcriptome and its genetic regulation is mostly stable late in life, a small subset of genes is dynamic and is characterized by a reduction in genetic regulation, most likely due to increasing environmental variance with age.

    View details for DOI 10.1186/s13059-019-1840-y

    View details for PubMedID 31684996

  • COMPREHENSIVE RNA ANALYSIS OF CEREBROSPINAL FLUID FROM LEPTOMENINGEAL METASTASES Polyak, D., Li, Y., Liu, B., Connolly, I., Khoeur, L., Kakusa, B., Johnson, E., Andersen, S., Pan, W., Nagpal, S., Montgomery, S. B., Gephart, M. OXFORD UNIV PRESS INC. 2019: 62
  • Uganda Genome Resource Enables Insights into Population History and Genomic Discovery in Africa. Cell Gurdasani, D., Carstensen, T., Fatumo, S., Chen, G., Franklin, C. S., Prado-Martinez, J., Bouman, H., Abascal, F., Haber, M., Tachmazidou, I., Mathieson, I., Ekoru, K., DeGorter, M. K., Nsubuga, R. N., Finan, C., Wheeler, E., Chen, L., Cooper, D. N., Schiffels, S., Chen, Y., Ritchie, G. R., Pollard, M. O., Fortune, M. D., Mentzer, A. J., Garrison, E., Bergstrom, A., Hatzikotoulas, K., Adeyemo, A., Doumatey, A., Elding, H., Wain, L. V., Ehret, G., Auer, P. L., Kooperberg, C. L., Reiner, A. P., Franceschini, N., Maher, D. P., Montgomery, S. B., Kadie, C., Widmer, C., Xue, Y., Seeley, J., Asiki, G., Kamali, A., Young, E. H., Pomilla, C., Soranzo, N., Zeggini, E., Pirie, F., Morris, A. P., Heckerman, D., Tyler-Smith, C., Motala, A., Rotimi, C., Kaleebu, P., Barroso, I., Sandhu, M. S. 2019; 179 (4): 984

    Abstract

    Genomic studies in African populations provide unique opportunities to understand disease etiology, human diversity, and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans was best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, hematological, lipid, and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region.

    View details for DOI 10.1016/j.cell.2019.10.004

    View details for PubMedID 31675503

  • Atheroprotective roles of smooth muscle cell phenotypic modulation and the TCF21 disease gene as revealed by single-cell analysis. Nature medicine Wirka, R. C., Wagh, D., Paik, D. T., Pjanic, M., Nguyen, T., Miller, C. L., Kundu, R., Nagao, M., Coller, J., Koyano, T. K., Fong, R., Woo, Y. J., Liu, B., Montgomery, S. B., Wu, J. C., Zhu, K., Chang, R., Alamprese, M., Tallquist, M. D., Kim, J. B., Quertermous, T. 2019

    Abstract

    In response to various stimuli, vascular smooth muscle cells (SMCs) can de-differentiate, proliferate and migrate in a process known as phenotypic modulation. However, the phenotype of modulated SMCs in vivo during atherosclerosis and the influence of this process on coronary artery disease (CAD) risk have not been clearly established. Using single-cell RNA sequencing, we comprehensively characterized the transcriptomic phenotype of modulated SMCs in vivo in atherosclerotic lesions of both mouse and human arteries and found that these cells transform into unique fibroblast-like cells, termed 'fibromyocytes', rather than into a classical macrophage phenotype. SMC-specific knockout of TCF21-a causal CAD gene-markedly inhibited SMC phenotypic modulation in mice, leading to the presence of fewer fibromyocytes within lesions as well as within the protective fibrous cap of the lesions. Moreover, TCF21 expression was strongly associated with SMC phenotypic modulation in diseased human coronary arteries, and higher levels of TCF21 expression were associated with decreased CAD risk in human CAD-relevant tissues. These results establish a protective role for both TCF21 and SMC phenotypic modulation in this disease.

    View details for DOI 10.1038/s41591-019-0512-5

    View details for PubMedID 31359001

  • Identifying causal variants and genes using functional genomics in specialized cell types and contexts. Human genetics Liu, B., Montgomery, S. B. 2019

    Abstract

    A central goal in human genetics is the identification of variants and genes that influence the risk of polygenic diseases. In the past decade, genome-wide association studies (GWAS) have identified tens of thousands of genetic loci associated with various diseases. Since the majority of such loci lie within non-coding regions and have many candidate variants in linkage disequilibrium, it has been challenging to accurately identify specific causal variants and genes. To aid in their discovery a variety of statistical and experimental approaches have been developed. These approaches often borrow information from functional genomics assays such as ATAC-seq, ChIP-seq and RNA-seq to annotate functional variants and identify regulatory relationships between variants and genes. While such approaches are powerful, given the diversity of cell types and environments, it is paramount to select disease-relevant contexts for follow-up analyses. In this review, we discuss the latest developments, challenges, and best practices for determining the causal mechanisms of polygenic disease risk variants with functional genomics data from specialized cell types.

    View details for DOI 10.1007/s00439-019-02044-2

    View details for PubMedID 31317254

  • Disease mechanisms elucidated by genetic regulation of human RPE gene expression Vollrath, D., Liu, B., Calton, M. A., Abell, N. S., Benchorin, G., Gloudemans, M. J., Chen, M., Hu, J., Li, X., Balliu, B., Bok, D., Montgomery, S. B. ASSOC RESEARCH VISION OPHTHALMOLOGY INC. 2019
  • Genetic analyses of human fetal retinal pigment epithelium gene expression suggest ocular disease mechanisms. Communications biology Liu, B., Calton, M. A., Abell, N. S., Benchorin, G., Gloudemans, M. J., Chen, M., Hu, J., Li, X., Balliu, B., Bok, D., Montgomery, S. B., Vollrath, D. 2019; 2 (1): 186

    Abstract

    The retinal pigment epithelium (RPE) serves vital roles in ocular development and retinal homeostasis but has limited representation in large-scale functional genomics datasets. Understanding how common human genetic variants affect RPE gene expression could elucidate the sources of phenotypic variability in selected monogenic ocular diseases and pinpoint causal genes at genome-wide association study (GWAS) loci. We interrogated the genetics of gene expression of cultured human fetal RPE (fRPE) cells under two metabolic conditions and discovered hundreds of shared or condition-specific expression or splice quantitative trait loci (e/sQTLs). Co-localizations of fRPE e/sQTLs with age-related macular degeneration (AMD) and myopia GWAS data suggest new candidate genes, and mechanisms by which a common RDH5 allele contributes to both increased AMD risk and decreased myopia risk. Our study highlights the unique transcriptomic characteristics of fRPE and provides a resource to connect e/sQTLs in a critical ocular cell type to monogenic and complex eye disorders.

    View details for DOI 10.1038/s42003-019-0430-6

    View details for PubMedID 31925026

  • Identification of 22 novel loci associated with urinary biomarkers of albumin, sodium, and potassium excretion KIDNEY INTERNATIONAL Zanetti, D., Rao, A., Gustafsson, S., Assimes, T. L., Montgomery, S. B., Ingelsson, E. 2019; 95 (5): 1197–1208
  • Abundant associations with gene expression complicate GWAS follow-up NATURE GENETICS Liu, B., Gloudemans, M. J., Rao, A. S., Ingelsson, E., Montgomery, S. B. 2019; 51 (5): 768-+
  • Transcriptional and Position Effect Contributions to rAAV-Mediated Gene Targeting Spector, L. P., Tiffany, M., Ferraro, N. M., Abell, N. S., Montgomery, S. B., Kay, M. A. CELL PRESS. 2019: 294
  • A toolkit for genetics providers in follow-up of patients with non-diagnostic exome sequencing JOURNAL OF GENETIC COUNSELING Zastrow, D. B., Kohler, J. N., Bonner, D., Reuter, C. M., Fernandez, L., Grove, M. E., Fisk, D. G., Yang, Y., Eng, C. M., Ward, P. A., Bick, D., Worthey, E. A., Fisher, P. G., Ashley, E. A., Bernstein, J. A., Wheeler, M. T., Adams, D. R., Aday, A., Alejandro, M. E., Allard, P., Ashley, E. A., Azamian, M. S., Bacino, C. A., Baker, E., Balasubramanyam, A., Barseghyan, H., Batzli, G. F., Beggs, A. H., Behnam, B., Bellen, H. J., Bernstein, J. A., Bican, A., Bick, D. P., Birch, C. L., Boone, B. E., Bostwick, B. L., Briere, L. C., Brokamp, E., Brown, D. M., Brush, M., Burke, E. A., Burrage, L. C., Butte, M. J., Chen, S., Clark, G. D., Coakley, T. R., Cogan, J. D., Colley, H. A., Cooper, C. M., Cope, H., Craigen, W. J., D'Souza, P., Davids, M., Dayal, J. G., Dell'Angelica, E. C., Dhar, S. U., Dipple, K. M., Donnell-Fink, L. A., Dorrani, N., Dorset, D. C., Douine, E. D., Draper, D. D., Dries, A. M., Eckstein, D. J., Emrick, L. T., Eng, C. M., Enns, G. M., Eskin, A., Esteves, C., Estwick, T., Fairbrother, L., Ferreira, C., Fieg, E. L., Fisher, P. G., Fogel, B. L., Gahl, W. A., Glanton, E., Godfrey, R. A., Goldman, A. M., Goldstein, D. B., Gould, S. E., Gourdine, J. F., Groden, C. A., Gropman, A. L., Haendel, M., Hamid, R., Hanchard, N. A., High, F., Holm, I. A., Hom, J., Howerton, E. M., Huang, Y., Jamal, F., Jiang, Y., Johnston, J. M., Jones, A. L., Karaviti, L., Koeller, D. M., Kohane, I. S., Krasnewich, D. M., Korrick, S., Koziura, M., Krier, J. B., Kyle, J. E., Lalani, S. R., Lau, C., Lazar, J., LeBlanc, K., Lee, B. H., Lee, H., Levy, S. E., Lewis, R. A., Lincoln, S. A., Loo, S. K., Loscalzo, J., Maas, R. L., Macnamara, E. F., MacRae, C. A., Maduro, V. V., Majcherska, M. M., Malicdan, M. V., Mamounas, L. A., Manolio, T. A., Markello, T. C., Marom, R., Martin, G., Martinez-Agosto, J. A., Marwaha, S., May, T., McConkie-Rosell, A., McCormack, C. E., McCray, A. T., Merker, J. D., Metz, T. O., Might, M., Moretti, P. M., Morimoto, M., Nehrebecky, M. E., Nelson, S. F., Newberry, J., Newman, J. H., Nicholas, S. K., Novacic, D., Orange, J. S., Orengo, J. P., Pallais, J., Palmer, C. S., Papp, J. C., Postlethwait, J. H., Potocki, L., Pusey, B. N., Rives, L., Robertson, A. K., Rodan, L. H., Rosenfeld, J. A., Sampson, J. B., Samson, S. L., Schoch, K., Scott, D. A., Shakachite, L., Sharma, P., Shashi, V., Signer, R., Silverman, E. K., Sinsheimer, J. S., Smith, K. S., Spillmann, R. C., Stoler, J. M., Stong, N., Sullivan, J. A., Sweetser, D. A., Tan, Q., Tifft, C. J., Toro, C., Tran, A. A., Urv, T. K., Vilain, E., Vogel, T. P., Waggott, D. M., Wahl, C. E., Walley, N. M., Walsh, C. A., Walker, M., Wan, J., Wangler, M. F., Ward, P. A., Waters, K. M., Webb-Robertson, B. M., Westerfield, M., Wheeler, M. T., Wise, A. L., Wolfe, L. A., Worthey, E. A., Yamamoto, S., Yang, J., Yang, Y., Yoon, A. J., Yu, G., Zhao, C., Zheng, A., Undiagnosed Dis Network 2019; 28 (2): 213–28

    View details for DOI 10.1002/jgc4.1119

    View details for Web of Science ID 000463993600005

  • Proficiency Testing of Standardized Samples Shows Very High Interlaboratory Agreement for Clinical Next-Generation Sequencing-Based Oncology Assays ARCHIVES OF PATHOLOGY & LABORATORY MEDICINE Merker, J. D., Devereaux, K., Iafrate, A., Kamel-Reid, S., Kim, A. S., Moncur, J. T., Montgomery, S. B., Nagarajan, R., Portier, B. P., Routbort, M. J., Smail, C., Surrey, L. F., Vasalos, P., Lazar, A. J., Lindeman, N. 2019; 143 (4): 463–71
  • Identification of 22 novel loci associated withurinary biomarkers of albumin, sodium, andpotassium excretion. Kidney international Zanetti, D., Rao, A., Gustafsson, S., Assimes, T. L., Montgomery, S. B., Ingelsson, E. 2019

    Abstract

    Urine biomarkers reflecting kidney function and handling of dietary sodium and potassium are strongly associated with several common diseases including chronic kidney disease, cardiovascular disease, and diabetes mellitus. Knowledge about the genetic determinants of these biomarkers may shed light on pathophysiological mechanisms underlying the development of these diseases. We performed genome-wide association studies of urinary albumin: creatinine ratio (UACR), urinary potassium: creatinine ratio (UK/UCr), urinary sodium: creatinine ratio (UNa/UCr) and urinary sodium: potassium ratio (UNa/UK) in up to 218,450 (discovery) and 109,166 (replication) unrelated individuals of European ancestry from the UK Biobank. Further, we explored genetic correlations, tissue-specific gene expression, and possible genes implicated in the regulation of these biomarkers. After replication, we identified 19 genome-wide significant independent loci associated with UACR, 6 each with UK/UCr and UNa/UCr, and 4 with UNa/UK. In addition to 22 novel associations, we confirmed several established associations, including between the CUBN locus and microalbuminuria. We detected high pairwise genetic correlation across the urinary biomarkers, and between their levels and several physiological measurements. We highlight GIPR, a potential diabetes drug target, as possibly implicated in the genetic control of urinary potassium excretion, and NRBP1, a locus associated with gout, as plausibly involved in sodium and albumin excretion. Overall, we identified 22 novel genome-wide significant associations with urinary biomarkers and confirmed several previously established associations, providing new insights into the genetic basis of these traits and their connection to chronic diseases.

    View details for PubMedID 30910378

  • Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts. Nature medicine Frésard, L. n., Smail, C. n., Ferraro, N. M., Teran, N. A., Li, X. n., Smith, K. S., Bonner, D. n., Kernohan, K. D., Marwaha, S. n., Zappala, Z. n., Balliu, B. n., Davis, J. R., Liu, B. n., Prybol, C. J., Kohler, J. N., Zastrow, D. B., Reuter, C. M., Fisk, D. G., Grove, M. E., Davidson, J. M., Hartley, T. n., Joshi, R. n., Strober, B. J., Utiramerur, S. n., Lind, L. n., Ingelsson, E. n., Battle, A. n., Bejerano, G. n., Bernstein, J. A., Ashley, E. A., Boycott, K. M., Merker, J. D., Wheeler, M. T., Montgomery, S. B. 2019

    Abstract

    It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2-5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6-8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution.

    View details for DOI 10.1038/s41591-019-0457-8

    View details for PubMedID 31160820

  • SEX DIFFERENCES AT THE MOLECULAR LEVEL: LESSONS FROM THE HUMAN TRANSCRIPTOME Stranger, B., Oliva, M., Gamazon, E., Reverter, F., Wucher, V., Balliu, B., Dumitrascu, B., Parsana, P., Payne, A., Jo, B., Montgomery, S., Battle, A., Ardlie, K., Guigo, R., Engelhardt, B. ELSEVIER. 2019: 1034
  • Abundant associations with gene expression complicate GWAS follow-up. Nature genetics Liu, B. n., Gloudemans, M. J., Rao, A. S., Ingelsson, E. n., Montgomery, S. B. 2019; 51 (5): 768–69

    View details for PubMedID 31043754

  • A toolkit for genetics providers in follow-up of patients with non-diagnostic exome sequencing. Journal of genetic counseling Zastrow, D. B., Kohler, J. N., Bonner, D. n., Reuter, C. M., Fernandez, L. n., Grove, M. E., Fisk, D. G., Yang, Y. n., Eng, C. M., Ward, P. A., Bick, D. n., Worthey, E. A., Fisher, P. G., Ashley, E. A., Bernstein, J. A., Wheeler, M. T. 2019; 28 (2): 213–28

    Abstract

    There are approximately 7,000 rare diseases affecting 25-30 million Americans, with 80% estimated to have a genetic basis. This presents a challenge for genetics practitioners to determine appropriate testing, make accurate diagnoses, and conduct up-to-date patient management. Exome sequencing (ES) is a comprehensive diagnostic approach, but only 25%-41% of the patients receive a molecular diagnosis. The remaining three-fifths to three-quarters of patients undergoing ES remain undiagnosed. The Stanford Center for Undiagnosed Diseases (CUD), a clinical site of the Undiagnosed Diseases Network, evaluates patients with undiagnosed and rare diseases using a combination of methods including ES. Frequently these patients have non-diagnostic ES results, but strategic follow-up techniques identify diagnoses in a subset. We present techniques used at the CUD that can be adopted by genetics providers in clinical follow-up of cases where ES is non-diagnostic. Solved case examples illustrate different types of non-diagnostic results and the additional techniques that led to a diagnosis. Frequent approaches include segregation analysis, data reanalysis, genome sequencing, additional variant identification, careful phenotype-disease correlation, confirmatory testing, and case matching. We also discuss prioritization of cases for additional analyses.

    View details for PubMedID 30964584

  • Genetic analyses of human fetal retinal pigment epithelium gene expression suggest ocular disease mechanisms. Communications biology Liu, B., Calton, M. A., Abell, N. S., Benchorin, G., Gloudemans, M. J., Chen, M., Hu, J., Li, X., Balliu, B., Bok, D., Montgomery, S. B., Vollrath, D. 2019; 2: 186

    Abstract

    The retinal pigment epithelium (RPE) serves vital roles in ocular development and retinal homeostasis but has limited representation in large-scale functional genomics datasets. Understanding how common human genetic variants affect RPE gene expression could elucidate the sources of phenotypic variability in selected monogenic ocular diseases and pinpoint causal genes at genome-wide association study (GWAS) loci. We interrogated the genetics of gene expression of cultured human fetal RPE (fRPE) cells under two metabolic conditions and discovered hundreds of shared or condition-specific expression or splice quantitative trait loci (e/sQTLs). Co-localizations of fRPE e/sQTLs with age-related macular degeneration (AMD) and myopia GWAS data suggest new candidate genes, and mechanisms by which a common RDH5 allele contributes to both increased AMD risk and decreased myopia risk. Our study highlights the unique transcriptomic characteristics of fRPE and provides a resource to connect e/sQTLs in a critical ocular cell type to monogenic and complex eye disorders.

    View details for DOI 10.1038/s42003-019-0430-6

    View details for PubMedID 31123710

  • Pathologic gene network rewiring implicates PPP1R3A as a central regulator in pressure overload heart failure. Nature communications Cordero, P., Parikh, V. N., Chin, E. T., Erbilgin, A., Gloudemans, M. J., Shang, C., Huang, Y., Chang, A. C., Smith, K. S., Dewey, F., Zaleta, K., Morley, M., Brandimarto, J., Glazer, N., Waggott, D., Pavlovic, A., Zhao, M., Moravec, C. S., Tang, W. H., Skreen, J., Malloy, C., Hannenhalli, S., Li, H., Ritter, S., Li, M., Bernstein, D., Connolly, A., Hakonarson, H., Lusis, A. J., Margulies, K. B., Depaoli-Roach, A. A., Montgomery, S. B., Wheeler, M. T., Cappola, T., Ashley, E. A. 2019; 10 (1): 2760

    Abstract

    Heart failure is a leading cause of mortality, yet our understanding of the genetic interactions underlying this disease remains incomplete. Here, we harvest 1352 healthy and failing human hearts directly from transplant center operating rooms, and obtain genome-wide genotyping and gene expression measurements for a subset of 313. We build failing and non-failing cardiac regulatory gene networks, revealing important regulators and cardiac expression quantitative trait loci (eQTLs). PPP1R3A emerges as a regulator whose network connectivity changes significantly between health and disease. RNA sequencing after PPP1R3A knockdown validates network-based predictions, and highlights metabolic pathway regulation associated with increased cardiomyocyte size and perturbed respiratory metabolism. Mice lacking PPP1R3A are protected against pressure-overload heart failure. We present a global gene interaction map of the human heart failure transition, identify previously unreported cardiac eQTLs, and demonstrate the discovery potential of disease-specific networks through the description of PPP1R3A as a central regulator in heart failure.

    View details for DOI 10.1038/s41467-019-10591-5

    View details for PubMedID 31235787

  • Diagnosing rare diseases after the exome. Cold Spring Harbor molecular case studies Fresard, L., Montgomery, S. B. 2018; 4 (6)

    Abstract

    High-throughput sequencing has ushered in a diversity of approaches for identifying genetic variants and understanding genome structure and function. When applied to individuals with rare genetic diseases, these approaches have greatly accelerated gene discovery and patient diagnosis. Over the past decade, exome sequencing has emerged as a comprehensive and cost-effective approach to identify pathogenic variants in the protein-coding regions of the genome. However, for individuals in whom exome-sequencing fails to identify a pathogenic variant, we discuss recent advances that are helping to reduce the diagnostic gap.

    View details for PubMedID 30559314

  • Proficiency Testing of Standardized Samples Shows Very High Interlaboratory Agreement for Clinical Next-Generation Sequencing-Based Oncology Assays. Archives of pathology & laboratory medicine Merker, J. D., Devereaux, K., Iafrate, A. J., Kamel-Reid, S., Kim, A. S., Moncur, J. T., Montgomery, S. B., Nagarajan, R., Portier, B. P., Routbort, M. J., Smail, C., Surrey, L. F., Vasalos, P., Lazar, A. J., Lindeman, N. I. 2018

    Abstract

    CONTEXT.: Next-generation sequencing-based assays are being increasingly used in the clinical setting for the detection of somatic variants in solid tumors, but limited data are available regarding the interlaboratory performance of these assays.OBJECTIVE.: To examine proficiency testing data from the initial College of American Pathologists (CAP) Next-Generation Sequencing Solid Tumor survey to report on laboratory performance.DESIGN.: CAP proficiency testing results from 111 laboratories were analyzed for accuracy and associated assay performance characteristics.RESULTS.: The overall accuracy observed for all variants was 98.3%. Rare false-negative results could not be attributed to sequencing platform, selection method, or other assay characteristics. The median and average of the variant allele fractions reported by the laboratories were within 10% of those orthogonally determined by digital polymerase chain reaction for each variant. The median coverage reported at the variant sites ranged from 1922 to 3297.CONCLUSIONS.: Laboratories demonstrated an overall accuracy of greater than 98% with high specificity when examining 10 clinically relevant somatic single-nucleotide variants with a variant allele fraction of 15% or greater. These initial data suggest excellent performance, but further ongoing studies are needed to evaluate the performance of lower variant allele fractions and additional variant types.

    View details for PubMedID 30376374

  • Large-Scale Phenome-Wide Association Study of PCSK9 Variants Demonstrates Protection Against Ischemic Stroke CIRCULATION-GENOMIC AND PRECISION MEDICINE Rao, A. S., Lindholm, D., Rivas, M. A., Knowles, J. W., Montgomery, S. B., Ingelsson, E. 2018; 11 (7): e002162

    Abstract

    PCSK9 inhibition is a potent new therapy for hypercholesterolemia and cardiovascular disease. Although short-term clinical trial results have not demonstrated major adverse effects, long-term data will not be available for some time. Genetic studies in large biobanks offer a unique opportunity to predict drug effects and provide context for the evaluation of future clinical trial outcomes.We tested the association of the PCSK9 missense variant rs11591147 with predefined phenotypes and phenome-wide, in 337 536 individuals of British ancestry in the UK Biobank, with independent discovery and replication. Using a Bayesian statistical method, we leveraged phenotype correlations to evaluate the phenome-wide impact of PCSK9 inhibition with higher power at a finer resolution.The T allele of rs11591147 showed a protective effect on hyperlipidemia (odds ratio, 0.63±0.04; P=2.32×10-38), coronary heart disease (odds ratio, 0.73±0.09; P=1.05×10-6), and ischemic stroke (odds ratio, 0.61±0.18; P=2.40×10-3) and was associated with increased type 2 diabetes mellitus risk adjusted for lipid-lowering medication status (odds ratio, 1.24±0.10; P=1.98×10-7). We did not observe associations with cataracts, heart failure, atrial fibrillation, and cognitive dysfunction. Leveraging phenotype correlations, we observed evidence of a protective association with cerebral infarction and vascular occlusion. These results explore the effects of direct PCSK9 inhibition; off-target effects cannot be predicted using this approach.This result represents the first genetic evidence in a large cohort for the protective effect of PCSK9 inhibition on ischemic stroke and corroborates exploratory evidence from clinical trials. PCSK9 inhibition was not associated with variables other than those related to LDL (low-density lipoprotein) cholesterol, atherosclerosis, and type 2 diabetes mellitus, suggesting that other effects are either small or absent.

    View details for PubMedID 29997226

  • Ubiquitination of ABCE1 by NOT4 in Response to Mitochondrial Damage Links Co-translational Quality Control to PINK1-Directed Mitophagy. Cell metabolism Wu, Z., Wang, Y., Lim, J., Liu, B., Li, Y., Vartak, R., Stankiewicz, T., Montgomery, S., Lu, B. 2018

    Abstract

    Translation of mRNAs is tightly regulated and constantly surveyed for errors. Aberrant translation can trigger co-translational protein and RNA quality control processes, impairments of which cause neurodegeneration by still poorly understood mechanism(s). Here we show that quality control of translation of mitochondrial outer membrane (MOM)-localized mRNA intersects with the turnover of damaged mitochondria, both orchestrated by the mitochondrial kinase PINK1. Mitochondrial damage causes stalled translation of complex-I 30 kDa subunit (C-I30) mRNA on MOM, triggering the recruitment of co-translational quality control factors Pelo, ABCE1, and NOT4 to the ribosome/mRNA-ribonucleoprotein complex. Damage-induced ubiquitination of ABCE1 by NOT4 generates poly-ubiquitin signals that attract autophagy receptors to MOM to initiate mitophagy. In the Drosophila PINK1 model, these factors act synergistically to restore mitophagy and neuromuscular tissue integrity. Thus ribosome-associated co-translational quality control generates an early signal to trigger mitophagy. Our results have broad therapeutic implications for the understanding and treatment of neurodegenerative diseases.

    View details for PubMedID 29861391

  • Recurrently Mutated Genes Differ between Leptomeningeal and Solid Lung Cancer Brain Metastases. Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer Li, Y., Liu, B., Connolly, I. D., Kakusa, B. W., Pan, W., Nagpal, S., Montgomery, S. B., Hayden Gephart, M. 2018

    Abstract

    When compared with solid brain metastases from NSCLC, leptomeningeal disease (LMD) has unique growth patterns and is rapidly fatal. Patients with LMD do not undergo surgical resection, limiting the tissue available for scientific research. In this study we performed whole exome sequencing on eight samples of LMD to identify somatic mutations and compared the results with those for 26 solid brain metastases. We found that taste 2 receptor member 31 gene (TAS2R31) and phosphodiesterase 4D interacting protein gene (PDE4DIP) were recurrently mutated among LMD samples, suggesting involvement in LMD progression. Together with a retrospective review of the charts of an additional 44 patients with NSCLC LMD, we discovered a surprisingly low number of KRAS mutations (n= 4 [7.7%]) but a high number of EGFR mutations (n= 33 [63.5%]). The median interval for development of LMD from NSCLC was shorter in patients with mutant EGFR (16.3 months) than in patients with wild-type EGFR (23.9 months) (p= 0.017). Targeted analysis of recurrent mutations thus presents a useful complement to the existing diagnostic tool kit, and correlations of EGFR in LMD and KRAS in solid metastases suggest that molecular distinctions or systemic treatment pressure underpin the differences in growth patterns within the brain.

    View details for PubMedID 29604399

  • Biallelic Mutations in ATP5F1D, which Encodes a Subunit of ATP Synthase, Cause a Metabolic Disorder AMERICAN JOURNAL OF HUMAN GENETICS Olahova, M., Yoon, W., Thompson, K., Jangam, S., Fernandez, L., Davidson, J. M., Kyle, J. E., Grove, M. E., Fisk, D. G., Kohler, J. N., Holmes, M., Dries, A. M., Huang, Y., Zhao, C., Contrepois, K., Zappala, Z., Fresard, L., Waggott, D., Zink, E. M., Kim, Y., Heyman, H. M., Stratton, K. G., Webb-Robertson, B. M., Snyder, M., Merker, J. D., Montgomery, S. B., Fisher, P. G., Feichtinger, R. G., Mayr, J. A., Hall, J., Barbosa, I. A., Simpson, M. A., Deshpande, C., Waters, K. M., Koeller, D. M., Metz, T. O., Morris, A. A., Schelley, S., Cowan, T., Friederich, M. W., McFarland, R., Van Hove, J. K., Enns, G. M., Yamamoto, S., Ashley, E. A., Wangler, M. F., Taylor, R. W., Bellen, H. J., Bernstein, J. A., Wheeler, M. T., Undiagnosed Diseases Network 2018; 102 (3): 494–504

    Abstract

    ATP synthase, H+ transporting, mitochondrial F1 complex, δ subunit (ATP5F1D; formerly ATP5D) is a subunit of mitochondrial ATP synthase and plays an important role in coupling proton translocation and ATP production. Here, we describe two individuals, each with homozygous missense variants in ATP5F1D, who presented with episodic lethargy, metabolic acidosis, 3-methylglutaconic aciduria, and hyperammonemia. Subject 1, homozygous for c.245C>T (p.Pro82Leu), presented with recurrent metabolic decompensation starting in the neonatal period, and subject 2, homozygous for c.317T>G (p.Val106Gly), presented with acute encephalopathy in childhood. Cultured skin fibroblasts from these individuals exhibited impaired assembly of F1FO ATP synthase and subsequent reduced complex V activity. Cells from subject 1 also exhibited a significant decrease in mitochondrial cristae. Knockdown of Drosophila ATPsynδ, the ATP5F1D homolog, in developing eyes and brains caused a near complete loss of the fly head, a phenotype that was fully rescued by wild-type human ATP5F1D. In contrast, expression of the ATP5F1D c.245C>T and c.317T>G variants rescued the head-size phenotype but recapitulated the eye and antennae defects seen in other genetic models of mitochondrial oxidative phosphorylation deficiency. Our data establish c.245C>T (p.Pro82Leu) and c.317T>G (p.Val106Gly) in ATP5F1D as pathogenic variants leading to a Mendelian mitochondrial disease featuring episodic metabolic decompensation.

    View details for PubMedID 29478781

  • Genetic Regulatory Mechanisms of Smooth Muscle Cells Map to Coronary Artery Disease Risk Loci. American journal of human genetics Liu, B. n., Pjanic, M. n., Wang, T. n., Nguyen, T. n., Gloudemans, M. n., Rao, A. n., Castano, V. G., Nurnberg, S. n., Rader, D. J., Elwyn, S. n., Ingelsson, E. n., Montgomery, S. B., Miller, C. L., Quertermous, T. n. 2018

    Abstract

    Coronary artery disease (CAD) is the leading cause of death globally. Genome-wide association studies (GWASs) have identified more than 95 independent loci that influence CAD risk, most of which reside in non-coding regions of the genome. To interpret these loci, we generated transcriptome and whole-genome datasets using human coronary artery smooth muscle cells (HCASMCs) from 52 unrelated donors, as well as epigenomic datasets using ATAC-seq on a subset of 8 donors. Through systematic comparison with publicly available datasets from GTEx and ENCODE projects, we identified transcriptomic, epigenetic, and genetic regulatory mechanisms specific to HCASMCs. We assessed the relevance of HCASMCs to CAD risk using transcriptomic and epigenomic level analyses. By jointly modeling eQTL and GWAS datasets, we identified five genes (SIPA1, TCF21, SMAD3, FES, and PDGFRA) that may modulate CAD risk through HCASMCs, all of which have relevant functional roles in vascular remodeling. Comparison with GTEx data suggests that SIPA1 and PDGFRA influence CAD risk predominantly through HCASMCs, while other annotated genes may have multiple cell and tissue targets. Together, these results provide tissue-specific and mechanistic insights into the regulation of a critical vascular cell type associated with CAD in human populations.

    View details for PubMedID 30146127

  • Functional regulatory mechanism of smooth muscle cell-restricted LMOD1 coronary artery disease locus. PLoS genetics Nanda, V. n., Wang, T. n., Pjanic, M. n., Liu, B. n., Nguyen, T. n., Matic, L. P., Hedin, U. n., Koplev, S. n., Ma, L. n., Franzén, O. n., Ruusalepp, A. n., Schadt, E. E., Björkegren, J. L., Montgomery, S. B., Snyder, M. P., Quertermous, T. n., Leeper, N. J., Miller, C. L. 2018; 14 (11): e1007755

    Abstract

    Recent genome-wide association studies (GWAS) have identified multiple new loci which appear to alter coronary artery disease (CAD) risk via arterial wall-specific mechanisms. One of the annotated genes encodes LMOD1 (Leiomodin 1), a member of the actin filament nucleator family that is highly enriched in smooth muscle-containing tissues such as the artery wall. However, it is still unknown whether LMOD1 is the causal gene at this locus and also how the associated variants alter LMOD1 expression/function and CAD risk. Using epigenomic profiling we recently identified a non-coding regulatory variant, rs34091558, which is in tight linkage disequilibrium (LD) with the lead CAD GWAS variant, rs2820315. Herein we demonstrate through expression quantitative trait loci (eQTL) and statistical fine-mapping in GTEx, STARNET, and human coronary artery smooth muscle cell (HCASMC) datasets, rs34091558 is the top regulatory variant for LMOD1 in vascular tissues. Position weight matrix (PWM) analyses identify the protective allele rs34091558-TA to form a conserved Forkhead box O3 (FOXO3) binding motif, which is disrupted by the risk allele rs34091558-A. FOXO3 chromatin immunoprecipitation and reporter assays show reduced FOXO3 binding and LMOD1 transcriptional activity by the risk allele, consistent with effects of FOXO3 downregulation on LMOD1. LMOD1 knockdown results in increased proliferation and migration and decreased cell contraction in HCASMC, and immunostaining in atherosclerotic lesions in the SMC lineage tracing reporter mouse support a key role for LMOD1 in maintaining the differentiated SMC phenotype. These results provide compelling functional evidence that genetic variation is associated with dysregulated LMOD1 expression/function in SMCs, together contributing to the heritable risk for CAD.

    View details for PubMedID 30444878

  • Allele-specific expression reveals interactions between genetic variation and environment. Nature methods Knowles, D. A., Davis, J. R., Edgington, H., Raj, A., Favé, M., Zhu, X., Potash, J. B., Weissman, M. M., Shi, J., Levinson, D. F., Awadalla, P., Mostafavi, S., Montgomery, S. B., Battle, A. 2017

    Abstract

    Identifying interactions between genetics and the environment (GxE) remains challenging. We have developed EAGLE, a hierarchical Bayesian model for identifying GxE interactions based on associations between environmental variables and allele-specific expression. Combining whole-blood RNA-seq with extensive environmental annotations collected from 922 human individuals, we identified 35 GxE interactions, compared with only four using standard GxE interaction testing. EAGLE provides new opportunities for researchers to identify GxE interactions using functional genomic data.

    View details for DOI 10.1038/nmeth.4298

    View details for PubMedID 28530654

  • Population- and individual- specific regulatory variation in Sardinia NATURE GENETICS Pala, M., Zappala, Z., Marongiu, M., Li, X., Davis, J. R., Cusano, R., Crobu, F., Kukurba, K. R., Gloudemans, M. J., Reinier, F., Berutti, R., Piras, M. G., Mulas, A., Zoledziewska, M., Marongiu, M., Sorokin, E. P., Hess, G. T., Smith, K. S., Busonero, F., Maschio, A., Steri, M., Sidore, C., Sanna, S., Fiorillo, E., Bassik, M. C., Sawcer, S. J., Battle, A., Novembre, J., Jones, C., Angius, A., Abecasis, G. R., Schlessinger, D., Cucca, F., Montgomery, S. B. 2017; 49 (5): 700-?

    Abstract

    Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.

    View details for DOI 10.1038/ng.3840

    View details for Web of Science ID 000400051400010

    View details for PubMedID 28394350

  • The impact of structural variation on human gene expression NATURE GENETICS Chiang, C., Scott, A. J., Davis, J. R., Tsang, E. K., Li, X., Kim, Y., Hadzic, T., Damani, F. N., Ganel, L., Montgomery, S. B., Battle, A., Conrad, D. F., Hall, I. M. 2017; 49 (5): 692-?

    Abstract

    Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.

    View details for DOI 10.1038/ng.3834

    View details for PubMedID 28369037

  • Overexpression of the Cytokine BAFF and Autoimmunity Risk NEW ENGLAND JOURNAL OF MEDICINE Steri, M., Orru, V., Idda, M. L., Pitzalis, M., Pala, M., Zara, I., Sidore, C., Faa, V., Floris, M., Deiana, M., Asunis, I., Porcu, E., Mulas, A., PIRAS, M. G., Lobina, M., Lai, S., Marongiu, M., Serra, V., Marongiu, M., Sole, G., Busonero, F., Maschio, A., Cusano, R., Cuccuru, G., Deidda, F., Poddie, F., Farina, G., Dei, M., VIRDIS, F., Olla, S., Satta, M. A., Pani, M., Delitala, A., Cocco, E., Frau, J., Coghe, G., Lorefice, L., Fenu, G., Ferrigno, P., Ban, M., Barizzone, N., Leone, M., Guerini, F. R., Piga, M., Firinu, D., Kockum, I., Bomfim, I. L., Olsson, T., Alfredsson, L., Suarez, A., Carreira, P. E., Castillo-Palma, M. J., MARCUS, J. H., Congia, M., Angius, A., Melis, M., Gonzalez, A., Riquelme, M. E., da Silva, B. M., Marchini, M., DANIELI, M. G., Del Giacco, S., Mathieu, A., Pani, A., Montgomery, S. B., Rosati, G., Hillert, J., Sawcer, S., D'Alfonso, S., Todd, J. A., Novembre, J., Abecasis, G. R., Whalen, M. B., Marrosu, M. G., Meloni, A., Sanna, S., Gorospe, M., Schlessinger, D., Fiorillo, E., Zoledziewska, M., Cucca, F. 2017; 376 (17): 1615-1626

    Abstract

    Genomewide association studies of autoimmune diseases have mapped hundreds of susceptibility regions in the genome. However, only for a few association signals has the causal gene been identified, and for even fewer have the causal variant and underlying mechanism been defined. Coincident associations of DNA variants affecting both the risk of autoimmune disease and quantitative immune variables provide an informative route to explore disease mechanisms and drug-targetable pathways.Using case-control samples from Sardinia, Italy, we performed a genomewide association study in multiple sclerosis followed by TNFSF13B locus-specific association testing in systemic lupus erythematosus (SLE). Extensive phenotyping of quantitative immune variables, sequence-based fine mapping, cross-population and cross-phenotype analyses, and gene-expression studies were used to identify the causal variant and elucidate its mechanism of action. Signatures of positive selection were also investigated.A variant in TNFSF13B, encoding the cytokine and drug target B-cell activating factor (BAFF), was associated with multiple sclerosis as well as SLE. The disease-risk allele was also associated with up-regulated humoral immunity through increased levels of soluble BAFF, B lymphocytes, and immunoglobulins. The causal variant was identified: an insertion-deletion variant, GCTGT→A (in which A is the risk allele), yielded a shorter transcript that escaped microRNA inhibition and increased production of soluble BAFF, which in turn up-regulated humoral immunity. Population genetic signatures indicated that this autoimmunity variant has been evolutionarily advantageous, most likely by augmenting resistance to malaria.A TNFSF13B variant was associated with multiple sclerosis and SLE, and its effects were clarified at the population, cellular, and molecular levels. (Funded by the Italian Foundation for Multiple Sclerosis and others.).

    View details for DOI 10.1056/NEJMoa1610528

    View details for Web of Science ID 000400071900005

  • PML nuclear bodies contribute to the basal expression of the mTOR inhibitor DDIT4 SCIENTIFIC REPORTS Salsman, J., Stathakis, A., Parker, E., Chung, D., Anthes, L. E., Koskowich, K. L., Lahsaee, S., Gaston, D., Kukurba, K. R., Smith, K. S., Chute, I. C., Leger, D., Frost, L. D., Montgomery, S. B., Lewis, S. M., Eskiw, C., Dellaire, G. 2017; 7

    Abstract

    The promyelocytic leukemia (PML) protein is an essential component of PML nuclear bodies (PML NBs) frequently lost in cancer. PML NBs coordinate chromosomal regions via modification of nuclear proteins that in turn may regulate genes in the vicinity of these bodies. However, few PML NB-associated genes have been identified. PML and PML NBs can also regulate mTOR and cell fate decisions in response to cellular stresses. We now demonstrate that PML depletion in U2OS cells or TERT-immortalized normal human diploid fibroblasts results in decreased expression of the mTOR inhibitor DDIT4 (REDD1). DNA and RNA immuno-FISH reveal that PML NBs are closely associated with actively transcribed DDIT4 loci, implicating these bodies in regulation of basal DDIT4 expression. Although PML silencing did reduce the sensitivity of U2OS cells to metabolic stress induced by metformin, PML loss did not inhibit the upregulation of DDIT4 in response to metformin, hypoxia-like (CoCl2) or genotoxic stress. Analysis of publicly available cancer data also revealed a significant correlation between PML and DDIT4 expression in several cancer types (e.g. lung, breast, prostate). Thus, these findings uncover a novel mechanism by which PML loss may contribute to mTOR activation and cancer progression via dysregulation of basal DDIT4 gene expression.

    View details for DOI 10.1038/srep45038

    View details for Web of Science ID 000397135000001

    View details for PubMedID 28332630

  • Whole transcriptome sequencing in blood provides a diagnosis of spinal muscular atrophy with progressive myoclonic epilepsy (SMA-PME). Human mutation Kernohan, K. D., Frésard, L., Zappala, Z., Hartley, T., Smith, K. S., Wagner, J., Xu, H., McBride, A., Bourque, P. R., Consortium, C. R., Bennett, S. A., Dyment, D. A., Boycott, K. M., Montgomery, S. B., Warman-Chardon, J. 2017

    Abstract

    At least 15% of the disease-causing mutations affect mRNA splicing. Many splicing mutations are missed in a clinical setting due to limitations of in silico prediction algorithms or their location in noncoding regions. Whole-transcriptome sequencing is a promising new tool to identify these mutations; however, it will be a challenge to obtain disease-relevant tissue for RNA. Here, we describe an individual with a sporadic atypical spinal muscular atrophy, in whom clinical DNA sequencing reported one pathogenic ASAH1 mutation (c.458A>G;p.Tyr153Cys). Transcriptome sequencing on patient leukocytes identified a highly significant and atypical ASAH1 isoform not explained by c.458A>G(p<10(-16) ). Subsequent Sanger-sequencing identified the splice mutation responsible for the isoform (c.504A>C;p.Lys168Asn) and provided a molecular diagnosis of autosomal-recessive spinal muscular atrophy with progressive myoclonic epilepsy. Our findings demonstrate the utility of RNA sequencing from blood to identify splice-impacting disease mutations for nonhematological conditions, providing a diagnosis for these otherwise unsolved patients.

    View details for DOI 10.1002/humu.23211

    View details for PubMedID 28251733

  • Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genetics in medicine : official journal of the American College of Medical Genetics Merker, J. D., Wenger, A. M., Sneddon, T. n., Grove, M. n., Zappala, Z. n., Fresard, L. n., Waggott, D. n., Utiramerur, S. n., Hou, Y. n., Smith, K. S., Montgomery, S. B., Wheeler, M. n., Buchan, J. G., Lambert, C. C., Eng, K. S., Hickey, L. n., Korlach, J. n., Ford, J. n., Ashley, E. A. 2017

    Abstract

    PurposeCurrent clinical genomics assays primarily utilize short-read sequencing (SRS), but SRS has limited ability to evaluate repetitive regions and structural variants. Long-read sequencing (LRS) has complementary strengths, and we aimed to determine whether LRS could offer a means to identify overlooked genetic variation in patients undiagnosed by SRS.MethodsWe performed low-coverage genome LRS to identify structural variants in a patient who presented with multiple neoplasia and cardiac myxomata, in whom the results of targeted clinical testing and genome SRS were negative.ResultsThis LRS approach yielded 6,971 deletions and 6,821 insertions > 50 bp. Filtering for variants that are absent in an unrelated control and overlap a disease gene coding exon identified three deletions and three insertions. One of these, a heterozygous 2,184 bp deletion, overlaps the first coding exon of PRKAR1A, which is implicated in autosomal dominant Carney complex. RNA sequencing demonstrated decreased PRKAR1A expression. The deletion was classified as pathogenic based on guidelines for interpretation of sequence variants.ConclusionThis first successful application of genome LRS to identify a pathogenic variant in a patient suggests that LRS has significant potential for the identification of disease-causing structural variation. Larger studies will ultimately be required to evaluate the potential clinical utility of LRS.GENETICS in MEDICINE advance online publication, 22 June 2017; doi:10.1038/gim.2017.86.

    View details for PubMedID 28640241

  • Overexpression of the Cytokine BAFF and Autoimmunity Risk. New England journal of medicine Steri, M., Orrù, V., Idda, M. L., Pitzalis, M., Pala, M., Zara, I., Sidore, C., Faà, V., Floris, M., Deiana, M., Asunis, I., Porcu, E., Mulas, A., Piras, M. G., Lobina, M., Lai, S., Marongiu, M., Serra, V., Marongiu, M., Sole, G., Busonero, F., Maschio, A., Cusano, R., Cuccuru, G., Deidda, F., Poddie, F., Farina, G., Dei, M., Virdis, F., Olla, S., Satta, M. A., Pani, M., Delitala, A., Cocco, E., Frau, J., Coghe, G., Lorefice, L., Fenu, G., Ferrigno, P., Ban, M., Barizzone, N., Leone, M., Guerini, F. R., Piga, M., Firinu, D., Kockum, I., Lima Bomfim, I., Olsson, T., Alfredsson, L., Suarez, A., Carreira, P. E., Castillo-Palma, M. J., Marcus, J. H., Congia, M., Angius, A., Melis, M., Gonzalez, A., Alarcón Riquelme, M. E., da Silva, B. M., Marchini, M., Danieli, M. G., Del Giacco, S., Mathieu, A., Pani, A., Montgomery, S. B., Rosati, G., Hillert, J., Sawcer, S., D'Alfonso, S., Todd, J. A., Novembre, J., Abecasis, G. R., Whalen, M. B., Marrosu, M. G., Meloni, A., Sanna, S., Gorospe, M., Schlessinger, D., Fiorillo, E., Zoledziewska, M., Cucca, F. 2017; 376 (17): 1615-1626

    Abstract

    Genomewide association studies of autoimmune diseases have mapped hundreds of susceptibility regions in the genome. However, only for a few association signals has the causal gene been identified, and for even fewer have the causal variant and underlying mechanism been defined. Coincident associations of DNA variants affecting both the risk of autoimmune disease and quantitative immune variables provide an informative route to explore disease mechanisms and drug-targetable pathways.Using case-control samples from Sardinia, Italy, we performed a genomewide association study in multiple sclerosis followed by TNFSF13B locus-specific association testing in systemic lupus erythematosus (SLE). Extensive phenotyping of quantitative immune variables, sequence-based fine mapping, cross-population and cross-phenotype analyses, and gene-expression studies were used to identify the causal variant and elucidate its mechanism of action. Signatures of positive selection were also investigated.A variant in TNFSF13B, encoding the cytokine and drug target B-cell activating factor (BAFF), was associated with multiple sclerosis as well as SLE. The disease-risk allele was also associated with up-regulated humoral immunity through increased levels of soluble BAFF, B lymphocytes, and immunoglobulins. The causal variant was identified: an insertion-deletion variant, GCTGT→A (in which A is the risk allele), yielded a shorter transcript that escaped microRNA inhibition and increased production of soluble BAFF, which in turn up-regulated humoral immunity. Population genetic signatures indicated that this autoimmunity variant has been evolutionarily advantageous, most likely by augmenting resistance to malaria.A TNFSF13B variant was associated with multiple sclerosis and SLE, and its effects were clarified at the population, cellular, and molecular levels. (Funded by the Italian Foundation for Multiple Sclerosis and others.).

    View details for DOI 10.1056/NEJMoa1610528

    View details for PubMedID 28445677

  • Cohort-specific imputation of gene expression improves prediction of warfarin dose for African Americans. Genome medicine Gottlieb, A. n., Daneshjou, R. n., DeGorter, M. n., Bourgeois, S. n., Svensson, P. J., Wadelius, M. n., Deloukas, P. n., Montgomery, S. B., Altman, R. B. 2017; 9 (1): 98

    Abstract

    Genome-wide association studies are useful for discovering genotype-phenotype associations but are limited because they require large cohorts to identify a signal, which can be population-specific. Mapping genetic variation to genes improves power and allows the effects of both protein-coding variation as well as variation in expression to be combined into "gene level" effects.Previous work has shown that warfarin dose can be predicted using information from genetic variation that affects protein-coding regions. Here, we introduce a method that improves dose prediction by integrating tissue-specific gene expression. In particular, we use drug pathways and expression quantitative trait loci knowledge to impute gene expression-on the assumption that differential expression of key pathway genes may impact dose requirement. We focus on 116 genes from the pharmacokinetic and pharmacodynamic pathways of warfarin within training and validation sets comprising both European and African-descent individuals.We build gene-tissue signatures associated with warfarin dose in a cohort-specific manner and identify a signature of 11 gene-tissue pairs that significantly augments the International Warfarin Pharmacogenetics Consortium dosage-prediction algorithm in both populations.Our results demonstrate that imputed expression can improve dose prediction and bridge population-specific compositions. MATLAB code is available at https://github.com/assafgo/warfarin-cohort.

    View details for PubMedID 29178968

  • Incorporation of Biological Knowledge Into the Study of Gene-Environment Interactions. American journal of epidemiology Ritchie, M. D., Davis, J. R., Aschard, H. n., Battle, A. n., Conti, D. n., Du, M. n., Eskin, E. n., Fallin, M. D., Hsu, L. n., Kraft, P. n., Moore, J. H., Pierce, B. L., Bien, S. A., Thomas, D. C., Wei, P. n., Montgomery, S. B. 2017; 186 (7): 771–77

    Abstract

    A growing knowledge base of genetic and environmental information has greatly enabled the study of disease risk factors. However, the computational complexity and statistical burden of testing all variants by all environments has required novel study designs and hypothesis-driven approaches. We discuss how incorporating biological knowledge from model organisms, functional genomics, and integrative approaches can empower the discovery of novel gene-environment interactions and discuss specific methodological considerations with each approach. We consider specific examples where the application of these approaches has uncovered effects of gene-environment interactions relevant to drug response and immunity, and we highlight how such improvements enable a greater understanding of the pathogenesis of disease and the realization of precision medicine.

    View details for DOI 10.1093/aje/kwx229

    View details for PubMedID 28978191

  • Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases. American journal of epidemiology McAllister, K. n., Mechanic, L. E., Amos, C. n., Aschard, H. n., Blair, I. A., Chatterjee, N. n., Conti, D. n., Gauderman, W. J., Hsu, L. n., Hutter, C. M., Jankowska, M. M., Kerr, J. n., Kraft, P. n., Montgomery, S. B., Mukherjee, B. n., Papanicolaou, G. J., Patel, C. J., Ritchie, M. D., Ritz, B. R., Thomas, D. C., Wei, P. n., Witte, J. S. 2017; 186 (7): 753–61

    Abstract

    Recently, many new approaches, study designs, and statistical and analytical methods have emerged for studying gene-environment interactions (G×Es) in large-scale studies of human populations. There are opportunities in this field, particularly with respect to the incorporation of -omics and next-generation sequencing data and continual improvement in measures of environmental exposures implicated in complex disease outcomes. In a workshop called "Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases," held October 17-18, 2014, by the National Institute of Environmental Health Sciences and the National Cancer Institute in conjunction with the annual American Society of Human Genetics meeting, participants explored new approaches and tools that have been developed in recent years for G×E discovery. This paper highlights current and critical issues and themes in G×E research that need additional consideration, including the improved data analytical methods, environmental exposure assessment, and incorporation of functional data and annotations.

    View details for DOI 10.1093/aje/kwx227

    View details for PubMedID 28978193

  • Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. Nature genetics 2017; 49 (12): 1664–70

    Abstract

    Genetic variants have been associated with myriad molecular phenotypes that provide new insight into the range of mechanisms underlying genetic traits and diseases. Identifying any particular genetic variant's cascade of effects, from molecule to individual, requires assaying multiple layers of molecular complexity. We introduce the Enhancing GTEx (eGTEx) project that extends the GTEx project to combine gene expression with additional intermediate molecular measurements on the same tissues to provide a resource for studying how genetic differences cascade through molecular phenotypes to impact human health.

    View details for DOI 10.1038/ng.3969

    View details for PubMedID 29019975

  • The impact of rare variation on gene expression across tissues. Nature Li, X. n., Kim, Y. n., Tsang, E. K., Davis, J. R., Damani, F. N., Chiang, C. n., Hess, G. T., Zappala, Z. n., Strober, B. J., Scott, A. J., Li, A. n., Ganna, A. n., Bassik, M. C., Merker, J. D., Hall, I. M., Battle, A. n., Montgomery, S. B. 2017; 550 (7675): 239–43

    Abstract

    Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.

    View details for PubMedID 29022581

  • Genetic effects on gene expression across human tissues. Nature Battle, A. n., Brown, C. D., Engelhardt, B. E., Montgomery, S. B. 2017; 550 (7675): 204–13

    Abstract

    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

    View details for PubMedID 29022597

  • Small RNA Sequencing in Cells and Exosomes Identifies eQTLs and 14q32 as a Region of Active Export G3-GENES GENOMES GENETICS Tsang, E. K., Abell, N. S., Li, X., Anaya, V., Karczewski, K. J., Knowles, D. A., Sierra, R. G., Smith, K. S., Montgomery, S. B. 2017; 7 (1): 31-39

    Abstract

    Exosomes are small extracellular vesicles that carry heterogeneous cargo, including RNA, between cells. Increasing evidence suggests that exosomes are important mediators of intercellular communication and biomarkers of disease. Despite this, the variability of exosomal RNA between individuals has not been well quantified. To assess this variability, we sequenced the small RNA of cells and exosomes from a 17-member family. Across individuals, we show that selective export of miRNAs occurs not only at the level of specific transcripts, but that a cluster of 74 mature miRNAs on chromosome 14q32 is massively exported in exosomes while mostly absent from cells. We also observe more interindividual variability between exosomal samples than between cellular ones and identify four miRNA expression quantitative trait loci shared between cells and exosomes. Our findings indicate that genomically colocated miRNAs can be exported together and highlight the variability in exosomal miRNA levels between individuals as relevant for exosome use as diagnostics.

    View details for DOI 10.1534/g3.116.036137

    View details for Web of Science ID 000392200800003

    View details for PubMedCentralID PMC5217120

  • FIRE: functional inference of genetic variants that regulate gene expression. Bioinformatics (Oxford, England) Ioannidis, N. M., Davis, J. R., DeGorter, M. K., Larson, N. B., McDonnell, S. K., French, A. J., Battle, A. J., Hastie, T. J., Thibodeau, S. N., Montgomery, S. B., Bustamante, C. D., Sieh, W. n., Whittemore, A. S. 2017; 33 (24): 3895–3901

    Abstract

    Interpreting genetic variation in noncoding regions of the genome is an important challenge for personal genome analysis. One mechanism by which noncoding single nucleotide variants (SNVs) influence downstream phenotypes is through the regulation of gene expression. Methods to predict whether or not individual SNVs are likely to regulate gene expression would aid interpretation of variants of unknown significance identified in whole-genome sequencing studies.We developed FIRE (Functional Inference of Regulators of Expression), a tool to score both noncoding and coding SNVs based on their potential to regulate the expression levels of nearby genes. FIRE consists of 23 random forests trained to recognize SNVs in cis-expression quantitative trait loci (cis-eQTLs) using a set of 92 genomic annotations as predictive features. FIRE scores discriminate cis-eQTL SNVs from non-eQTL SNVs in the training set with a cross-validated area under the receiver operating characteristic curve (AUC) of 0.807, and discriminate cis-eQTL SNVs shared across six populations of different ancestry from non-eQTL SNVs with an AUC of 0.939. FIRE scores are also predictive of cis-eQTL SNVs across a variety of tissue types.FIRE scores for genome-wide SNVs in hg19/GRCh37 are available for download at https://sites.google.com/site/fireregulatoryvariation/.nilah@stanford.edu.Supplementary data are available at Bioinformatics online.

    View details for PubMedID 28961785

  • A TNFRSF14-Fc epsilon RI-mast cell pathway contributes to development of multiple features of asthma pathology in mice NATURE COMMUNICATIONS Sibilano, R., Gaudenzio, N., DeGorter, M. K., Reber, L. L., Hernandez, J. D., Starkl, P. M., Zurek, O. W., Tsai, M., Zahner, S., Montgomery, S. B., Roers, A., Kronenberg, M., Yu, M., Galli, S. J. 2016; 7

    Abstract

    Asthma has multiple features, including airway hyperreactivity, inflammation and remodelling. The TNF superfamily member TNFSF14 (LIGHT), via interactions with the receptor TNFRSF14 (HVEM), can support TH2 cell generation and longevity and promote airway remodelling in mouse models of asthma, but the mechanisms by which TNFSF14 functions in this setting are incompletely understood. Here we find that mouse and human mast cells (MCs) express TNFRSF14 and that TNFSF14:TNFRSF14 interactions can enhance IgE-mediated MC signalling and mediator production. In mouse models of asthma, TNFRSF14 blockade with a neutralizing antibody administered after antigen sensitization, or genetic deletion of Tnfrsf14, diminishes plasma levels of antigen-specific IgG1 and IgE antibodies, airway hyperreactivity, airway inflammation and airway remodelling. Finally, by analysing two types of genetically MC-deficient mice after engrafting MCs that either do or do not express TNFRSF14, we show that TNFRSF14 expression on MCs significantly contributes to the development of multiple features of asthma pathology.

    View details for DOI 10.1038/ncomms13696

    View details for Web of Science ID 000389853400001

    View details for PubMedID 27982078

    View details for PubMedCentralID PMC5171877

  • Directed evolution using dCas9-targeted somatic hypermutation in mammalian cells. Nature methods Hess, G. T., Frésard, L., Han, K., Lee, C. H., Li, A., Cimprich, K. A., Montgomery, S. B., Bassik, M. C. 2016

    Abstract

    Engineering and study of protein function by directed evolution has been limited by the technical requirement to use global mutagenesis or introduce DNA libraries. Here, we develop CRISPR-X, a strategy to repurpose the somatic hypermutation machinery for protein engineering in situ. Using catalytically inactive dCas9 to recruit variants of cytidine deaminase (AID) with MS2-modified sgRNAs, we can specifically mutagenize endogenous targets with limited off-target damage. This generates diverse libraries of localized point mutations and can target multiple genomic locations simultaneously. We mutagenize GFP and select for spectrum-shifted variants, including EGFP. Additionally, we mutate the target of the cancer therapeutic bortezomib, PSMB5, and identify known and novel mutations that confer bortezomib resistance. Finally, using a hyperactive AID variant, we mutagenize loci both upstream and downstream of transcriptional start sites. These experiments illustrate a powerful approach to create complex libraries of genetic variants in native context, which is broadly applicable to investigate and improve protein function.

    View details for DOI 10.1038/nmeth.4038

    View details for PubMedID 27798611

  • Small RNA Sequencing in Cells and Exosomes Identifies eQTLs and 14q32 as a Region of Active Export. G3 (Bethesda, Md.) Tsang, E. K., Abell, N. S., Li, X., Anaya, V., Karczewski, K. J., Knowles, D. A., Sierra, R. G., Smith, K. S., Montgomery, S. B. 2016

    Abstract

    Exosomes are small extracellular vesicles that carry heterogeneous cargo, including RNA, between cells. Increasing evidence suggests that exosomes are important mediators of intercellular communication and biomarkers of disease. Despite this, the variability of exosomal RNA between individuals has not been well quantified. To assess this variability, we sequenced the small RNA of cells and exosomes from a 17-member family. Across individuals, we show that selective export of miRNAs occurs not only at the level of specific transcripts, but that a cluster of 74 mature miRNAs on chromosome 14q32 is massively exported in exosomes while mostly absent from cells. We also observe more interindividual variability between exosomal samples than between cellular ones and identify four miRNA expression quantitative trait loci shared between cells and exosomes. Our findings indicate that genomically colocated miRNAs can be exported together and highlight the variability in exosomal miRNA levels between individuals as relevant for exosome use as diagnostics.

    View details for DOI 10.1534/g3.116.036137

    View details for PubMedID 27799337

    View details for PubMedCentralID PMC5217120

  • DNA Methylation Profiling of Uniparental Disomy Subjects Provides a Map of Parental Epigenetic Bias in the Human Genome. American journal of human genetics Joshi, R. S., Garg, P., Zaitlen, N., Lappalainen, T., Watson, C. T., Azam, N., Ho, D., Li, X., Antonarakis, S. E., Brunner, H. G., Buiting, K., Cheung, S. W., Coffee, B., Eggermann, T., Francis, D., Geraedts, J. P., Gimelli, G., Jacobson, S. G., Le Caignec, C., de Leeuw, N., Liehr, T., Mackay, D. J., Montgomery, S. B., Pagnamenta, A. T., Papenhausen, P., Robinson, D. O., Ruivenkamp, C., Schwartz, C., Steiner, B., Stevenson, D. A., Surti, U., Wassink, T., Sharp, A. J. 2016; 99 (3): 555-566

    Abstract

    Genomic imprinting is a mechanism in which gene expression varies depending on parental origin. Imprinting occurs through differential epigenetic marks on the two parental alleles, with most imprinted loci marked by the presence of differentially methylated regions (DMRs). To identify sites of parental epigenetic bias, here we have profiled DNA methylation patterns in a cohort of 57 individuals with uniparental disomy (UPD) for 19 different chromosomes, defining imprinted DMRs as sites where the maternal and paternal methylation levels diverge significantly from the biparental mean. Using this approach we identified 77 DMRs, including nearly all those described in previous studies, in addition to 34 DMRs not previously reported. These include a DMR at TUBGCP5 within the recurrent 15q11.2 microdeletion region, suggesting potential parent-of-origin effects associated with this genomic disorder. We also observed a modest parental bias in DNA methylation levels at every CpG analyzed across ∼1.9 Mb of the 15q11-q13 Prader-Willi/Angelman syndrome region, demonstrating that the influence of imprinting is not limited to individual regulatory elements such as CpG islands, but can extend across entire chromosomal domains. Using RNA-seq data, we detected signatures consistent with imprinted expression associated with nine novel DMRs. Finally, using a population sample of 4,004 blood methylomes, we define patterns of epigenetic variation at DMRs, identifying rare individuals with global gain or loss of methylation across multiple imprinted loci. Our data provide a detailed map of parental epigenetic bias in the human genome, providing insights into potential parent-of-origin effects.

    View details for DOI 10.1016/j.ajhg.2016.06.032

    View details for PubMedID 27569549

  • Impact of the X Chromosome and sex on regulatory variation GENOME RESEARCH Kukurba, K. R., Parsana, P., Balliu, B., Smith, K. S., Zappala, Z., Knowles, D. A., Fave, M., Davis, J. R., Li, X., Zhu, X., Potash, J. B., Weissman, M. M., Shi, J., Kundaje, A., Levinson, D. F., Awadalla, P., Mostafavi, S., Battle, A., Montgomery, S. B. 2016; 26 (6): 768-777

    Abstract

    The X Chromosome, with its unique mode of inheritance, contributes to differences between the sexes at a molecular level, including sex-specific gene expression and sex-specific impact of genetic variation. Improving our understanding of these differences offers to elucidate the molecular mechanisms underlying sex-specific traits and diseases. However, to date, most studies have either ignored the X Chromosome or had insufficient power to test for the sex-specific impact of genetic variation. By analyzing whole blood transcriptomes of 922 individuals, we have conducted the first large-scale, genome-wide analysis of the impact of both sex and genetic variation on patterns of gene expression, including comparison between the X Chromosome and autosomes. We identified a depletion of expression quantitative trait loci (eQTL) on the X Chromosome, especially among genes under high selective constraint. In contrast, we discovered an enrichment of sex-specific regulatory variants on the X Chromosome. To resolve the molecular mechanisms underlying such effects, we generated chromatin accessibility data through ATAC-sequencing to connect sex-specific chromatin accessibility to sex-specific patterns of expression and regulatory variation. As sex-specific regulatory variants discovered in our study can inform sex differences in heritable disease prevalence, we integrated our data with genome-wide association study data for multiple immune traits identifying several traits with significant sex biases in genetic susceptibilities. Together, our study provides genome-wide insight into how genetic variation, the X Chromosome, and sex shape human gene regulation and disease.

    View details for DOI 10.1101/gr.197897.115

    View details for PubMedID 27197214

  • An Efficient Multiple-Testing Adjustment for eQTL Studies that Accounts for Linkage Disequilibrium between Variants AMERICAN JOURNAL OF HUMAN GENETICS Davis, J. R., Fresard, L., Knowles, D. A., Pala, M., Bustamante, C. D., Battle, A., Montgomery, S. B. 2016; 98 (1): 216-224
  • An Efficient Multiple-Testing Adjustment for eQTL Studies that Accounts for Linkage Disequilibrium between Variants. American journal of human genetics Davis, J. R., Fresard, L., Knowles, D. A., Pala, M., Bustamante, C. D., Battle, A., Montgomery, S. B. 2016; 98 (1): 216-24

    Abstract

    Methods for multiple-testing correction in local expression quantitative trait locus (cis-eQTL) studies are a trade-off between statistical power and computational efficiency. Bonferroni correction, though computationally trivial, is overly conservative and fails to account for linkage disequilibrium between variants. Permutation-based methods are more powerful, though computationally far more intensive. We present an alternative correction method called eigenMT, which runs over 500 times faster than permutations and has adjusted p values that closely approximate empirical ones. To achieve this speed while also maintaining the accuracy of permutation-based methods, we estimate the effective number of independent variants tested for association with a particular gene, termed Meff, by using the eigenvalue decomposition of the genotype correlation matrix. We employ a regularized estimator of the correlation matrix to ensure Meff is robust and yields adjusted p values that closely approximate p values from permutations. Finally, using a common genotype matrix, we show that eigenMT can be applied with even greater efficiency to studies across tissues or conditions. Our method provides a simpler, more efficient approach to multiple-testing correction than existing methods and fits within existing pipelines for eQTL discovery.

    View details for DOI 10.1016/j.ajhg.2015.11.021

    View details for PubMedID 26749306

    View details for PubMedCentralID PMC4716687

  • ORegAnno 3.0: a community-driven resource for curated regulatory annotation. Nucleic acids research Lesurf, R., Cotto, K. C., Wang, G., Griffith, M., Kasaian, K., Jones, S. J., Montgomery, S. B., Griffith, O. L. 2016; 44 (D1): D126-32

    Abstract

    The Open Regulatory Annotation database (ORegAnno) is a resource for curated regulatory annotation. It contains information about regulatory regions, transcription factor binding sites, RNA binding sites, regulatory variants, haplotypes, and other regulatory elements. ORegAnno differentiates itself from other regulatory resources by facilitating crowd-sourced interpretation and annotation of regulatory observations from the literature and highly curated resources. It contains a comprehensive annotation scheme that aims to describe both the elements and outcomes of regulatory events. Moreover, ORegAnno assembles these disparate data sources and annotations into a single, high quality catalogue of curated regulatory information. The current release is an update of the database previously featured in the NAR Database Issue, and now contains 1 948 307 records, across 18 species, with a combined coverage of 334 215 080 bp. Complete records, annotation, and other associated data are available for browsing and download at http://www.oreganno.org/.

    View details for DOI 10.1093/nar/gkv1203

    View details for PubMedID 26578589

    View details for PubMedCentralID PMC4702855

  • Integrative functional genomics identifies regulatory mechanisms at coronary artery disease loci. Nature communications Miller, C. L., Pjanic, M., Wang, T., Nguyen, T., Cohain, A., Lee, J. D., Perisic, L., Hedin, U., Kundu, R. K., Majmudar, D., Kim, J. B., Wang, O., Betsholtz, C., Ruusalepp, A., Franzén, O., Assimes, T. L., Montgomery, S. B., Schadt, E. E., Björkegren, J. L., Quertermous, T. 2016; 7: 12092-?

    Abstract

    Coronary artery disease (CAD) is the leading cause of mortality and morbidity, driven by both genetic and environmental risk factors. Meta-analyses of genome-wide association studies have identified >150 loci associated with CAD and myocardial infarction susceptibility in humans. A majority of these variants reside in non-coding regions and are co-inherited with hundreds of candidate regulatory variants, presenting a challenge to elucidate their functions. Herein, we use integrative genomic, epigenomic and transcriptomic profiling of perturbed human coronary artery smooth muscle cells and tissues to begin to identify causal regulatory variation and mechanisms responsible for CAD associations. Using these genome-wide maps, we prioritize 64 candidate variants and perform allele-specific binding and expression analyses at seven top candidate loci: 9p21.3, SMAD3, PDGFD, IL6R, BMP1, CCDC97/TGFB1 and LMOD1. We validate our findings in expression quantitative trait loci cohorts, which together reveal new links between CAD associations and regulatory function in the appropriate disease context.

    View details for DOI 10.1038/ncomms12092

    View details for PubMedID 27386823

  • Non-Coding Loss-of-Function Variation in Human Genomes HUMAN HEREDITY Zappala, Z., Montgomery, S. B. 2016; 81 (2): 78-87

    Abstract

    Whole-genome and exome sequencing in human populations has revealed the tolerance of each gene for loss-of-function variation. By understanding this tolerance, it has become increasingly possible to identify genes that would make safe therapeutic targets and to identify rare genetic risk factors and phenotypes at the scale of individual genomes. To date, the vast majority of surveyed loss-of-function variants are in protein-coding regions of the genome mainly due to the focus on these regions by exome-based sequencing projects and their relative ease of interpretability. As whole-genome sequencing becomes more prevalent, new strategies will be required to uncover impactful variation in non-coding regions of the genome where the architecture of genome function is more complex. In this review, we investigate recent studies of loss-of-function variation and emerging approaches for interpreting whole-genome sequencing data to identify rare and impactful non-coding loss-of-function variants.

    View details for DOI 10.1159/000447453

    View details for Web of Science ID 000392559600029

    View details for PubMedID 28076858

  • A global reference for human genetic variation NATURE Altshuler, D. M., Durbin, R. M., Abecasis, G. R., Bentley, D. R., Chakravarti, A., Clark, A. G., Donnelly, P., Eichler, E. E., Flicek, P., Gabriel, S. B., Gibbs, R. A., Green, E. D., Hurles, M. E., Knoppers, B. M., Korbel, J. O., Lander, E. S., Lee, C., Lehrach, H., Mardis, E. R., Marth, G. T., McVean, G. A., Nickerson, D. A., Schmidt, J. P., Sherry, S. T., Wang, J., Wilson, R. K., Gibbs, R. A., Boerwinkle, E., Doddapaneni, H., Han, Y., Korchina, V., Kovar, C., Lee, S., Muzny, D., Reid, J. G., Zhu, Y., Wang, J., Chang, Y., Feng, Q., Fang, X., Guo, X., Jian, M., Jiang, H., Jin, X., Lan, T., Li, G., Li, J., Li, Y., Liu, S., Liu, X., Lu, Y., Ma, X., Tang, M., Wang, B., Wang, G., Wu, H., Wu, R., Xu, X., Yin, Y., Zhang, D., Zhang, W., Zhao, J., Zhao, M., Zheng, X., Lander, E. S., Altshuler, D. M., Gabriel, S. B., Gupta, N., Gharani, N., Toji, L. H., Gerry, N. P., Resch, A. M., Flicek, P., Barker, J., Clarke, L., Gil, L., Hunt, S. E., Kelman, G., Kulesha, E., Leinonen, R., McLaren, W. M., Radhakrishnan, R., Roa, A., Smirnov, D., Smith, R. E., Streeter, I., Thormann, A., Toneva, I., Vaughan, B., Zheng-Bradley, X., Bentley, D. R., Grocock, R., Humphray, S., James, T., Kingsbury, Z., Lehrach, H., Sudbrak, R., Albrecht, M. W., Amstislavskiy, V. S., Borodina, T. A., Lienhard, M., Mertes, F., Sultan, M., Timmermann, B., Yaspo, M., Mardis, E. R., Wilson, R. K., Fulton, L., Fulton, R., Sherry, S. T., Ananiev, V., Belaia, Z., Beloslyudtsev, D., Bouk, N., Chen, C., Church, D., Cohen, R., Cook, C., Garner, J., Hefferon, T., Kimelman, M., Liu, C., Lopez, J., Meric, P., O'Sullivan, C., Ostapchuk, Y., Phan, L., Ponomarov, S., Schneider, V., Shekhtman, E., Sirotkin, K., Slotta, D., Zhang, H., McVean, G. A., Durbin, R. M., Balasubramaniam, S., Burton, J., Danecek, P., Keane, T. M., Kolb-Kokocinski, A., McCarthy, S., Stalker, J., Quail, M., Schmidt, J. P., Davies, C. J., Gollub, J., Webster, T., Wong, B., Zhan, Y., Auton, A., Campbell, C. L., Kong, Y., Marcketta, A., Gibbs, R. A., Yu, F., Antunes, L., Bainbridge, M., Muzny, D., Sabo, A., Huang, Z., Wang, J., Coin, L. J., Fang, L., Guo, X., Jin, X., Li, G., Li, Q., Li, Y., Li, Z., Lin, H., Liu, B., Luo, R., Shao, H., Xie, Y., Ye, C., Yu, C., Zhang, F., Zheng, H., Zhu, H., Alkan, C., Dal, E., Kahveci, F., Marth, G. T., Garrison, E. P., Kural, D., Lee, W., Leong, W. F., Stromberg, M., Ward, A. N., Wu, J., Zhang, M., Daly, M. J., DePristo, M. A., Handsaker, R. E., Altshuler, D. M., Banks, E., Bhatia, G., del Angel, G., Gabriel, S. B., Genovese, G., Gupta, N., Li, H., Kashin, S., Lander, E. S., McCarroll, S. A., Nemesh, J. C., Poplin, R. E., Yoon, S. C., Lihm, J., Makarov, V., Clark, A. G., Gottipati, S., Keinan, A., Rodriguez-Flores, J. L., Korbel, J. O., Rausch, T., Fritz, M. H., Stuetz, A. M., Flicek, P., Beal, K., Clarke, L., Datta, A., Herrero, J., McLaren, W. M., Ritchie, G. R., Smith, R. E., Zerbino, D., Zheng-Bradley, X., Sabeti, P. C., Shlyakhter, I., Schaffner, S. F., Vitti, J., Cooper, D. N., Ball, E. V., Stenson, P. D., Bentley, D. R., Barnes, B., Bauer, M., Cheetham, R. K., Cox, A., Eberle, M., Humphray, S., Kahn, S., Murray, L., Peden, J., Shaw, R., Kenny, E. E., Batzer, M. A., Konkel, M. K., Walker, J. A., MacArthur, D. G., Lek, M., Sudbrak, R., Amstislavskiy, V. S., Herwig, R., Mardis, E. R., Ding, L., Koboldt, D. C., Larson, D., Ye, K., Gravel, S., Swaroop, A., Chew, E., Lappalainen, T., Erlich, Y., Gymrek, M., Willems, T. F., Simpson, J. T., Shriver, M. D., Rosenfeld, J. A., Bustamante, C. D., Montgomery, S. B., De La Vega, F. M., Byrnes, J. K., Carroll, A. W., DeGorter, M. K., Lacroute, P., Maples, B. K., Martin, A. R., Moreno-Estrada, A., Shringarpure, S. S., Zakharia, F., Halperin, E., Baran, Y., Lee, C., Cerveira, E., Hwang, J., Malhotra, A., Plewczynski, D., Radew, K., Romanovitch, M., Zhang, C., Hyland, F. C., Craig, D. W., Christoforides, A., Homer, N., Izatt, T., Kurdoglu, A. A., Sinari, S. A., Squire, K., Sherry, S. T., Xiao, C., Sebat, J., Antaki, D., Gujral, M., Noor, A., Ye, K., Burchard, E. G., Hernandez, R. D., Gignoux, C. R., Haussler, D., Katzman, S. J., Kent, W. J., Howie, B., Ruiz-Linares, A., Dermitzakis, E. T., Devine, S. E., Goncalo, R. A., Kang, H. M., Kidd, J. M., Blackwell, T., Caron, S., Chen, W., Emery, S., Fritsche, L., Fuchsberger, C., Jun, G., Li, B., Lyons, R., Scheller, C., Sidore, C., Song, S., Sliwerska, E., Taliun, D., Tan, A., Welch, R., Wing, M. K., Zhan, X., Awadalla, P., Hodgkinson, A., Li, Y., Shi, X., Quitadamo, A., Lunter, G., McVean, G. A., Marchini, J. L., Myers, S., Churchhouse, C., Delaneau, O., Gupta-Hinch, A., Kretzschmar, W., Iqbal, Z., Mathieson, I., Menelaou, A., Rimmer, A., Xifara, D. K., Oleksyk, T. K., Fu, Y., Liu, X., Xiong, M., Jorde, L., Witherspoon, D., Xing, J., Eichler, E. E., Browning, B. L., Browning, S. R., Hormozdiari, F., Sudmant, P. H., Khurana, E., Durbin, R. M., Hurles, M. E., Tyler-Smith, C., Albers, C. A., Ayub, Q., Balasubramaniam, S., Chen, Y., Colonna, V., Danecek, P., Jostins, L., Keane, T. M., McCarthy, S., Walter, K., Xue, Y., Gerstein, M. B., Abyzov, A., Balasubramanian, S., Chen, J., Clarke, D., Fu, Y., Harmanci, A. O., Jin, M., Lee, D., Liu, J., Mu, X. J., Zhang, J., Zhang, Y., Li, Y., Luo, R., Zhu, H., Alkan, C., Dal, E., Kahveci, F., Marth, G. T., Garrison, E. P., Kural, D., Lee, W., Ward, A. N., Wu, J., Zhang, M., McCarroll, S. A., Handsaker, R. E., Altshuler, D. M., Banks, E., del Angel, G., Genovese, G., Hartl, C., Li, H., Kashin, S., Nemesh, J. C., Shakir, K., Yoon, S. C., Lihm, J., Makarov, V., Degenhardt, J., Korbel, J. O., Fritz, M. H., Meiers, S., Raeder, B., Rausch, T., Stuetz, A. M., Flicek, P., Casale, F. P., Clarke, L., Smith, R. E., Stegle, O., Zheng-Bradley, X., Bentley, D. R., Barnes, B., Cheetham, R. K., Eberle, M., Humphray, S., Kahn, S., Murray, L., Shaw, R., Lameijer, E., Batzer, M. A., Konkel, M. K., Walker, J. A., Ding, L., Hall, I., Ye, K., Lacroute, P., Lee, C., Cerveira, E., Malhotra, A., Hwang, J., Plewczynski, D., Radew, K., Romanovitch, M., Zhang, C., Craig, D. W., Homer, N., Church, D., Xiao, C., Sebat, J., Antaki, D., Bafna, V., Michaelson, J., Ye, K., Devine, S. E., Gardner, E. J., Abecasis, G. R., Kidd, J. M., Mills, R. E., Dayama, G., Emery, S., Jun, G., Shi, X., Quitadamo, A., Lunter, G., McVean, G. A., Chen, K., Fan, X., Chong, Z., Chen, T., Witherspoon, D., Xing, J., Eichler, E. E., Chaisson, M. J., Hormozdiari, F., Huddleston, J., Malig, M., Nelson, B. J., Sudmant, P. H., Parrish, N. F., Khurana, E., Hurles, M. E., Blackburne, B., Lindsay, S. J., Ning, Z., Walter, K., Zhang, Y., Gerstein, M. B., Abyzov, A., Chen, J., Clarke, D., Lam, H., Mu, X. J., Sisu, C., Zhang, J., Zhang, Y., Gibbs, R. A., Yu, F., Bainbridge, M., Challis, D., Evani, U. S., Kovar, C., Lu, J., Muzny, D., Nagaswamy, U., Reid, J. G., Sabo, A., Yu, J., Guo, X., Li, W., Li, Y., Wu, R., Marth, G. T., Garrison, E. P., Leong, W. F., Ward, A. N., del Angel, G., DePristo, M. A., Gabriel, S. B., Gupta, N., Hartl, C., Poplin, R. E., Clark, A. G., Rodriguez-Flores, J. L., Flicek, P., Clarke, L., Smith, R. E., Zheng-Bradley, X., MacArthur, D. G., Mardis, E. R., Fulton, R., Koboldt, D. C., Gravel, S., Bustamante, C. D., Craig, D. W., Christoforides, A., Homer, N., Izatt, T., Sherry, S. T., Xiao, C., Dermitzakis, E. T., Abecasis, G. R., Kang, H. M., McVean, G. A., Gerstein, M. B., Balasubramanian, S., Habegger, L., Yu, H., Flicek, P., Clarke, L., Cunningham, F., Dunham, I., Zerbino, D., Zheng-Bradley, X., Lage, K., Jespersen, J. B., Horn, H., Montgomery, S. B., DeGorter, M. K., Khurana, E., Tyler-Smith, C., Chen, Y., Colonna, V., Xue, Y., Gerstein, M. B., Balasubramanian, S., Fu, Y., Kim, D., Auton, A., Marcketta, A., DeSalle, R., Narechania, A., Sayres, M. A., Garrison, E. P., Handsaker, R. E., Kashin, S., McCarroll, S. A., Rodriguez-Flores, J. L., Flicek, P., Clarke, L., Zheng-Bradley, X., Erlich, Y., Gymrek, M., Willems, T. F., Bustamante, C. D., Mendez, F. L., Poznik, G. D., Underhill, P. A., Lee, C., Cerveira, E., Malhotra, A., Romanovitch, M., Zhang, C., Abecasis, G. R., Coin, L., Shao, H., Mittelman, D., Tyler-Smith, C., Ayub, Q., Banerjee, R., Cerezo, M., Chen, Y., Fitzgerald, T., Louzada, S., Massaia, A., McCarthy, S., Ritchie, G. R., Xue, Y., Yang, F., Gibbs, R. A., Kovar, C., Kalra, D., Hale, W., Muzny, D., Reid, J. G., Wang, J., Dan, X., Guo, X., Li, G., Li, Y., Ye, C., Zheng, X., Altshuler, D. M., Flicek, P., Clarke, L., Zheng-Bradley, X., Bentley, D. R., Cox, A., Humphray, S., Kahn, S., Sudbrak, R., Albrecht, M. W., Lienhard, M., Larson, D., Craig, D. W., Izatt, T., Kurdoglu, A. A., Sherry, S. T., Xiao, C., Haussler, D., Abecasis, G. R., McVean, G. A., Durbin, R. M., Balasubramaniam, S., Keane, T. M., McCarthy, S., Stalker, J., Chakravarti, A., Knoppers, B. M., Abecasis, G. R., Barnes, K. C., Beiswanger, C., Burchard, E. G., Bustamante, C. D., Cai, H., Cao, H., Durbin, R. M., Gerry, N. P., Gharani, N., Gibbs, R. A., Gignoux, C. R., Gravel, S., Henn, B., Jones, D., Jorde, L., Kaye, J. S., Keinan, A., Kent, A., Kerasidou, A., Li, Y., Mathias, R., McVean, G. A., Moreno-Estrada, A., Ossorio, P. N., Parker, M., Resch, A. M., Rotimi, C. N., Royal, C. D., Sandoval, K., Su, Y., Sudbrak, R., Tian, Z., Tishkoff, S., Toji, L. H., Tyler-Smith, C., Via, M., Wang, Y., Yang, H., Yang, L., Zhu, J., Bodmer, W., Bedoya, G., Ruiz-Linares, A., Cai, Z., Gao, Y., Chu, J., Peltonen, L., Garcia-Montero, A., Orfao, A., Dutil, J., Martinez-Cruzado, J. C., Oleksyk, T. K., Barnes, K. C., Mathias, R. A., Hennis, A., Watson, H., McKenzie, C., Qadri, F., LaRocque, R., Sabeti, P. C., Zhu, J., Deng, X., Sabeti, P. C., Asogun, D., Folarin, O., Happi, C., Omoniwa, O., Stremlau, M., Tariyal, R., Jallow, M., Joof, F. S., Corrah, T., Rockett, K., Kwiatkowski, D., Kooner, J., Tran Tinh Hien, T. T., Dunstan, S. J., Nguyen Thuy Hang, N. T., Fonnie, R., Garry, R., Kanneh, L., Moses, L., Sabeti, P. C., Schieffelin, J., Grant, D. S., Gallo, C., Poletti, G., Saleheen, D., Rasheed, A., Brook, L. D., Felsenfeld, A., McEwen, J. E., Vaydylevich, Y., Green, E. D., Duncanson, A., Dunn, M., Schloss, J. A., Wang, J., Yang, H., Auton, A., Brooks, L. D., Durbin, R. M., Garrison, E. P., Kang, H. M., Korbel, J. O., Marchini, J. L., McCarthy, S., McVean, G. A., Abecasis, G. R. 2015; 526 (7571): 68-?

    Abstract

    The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

    View details for DOI 10.1038/nature15393

    View details for Web of Science ID 000362095100036

  • The landscape of genomic imprinting across diverse adult human tissues GENOME RESEARCH Baran, Y., Subramaniam, M., Biton, A., Tukiainen, T., Tsang, E. K., Rivas, M. A., Pirinen, M., Gutierrez-Arcelus, M., Smith, K. S., Kukurba, K. R., Zhang, R., Eng, C., Torgerson, D. G., Urbanek, C., Li, J. B., Rodriguez-Santana, J. R., Burchard, E. G., Seibold, M. A., MacArthur, D. G., Montgomery, S. B., Zaitlen, N. A., Lappalainen, T. 2015; 25 (7): 927-936

    Abstract

    Genomic imprinting is an important regulatory mechanism that silences one of the parental copies of a gene. To systematically characterize this phenomenon, we analyze tissue specificity of imprinting from allelic expression data in 1582 primary tissue samples from 178 individuals from the Genotype-Tissue Expression (GTEx) project. We characterize imprinting in 42 genes, including both novel and previously identified genes. Tissue specificity of imprinting is widespread, and gender-specific effects are revealed in a small number of genes in muscle with stronger imprinting in males. IGF2 shows maternal expression in the brain instead of the canonical paternal expression elsewhere. Imprinting appears to have only a subtle impact on tissue-specific expression levels, with genes lacking a systematic expression difference between tissues with imprinted and biallelic expression. In summary, our systematic characterization of imprinting in adult tissues highlights variation in imprinting between genes, individuals, and tissues.

    View details for DOI 10.1101/gr.192278.115

    View details for Web of Science ID 000357356900001

    View details for PubMedID 25953952

    View details for PubMedCentralID PMC4484390

  • Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome. Science Rivas, M. A., Pirinen, M., Conrad, D. F., Lek, M., Tsang, E. K., Karczewski, K. J., Maller, J. B., Kukurba, K. R., DeLuca, D. S., Fromer, M., Ferreira, P. G., Smith, K. S., Zhang, R., Zhao, F., Banks, E., Poplin, R., Ruderfer, D. M., Purcell, S. M., Tukiainen, T., Minikel, E. V., Stenson, P. D., Cooper, D. N., Huang, K. H., Sullivan, T. J., Nedzel, J., Bustamante, C. D., Li, J. B., Daly, M. J., Guigo, R., Donnelly, P., Ardlie, K., Sammeth, M., Dermitzakis, E. T., McCarthy, M. I., Montgomery, S. B., Lappalainen, T., MacArthur, D. G. 2015; 348 (6235): 666-669

    Abstract

    Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.

    View details for DOI 10.1126/science.1261877

    View details for PubMedID 25954003

  • Effect of predicted protein-truncating genetic variants on the human transcriptome SCIENCE Rivas, M. A., Pirinen, M., Conrad, D. F., Lek, M., Tsang, E. K., Karczewski, K. J., Maller, J. B., Kukurba, K. R., DeLuca, D. S., Fromer, M., Ferreira, P. G., Smith, K. S., Zhang, R., Zhao, F., Banks, E., Poplin, R., Ruderfer, D. M., Purcell, S. M., Tukiainen, T., Minikel, E. V., Stenson, P. D., Cooper, D. N., Huang, K. H., Sullivan, T. J., Nedzel, J., Bustamante, C. D., Li, J. B., Daly, M. J., Guigo, R., Donnelly, P., Ardlie, K., Sammeth, M., Dermitzakis, E. T., McCarthy, M. I., Montgomery, S. B., Lappalainen, T., MacArthur, D. G. 2015; 348 (6235): 666-669

    Abstract

    Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.

    View details for DOI 10.1126/science.1261877

    View details for Web of Science ID 000354045700038

    View details for PubMedCentralID PMC4537935

  • Genetic conflict reflected in tissue-specific maps of genomic imprinting in human and mouse. Nature genetics Babak, T., Deveale, B., Tsang, E. K., Zhou, Y., Li, X., Smith, K. S., Kukurba, K. R., Zhang, R., Li, J. B., van der Kooy, D., Montgomery, S. B., Fraser, H. B. 2015; 47 (5): 544-549

    Abstract

    Genomic imprinting is an epigenetic process that restricts gene expression to either the maternally or paternally inherited allele. Many theories have been proposed to explain its evolutionary origin, but understanding has been limited by a paucity of data mapping the breadth and dynamics of imprinting within any organism. We generated an atlas of imprinting spanning 33 mouse and 45 human developmental stages and tissues. Nearly all imprinted genes were imprinted in early development and either retained their parent-of-origin expression in adults or lost it completely. Consistent with an evolutionary signature of parental conflict, imprinted genes were enriched for coexpressed pairs of maternally and paternally expressed genes, showed accelerated expression divergence between human and mouse, and were more highly expressed than their non-imprinted orthologs in other species. Our approach demonstrates a general framework for the discovery of imprinting in any species and sheds light on the causes and consequences of genomic imprinting in mammals.

    View details for DOI 10.1038/ng.3274

    View details for PubMedID 25848752

  • Genetic conflict reflected in tissue-specific maps of genomic imprinting in human and mouse. Nature genetics Babak, T., Deveale, B., Tsang, E. K., Zhou, Y., Li, X., Smith, K. S., Kukurba, K. R., Zhang, R., Li, J. B., van der Kooy, D., Montgomery, S. B., Fraser, H. B. 2015; 47 (5): 544-549

    Abstract

    Genomic imprinting is an epigenetic process that restricts gene expression to either the maternally or paternally inherited allele. Many theories have been proposed to explain its evolutionary origin, but understanding has been limited by a paucity of data mapping the breadth and dynamics of imprinting within any organism. We generated an atlas of imprinting spanning 33 mouse and 45 human developmental stages and tissues. Nearly all imprinted genes were imprinted in early development and either retained their parent-of-origin expression in adults or lost it completely. Consistent with an evolutionary signature of parental conflict, imprinted genes were enriched for coexpressed pairs of maternally and paternally expressed genes, showed accelerated expression divergence between human and mouse, and were more highly expressed than their non-imprinted orthologs in other species. Our approach demonstrates a general framework for the discovery of imprinting in any species and sheds light on the causes and consequences of genomic imprinting in mammals.

    View details for DOI 10.1038/ng.3274

    View details for PubMedID 25848752

    View details for PubMedCentralID PMC4414907

  • Tissue-specific effects of genetic and epigenetic variation on gene regulation and splicing. PLoS genetics Gutierrez-Arcelus, M., Ongen, H., Lappalainen, T., Montgomery, S. B., Buil, A., Yurovsky, A., Bryois, J., Padioleau, I., Romano, L., Planchon, A., Falconnet, E., Bielser, D., Gagnebin, M., Giger, T., Borel, C., Letourneau, A., Makrythanasis, P., Guipponi, M., Gehrig, C., Antonarakis, S. E., Dermitzakis, E. T. 2015; 11 (1)

    Abstract

    Understanding how genetic variation affects distinct cellular phenotypes, such as gene expression levels, alternative splicing and DNA methylation levels, is essential for better understanding of complex diseases and traits. Furthermore, how inter-individual variation of DNA methylation is associated to gene expression is just starting to be studied. In this study, we use the GenCord cohort of 204 newborn Europeans' lymphoblastoid cell lines, T-cells and fibroblasts derived from umbilical cords. The samples were previously genotyped for 2.5 million SNPs, mRNA-sequenced, and assayed for methylation levels in 482,421 CpG sites. We observe that methylation sites associated to expression levels are enriched in enhancers, gene bodies and CpG island shores. We show that while the correlation between DNA methylation and gene expression can be positive or negative, it is very consistent across cell-types. However, this epigenetic association to gene expression appears more tissue-specific than the genetic effects on gene expression or DNA methylation (observed in both sharing estimations based on P-values and effect size correlations between cell-types). This predominance of genetic effects can also be reflected by the observation that allele specific expression differences between individuals dominate over tissue-specific effects. Additionally, we discover genetic effects on alternative splicing and interestingly, a large amount of DNA methylation correlating to alternative splicing, both in a tissue-specific manner. The locations of the SNPs and methylation sites involved in these associations highlight the participation of promoter proximal and distant regulatory regions on alternative splicing. Overall, our results provide high-resolution analyses showing how genome sequence variation has a broad effect on cellular phenotypes across cell-types, whereas epigenetic factors provide a secondary layer of variation that is more tissue-specific. Furthermore, the details of how this tissue-specificity may vary across inter-relations of molecular traits, and where these are occurring, can yield further insights into gene regulation and cellular biology as a whole.

    View details for DOI 10.1371/journal.pgen.1004958

    View details for PubMedID 25634236

    View details for PubMedCentralID PMC4310612

  • RNA Sequencing and Analysis. Cold Spring Harbor protocols Kukurba, K. R., Montgomery, S. B. 2015; 2015 (11): pdb top084970-?

    Abstract

    RNA sequencing (RNA-Seq) uses the capabilities of high-throughput sequencing methods to provide insight into the transcriptome of a cell. Compared to previous Sanger sequencing- and microarray-based methods, RNA-Seq provides far higher coverage and greater resolution of the dynamic nature of the transcriptome. Beyond quantifying gene expression, the data generated by RNA-Seq facilitate the discovery of novel transcripts, identification of alternatively spliced genes, and detection of allele-specific expression. Recent advances in the RNA-Seq workflow, from sample preparation to library construction to data analysis, have enabled researchers to further elucidate the functional complexity of the transcription. In addition to polyadenylated messenger RNA (mRNA) transcripts, RNA-Seq can be applied to investigate different populations of RNA, including total RNA, pre-mRNA, and noncoding RNA, such as microRNA and long ncRNA. This article provides an introduction to RNA-Seq methods, including applications, experimental design, and technical challenges.

    View details for DOI 10.1101/pdb.top084970

    View details for PubMedID 25870306

  • Type I interferon signaling genes in recurrent major depression: increased expression detected by whole-blood RNA sequencing. Molecular psychiatry Mostafavi, S., Battle, A., Zhu, X., Potash, J. B., Weissman, M. M., Shi, J., Beckman, K., Haudenschild, C., McCormick, C., Mei, R., Gameroff, M. J., Gindes, H., Adams, P., Goes, F. S., Mondimore, F. M., MacKinnon, D. F., Notes, L., Schweizer, B., Furman, D., Montgomery, S. B., Urban, A. E., Koller, D., Levinson, D. F. 2014; 19 (12): 1267-1274

    Abstract

    A study of genome-wide gene expression in major depressive disorder (MDD) was undertaken in a large population-based sample to determine whether altered expression levels of genes and pathways could provide insights into biological mechanisms that are relevant to this disorder. Gene expression studies have the potential to detect changes that may be because of differences in common or rare genomic sequence variation, environmental factors or their interaction. We recruited a European ancestry sample of 463 individuals with recurrent MDD and 459 controls, obtained self-report and semi-structured interview data about psychiatric and medical history and other environmental variables, sequenced RNA from whole blood and genotyped a genome-wide panel of common single-nucleotide polymorphisms. We used analytical methods to identify MDD-related genes and pathways using all of these sources of information. In analyses of association between MDD and expression levels of 13 857 single autosomal genes, accounting for multiple technical, physiological and environmental covariates, a significant excess of low P-values was observed, but there was no significant single-gene association after genome-wide correction. Pathway-based analyses of expression data detected significant association of MDD with increased expression of genes in the interferon α/β signaling pathway. This finding could not be explained by potentially confounding diseases and medications (including antidepressants) or by computationally estimated proportions of white blood cell types. Although cause-effect relationships cannot be determined from these data, the results support the hypothesis that altered immune signaling has a role in the pathogenesis, manifestation, and/or the persistence and progression of MDD.Molecular Psychiatry advance online publication, 3 December 2013; doi:10.1038/mp.2013.161.

    View details for DOI 10.1038/mp.2013.161

    View details for PubMedID 24296977

  • Type I interferon signaling genes in recurrent major depression: increased expression detected by whole-blood RNA sequencing MOLECULAR PSYCHIATRY Mostafavi, S., Battle, A., Zhu, X., Potash, J. B., Weissman, M. M., Shi, J., Beckman, K., Haudenschild, C., McCormick, C., Mei, R., Gameroff, M. J., Gindes, H., Adams, P., Goes, F. S., Mondimore, F. M., MacKinnon, D. F., Notes, L., Schweizer, B., Furman, D., Montgomery, S. B., Urban, A. E., Koller, D., Levinson, D. F. 2014; 19 (12): 1267-1274
  • High-Resolution Transcriptome Analysis with Long-Read RNA Sequencing PLOS ONE Cho, H., Davis, J., Li, X., Smith, K. S., Battle, A., Montgomery, S. B. 2014; 9 (9)

    Abstract

    RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2×75 bp and 2×262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals.

    View details for DOI 10.1371/journal.pone.0108095

    View details for Web of Science ID 000342492700076

    View details for PubMedCentralID PMC4176000

  • Transcriptome sequencing of a large human family identifies the impact of rare noncoding variants. American journal of human genetics Li, X., Battle, A., Karczewski, K. J., Zappala, Z., Knowles, D. A., Smith, K. S., Kukurba, K. R., Wu, E., Simon, N., Montgomery, S. B. 2014; 95 (3): 245-256

    Abstract

    Recent and rapid human population growth has led to an excess of rare genetic variants that are expected to contribute to an individual's genetic burden of disease risk. To date, much of the focus has been on rare protein-coding variants, for which potential impact can be estimated from the genetic code, but determining the impact of rare noncoding variants has been more challenging. To improve our understanding of such variants, we combined high-quality genome sequencing and RNA sequencing data from a 17-individual, three-generation family to contrast expression quantitative trait loci (eQTLs) and splicing quantitative trait loci (sQTLs) within this family to eQTLs and sQTLs within a population sample. Using this design, we found that eQTLs and sQTLs with large effects in the family were enriched with rare regulatory and splicing variants (minor allele frequency < 0.01). They were also more likely to influence essential genes and genes involved in complex disease. In addition, we tested the capacity of diverse noncoding annotation to predict the impact of rare noncoding variants. We found that distance to the transcription start site, evolutionary constraint, and epigenetic annotation were considerably more informative for predicting the impact of rare variants than for predicting the impact of common variants. These results highlight that rare noncoding variants are important contributors to individual gene-expression profiles and further demonstrate a significant capability for genomic annotation to predict the impact of rare noncoding variants.

    View details for DOI 10.1016/j.ajhg.2014.08.004

    View details for PubMedID 25192044

    View details for PubMedCentralID PMC4157143

  • Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture. PLoS genetics Martin, A. R., Costa, H. A., Lappalainen, T., Henn, B. M., Kidd, J. M., Yee, M., Grubert, F., Cann, H. M., Snyder, M., Montgomery, S. B., Bustamante, C. D. 2014; 10 (8)

    Abstract

    Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP). The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and regulatory genetics across populations from the broadest points of human migration history yet sampled.

    View details for DOI 10.1371/journal.pgen.1004549

    View details for PubMedID 25121757

  • Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture. PLoS genetics Martin, A. R., Costa, H. A., Lappalainen, T., Henn, B. M., Kidd, J. M., Yee, M., Grubert, F., Cann, H. M., Snyder, M., Montgomery, S. B., Bustamante, C. D. 2014; 10 (8)

    View details for DOI 10.1371/journal.pgen.1004549

    View details for PubMedID 25121757

  • Cis and trans effects of human genomic variants on gene expression. PLoS genetics Bryois, J., Buil, A., Evans, D. M., Kemp, J. P., Montgomery, S. B., Conrad, D. F., Ho, K. M., Ring, S., Hurles, M., Deloukas, P., Davey Smith, G., Dermitzakis, E. T. 2014; 10 (7)

    Abstract

    Gene expression is a heritable cellular phenotype that defines the function of a cell and can lead to diseases in case of misregulation. In order to detect genetic variations affecting gene expression, we performed association analysis of single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) with gene expression measured in 869 lymphoblastoid cell lines of the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort in cis and in trans. We discovered that 3,534 genes (false discovery rate (FDR) = 5%) are affected by an expression quantitative trait locus (eQTL) in cis and 48 genes are affected in trans. We observed that CNVs are more likely to be eQTLs than SNPs. In addition, we found that variants associated to complex traits and diseases are enriched for trans-eQTLs and that trans-eQTLs are enriched for cis-eQTLs. As a variant affecting both a gene in cis and in trans suggests that the cis gene is functionally linked to the trans gene expression, we looked specifically for trans effects of cis-eQTLs. We discovered that 26 cis-eQTLs are associated to 92 genes in trans with the cis-eQTLs of the transcriptions factors BATF3 and HMX2 affecting the most genes. We then explored if the variation of the level of expression of the cis genes were causally affecting the level of expression of the trans genes and discovered several causal relationships between variation in the level of expression of the cis gene and variation of the level of expression of the trans gene. This analysis shows that a large sample size allows the discovery of secondary effects of human variations on gene expression that can be used to construct short directed gene regulatory networks.

    View details for DOI 10.1371/journal.pgen.1004461

    View details for PubMedID 25010687

    View details for PubMedCentralID PMC4091791

  • Cis and trans effects of human genomic variants on gene expression. PLoS genetics Bryois, J., Buil, A., Evans, D. M., Kemp, J. P., Montgomery, S. B., Conrad, D. F., Ho, K. M., Ring, S., Hurles, M., Deloukas, P., Davey Smith, G., Dermitzakis, E. T. 2014; 10 (7): e1004461

    Abstract

    Gene expression is a heritable cellular phenotype that defines the function of a cell and can lead to diseases in case of misregulation. In order to detect genetic variations affecting gene expression, we performed association analysis of single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) with gene expression measured in 869 lymphoblastoid cell lines of the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort in cis and in trans. We discovered that 3,534 genes (false discovery rate (FDR) = 5%) are affected by an expression quantitative trait locus (eQTL) in cis and 48 genes are affected in trans. We observed that CNVs are more likely to be eQTLs than SNPs. In addition, we found that variants associated to complex traits and diseases are enriched for trans-eQTLs and that trans-eQTLs are enriched for cis-eQTLs. As a variant affecting both a gene in cis and in trans suggests that the cis gene is functionally linked to the trans gene expression, we looked specifically for trans effects of cis-eQTLs. We discovered that 26 cis-eQTLs are associated to 92 genes in trans with the cis-eQTLs of the transcriptions factors BATF3 and HMX2 affecting the most genes. We then explored if the variation of the level of expression of the cis genes were causally affecting the level of expression of the trans genes and discovered several causal relationships between variation in the level of expression of the cis gene and variation of the level of expression of the trans gene. This analysis shows that a large sample size allows the discovery of secondary effects of human variations on gene expression that can be used to construct short directed gene regulatory networks.

    View details for DOI 10.1371/journal.pgen.1004461

    View details for PubMedID 25010687

    View details for PubMedCentralID PMC4091791

  • Determining causality and consequence of expression quantitative trait loci HUMAN GENETICS Battle, A., Montgomery, S. B. 2014; 133 (6): 727-735

    Abstract

    Expression quantitative trait loci (eQTLs) are currently the most abundant and systematically-surveyed class of functional consequence for genetic variation. Recent genetic studies of gene expression have identified thousands of eQTLs in diverse tissue types for the majority of human genes. Application of this large eQTL catalog provides an important resource for understanding the molecular basis of common genetic diseases. However, only now has both the availability of individuals with full genomes and corresponding advances in functional genomics provided the opportunity to dissect eQTLs to identify causal regulatory variants. Resolving the properties of such causal regulatory variants is improving understanding of the molecular mechanisms that influence traits and guiding the development of new genome-scale approaches to variant interpretation. In this review, we provide an overview of current computational and experimental methods for identifying causal regulatory variants and predicting their phenotypic consequences.

    View details for DOI 10.1007/s00439-014-1446-0

    View details for Web of Science ID 000336317000005

    View details for PubMedID 24770875

  • Allelic Expression of Deleterious Protein-Coding Variants across Human Tissues. PLoS genetics Kukurba, K. R., Zhang, R., Li, X., Smith, K. S., Knowles, D. A., How Tan, M., Piskol, R., Lek, M., Snyder, M., MacArthur, D. G., Li, J. B., Montgomery, S. B. 2014; 10 (5)

    Abstract

    Personal exome and genome sequencing provides access to loss-of-function and rare deleterious alleles whose interpretation is expected to provide insight into individual disease burden. However, for each allele, accurate interpretation of its effect will depend on both its penetrance and the trait's expressivity. In this regard, an important factor that can modify the effect of a pathogenic coding allele is its level of expression; a factor which itself characteristically changes across tissues. To better inform the degree to which pathogenic alleles can be modified by expression level across multiple tissues, we have conducted exome, RNA and deep, targeted allele-specific expression (ASE) sequencing in ten tissues obtained from a single individual. By combining such data, we report the impact of rare and common loss-of-function variants on allelic expression exposing stronger allelic bias for rare stop-gain variants and informing the extent to which rare deleterious coding alleles are consistently expressed across tissues. This study demonstrates the potential importance of transcriptome data to the interpretation of pathogenic protein-coding variants.

    View details for DOI 10.1371/journal.pgen.1004304

    View details for PubMedID 24786518

  • Dissecting the causal genetic mechanisms of coronary heart disease. Current atherosclerosis reports Miller, C. L., Assimes, T. L., Montgomery, S. B., Quertermous, T. 2014; 16 (5): 406-?

    Abstract

    Large-scale genome-wide association studies (GWAS) have identified 46 loci that are associated with coronary heart disease (CHD). Additionally, 104 independent candidate variants (false discovery rate of 5 %) have been identified (Schunkert H, Konig IR, Kathiresan S, Reilly MP, Assimes TL, Holm H et al. Nat Genet 43:333-8, 2011; Deloukas P, Kanoni S, Willenborg C, Farrall M, Assimes TL, Thompson JR et al. Nat Genet 45:25-33, 2012; C4D Genetics Consortium. Nat Genet 43:339-44, 2011). The majority of the causal genes in these loci function independently of conventional risk factors. It is postulated that a number of the CHD-associated genes regulate basic processes in the vascular cells involved in atherosclerosis, and that study of the signaling pathways that are modulated in this cell type by causal regulatory variation will provide critical new insights for targeting the initiation and progression of disease. In this review, we will discuss the types of experimental approaches and data that are critical to understanding the molecular processes that underlie the disease risk at 9p21.3, TCF21, SORT1, and other CHD-associated loci.

    View details for DOI 10.1007/s11883-014-0406-4

    View details for PubMedID 24623178

  • SplicePlot: a utility for visualizing splicing quantitative trait loci. Bioinformatics Wu, E., Nance, T., Montgomery, S. B. 2014; 30 (7): 1025-1026

    Abstract

    RNA-Sequencing has provided unprecedented resolution of alternative splicing and splicing-quantitative trait loci (sQTL). However, there are few tools available for visualizing the genotype-dependent effects of splicing at a population level. SplicePlot is a simple command line utility that produces intuitive visualization of sQTLs and their effects. SplicePlot takes mapped RNA-seq reads in BAM format and genotype data in VCF format as input and outputs publication quality sashimi plots, hive plots, and structure plots enabling better investigation and understanding of the role of genetics on alternative splicing and transcript structure.Availability and Implementation: Source code and detailed documentation are available at http://montgomerylab.stanford.edu/spliceplot/index.html under Resources and at Github. SplicePlot is implemented in Python and is supported on Linux and Mac OS. A VirtualBox virtual machine running Ubuntu with SplicePlot already installed is also available.wu.eric.g@gmail.com or smontgom@stanford.edu.

    View details for DOI 10.1093/bioinformatics/btt733

    View details for PubMedID 24363378

  • Path-scan: a reporting tool for identifying clinically actionable variants. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Daneshjou, R., Zappala, Z., Kukurba, K., Boyle, S. M., Ormond, K. E., Klein, T. E., Snyder, M., Bustamante, C. D., Altman, R. B., Montgomery, S. B. 2014; 19: 229-240

    Abstract

    The American College of Medical Genetics and Genomics (ACMG) recently released guidelines regarding the reporting of incidental findings in sequencing data. Given the availability of Direct to Consumer (DTC) genetic testing and the falling cost of whole exome and genome sequencing, individuals will increasingly have the opportunity to analyze their own genomic data. We have developed a web-based tool, PATH-SCAN, which annotates individual genomes and exomes for ClinVar designated pathogenic variants found within the genes from the ACMG guidelines. Because mutations in these genes predispose individuals to conditions with actionable outcomes, our tool will allow individuals or researchers to identify potential risk variants in order to consult physicians or genetic counselors for further evaluation. Moreover, our tool allows individuals to anonymously submit their pathogenic burden, so that we can crowd source the collection of quantitative information regarding the frequency of these variants. We tested our tool on 1092 publicly available genomes from the 1000 Genomes project, 163 genomes from the Personal Genome Project, and 15 genomes from a clinical genome sequencing research project. Excluding the most commonly seen variant in 1000 Genomes, about 20% of all genomes analyzed had a ClinVar designated pathogenic variant that required further evaluation.

    View details for PubMedID 24297550

  • Transcriptome analysis reveals differential splicing events in IPF lung tissue. PloS one Nance, T., Smith, K. S., Anaya, V., Richardson, R., Ho, L., Pala, M., Mostafavi, S., Battle, A., Feghali-Bostwick, C., Rosen, G., Montgomery, S. B. 2014; 9 (5)

    Abstract

    Idiopathic pulmonary fibrosis (IPF) is a complex disease in which a multitude of proteins and networks are disrupted. Interrogation of the transcriptome through RNA sequencing (RNA-Seq) enables the determination of genes whose differential expression is most significant in IPF, as well as the detection of alternative splicing events which are not easily observed with traditional microarray experiments. We sequenced messenger RNA from 8 IPF lung samples and 7 healthy controls on an Illumina HiSeq 2000, and found evidence for substantial differential gene expression and differential splicing. 873 genes were differentially expressed in IPF (FDR<5%), and 440 unique genes had significant differential splicing events in at least one exonic region (FDR<5%). We used qPCR to validate the differential exon usage in the second and third most significant exonic regions, in the genes COL6A3 (RNA-Seq adjusted pval = 7.18e-10) and POSTN (RNA-Seq adjusted pval = 2.06e-09), which encode the extracellular matrix proteins collagen alpha-3(VI) and periostin. The increased gene-level expression of periostin has been associated with IPF and its clinical progression, but its differential splicing has not been studied in the context of this disease. Our results suggest that alternative splicing of these and other genes may be involved in the pathogenesis of IPF. We have developed an interactive web application which allows users to explore the results of our RNA-Seq experiment, as well as those of two previously published microarray experiments, and we hope that this will serve as a resource for future investigations of gene regulation in IPF.

    View details for DOI 10.1371/journal.pone.0097550

    View details for PubMedID 24805851

  • High-resolution transcriptome analysis with long-read RNA sequencing. PloS one Cho, H., Davis, J., Li, X., Smith, K. S., Battle, A., Montgomery, S. B. 2014; 9 (9)

    Abstract

    RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2×75 bp and 2×262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals.

    View details for DOI 10.1371/journal.pone.0108095

    View details for PubMedID 25251678

  • Transcriptome Analysis Reveals Differential Splicing Events in IPF Lung Tissue. PloS one Nance, T., Smith, K. S., Anaya, V., Richardson, R., Ho, L., Pala, M., Mostafavi, S., Battle, A., Feghali-Bostwick, C., Rosen, G., Montgomery, S. B. 2014; 9 (3): e92111

    Abstract

    Idiopathic pulmonary fibrosis (IPF) is a complex disease in which a multitude of proteins and networks are disrupted. Interrogation of the transcriptome through RNA sequencing (RNA-Seq) enables the determination of genes whose differential expression is most significant in IPF, as well as the detection of alternative splicing events which are not easily observed with traditional microarray experiments. We sequenced messenger RNA from 8 IPF lung samples and 7 healthy controls on an Illumina HiSeq 2000, and found evidence for substantial differential gene expression and differential splicing. 873 genes were differentially expressed in IPF (FDR<5%), and 440 unique genes had significant differential splicing events in at least one exonic region (FDR<5%). We used qPCR to validate the differential exon usage in the second and third most significant exonic regions, in the genes COL6A3 (RNA-Seq adjusted pval = 7.18e-10) and POSTN (RNA-Seq adjusted pval = 2.06e-09), which encode the extracellular matrix proteins collagen alpha-3(VI) and periostin. The increased gene-level expression of periostin has been associated with IPF and its clinical progression, but its differential splicing has not been studied in the context of this disease. Our results suggest that alternative splicing of these and other genes may be involved in the pathogenesis of IPF. We have developed an interactive web application which allows users to explore the results of our RNA-Seq experiment, as well as those of two previously published microarray experiments, and we hope that this will serve as a resource for future investigations of gene regulation in IPF.

    View details for DOI 10.1371/journal.pone.0092111

    View details for PubMedID 24647608

    View details for PubMedCentralID PMC3960165

  • Quantifying RNA allelic ratios by microfluidic multiplex PCR and sequencing. Nature methods Zhang, R., Li, X., Ramaswami, G., Smith, K. S., Turecki, G., Montgomery, S. B., Li, J. B. 2014; 11 (1): 51-54

    Abstract

    We developed a targeted RNA sequencing method that couples microfluidics-based multiplex PCR and deep sequencing (mmPCR-seq) to uniformly and simultaneously amplify up to 960 loci in 48 samples independently of their gene expression levels and to accurately and cost-effectively measure allelic ratios even for low-quantity or low-quality RNA samples. We applied mmPCR-seq to RNA editing and allele-specific expression studies. mmPCR-seq complements RNA-seq for studying allelic variations in the transcriptome.

    View details for DOI 10.1038/nmeth.2736

    View details for PubMedID 24270603

    View details for PubMedCentralID PMC3877737

  • Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals GENOME RESEARCH Battle, A., Mostafavi, S., Zhu, X., Potash, J. B., Weissman, M. M., McCormick, C., Haudenschild, C. D., Beckman, K. B., Shi, J., Mei, R., Urban, A. E., Montgomery, S. B., Levinson, D. F., Koller, D. 2014; 24 (1): 14-24

    Abstract

    Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation-by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.

    View details for DOI 10.1101/gr.155192.113

    View details for PubMedID 24092820

  • Performance of genomic medicine. Genome biology Karczewski, K. J., Montgomery, S. B. 2013; 14 (12): 316

    Abstract

    A report on the Cold Spring Harbor Laboratory meeting on Precision Medicine: Personal Genomes and Pharmacogenomics, held in Cold Spring Harbor, New York, USA, November 13-16, 2013.

    View details for DOI 10.1186/gb4146

    View details for PubMedID 24359965

  • Transcriptome and genome sequencing uncovers functional variation in humans. Nature Lappalainen, T., Sammeth, M., Friedländer, M. R., 't Hoen, P. A., Monlong, J., Rivas, M. A., Gonzàlez-Porta, M., Kurbatova, N., Griebel, T., Ferreira, P. G., Barann, M., Wieland, T., Greger, L., van Iterson, M., Almlöf, J., Ribeca, P., Pulyakhina, I., Esser, D., Giger, T., Tikhonov, A., Sultan, M., Bertier, G., MacArthur, D. G., Lek, M., Lizano, E., Buermans, H. P., Padioleau, I., Schwarzmayr, T., Karlberg, O., Ongen, H., Kilpinen, H., Beltran, S., Gut, M., Kahlem, K., Amstislavskiy, V., Stegle, O., Pirinen, M., Montgomery, S. B., Donnelly, P., McCarthy, M. I., Flicek, P., Strom, T. M., Lehrach, H., Schreiber, S., Sudbrak, R., Carracedo, A., Antonarakis, S. E., Häsler, R., Syvänen, A., van Ommen, G., Brazma, A., Meitinger, T., Rosenstiel, P., Guigó, R., Gut, I. G., Estivill, X., Dermitzakis, E. T. 2013; 501 (7468): 506-511

    Abstract

    Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

    View details for DOI 10.1038/nature12531

    View details for PubMedID 24037378

  • Transcriptome and genome sequencing uncovers functional variation in humans NATURE Lappalainen, T., Sammeth, M., Friedlaender, M. R., 't Hoen, P. A., Monlong, J., Rivas, M. A., Gonzalez-Porta, M., Kurbatova, N., Griebel, T., Ferreira, P. G., Barann, M., Wieland, T., Greger, L., van Iterson, M., Almloef, J., Ribeca, P., Pulyakhina, I., Esser, D., Giger, T., Tikhonov, A., Sultan, M., Bertier, G., MacArthur, D. G., Lek, M., Lizano, E., Buermans, H. P., Padioleau, I., Schwarzmayr, T., Karlberg, O., Ongen, H., Kilpinen, H., Beltran, S., Gut, M., Kahlem, K., Amstislavskiy, V., Stegle, O., Pirinen, M., Montgomery, S. B., Donnelly, P., McCarthy, M. I., Flicek, P., Strom, T. M., Lehrach, H., Schreiber, S., Sudbrak, R., Carracedo, A., Antonarakis, S. E., Haesler, R., Syvaenen, A., van Ommen, G., Brazma, A., Meitinger, T., Rosenstiel, P., Guigo, R., Gut, I. G., Estivill, X., Dermitzakis, E. T. 2013; 501 (7468): 506-511
  • Systematic functional regulatory assessment of disease-associated variants. Proceedings of the National Academy of Sciences of the United States of America Karczewski, K. J., Dudley, J. T., Kukurba, K. R., Chen, R., Butte, A. J., Montgomery, S. B., Snyder, M. 2013; 110 (23): 9607-9612

    Abstract

    Genome-wide association studies have discovered many genetic loci associated with disease traits, but the functional molecular basis of these associations is often unresolved. Genome-wide regulatory and gene expression profiles measured across individuals and diseases reflect downstream effects of genetic variation and may allow for functional assessment of disease-associated loci. Here, we present a unique approach for systematic integration of genetic disease associations, transcription factor binding among individuals, and gene expression data to assess the functional consequences of variants associated with hundreds of human diseases. In an analysis of genome-wide binding profiles of NFκB, we find that disease-associated SNPs are enriched in NFκB binding regions overall, and specifically for inflammatory-mediated diseases, such as asthma, rheumatoid arthritis, and coronary artery disease. Using genome-wide variation in transcription factor-binding data, we find that NFκB binding is often correlated with disease-associated variants in a genotype-specific and allele-specific manner. Furthermore, we show that this binding variation is often related to expression of nearby genes, which are also found to have altered expression in independent profiling of the variant-associated disease condition. Thus, using this integrative approach, we provide a unique means to assign putative function to many disease-associated SNPs.

    View details for DOI 10.1073/pnas.1219099110

    View details for PubMedID 23690573

  • Desktop transcriptome sequencing from archival tissue to identify clinically relevant translocations. American journal of surgical pathology Sweeney, R. T., Zhang, B., Zhu, S. X., Varma, S., Smith, K. S., Montgomery, S. B., van de Rijn, M., Zehnder, J., West, R. B. 2013; 37 (6): 796-803

    Abstract

    Somatic mutations, often translocations or single nucleotide variations, are pathognomonic for certain types of cancers and are increasingly of clinical importance for diagnosis and prediction of response to therapy. Conventional clinical assays only evaluate 1 mutation at a time, and targeted tests are often constrained to identify only the most common mutations. Genome-wide or transcriptome-wide high-throughput sequencing (HTS) of clinical samples offers an opportunity to evaluate for all clinically significant mutations with a single test. Recently a "desktop version" of HTS has become available, but most of the experience to date is based on data obtained from high-quality DNA from frozen specimens. In this study, we demonstrate, as a proof of principle, that translocations in sarcomas can be diagnosed from formalin-fixed paraffin-embedded (FFPE) tissue with desktop HTS. Using the first generation MiSeq platform, full transcriptome sequencing was performed on FFPE material from archival blocks of 3 synovial sarcomas, 3 myxoid liposarcomas, 2 Ewing sarcomas, and 1 clear cell sarcoma. Mapping the reads to the "sarcomatome" (all known 83 genes involved in translocations and mutations in sarcoma) and using a novel algorithm for ranking fusion candidates, the pathognomonic fusions and the exact breakpoints were identified in all cases of synovial sarcoma, myxoid liposarcoma, and clear cell sarcoma. The Ewing sarcoma fusion gene was detectable in FFPE material only with a sequencing platform that generates greater sequencing depth. The results show that a single transcriptome HTS assay, from FFPE, has the potential to replace conventional molecular diagnostic techniques for the evaluation of clinically relevant mutations in cancer.

    View details for DOI 10.1097/PAS.0b013e31827ad9b2

    View details for PubMedID 23598961

  • The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome research Montgomery, S. B., Goode, D. L., Kvikstad, E., Albers, C. A., Zhang, Z. D., Mu, X. J., Ananda, G., Howie, B., Karczewski, K. J., Smith, K. S., Anaya, V., Richardson, R., Davis, J., MacArthur, D. G., Sidow, A., Duret, L., Gerstein, M., Makova, K. D., Marchini, J., McVean, G., Lunter, G. 2013; 23 (5): 749-761

    Abstract

    Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.

    View details for DOI 10.1101/gr.148718.112

    View details for PubMedID 23478400

    View details for PubMedCentralID PMC3638132

  • Examination of the relationship between variation at 17q21 and childhood wheeze phenotypes JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY Granell, R., Henderson, A. J., Timpson, N., St Pourcain, B., Kemp, J. P., Ring, S. M., Ho, K., Montgomery, S. B., Dermitzakis, E. T., Evans, D. M., Sterne, J. A. 2013; 131 (3): 685-694

    Abstract

    Genome-wide association studies have identified associations of genetic variants at 17q21 near ORMDL3 with childhood asthma.We sought to determine whether associations in this region are specific to particular asthma phenotypes and specific to ORMDL3.We examined associations between 244 independent single nucleotide polymorphisms (SNPs) plus 13 previously identified asthma-related SNPs in the region between 34 and 36 Mb on chromosome 17 and early wheezing phenotypes, doctor-diagnosed asthma and atopy at 7½ years, and bronchial hyperresponsiveness and lung function at 8½ years in 7045 children from the Avon Longitudinal Study of Parents and Children birth cohort study. With this, cis expression quantitative trait loci signals for the same SNPs were assessed in 875 samples across genes in the same region.The strongest evidence for phenotypic association was seen for persistent wheezing (rs8076131 near ORMDL3: relative risk ratio [RRR], 1.60 [95% CI, 1.40-1.84], P = 1.4 × 10(-11); rs2305480 near GSDML: RRR, 1.60 [95% CI, 1.39-1.83], P = 1.5 × 10(-11); and rs9303277 near IKZF3: RRR, 1.57 [95% CI, 1.37-1.79], P = 4.4 × 10(-11)). Similar but less precisely estimated effects were seen for intermediate-onset wheeze, but there was little evidence of associations with other wheezing phenotypes. There was some evidence of associations with bronchial hyperresponsiveness. SNPs across the whole region show strong evidence of association with differential levels of expression at GSDML, IKZF3, and MED24, as well as ORMDL3.Associations of SNPs in the 17q21 locus are specific to asthma and specific wheezing phenotypes and are not explained by associations with intermediate phenotypes, such as atopy or lung function.

    View details for DOI 10.1016/j.jaci.2012.09.021

    View details for Web of Science ID 000315587800008

    View details for PubMedID 23154084

  • Integrating GWAS and Expression Data for Functional Characterization of Disease-Associated SNPs: An Application to Follicular Lymphoma AMERICAN JOURNAL OF HUMAN GENETICS Conde, L., Bracci, P. M., Richardson, R., Montgomery, S. B., Skibola, C. F. 2013; 92 (1): 126-130

    Abstract

    Development of post-GWAS (genome-wide association study) methods are greatly needed for characterizing the function of trait-associated SNPs. Strategies integrating various biological data sets with GWAS results will provide insights into the mechanistic role of associated SNPs. Here, we present a method that integrates RNA sequencing (RNA-seq) and allele-specific expression data with GWAS data to further characterize SNPs associated with follicular lymphoma (FL). We investigated the influence on gene expression of three established FL-associated loci-rs10484561, rs2647012, and rs6457327-by measuring their correlation with human-leukocyte-antigen (HLA) expression levels obtained from publicly available RNA-seq expression data sets from lymphoblastoid cell lines. Our results suggest that SNPs linked to the protective variant rs2647012 exert their effect by a cis-regulatory mechanism involving modulation of HLA-DQB1 expression. In contrast, no effect on HLA expression was observed for the colocalized risk variant rs10484561. The application of integrative methods, such as those presented here, to other post-GWAS investigations will help identify causal disease variants and enhance our understanding of biological disease mechanisms.

    View details for DOI 10.1016/j.ajhg.2012.11.009

    View details for Web of Science ID 000313759000013

    View details for PubMedID 23246294

    View details for PubMedCentralID PMC3542469

  • Passive and active DNA methylation and the interplay with genetic variation in gene regulation. eLife Gutierrez-Arcelus, M., Lappalainen, T., Montgomery, S. B., Buil, A., Ongen, H., Yurovsky, A., Bryois, J., Giger, T., Romano, L., Planchon, A., Falconnet, E., Bielser, D., Gagnebin, M., Padioleau, I., Borel, C., Letourneau, A., Makrythanasis, P., Guipponi, M., Gehrig, C., Antonarakis, S. E., Dermitzakis, E. T. 2013; 2

    Abstract

    DNA methylation is an essential epigenetic mark whose role in gene regulation and its dependency on genomic sequence and environment are not fully understood. In this study we provide novel insights into the mechanistic relationships between genetic variation, DNA methylation and transcriptome sequencing data in three different cell-types of the GenCord human population cohort. We find that the association between DNA methylation and gene expression variation among individuals are likely due to different mechanisms from those establishing methylation-expression patterns during differentiation. Furthermore, cell-type differential DNA methylation may delineate a platform in which local inter-individual changes may respond to or act in gene regulation. We show that unlike genetic regulatory variation, DNA methylation alone does not significantly drive allele specific expression. Finally, inferred mechanistic relationships using genetic variation as well as correlations with TF abundance reveal both a passive and active role of DNA methylation to regulatory interactions influencing gene expression. DOI:http://dx.doi.org/10.7554/eLife.00523.001.

    View details for DOI 10.7554/eLife.00523

    View details for PubMedID 23755361

  • Normalizing RNA-Sequencing Data by Modeling Hidden Covariates with Prior Knowledge. PloS one Mostafavi, S., Battle, A., Zhu, X., Urban, A. E., Levinson, D., Montgomery, S. B., Koller, D. 2013; 8 (7)

    View details for DOI 10.1371/journal.pone.0068141

    View details for PubMedID 23874524

  • Cancer Transcriptome Sequencing and Analysis Cancer Genomics: From Bench to Personalized Medicine Morin, R. D., Montgomery, S. B. Elsevier. 2013; 1: 31–49
  • Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. PloS one Mostafavi, S., Battle, A., Zhu, X., Urban, A. E., Levinson, D., Montgomery, S. B., Koller, D. 2013; 8 (7)

    Abstract

    Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks.

    View details for DOI 10.1371/journal.pone.0068141

    View details for PubMedID 23874524

  • Detection and impact of rare regulatory variants in human disease. Frontiers in genetics Li, X., Montgomery, S. B. 2013; 4: 67-?

    Abstract

    Advances in genome sequencing are providing unprecedented resolution of rare and private variants. However, methods which assess the effect of these variants have relied predominantly on information within coding sequences. Assessing their impact in non-coding sequences remains a significant contemporary challenge. In this review, we highlight the role of regulatory variation as causative agents and modifiers of monogenic disorders. We further discuss how advances in functional genomics are now providing new opportunity to assess the impact of rare non-coding variants and their role in disease.

    View details for DOI 10.3389/fgene.2013.00067

    View details for PubMedID 23755067

    View details for PubMedCentralID PMC3668132

  • Sex-biased genetic effects on gene regulation in humans GENOME RESEARCH Dimas, A. S., Nica, A. C., Montgomery, S. B., Stranger, B. E., Raj, T., Buil, A., Giger, T., Lappalainen, T., Gutierrez-Arcelus, M., McCarthy, M. I., Dermitzakis, E. T. 2012; 22 (12): 2368-2375

    Abstract

    Human regulatory variation, reported as expression quantitative trait loci (eQTLs), contributes to differences between populations and tissues. The contribution of eQTLs to differences between sexes, however, has not been investigated to date. Here we explore regulatory variation in females and males and demonstrate that 12%-15% of autosomal eQTLs function in a sex-biased manner. We show that genes possessing sex-biased eQTLs are expressed at similar levels across the sexes and highlight cases of genes controlling sexually dimorphic and shared traits that are under the control of distinct regulatory elements in females and males. This study illustrates that sex provides important context that can modify the effects of functional genetic variants.

    View details for DOI 10.1101/gr.134981.111

    View details for Web of Science ID 000311895500005

    View details for PubMedID 22960374

  • Mapping cis- and trans-regulatory effects across multiple tissues in twins NATURE GENETICS Grundberg, E., Small, K. S., Hedman, A. K., Nica, A. C., Buil, A., Keildson, S., Bell, J. T., Yang, T., Meduri, E., Barrett, A., Nisbett, J., Sekowska, M., Wilk, A., Shin, S., Glass, D., Travers, M., Min, J. L., Ring, S., Ho, K., Thorleifsson, G., Kong, A., Thorsteindottir, U., Ainali, C., Dimas, A. S., Hassanali, N., Ingle, C., Knowles, D., Krestyaninova, M., Lowe, C. E., Di Meglio, P., Montgomery, S. B., Parts, L., Potter, S., Surdulescu, G., Tsaprouni, L., Tsoka, S., Bataille, V., Durbin, R., Nestle, F. O., O'Rahilly, S., Soranzo, N., Lindgren, C. M., Zondervan, K. T., Ahmadi, K. R., Schadt, E. E., Stefansson, K., Smith, G. D., McCarthy, M. I., Deloukas, P., Dermitzakis, E. T., Spector, T. D. 2012; 44 (10): 1084-?

    Abstract

    Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many expression quantitative trait locus (eQTL) studies, typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis effect on expression cannot be accounted for by common cis variants, a finding that reveals the contribution of low-frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene, and we identify several replicating trans variants that act predominantly in a tissue-restricted manner and may regulate the transcription of many genes.

    View details for DOI 10.1038/ng.2394

    View details for Web of Science ID 000309550200006

    View details for PubMedID 22941192

  • Genotype-Based Test in Mapping Cis-Regulatory Variants from Allele-Specific Expression Data PLOS ONE Lefebvre, J. F., Vello, E., Ge, B., Montgomery, S. B., Dermitzakis, E. T., Pastinen, T., Labuda, D. 2012; 7 (6)

    Abstract

    Identifying and understanding the impact of gene regulatory variation is of considerable importance in evolutionary and medical genetics; such variants are thought to be responsible for human-specific adaptation and to have an important role in genetic disease. Regulatory variation in cis is readily detected in individuals showing uneven expression of a transcript from its two allelic copies, an observation referred to as allelic imbalance (AI). Identifying individuals exhibiting AI allows mapping of regulatory DNA regions and the potential to identify the underlying causal genetic variant(s). However, existing mapping methods require knowledge of the haplotypes, which make them sensitive to phasing errors. In this study, we introduce a genotype-based mapping test that does not require haplotype-phase inference to locate regulatory regions. The test relies on partitioning genotypes of individuals exhibiting AI and those not expressing AI in a 2×3 contingency table. The performance of this test to detect linkage disequilibrium (LD) between a potential regulatory site and a SNP located in this region was examined by analyzing the simulated and the empirical AI datasets. In simulation experiments, the genotype-based test outperforms the haplotype-based tests with the increasing distance separating the regulatory region from its regulated transcript. The genotype-based test performed equally well with the experimental AI datasets, either from genome-wide cDNA hybridization arrays or from RNA sequencing. By avoiding the need of haplotype inference, the genotype-based test will suit AI analyses in population samples of unknown haplotype structure and will additionally facilitate the identification of cis-regulatory variants that are located far away from the regulated transcript.

    View details for DOI 10.1371/journal.pone.0038667

    View details for Web of Science ID 000305351700058

    View details for PubMedID 22685595

  • Patterns of Cis Regulatory Variation in Diverse Human Populations PLOS GENETICS Stranger, B. E., Montgomery, S. B., Dimas, A. S., Parts, L., Stegle, O., Ingle, C. E., Sekowska, M., Smith, G. D., Evans, D., Gutierrez-Arcelus, M., Price, A., Raj, T., Nisbett, J., Nica, A. C., Beazley, C., Durbin, R., Deloukas, P., Dermitzakis, E. T. 2012; 8 (4): 272-284

    Abstract

    The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs) after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for the transferability of complex trait variants across populations.

    View details for DOI 10.1371/journal.pgen.1002639

    View details for Web of Science ID 000303441800020

    View details for PubMedID 22532805

  • A Systematic Survey of Loss-of-Function Variants in Human Protein-Coding Genes SCIENCE MacArthur, D. G., Balasubramanian, S., Frankish, A., Huang, N., Morris, J., Walter, K., Jostins, L., Habegger, L., Pickrell, J. K., Montgomery, S. B., Albers, C. A., Zhang, Z. D., Conrad, D. F., Lunter, G., Zheng, H., Ayub, Q., DePristo, M. A., Banks, E., Hu, M., Handsaker, R. E., Rosenfeld, J. A., Fromer, M., Jin, M., Mu, X. J., Khurana, E., Ye, K., Kay, M., Saunders, G. I., Suner, M., Hunt, T., Barnes, I. H., Amid, C., Carvalho-Silva, D. R., Bignell, A. H., Snow, C., Yngvadottir, B., Bumpstead, S., Cooper, D. N., Xue, Y., Romero, I. G., Wang, J., Li, Y., Gibbs, R. A., McCarroll, S. A., Dermitzakis, E. T., Pritchard, J. K., Barrett, J. C., Harrow, J., Hurles, M. E., Gerstein, M. B., Tyler-Smith, C. 2012; 335 (6070): 823-828

    Abstract

    Genome-sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in nonessential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.

    View details for DOI 10.1126/science.1215040

    View details for Web of Science ID 000300356400036

    View details for PubMedID 22344438

    View details for PubMedCentralID PMC3299548

  • Meta-analysis of genome-wide association studies identifies three new risk loci for atopic dermatitis NATURE GENETICS Paternoster, L., Standl, M., Chen, C., Ramasamy, A., Bonnelykke, K., Duijts, L., Ferreira, M. A., Alves, A. C., Thyssen, J. P., Albrecht, E., Baurecht, H., Feenstra, B., Sleiman, P. M., Hysi, P., Warrington, N. M., Curjuric, I., Myhre, R., Curtin, J. A., Groen-Blokhuis, M. M., Kerkhof, M., Saaf, A., Franke, A., Ellinghaus, D., Foelster-Holst, R., Dermitzakis, E., Montgomery, S. B., Prokisch, H., Heim, K., Hartikainen, A., Pouta, A., Pekkanen, J., Blakemore, A. I., Buxton, J. L., Kaakinen, M., Duffy, D. L., Madden, P. A., Heath, A. C., Montgomery, G. W., Thompson, P. J., Matheson, M. C., Le Souef, P., St Pourcain, B., Smith, G. D., Henderson, J., Kemp, J. P., Timpson, N. J., Deloukas, P., Ring, S. M., Wichmann, H., Mueller-Nurasyid, M., Novak, N., Klopp, N., Rodriguez, E., McArdle, W., Linneberg, A., Menne, T., Nohr, E. A., Hofman, A., Uitterlinden, A. G., van Duijin, C. M., Rivadeneira, F., de Jongste, J. C., van der Valk, R. J., Wjst, M., Jogi, R., Geller, F., Boyd, H. A., Murray, J. C., Kim, C., Mentch, F., March, M., Mangino, M., Spector, T. D., Bataille, V., Pennell, C. E., Holt, P. G., Sly, P., Tiesler, C. M., Thiering, E., Illig, T., Imboden, M., Nystad, W., Simpson, A., Hottenga, J., Postma, D., Koppelman, G. H., Smit, H. A., Soderhall, C., Chawes, B., Kreiner-Moller, E., Bisgaard, H., Melen, E., Boomsma, D. I., Custovic, A., Jacobsson, B., Probst-Hensch, N. M., Palmer, L. J., Glass, D., Hakonarson, H., Melbye, M., Jarvis, D. L., Jaddoe, V. W., Gieger, C., Strachan, D. P., Martin, N. G., Jarvelin, M., Heinrich, J., Evans, D. M., Weidinger, S. 2012; 44 (2): 187-192

    Abstract

    Atopic dermatitis (AD) is a commonly occurring chronic skin disease with high heritability. Apart from filaggrin (FLG), the genes influencing atopic dermatitis are largely unknown. We conducted a genome-wide association meta-analysis of 5,606 affected individuals and 20,565 controls from 16 population-based cohorts and then examined the ten most strongly associated new susceptibility loci in an additional 5,419 affected individuals and 19,833 controls from 14 studies. Three SNPs reached genome-wide significance in the discovery and replication cohorts combined, including rs479844 upstream of OVOL1 (odds ratio (OR) = 0.88, P = 1.1 × 10(-13)) and rs2164983 near ACTL9 (OR = 1.16, P = 7.1 × 10(-9)), both of which are near genes that have been implicated in epidermal proliferation and differentiation, as well as rs2897442 in KIF3A within the cytokine cluster at 5q31.1 (OR = 1.11, P = 3.8 × 10(-8)). We also replicated association with the FLG locus and with two recently identified association signals at 11q13.5 (rs7927894; P = 0.008) and 20q13.33 (rs6010620; P = 0.002). Our results underline the importance of both epidermal barrier function and immune dysregulation in atopic dermatitis pathogenesis.

    View details for DOI 10.1038/ng.1017

    View details for Web of Science ID 000299664400018

    View details for PubMedID 22197932

    View details for PubMedCentralID PMC3272375

  • DNA methylation profiles of human active and inactive X chromosomes GENOME RESEARCH Sharp, A. J., Stathaki, E., Migliavacca, E., Brahmachary, M., Montgomery, S. B., Dupre, Y., Antonarakis, S. E. 2011; 21 (10): 1592-1600

    Abstract

    X-chromosome inactivation (XCI) is a dosage compensation mechanism that silences the majority of genes on one X chromosome in each female cell. To characterize epigenetic changes that accompany this process, we measured DNA methylation levels in 45,X patients carrying a single active X chromosome (X(a)), and in normal females, who carry one X(a) and one inactive X (X(i)). Methylated DNA was immunoprecipitated and hybridized to high-density oligonucleotide arrays covering the X chromosome, generating epigenetic profiles of active and inactive X chromosomes. We observed that XCI is accompanied by changes in DNA methylation specifically at CpG islands (CGIs). While the majority of CGIs show increased methylation levels on the X(i), XCI actually results in significant reductions in methylation at 7% of CGIs. Both intra- and inter-genic CGIs undergo epigenetic modification, with the biggest increase in methylation occurring at the promoters of genes silenced by XCI. In contrast, genes escaping XCI generally have low levels of promoter methylation, while genes that show inter-individual variation in silencing show intermediate increases in methylation. Thus, promoter methylation and susceptibility to XCI are correlated. We also observed a global correlation between CGI methylation and the evolutionary age of X-chromosome strata, and that genes escaping XCI show increased methylation within gene bodies. We used our epigenetic map to predict 26 novel genes escaping XCI, and searched for parent-of-origin-specific methylation differences, but found no evidence to support imprinting on the human X chromosome. Our study provides a detailed analysis of the epigenetic profile of active and inactive X chromosomes.

    View details for DOI 10.1101/gr.112680.110

    View details for Web of Science ID 000295407800004

    View details for PubMedID 21862626

  • Epistatic Selection between Coding and Regulatory Variation in Human Evolution and Disease AMERICAN JOURNAL OF HUMAN GENETICS Lappalainen, T., Montgomery, S. B., Nica, A. C., Dermitzakis, E. T. 2011; 89 (3): 459-463

    Abstract

    Interaction (nonadditive effects) between genetic variants has been highlighted as an important mechanism underlying phenotypic variation, but the discovery of genetic interactions in humans has proved difficult. In this study, we show that the spectrum of variation in the human genome has been shaped by modifier effects of cis-regulatory variation on the functional impact of putatively deleterious protein-coding variants. We analyzed 1000 Genomes population-scale resequencing data from Europe (CEU [Utah residents with Northern and Western European ancestry from the CEPH collection]) and Africa (YRI [Yoruba in Ibadan, Nigeria]) together with gene expression data from arrays and RNA sequencing for the same samples. We observed an underrepresentation of derived putatively functional coding variation on the more highly expressed regulatory haplotype, which suggests stronger purifying selection against deleterious coding variants that have increased penetrance because of their regulatory background. Furthermore, the frequency spectrum and impact size distribution of common regulatory polymorphisms (eQTLs) appear to be shaped in order to minimize the selective disadvantage of having deleterious coding mutations on the more highly expressed haplotype. Interestingly, eQTLs explaining common disease GWAS signals showed an enrichment of putative epistatic effects, suggesting that some disease associations might arise from interactions increasing the penetrance of rare coding variants. In conclusion, our results indicate that regulatory and coding variants often modify the functional impact of each other. This specific type of genetic interaction is detectable from sequencing data in a genome-wide manner, and characterizing these joint effects might help us understand functional mechanisms behind genetic associations to human phenotypes-including both Mendelian and common disease.

    View details for DOI 10.1016/j.ajhg.2011.08.004

    View details for Web of Science ID 000294939800012

    View details for PubMedID 21907014

  • Rare and Common Regulatory Variation in Population-Scale Sequenced Human Genomes PLOS GENETICS Montgomery, S. B., Lappalainen, T., Gutierrez-Arcelus, M., Dermitzakis, E. T. 2011; 7 (7)

    Abstract

    Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs) when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs) discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

    View details for DOI 10.1371/journal.pgen.1002144

    View details for Web of Science ID 000293338600007

    View details for PubMedID 21811411

  • Genome-wide association study identifies a common variant associated with risk of endometrial cancer NATURE GENETICS Spurdle, A. B., Thompson, D. J., Ahmed, S., Ferguson, K., Healey, C. S., O'Mara, T., Walker, L. C., Montgomery, S. B., Dermitzakis, E. T., Fahey, P., Montgomery, G. W., Webb, P. M., Fasching, P. A., Beckmann, M. W., Ekici, A. B., Hein, A., Lambrechts, D., Coenegrachts, L., Vergote, I., Amant, F., Salvesen, H. B., Trovik, J., Njolstad, T. S., Helland, H., Scott, R. J., Ashton, K., Proietto, T., Otton, G., Tomlinson, I., Gorman, M., Howarth, K., Hodgson, S., Garcia-Closas, M., Wentzensen, N., Yang, H., Chanock, S., Hall, P., Czene, K., Liu, J., Li, J., Shu, X., Zheng, W., Long, J., Xiang, Y., Shah, M., Morrison, J., Michailidou, K., Pharoah, P. D., Dunning, A. M., Easton, D. F. 2011; 43 (5): 451-?

    Abstract

    Endometrial cancer is the most common malignancy of the female genital tract in developed countries. To identify genetic variants associated with endometrial cancer risk, we performed a genome-wide association study involving 1,265 individuals with endometrial cancer (cases) from Australia and the UK and 5,190 controls from the Wellcome Trust Case Control Consortium. We compared genotype frequencies in cases and controls for 519,655 SNPs. Forty seven SNPs that showed evidence of association with endometrial cancer in stage 1 were genotyped in 3,957 additional cases and 6,886 controls. We identified an endometrial cancer susceptibility locus close to HNF1B at 17q12 (rs4430796, P = 7.1 × 10(-10)) that is also associated with risk of prostate cancer and is inversely associated with risk of type 2 diabetes.

    View details for DOI 10.1038/ng.812

    View details for Web of Science ID 000289972600015

    View details for PubMedID 21499250

  • From expression QTLs to personalized transcriptomics NATURE REVIEWS GENETICS Montgomery, S. B., Dermitzakis, E. T. 2011; 12 (4): 277-282

    Abstract

    Approaches that combine expression quantitative trait loci (eQTLs) and genome-wide association (GWA) studies are offering new functional information about the aetiology of complex human traits and diseases. Improved study designs--which take into account technological advances in resolving the transcriptome, cell history and state, population of origin and diverse endophenotypes--are providing insights into the architecture of disease and the landscape of gene regulation in humans. Furthermore, these advances are helping to establish links between cellular effects and organismal traits.

    View details for DOI 10.1038/nrg2969

    View details for Web of Science ID 000288531700011

    View details for PubMedID 21386863

  • The Architecture of Gene Regulatory Variation across Multiple Human Tissues: The MuTHER Study PLOS GENETICS Nica, A. C., Parts, L., Glass, D., Nisbet, J., Barrett, A., Sekowska, M., Travers, M., Potter, S., Grundberg, E., Small, K., Hedman, A. K., Bataille, V., Bell, J. T., Surdulescu, G., Dimas, A. S., Ingle, C., Nestle, F. O., Di Meglio, P., Min, J. L., Wilk, A., Hammond, C. J., Hassanali, N., Yang, T., Montgomery, S. B., O'Rahilly, S., Lindgren, C. M., Zondervan, K. T., Soranzo, N., Barroso, I., Durbin, R., Ahmadi, K., Deloukas, P., McCarthy, M. I., Dermitzakis, E. T., Spector, T. D. 2011; 7 (2)

    Abstract

    While there have been studies exploring regulatory variation in one or more tissues, the complexity of tissue-specificity in multiple primary tissues is not yet well understood. We explore in depth the role of cis-regulatory variation in three human tissues: lymphoblastoid cell lines (LCL), skin, and fat. The samples (156 LCL, 160 skin, 166 fat) were derived simultaneously from a subset of well-phenotyped healthy female twins of the MuTHER resource. We discover an abundance of cis-eQTLs in each tissue similar to previous estimates (858 or 4.7% of genes). In addition, we apply factor analysis (FA) to remove effects of latent variables, thus more than doubling the number of our discoveries (1,822 eQTL genes). The unique study design (Matched Co-Twin Analysis--MCTA) permits immediate replication of eQTLs using co-twins (93%-98%) and validation of the considerable gain in eQTL discovery after FA correction. We highlight the challenges of comparing eQTLs between tissues. After verifying previous significance threshold-based estimates of tissue-specificity, we show their limitations given their dependency on statistical power. We propose that continuous estimates of the proportion of tissue-shared signals and direct comparison of the magnitude of effect on the fold change in expression are essential properties that jointly provide a biologically realistic view of tissue-specificity. Under this framework we demonstrate that 30% of eQTLs are shared among the three tissues studied, while another 29% appear exclusively tissue-specific. However, even among the shared eQTLs, a substantial proportion (10%-20%) have significant differences in the magnitude of fold change between genotypic classes across tissues. Our results underline the need to account for the complexity of eQTL tissue-specificity in an effort to assess consequences of such variants for complex traits.

    View details for DOI 10.1371/journal.pgen.1002003

    View details for Web of Science ID 000287697300035

    View details for PubMedID 21304890

  • Identification of cis- and trans- regulatory variation modulating microRNA expression levels in human fibroblasts GENOME RESEARCH Borel, C., Deutsch, S., Letourneau, A., Migliavacca, E., Montgomery, S. B., Dimas, A. S., Vejnar, C. E., Attar, H., Gagnebin, M., Gehrig, C., Falconnet, E., Dupre, Y., Dermitzakis, E. T., Antonarakis, S. E. 2011; 21 (1): 68-73

    Abstract

    MicroRNAs (miRNAs) are regulatory noncoding RNAs that affect the production of a significant fraction of human mRNAs via post-transcriptional regulation. Interindividual variation of the miRNA expression levels is likely to influence the expression of miRNA target genes and may therefore contribute to phenotypic differences in humans, including susceptibility to common disorders. The extent to which miRNA levels are genetically controlled is largely unknown. In this report, we assayed the expression levels of miRNAs in primary fibroblasts from 180 European newborns of the GenCord project and performed association analysis to identify eQTLs (expression quantitative traits loci). We detected robust expression for 121 miRNAs out of 365 interrogated. We have identified significant cis- (10%) and trans- (11%) eQTLs. Furthermore, we detected one genomic locus (rs1522653) that influences the expression levels of five miRNAs, thus unraveling a novel mechanism for coregulation of miRNA expression.

    View details for DOI 10.1101/gr.109371.110

    View details for Web of Science ID 000285868300007

    View details for PubMedID 21147911

  • A map of human genome variation from population-scale sequencing NATURE Altshuler, D., Durbin, R. M., Abecasis, G. R., Bentley, D. R., Chakravarti, A., Clark, A. G., Collins, F. S., De La Vega, F. M., Donnelly, P., Egholm, M., Flicek, P., Gabriel, S. B., Gibbs, R. A., Knoppers, B. M., Lander, E. S., Lehrach, H., Mardis, E. R., McVean, G. A., Nickerson, D., Peltonen, L., Schafer, A. J., Sherry, S. T., Wang, J., Wilson, R. K., Gibbs, R. A., Deiros, D., Metzker, M., Muzny, D., Reid, J., Wheeler, D., Wang, J., Li, J., Jian, M., Li, G., Li, R., Liang, H., Tian, G., Wang, B., Wang, J., Wang, W., Yang, H., Zhang, X., Zheng, H., Lander, E. S., Altshuler, D. L., Ambrogio, L., Bloom, T., Cibulskis, K., Fennell, T. J., Gabriel, S. B., Jaffe, D. B., Shefler, E., Sougnez, C. L., Bentley, D. R., Gormley, N., Humphray, S., Kingsbury, Z., Koko-Gonzales, P., Stone, J., McKernan, K. J., Costa, G. L., Ichikawa, J. K., Lee, C. C., Sudbrak, R., Lehrach, H., Borodina, T. A., Dahl, A., Davydov, A. N., Marquardt, P., Mertes, F., Nietfeld, W., Rosenstiel, P., Schreiber, S., Soldatov, A. V., Timmermann, B., Tolzmann, M., Egholm, M., Affourtit, J., Ashworth, D., Attiya, S., Bachorski, M., Buglione, E., Burke, A., Caprio, A., Celone, C., Clark, S., Conners, D., Desany, B., Gu, L., Guccione, L., Kao, K., Kebbel, A., Knowlton, J., Labrecque, M., McDade, L., Mealmaker, C., Minderman, M., Nawrocki, A., Niazi, F., Pareja, K., Ramenani, R., Riches, D., Song, W., Turcotte, C., Wang, S., Mardis, E. R., Dooling, D., Fulton, L., Fulton, R., Weinstock, G., Durbin, R. M., Burton, J., Carter, D. M., Churcher, C., Coffey, A., Cox, A., Palotie, A., Quail, M., Skelly, T., Stalker, J., Swerdlow, H. P., Turner, D., De Witte, A., Giles, S., Gibbs, R. A., Wheeler, D., Bainbridge, M., Challis, D., Sabo, A., Yu, F., Yu, J., Wang, J., Fang, X., Guo, X., Li, R., Li, Y., Luo, R., Tai, S., Wu, H., Zheng, H., Zheng, X., Zhou, Y., Yang, H., Marth, G. T., Garrison, E. P., Huang, W., Indap, A., Kural, D., Lee, W., Leong, W. F., Huang, W., Indap, A., Kural, D., Lee, W., Leong, W. F., Quinlan, A. R., Stewart, C., Stromberg, M. P., Ward, A. N., Wu, J., Lee, C., Mills, R. E., Shi, X., Daly, M. J., DePristo, M. A., Altshuler, D. L., Ball, A. D., Banks, E., Bloom, T., Browning, B. L., Cibulskis, K., Fennell, T. J., Garimella, K. V., Grossman, S. R., Handsaker, R. E., Hanna, M., Hartl, C., Jaffe, D. B., Kernytsky, A. M., Korn, J. M., Li, H., Maguire, J. R., McCarroll, S. A., McKenna, A., Nemesh, J. C., Philippakis, A. A., Poplin, R. E., Price, A., Rivas, M. A., Sabeti, P. C., Schaffner, S. F., Shefler, E., Shlyakhter, I. A., Cooper, D. N., Ball, E. V., Mort, M., Phillips, A. D., Stenson, P. D., Sebat, J., Makarov, V., Ye, K., Yoon, S. C., Bustamante, C. D., Clark, A. G., Boyko, A., Degenhardt, J., Gravel, S., Gutenkunst, R. N., Kaganovich, M., Keinan, A., Lacroute, P., Ma, X., Reynolds, A., Clarke, L., Flicek, P., Cunningham, F., Herrero, J., Keenen, S., Kulesha, E., Leinonen, R., McLaren, W., Radhakrishnan, R., Smith, R. E., Zalunin, V., Zheng-Bradley, X., Korbel, J. O., Stuetz, A. M., Humphray, S., Bauer, M., Cheetham, R. K., Cox, T., Eberle, M., James, T., Kahn, S., Murray, L., Ye, K., De La Vega, F. M., Fu, Y., Hyland, F. C., Manning, J. M., McLaughlin, S. F., Peckham, H. E., Sakarya, O., Sun, Y. A., Tsung, E. F., Batzer, M. A., Konkel, M. K., Walker, J. A., Sudbrak, R., Albrecht, M. W., Amstislavskiy, V. S., Herwig, R., Parkhomchuk, D. V., Sherry, S. T., Agarwala, R., Khouri, H., Morgulis, A. O., Paschall, J. E., Phan, L. D., Rotmistrovsky, K. E., Sanders, R. D., Shumway, M. F., Xiao, C., McVean, G. A., Auton, A., Iqbal, Z., Lunter, G., Marchini, J. L., Moutsianas, L., Myers, S., Tumian, A., Desany, B., Knight, J., Winer, R., Craig, D. W., Beckstrom-Sternberg, S. M., Christoforides, A., Kurdoglu, A. A., Pearson, J., Sinari, S. A., Tembe, W. D., Haussler, D., Hinrichs, A. S., Katzman, S. J., Kern, A., Kuhn, R. M., Przeworski, M., Hernandez, R. D., Howie, B., Kelley, J. L., Melton, S. C., Abecasis, G. R., Li, Y., Anderson, P., Blackwell, T., Chen, W., Cookson, W. O., Ding, J., Kang, H. M., Lathrop, M., Liang, L., Moffatt, M. F., Scheet, P., Sidore, C., Snyder, M., Zhan, X., Zoellner, S., Awadalla, P., Casals, F., Idaghdour, Y., Keebler, J., Stone, E. A., Zilversmit, M., Jorde, L., Xing, J., Eichler, E. E., Aksay, G., Alkan, C., Hajirasouliha, I., Hormozdiari, F., Kidd, J. M., Sahinalp, S. C., Sudmant, P. H., Mardis, E. R., Chen, K., Chinwalla, A., Ding, L., Koboldt, D. C., McLellan, M. D., Dooling, D., Weinstock, G., Wallis, J. W., Wendl, M. C., Zhang, Q., Durbin, R. M., Albers, C. A., Ayub, Q., Balasubramaniam, S., Barrett, J. C., Carter, D. M., Chen, Y., Conrad, D. F., Danecek, P., Dermitzakis, E. T., Hu, M., Huang, N., Hurles, M. E., Jin, H., Jostins, L., Keane, T. M., Keane, T. M., Le, S. Q., Lindsay, S., Long, Q., MacArthur, D. G., Montgomery, S. B., Parts, L., Stalker, J., Tyler-Smith, C., Walter, K., Zhang, Y., Gerstein, M. B., Snyder, M., Abyzov, A., Abyzov, A., Balasubramanian, S., Bjornson, R., Du, J., Grubert, F., Habegger, L., Haraksingh, R., Jee, J., Khurana, E., Lam, H. Y., Leng, J., Mu, X. J., Urban, A. E., Zhang, Z., Li, Y., Luo, R., Marth, G. T., Garrison, E. P., Kural, D., Quinlan, A. R., Stewart, C., Stromberg, M. P., Ward, A. N., Wu, J., Lee, C., Mills, R. E., Shi, X., McCarroll, S. A., Banks, E., DePristo, M. A., Handsaker, R. E., Hartl, C., Korn, J. M., Li, H., Nemesh, J. C., Sebat, J., Makarov, V., Ye, K., Yoon, S. C., Degenhardt, J., Kaganovich, M., Clarke, L., Smith, R. E., Zheng-Bradley, X., Korbel, J. O., Humphray, S., Cheetham, R. K., Eberle, M., Kahn, S., Murray, L., Ye, K., De La Vega, F. M., Fu, Y., Peckham, H. E., Sun, Y. A., Batzer, M. A., Konkel, M. K., Xiao, C., Iqbal, Z., Desany, B., Blackwell, T., Snyder, M., Xing, J., Eichler, E. E., Aksay, G., Alkan, C., Hajirasouliha, I., Hormozdiari, F., Kidd, J. M., Chen, K., Chinwalla, A., Ding, L., McLellan, M. D., Wallis, J. W., Hurles, M. E., Conrad, D. F., Walter, K., Zhang, Y., Gerstein, M. B., Snyder, M., Abyzov, A., Du, J., Grubert, F., Haraksingh, R., Jee, J., Khurana, E., Lam, H. Y., Leng, J., Mu, X. J., Urban, A. E., Zhang, Z., Gibbs, R. A., Bainbridge, M., Challis, D., Coafra, C., Dinh, H., Kovar, C., Lee, S., Muzny, D., Nazareth, L., Reid, J., Sabo, A., Yu, F., Yu, J., Marth, G. T., Garrison, E. P., Indap, A., Leong, W. F., Quinlan, A. R., Stewart, C., Ward, A. N., Wu, J., Cibulskis, K., Fennell, T. J., Gabriel, S. B., Garimella, K. V., Hartl, C., Shefler, E., Sougnez, C. L., Wilkinson, J., Clark, A. G., Gravel, S., Grubert, F., Clarke, L., Flicek, P., Smith, R. E., Zheng-Bradley, X., Sherry, S. T., Khouri, H. M., Paschall, J. E., Shumway, M. F., Xiao, C., McVean, G. A., Katzman, S. J., Abecasis, G. R., Blackwell, T., Mardis, E. R., Dooling, D., Fulton, L., Fulton, R., Koboldt, D. C., Durbin, R. M., Balasubramaniam, S., Coffey, A., Keane, T. M., MacArthur, D. G., Palotie, A., Scott, C., Stalker, J., Tyler-Smith, C., Gerstein, M. B., Balasubramanian, S., Chakravarti, A., Knoppers, B. M., Peltonen, L., Abecasis, G. R., Bustamante, C. D., Gharani, N., Gibbs, R. A., Jorde, L., Kaye, J. S., Kent, A., Li, T., McGuire, A. L., McVean, G. A., Ossorio, P. N., Rotimi, C. N., Su, Y., Toji, L. H., Tyler-Smith, C., Brooks, L. D., Felsenfeld, A. L., McEwen, J. E., Abdallah, A., Juenger, C. R., Clemm, N. C., Collins, F. S., Duncanson, A., Green, E. D., Guyer, M. S., Peterson, J. L., Schafer, A. J., Abecasis, G. R., Altshuler, D. L., Auton, A., Brooks, L. D., Durbin, R. M., Gibbs, R. A., Hurles, M. E., McVean, G. A. 2010; 467 (7319): 1061-1073

    Abstract

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

    View details for DOI 10.1038/nature09534

    View details for Web of Science ID 000283548600039

    View details for PubMedCentralID PMC3042601

  • Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies BIOINFORMATICS Yang, T., Beazley, C., Montgomery, S. B., Dimas, A. S., Gutierrez-Arcelus, M., Stranger, B. E., Deloukas, P., Dermitzakis, E. T. 2010; 26 (19): 2474-2476

    Abstract

    Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols.http://www.sanger.ac.uk/resources/software/genevar.

    View details for DOI 10.1093/bioinformatics/btq452

    View details for Web of Science ID 000282170000023

    View details for PubMedID 20702402

  • Integrating common and rare genetic variation in diverse human populations NATURE Altshuler, D. M., Gibbs, R. A., Peltonen, L., Dermitzakis, E., Schaffner, S. F., Yu, F., Bonnen, P. E., de Bakker, P. I., Deloukas, P., Gabriel, S. B., Gwilliam, R., Hunt, S., Inouye, M., Jia, X., Palotie, A., Parkin, M., Whittaker, P., Chang, K., Hawes, A., Lewis, L. R., Ren, Y., Wheeler, D., Muzny, D. M., Barnes, C., Darvishi, K., Hurles, M., Korn, J. M., Kristiansson, K., Lee, C., McCarroll, S. A., Nemesh, J., Keinan, A., Montgomery, S. B., Pollack, S., Price, A. L., Soranzo, N., Gonzaga-Jauregui, C., Anttila, V., Brodeur, W., Daly, M. J., Leslie, S., McVean, G., Moutsianas, L., Nguyen, H., Zhang, Q., Ghori, M. J., McGinnis, R., McLaren, W., Takeuchi, F., Grossman, S. R., Shlyakhter, I., Hostetter, E. B., Sabeti, P. C., Adebamowo, C. A., Foster, M. W., Gordon, D. R., Licinio, J., Manca, M. C., Marshall, P. A., Matsuda, I., Ngare, D., Wang, V. O., Reddy, D., Rotimi, C. N., Royal, C. D., Sharp, R. R., Zeng, C., Brooks, L. D., McEwen, J. E. 2010; 467 (7311): 52-58

    Abstract

    Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called 'HapMap 3', includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of

    View details for DOI 10.1038/nature09298

    View details for Web of Science ID 000281461200033

    View details for PubMedID 20811451

  • Transcriptome genetics using second generation sequencing in a Caucasian population NATURE Montgomery, S. B., Sammeth, M., Gutierrez-Arcelus, M., Lach, R. P., Ingle, C., Nisbett, J., Guigo, R., Dermitzakis, E. T. 2010; 464 (7289): 773-U151

    Abstract

    Gene expression is an important phenotype that informs about genetic and environmental effects on cellular state. Many studies have previously identified genetic variants for gene expression phenotypes using custom and commercially available microarrays. Second generation sequencing technologies are now providing unprecedented access to the fine structure of the transcriptome. We have sequenced the mRNA fraction of the transcriptome in 60 extended HapMap individuals of European descent and have combined these data with genetic variants from the HapMap3 project. We have quantified exon abundance based on read depth and have also developed methods to quantify whole transcript abundance. We have found that approximately 10 million reads of sequencing can provide access to the same dynamic range as arrays with better quantification of alternative and highly abundant transcripts. Correlation with SNPs (small nucleotide polymorphisms) leads to a larger discovery of eQTLs (expression quantitative trait loci) than with arrays. We also detect a substantial number of variants that influence the structure of mature transcripts indicating variants responsible for alternative splicing. Finally, measures of allele-specific expression allowed the identification of rare eQTLs and allelic differences in transcript structure. This analysis shows that high throughput sequencing technologies reveal new properties of genetic effects on the transcriptome and allow the exploration of genetic effects in cellular processes.

    View details for DOI 10.1038/nature08903

    View details for Web of Science ID 000276205000048

    View details for PubMedID 20220756

  • Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations PLOS GENETICS Nica, A. C., Montgomery, S. B., Dimas, A. S., Stranger, B. E., Beazley, C., Barroso, I., Dermitzakis, E. T. 2010; 6 (4)

    Abstract

    The recent success of genome-wide association studies (GWAS) is now followed by the challenge to determine how the reported susceptibility variants mediate complex traits and diseases. Expression quantitative trait loci (eQTLs) have been implicated in disease associations through overlaps between eQTLs and GWAS signals. However, the abundance of eQTLs and the strong correlation structure (LD) in the genome make it likely that some of these overlaps are coincidental and not driven by the same functional variants. In the present study, we propose an empirical methodology, which we call Regulatory Trait Concordance (RTC) that accounts for local LD structure and integrates eQTLs and GWAS results in order to reveal the subset of association signals that are due to cis eQTLs. We simulate genomic regions of various LD patterns with both a single or two causal variants and show that our score outperforms SNP correlation metrics, be they statistical (r(2)) or historical (D'). Following the observation of a significant abundance of regulatory signals among currently published GWAS loci, we apply our method with the goal to prioritize relevant genes for each of the respective complex traits. We detect several potential disease-causing regulatory effects, with a strong enrichment for immunity-related conditions, consistent with the nature of the cell line tested (LCLs). Furthermore, we present an extension of the method in trans, where interrogating the whole genome for downstream effects of the disease variant can be informative regarding its unknown primary biological effect. We conclude that integrating cellular phenotype associations with organismal complex traits will facilitate the biological interpretation of the genetic effects on these traits.

    View details for DOI 10.1371/journal.pgen.1000895

    View details for Web of Science ID 000277354200012

    View details for PubMedID 20369022

  • Out of the sequencer and into the wiki as we face new challenges in genome informatics. Genome biology Ning, Z., Montgomery, S. B. 2010; 11 (10): 308-?

    Abstract

    A report on the joint Cold Spring Harbor Laboratory/Wellcome Trust Conference 'Genome Informatics', 15-19 September 2010, Hinxton, Cambridge, UK.

    View details for DOI 10.1186/gb-2010-11-10-308

    View details for PubMedID 21067526

  • Annotating the regulatory genome. Methods in molecular biology (Clifton, N.J.) Montgomery, S. B., Kasaian, K., Jones, S. J., Griffith, O. L. 2010; 674: 313-349

    Abstract

    Determining the timing and molecular repertoire responsible for gene expression is fundamental to understanding a gene's function. Heritable differences in this character are increasingly regarded as explanatory for complex and common traits. For many known trait-predisposing genes, studies have sought to elucidate the associated logic behind gene regulation. However, there exist many challenges in deciphering these mechanisms. Among them, it is recognized that we have limited understanding of regulatory complexity, the current models of gene regulation have low specificity and any gene's regulatory logic is dependent on biological context. Addressing these limitations and defining the regulatory genome is an ongoing challenge for molecular biology. We discuss current efforts to define and annotate the regulatory genome by focusing on curation and text-mining activities. We further highlight the type of information and curation process for describing regulatory elements within the ORegAnno database ( www.oreganno.org ) and how the general standards for such information are changing.

    View details for DOI 10.1007/978-1-60761-854-6_20

    View details for PubMedID 20827601

  • The resolution of the genetics of gene expression HUMAN MOLECULAR GENETICS Montgomery, S. B., Dermitzakis, E. T. 2009; 18: R211-R215

    Abstract

    Understanding the influence of genetics on the molecular mechanisms underpinning human phenotypic diversity is fundamental to being able to predict health outcomes and treat disease. To interrogate the role of genetics on cellular state and function, gene expression has been extensively used. Past and present studies have highlighted important patterns of heritability, population differentiation and tissue-specificity in gene expression. Current and future studies are taking advantage of systems biology-based approaches and advances in sequencing technology: new methodology aims to translate regulatory networks to enrich pathways responsible for disease etiology and 2nd generation sequencing now offers single-molecular resolution of the transcriptome providing unprecedented information on the structural and genetic characteristics of gene expression. Such advances are leading to a future where rich cellular phenotypes will facilitate understanding of the transmission of genetic effect from the gene to organism.

    View details for DOI 10.1093/hmg/ddp400

    View details for Web of Science ID 000271265600012

    View details for PubMedID 19808798

  • Common Regulatory Variation Impacts Gene Expression in a Cell Type-Dependent Manner SCIENCE Dimas, A. S., Deutsch, S., Stranger, B. E., Montgomery, S. B., Borel, C., Attar-Cohen, H., Ingle, C., Beazley, C., Arcelus, M. G., Sekowska, M., Gagnebin, M., Nisbett, J., Deloukas, P., Dermitzakis, E. T., Antonarakis, S. E. 2009; 325 (5945): 1246-1250

    Abstract

    Studies correlating genetic variation to gene expression facilitate the interpretation of common human phenotypes and disease. As functional variants may be operating in a tissue-dependent manner, we performed gene expression profiling and association with genetic variants (single-nucleotide polymorphisms) on three cell types of 75 individuals. We detected cell type-specific genetic effects, with 69 to 80% of regulatory variants operating in a cell type-specific manner, and identified multiple expressive quantitative trait loci (eQTLs) per gene, unique or shared among cell types and positively correlated with the number of transcripts per gene. Cell type-specific eQTLs were found at larger distances from genes and at lower effect size, similar to known enhancers. These data suggest that the complete regulatory variant repertoire can only be uncovered in the context of cell-type specificity.

    View details for DOI 10.1126/science.1174148

    View details for Web of Science ID 000269523200038

    View details for PubMedID 19644074

  • Is the thrifty genotype hypothesis supported by evidence based on confirmed type 2 diabetes- and obesity-susceptibility variants? DIABETOLOGIA Southam, L., Soranzo, N., Montgomery, S. B., Frayling, T. M., McCarthy, M. I., Barroso, I., Zeggini, E. 2009; 52 (9): 1846-1851

    Abstract

    According to the thrifty genotype hypothesis, the high prevalence of type 2 diabetes and obesity is a consequence of genetic variants that have undergone positive selection during historical periods of erratic food supply. The recent expansion in the number of validated type 2 diabetes- and obesity-susceptibility loci, coupled with access to empirical data, enables us to look for evidence in support (or otherwise) of the thrifty genotype hypothesis using proven loci.We employed a range of tests to obtain complementary views of the evidence for selection: we determined whether the risk allele at associated 'index' single-nucleotide polymorphisms is derived or ancestral, calculated the integrated haplotype score (iHS) and assessed the population differentiation statistic fixation index (F (ST)) for 17 type 2 diabetes and 13 obesity loci.We found no evidence for significant differences for the derived/ancestral allele test. None of the studied loci showed strong evidence for selection based on the iHS score. We find a high F (ST) for rs7901695 at TCF7L2, the largest type 2 diabetes effect size found to date.Our results provide some evidence for selection at specific loci, but there are no consistent patterns of selection that provide conclusive confirmation of the thrifty genotype hypothesis. Discovery of more signals and more causal variants for type 2 diabetes and obesity is likely to allow more detailed examination of these issues.

    View details for DOI 10.1007/s00125-009-1419-3

    View details for Web of Science ID 000268776100018

    View details for PubMedID 19526209

  • Current computational methods for prioritizing candidate regulatory polymorphisms. Methods in molecular biology (Clifton, N.J.) Montgomery, S. 2009; 569: 89-114

    Abstract

    Discovery of DNA sequence variants responsible for human phenotypic variation is key to advances in molecular diagnostics and medicines. Historically, variants that alter the protein-coding sequence of genes have been targeted when attempting to identify a trait's etiology; this is done because the rules governing these regions are generally well-understood and candidate variants can be easily selected. However, the effects of variants on gene regulation are increasingly regarded as being as important as protein-coding variation in uncovering the nature of phenotypic variation. I discuss resources and methodology that have recently been developed to computationally prioritize variants that may alter gene expression.

    View details for DOI 10.1007/978-1-59745-524-4_5

    View details for PubMedID 19623487

  • ORegAnno: an open-access community-driven resource for regulatory annotation NUCLEIC ACIDS RESEARCH Griffith, O. L., Montgomery, S. B., Bernier, B., Chu, B., Kasaian, K., Aerts, S., Mahony, S., Sleumer, M. C., Bilenky, M., Haeussler, M., Griffith, M., Gallo, S. M., Giardine, B., Hooghe, B., Van Loo, P., Blanco, E., Ticoll, A., Lithwick, S., Portales-Casamar, E., Donaldson, I. J., Robertson, G., Wadelius, C., De Bleser, P., Vlieghe, D., Halfon, M. S., Wasserman, W., Hardison, R., Bergman, C. M., Jones, S. J. 2008; 36: D107-D113

    Abstract

    ORegAnno is an open-source, open-access database and literature curation system for community-based annotation of experimentally identified DNA regulatory regions, transcription factor binding sites and regulatory variants. The current release comprises 30 145 records curated from 922 publications and describing regulatory sequences for over 3853 genes and 465 transcription factors from 19 species. A new feature called the 'publication queue' allows users to input relevant papers from scientific literature as targets for annotation. The queue contains 4438 gene regulation papers entered by experts and another 54 351 identified by text-mining methods. Users can enter or 'check out' papers from the queue for manual curation using a series of user-friendly annotation pages. A typical record entry consists of species, sequence type, sequence, target gene, binding factor, experimental outcome and one or more lines of experimental evidence. An evidence ontology was developed to describe and categorize these experiments. Records are cross-referenced to Ensembl or Entrez gene identifiers, PubMed and dbSNP and can be visualized in the Ensembl or UCSC genome browsers. All data are freely available through search pages, XML data dumps or web services at: http://www.oreganno.org.

    View details for DOI 10.1093/nar/gkm967

    View details for Web of Science ID 000252545400020

    View details for PubMedID 18006570

  • Text-mining assisted regulatory annotation GENOME BIOLOGY Aerts, S., Haeussler, M., Van Vooren, S., Griffith, O. L., Hulpiau, P., Jones, S. J., Montgomery, S. B., Bergman, C. M. 2008; 9 (2)

    Abstract

    Decoding transcriptional regulatory networks and the genomic cis-regulatory logic implemented in their control nodes is a fundamental challenge in genome biology. High-throughput computational and experimental analyses of regulatory networks and sequences rely heavily on positive control data from prior small-scale experiments, but the vast majority of previously discovered regulatory data remains locked in the biomedical literature.We develop text-mining strategies to identify relevant publications and extract sequence information to assist the regulatory annotation process. Using a vector space model to identify Medline abstracts from papers likely to have high cis-regulatory content, we demonstrate that document relevance ranking can assist the curation of transcriptional regulatory networks and estimate that, minimally, 30,000 papers harbor unannotated cis-regulatory data. In addition, we show that DNA sequences can be extracted from primary text with high cis-regulatory content and mapped to genome sequences as a means of identifying the location, organism and target gene information that is critical to the cis-regulatory annotation process.Our results demonstrate that text-mining technologies can be successfully integrated with genome annotation systems, thereby increasing the availability of annotated cis-regulatory data needed to catalyze advances in the field of gene regulation.

    View details for DOI 10.1186/gb-2008-9-2-r31

    View details for Web of Science ID 000254659300013

    View details for PubMedID 18271954

  • Population genomics of human gene expression NATURE GENETICS Stranger, B. E., Nica, A. C., Forrest, M. S., Dimas, A., Bird, C. P., Beazley, C., Ingle, C. E., Dunning, M., Flicek, P., Koller, D., Montgomery, S., Tavare, S., Deloukas, P., Dermitzakis, E. T. 2007; 39 (10): 1217-1224

    Abstract

    Genetic variation influences gene expression, and this variation in gene expression can be efficiently mapped to specific genomic regions and variants. Here we have used gene expression profiling of Epstein-Barr virus-transformed lymphoblastoid cell lines of all 270 individuals genotyped in the HapMap Consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies. A detailed association analysis of over 2.2 million common SNPs per population (5% frequency in HapMap) with gene expression identified at least 1,348 genes with association signals in cis and at least 180 in trans. Replication in at least one independent population was achieved for 37% of cis signals and 15% of trans signals, respectively. Our results strongly support an abundance of cis-regulatory variation in the human genome. Detection of trans effects is limited but suggests that regulatory variation may be the key primary effect contributing to phenotypic variation in humans. We also explore several methodologies that improve the current state of analysis of gene expression variation.

    View details for DOI 10.1038/ng2142

    View details for Web of Science ID 000249737400017

    View details for PubMedID 17873874

    View details for PubMedCentralID PMC2683249

  • A survey of genomic properties for the detection of regulatory polymorphisms PLOS COMPUTATIONAL BIOLOGY Montgomery, S. B., Griffith, O. L., Schuetz, J. M., Brooks-Wilson, A., Jones, S. J. 2007; 3 (6): 1000-1010

    Abstract

    Advances in the computational identification of functional noncoding polymorphisms will aid in cataloging novel determinants of health and identifying genetic variants that explain human evolution. To date, however, the development and evaluation of such techniques has been limited by the availability of known regulatory polymorphisms. We have attempted to address this by assembling, from the literature, a computationally tractable set of regulatory polymorphisms within the ORegAnno database (http://www.oreganno.org). We have further used 104 regulatory single-nucleotide polymorphisms from this set and 951 polymorphisms of unknown function, from 2-kb and 152-bp noncoding upstream regions of genes, to investigate the discriminatory potential of 23 properties related to gene regulation and population genetics. Among the most important properties detected in this region are distance to transcription start site, local repetitive content, sequence conservation, minor and derived allele frequencies, and presence of a CpG island. We further used the entire set of properties to evaluate their collective performance in detecting regulatory polymorphisms. Using a 10-fold cross-validation approach, we were able to achieve a sensitivity and specificity of 0.82 and 0.71, respectively, and we show that this performance is strongly influenced by the distance to the transcription start site.

    View details for DOI 10.1371/journal.pcbi.0030106

    View details for Web of Science ID 000249105500010

    View details for PubMedID 17559298

  • ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation BIOINFORMATICS Montgomery, S. B., Griffith, O. L., Sleumer, M. C., Bergman, C. M., Bilenky, M., Pleasance, E. D., Prychyna, Y., Zhang, X., Jones, S. J. 2006; 22 (5): 637-640

    Abstract

    Our understanding of gene regulation is currently limited by our ability to collectively synthesize and catalogue transcriptional regulatory elements stored in scientific literature. Over the past decade, this task has become increasingly challenging as the accrual of biologically validated regulatory sequences has accelerated. To meet this challenge, novel community-based approaches to regulatory element annotation are required.Here, we present the Open Regulatory Annotation (ORegAnno) database as a dynamic collection of literature-curated regulatory regions, transcription factor binding sites and regulatory mutations (polymorphisms and haplotypes). ORegAnno has been designed to manage the submission, indexing and validation of new annotations from users worldwide. Submissions to ORegAnno are immediately cross-referenced to EnsEMBL, dbSNP, Entrez Gene, the NCBI Taxonomy database and PubMed, where appropriate.ORegAnno is available directly through MySQL, Web services, and online at http://www.oreganno.org. All software is licensed under the Lesser GNU Public License (LGPL).

    View details for DOI 10.1093/bioinformatics/btk027

    View details for Web of Science ID 000235604400024

    View details for PubMedID 16397004

  • cisRED: a database system for genome-scale computational discovery of regulatory elements NUCLEIC ACIDS RESEARCH Robertson, G., Bilenky, M., Lin, K., He, A., Yuen, W., Dagpinar, M., Varhol, R., Teague, K., Griffith, O. L., Zhang, X., Pan, Y., Hassel, M., Sleumer, M. C., Pan, W., Pleasance, E. D., Chuang, M., Hao, H., Li, Y. Y., Robertson, N., Fjell, C., Li, B., Montgomery, S. B., Astakhova, T., Zhou, J., Sander, J., Siddiqui, A. S., Jones, S. J. 2006; 34: D68-D73

    Abstract

    We describe cisRED, a database for conserved regulatory elements that are identified and ranked by a genome-scale computational system (www.cisred.org). The database and high-throughput predictive pipeline are designed to address diverse target genomes in the context of rapidly evolving data resources and tools. Motifs are predicted in promoter regions using multiple discovery methods applied to sequence sets that include corresponding sequence regions from vertebrates. We estimate motif significance by applying discovery and post-processing methods to randomized sequence sets that are adaptively derived from target sequence sets, retain motifs with p-values below a threshold and identify groups of similar motifs and co-occurring motif patterns. The database offers information on atomic motifs, motif groups and patterns. It is web-accessible, and can be queried directly, downloaded or installed locally.

    View details for DOI 10.1093/nar/gkj075

    View details for Web of Science ID 000239307700015

    View details for PubMedID 16381958

  • An application of peer-to-peer technology to the discovery, use and assessment of bioinformatics programs NATURE METHODS Montgomery, S. B., Fu, T., Guan, J., Lin, K., Jones, S. J. 2005; 2 (8): 563-563

    View details for Web of Science ID 000230884500002

    View details for PubMedID 16094378

  • Sockeye: A 3D environment for comparative genomics GENOME RESEARCH Montgomery, S. B., Astakhova, T., Bilenky, M., Birney, E., Fu, T., Hassel, M., Melsopp, C., Rak, M., Robertson, A. G., Sleumer, M., Siddiqui, A. S., Jones, S. J. 2004; 14 (5): 956-962

    Abstract

    Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization.

    View details for DOI 10.1101/gr.1890304

    View details for Web of Science ID 000221171700022

    View details for PubMedID 15123592

  • The genome sequence of the SARS-associated coronavirus SCIENCE Marra, M. A., Jones, S. J., Astell, C. R., Holt, R. A., Brooks-Wilson, A., Butterfield, Y. S., Khattra, J., Asano, J. K., Barber, S. A., Chan, S. Y., Cloutier, A., Coughlin, S. M., Freeman, D., Girn, N., Griffith, O. L., Leach, S. R., Mayo, M., MCDONALD, H., Montgomery, S. B., Pandoh, P. K., Petrescu, A. S., Robertson, A. G., Schein, J. E., Siddiqui, A., Smailus, D. E., Stott, J. E., Yang, G. S., Plummer, F., Andonov, A., Artsob, H., Bastien, N., Bernard, K., Booth, T. F., Bowness, D., Czub, M., Drebot, M., Fernando, L., Flick, R., Garbutt, M., Gray, M., Grolla, A., Jones, S., Feldmann, H., Meyers, A., Kabani, A., Li, Y., Normand, S., Stroher, U., Tipples, G. A., Tyler, S., Vogrig, R., Ward, D., Watson, B., BRUNHAM, R. C., Krajden, M., Petric, M., Skowronski, D. M., Upton, C., Roper, R. L. 2003; 300 (5624): 1399-1404

    Abstract

    We sequenced the 29,751-base genome of the severe acute respiratory syndrome (SARS)-associated coronavirus known as the Tor2 isolate. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously known groups of coronaviruses. The genome sequence will aid in the diagnosis of SARS virus infection in humans and potential animal hosts (using polymerase chain reaction and immunological tests), in the development of antivirals (including neutralizing antibodies), and in the identification of putative epitopes for vaccine development.

    View details for DOI 10.1126/science.1085953

    View details for Web of Science ID 000183181800036

    View details for PubMedID 12730501