All Publications


  • Impact of genome build on RNA-seq interpretation and diagnostics. American journal of human genetics Ungar, R. A., Goddard, P. C., Jensen, T. D., Degalez, F., Smith, K. S., Jin, C. A., Bonner, D. E., Bernstein, J. A., Wheeler, M. T., Montgomery, S. B. 2024

    Abstract

    Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network and Genomics Research to Elucidate the Genetics of Rare Disease Consortium. Across six routinely collected biospecimens, 61% of quantified genes were not influenced by genome build. However, we identified 1,492 genes with build-dependent quantification, 3,377 genes with build-exclusive expression, and 9,077 genes with annotation-specific expression across six routinely collected biospecimens, including 566 clinically relevant and 512 known OMIM genes. Further, we demonstrate that between builds for a given gene, a larger difference in quantification is well correlated with a larger change in expression outlier calling. Combined, we provide a database of genes impacted by build choice and recommend that transcriptomics-guided analyses and diagnoses are cross referenced with these data for robustness.

    View details for DOI 10.1016/j.ajhg.2024.05.005

    View details for PubMedID 38834072

  • Loss of function of FAM177A1, a Golgi complex localized protein, causes a novel neurodevelopmental disorder. Genetics in medicine : official journal of the American College of Medical Genetics Kohler, J. N., Legro, N. R., Baldridge, D., Shin, J., Bowman, A., Ugur, B., Jackstadt, M. M., Shriver, L. P., Patti, G. J., Zhang, B., Feng, W., McAdow, A. R., Goddard, P., Ungar, R. A., Jensen, T., Smith, K. S., Fresard, L., Alvarez, R., Bonner, D., Reuter, C. M., McCormack, C., Kravets, E., Marwaha, S., Holt, J. M., Worthey, E., Ashley, E. A., Montgomery, S. B., Fisher, P., Postlethwait, J., De Camilli, P., Solnica-Krezel, L., Bernstein, J. A., Wheeler, M. T. 2024: 101166

    Abstract

    The function of FAM177A1 and its relationship to human disease is largely unknown. Recent studies have demonstrated FAM177A1 to be a critical immune-associated gene. One previous case study has linked FAM177A1 to a neurodevelopmental disorder in four siblings.We identified five individuals from three unrelated families with biallelic variants in FAM177A1. The physiological function of FAM177A1 was studied in a zebrafish model organism and human cell lines with loss-of-function variants similar to the affected cohort.These individuals share a characteristic phenotype defined by macrocephaly, global developmental delay, intellectual disability, seizures, behavioral abnormalities, hypotonia, and gait disturbance. We show that FAM177A1 localizes to the Golgi complex in mammalian and zebrafish cells. Intersection of the RNA-seq and metabolomic datasets from FAM177A1-deficient human fibroblasts and whole zebrafish larvae demonstrated dysregulation of pathways associated with apoptosis, inflammation, and negative regulation of cell proliferation.Our data sheds light on the emerging function of FAM177A1 and defines FAM177A1-related neurodevelopmental disorder as a new clinical entity.

    View details for DOI 10.1016/j.gim.2024.101166

    View details for PubMedID 38767059

  • Increasing equity in science requires better ethics training: A course by trainees, for trainees. Cell genomics Patel, R. A., Ungar, R. A., Pyke, A. L., Adimoelja, A., Chakraborty, M., Cotter, D. J., Freund, M., Goddard, P., Gomez-Stafford, J., Greenwald, E., Higgs, E., Hunter, N., MacKenzie, T. M., Narain, A., Gjorgjieva, T., Martschenko, D. O. 2024: 100554

    Abstract

    Despite the profound impacts of scientific research, few scientists have received the necessary training to productively discuss the ethical and societal implications of their work. To address this critical gap, we-a group of predominantly human genetics trainees-developed a course on genetics, ethics, and society. We intend for this course to serve as a template for other institutions and scientific disciplines. Our curriculum positions human genetics within its historical and societal context and encourages students to evaluate how societal norms and structures impact the conduct of scientific research. We demonstrate the utility of this course via surveys of enrolled students and provide resources and strategies for others hoping to teach a similar course. We conclude by arguing that if we are to work toward rectifying the inequities and injustices produced by our field, we must first learn to view our own research as impacting and being impacted by society.

    View details for DOI 10.1016/j.xgen.2024.100554

    View details for PubMedID 38697124

  • Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease. medRxiv : the preprint server for health sciences Jensen, T. D., Ni, B., Reuter, C. M., Gorzynski, J. E., Fazal, S., Bonner, D., Ungar, R. A., Goddard, P. C., Raja, A., Ashley, E. A., Bernstein, J. A., Zuchner, S., Greicius, M. D., Montgomery, S. B., Schatz, M. C., Wheeler, M. T., Battle, A. 2024

    Abstract

    Rare structural variants (SVs) - insertions, deletions, and complex rearrangements - can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore long-read genomes of 68 individuals from the Undiagnosed Disease Network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF < 0.01) SV alleles per genome on average, achieving a 2.4x increase from short-reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals, and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that don't incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in FAM177A1 shared by two siblings, which were likely causal for a rare neurodevelopmental disorder. Our observations demonstrate the promise of integrating long-read sequencing with gene expression towards improving the prioritization of functional SVs and TREs in rare disease patients.

    View details for DOI 10.1101/2024.03.22.24304565

    View details for PubMedID 38585781

    View details for PubMedCentralID PMC10996727

  • Impact of genome build on RNA-seq interpretation and diagnostics. medRxiv : the preprint server for health sciences Ungar, R. A., Goddard, P. C., Jensen, T. D., Degalez, F., Smith, K. S., Jin, C. A., Bonner, D. E., Bernstein, J. A., Wheeler, M. T., Montgomery, S. B. 2024

    Abstract

    Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network (UDN) and Genomics Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium. We identified 2,800 genes with build-dependent quantification across six routinely-collected biospecimens, including 1,391 protein-coding genes and 341 known rare disease genes. We further observed multiple genes that only have detectable expression in a subset of genome builds. Finally, we characterized how genome build impacts the detection of outlier transcriptomic events. Combined, we provide a database of genes impacted by build choice, and recommend that transcriptomics-guided analyses and diagnoses are cross-referenced with these data for robustness.

    View details for DOI 10.1101/2024.01.11.24301165

    View details for PubMedID 38260490

    View details for PubMedCentralID PMC10802764

  • Transcriptomics and chromatin accessibility in multiple African population samples. bioRxiv : the preprint server for biology DeGorter, M. K., Goddard, P. C., Karakoc, E., Kundu, S., Yan, S. M., Nachun, D., Abell, N., Aguirre, M., Carstensen, T., Chen, Z., Durrant, M., Dwaracherla, V. R., Feng, K., Gloudemans, M. J., Hunter, N., Moorthy, M. P., Pomilla, C., Rodrigues, K. B., Smith, C. J., Smith, K. S., Ungar, R. A., Balliu, B., Fellay, J., Flicek, P., McLaren, P. J., Henn, B., McCoy, R. C., Sugden, L., Kundaje, A., Sandhu, M. S., Gurdasani, D., Montgomery, S. B. 2023

    Abstract

    Mapping the functional human genome and impact of genetic variants is often limited to European-descendent population samples. To aid in overcoming this limitation, we measured gene expression using RNA sequencing in lymphoblastoid cell lines (LCLs) from 599 individuals from six African populations to identify novel transcripts including those not represented in the hg38 reference genome. We used whole genomes from the 1000 Genomes Project and 164 Maasai individuals to identify 8,881 expression and 6,949 splicing quantitative trait loci (eQTLs/sQTLs), and 2,611 structural variants associated with gene expression (SV-eQTLs). We further profiled chromatin accessibility using ATAC-Seq in a subset of 100 representative individuals, to identity chromatin accessibility quantitative trait loci (caQTLs) and allele-specific chromatin accessibility, and provide predictions for the functional effect of 78.9 million variants on chromatin accessibility. Using this map of eQTLs and caQTLs we fine-mapped GWAS signals for a range of complex diseases. Combined, this work expands global functional genomic data to identify novel transcripts, functional elements and variants, understand population genetic history of molecular quantitative trait loci, and further resolve the genetic basis of multiple human traits and disease.

    View details for DOI 10.1101/2023.11.04.564839

    View details for PubMedID 37986808

    View details for PubMedCentralID PMC10659267