Michael Snyder, Ph.D.
Stanford W. Ascherman Professor of Genetics
Web page: http://snyderlab.stanford.edu/
Bio
1977 B.A, Chemistry and Biology, University of Rochester, NY
1978-1982 Ph.D. California Institute of Technology, CA Advisor: Dr. Norman Davidson
1982-1986 Postdoctoral Research Stanford University School of Medicine, CA Advisor: Dr. Ronald Davis
1986-2009 Faculty Dept of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT
2009-present Dept of Genetics, Stanford University School of Medicine, Stanford, CA
Academic Appointments
-
Professor, Genetics
-
Member, Bio-X
-
Member, Cardiovascular Institute
-
Member, Stanford Cancer Institute
-
Member, Wu Tsai Neurosciences Institute
Administrative Appointments
-
Chair, Dept. of Genetics (2009 - Present)
-
Director, Center for Genomics and Personalized Medicine (2009 - Present)
Honors & Awards
-
George Beadle Award, GSA (2019)
-
Elected Member, American Academy of Science (2015)
-
Stanford B. Ascherman Professor, Stanford (2011)
-
Pioneer Award, Human Proteome Organization (2009)
-
CT Medal of Science, Connection Academy of Science (2007)
-
Pew Scholar Award, Pew Foundation (1987-1991)
Current Research and Scholarly Interests
We are presently in an omics revolution in which genomes and other omes can be readily characterized. Our laboratory uses a variety of approaches to analyze genomes and regulatory networks. Our research focuses on yeast, an ideal model organism ideally suited to genetic analysis, and humans.
1) Transcriptomes
To annotate genomes, we developed RNA sequencing for annotation the yeast and human transcriptomes. We discovered that the eukaryotic transcriptome is much more complex than previously appreciated and that embryonic stem cells have more transcript isoforms than differentiated cells.
2) Transcription Factor Binding Networks
We have also developed methods for mapping transcription factor binding sites through the genome. We used this to develop regulatory maps and have been using this to help decipher the combinatorial regulatory code which factors work together to regulate which genes. Using this approach we have mapped out pathways crucial for metabolism and inflammation.
3) Integrated Regulatory Networks
In addition to transcriptional factor binding networks we have also been mapping phosphorylation and metabolite-protein interaction networks. These studies have revealed novel global regulators and key points in integrated regulatory networks.
4) Variation
We have been analyzing differences between individuals and species at two levels: DNA sequence variation and regulatory information variations. We developed paired end sequencing for humans and found that humans have extensive structural variation (SV), i.e. deletions, insertions and inversions. This is likely to be a major cause of phenotypic variation and human disease. In addition, by mapping binding sites difference among different yeast strains and humans, we have found that individuals differ much more in their regulatory information than in coding sequence differences. We can correlate these differences with those in SNPS and SVs, thereby associating noncoding DNA differences with regulatory information.
5) Human Disease
Finally, we are applying omics approaches of genome sequencing, transcriptomics proteomics metabolomics, DNA methylation and microbiome assays to the analysis of human disease. These integrative omics approaches are being applied to help understand the molecular basis of disease and the development of diagnostics and therapeutics.
Clinical Trials
-
Multiomic Signatures of Microbial Metabolites Following Prebiotic Fiber Supplementation
Recruiting
The investigators propose a comprehensive, multiomic study that will integrate longitudinal data associating changes in specific gut bacteria and host in response to prebiotic fiber supplementation. These data will guide the development of an integrative biological signature relating bacterial-derived metabolites with biological outcome in the host. The open sharing of data generated by the proposed research represents a significant public resource that will support and accelerate future novel studies.
-
Precision Diets for Diabetes Prevention
Recruiting
With this study the investigators want to understand the physiological differences for people developing pre-diabetes and diabetes. The investigators hypothesize that different individuals go through different paths in the development of the disease. By understanding the personal mechanism for developing disease, the investigators will find a personalized approach to prevent that development. The investigators are also hoping to be able to find a biomarker that will pinpoint to the particular defect and thus, diagnose the problem at an earlier stage and have the information to give personalized diet recommendations to prevent the development of diabetes more effectively.
-
The 28 Day Challenge
Not Recruiting
The purpose of this study is to determine how a 28 Day Challenge influences mental health and well-being. This is a blinded study. Participants both with and without depression and anxiety, will be included. A moderation analysis will be performed to see whether changes in depression after the intervention are a function of baseline depression and anxiety levels.
Stanford is currently not accepting patients for this trial. For more information, please contact Ariel Ganz, PhD, 650-736-8099.
-
The Lasting Change Study
Not Recruiting
The study approach is to leverage the most cutting-edge techniques of multi-omics biology, wearable physiology, and digital real-time psychology profiling and using machine learning models to understand the mechanisms underlying the strategies and techniques that enable participants the power to initiate and maintain sustainable behavior change. Over the years, millions of people worldwide have attended immersive personal development seminars aiming to improve participants' health behaviors and wellness. Nevertheless, there's a scarcity of large-scale studies to assess their effects on behavior change and investigate their mechanism of action. A recent publication by the Science of Behavior Change Program (SOBC), launched by the National Institute of Health (NIH), recognized that: "science has not yet delivered a unified understanding of basic mechanisms of behavior change across a broad range of health-related behaviors, limiting progress in the development and translation of effective and efficacious behavioral intervention." As such, understanding the mechanisms underlying sustainable behavior change is key. The Date With Destiny (DWD) seminar is among the largest worldwide, and tens of thousands of people have already attended and testified to its transformative effect. The main objective of the study is to uncover the underlying mechanism of behavior change through longitudinal data collection of psychometrics Ecological Momentary Assessments, physiology (wearables), and biology (multi-omics) in study participants. The study specific objectives include: (1) To evaluate the impact of DWD on sustainable behavior change; (2) To investigate the mechanism of behavior change by collecting longitudinal real-time measurements of psychometrics (e.g., Ecological Momentary Assessments \[EMA\]), physiological (e.g., heart rate, blood oxygen level, breathing rate, and EDA), and biological (multi-omics analyses) features in study participants; (3) To assess the effect of the DWD on professional fulfillment, resilience, and mental wellness.
Stanford is currently not accepting patients for this trial.
-
Understanding and Diagnosing Allergic Disease in Twins
Not Recruiting
The purpose of this study is to gain better understanding of how the immune system works in twins with and without allergic disease. Healthy volunteers are not specifically targeted. Healthy non-allergic study participants may be found through the course of evaluation for the presence of allergies.
Stanford is currently not accepting patients for this trial. For more information, please contact Kari A Nadeau, MD, PhD, 650-521-7237.
2024-25 Courses
- AI for Beginners
GENE 231 (Spr) - Aging: Science and Technology for Longevity
GENE 223 (Spr) - Cloud Computing for Biology and Healthcare
BIOMEDIN 222, CS 273C, GENE 222 (Spr) - Genomics
GENE 211 (Win) - Healthcare Entrepreneurship
GENE 134, GENE 234 (Aut) - How We Age
GENE 229 (Win) -
Independent Studies (21)
- Biomedical Informatics Teaching Methods
BIOMEDIN 290 (Aut, Win, Spr) - Directed Reading and Research
BIOMEDIN 299 (Aut, Win, Spr) - Directed Reading in Genetics
GENE 299 (Aut, Win, Spr) - Directed Reading in Immunology
IMMUNOL 299 (Aut, Win, Spr) - Directed Reading in Stem Cell Biology and Regenerative Medicine
STEMREM 299 (Aut, Win, Spr) - Directed Study
BIOE 391 (Aut, Win, Spr) - Early Clinical Experience in Immunology
IMMUNOL 280 (Aut, Win, Spr) - Graduate Research
BIOPHYS 300 (Aut, Win, Spr) - Graduate Research
GENE 399 (Aut, Win, Spr) - Graduate Research
IMMUNOL 399 (Aut, Win, Spr) - Graduate Research
STEMREM 399 (Aut, Win, Spr) - Medical Scholars Research
BIOMEDIN 370 (Aut, Win, Spr) - Medical Scholars Research
GENE 370 (Aut, Win, Spr) - Medical Scholars Research
STEMREM 370 (Aut, Win, Spr) - Out-of-Department Advanced Research Laboratory in Bioengineering
BIOE 191X (Aut, Win, Spr) - Out-of-Department Undergraduate Research
BIO 199X (Aut, Win, Spr) - Supervised Study
GENE 260 (Aut, Win, Spr) - Teaching in Immunology
IMMUNOL 290 (Aut, Win, Spr) - Undergraduate Research
GENE 199 (Aut, Win, Spr) - Undergraduate Research
IMMUNOL 199 (Aut, Win, Spr) - Undergraduate Research
STEMREM 199 (Aut, Win, Spr)
- Biomedical Informatics Teaching Methods
-
Prior Year Courses
2023-24 Courses
- AI for Beginners
GENE 231 (Win) - Aging: Science and Technology for Longevity
GENE 223 (Spr) - Cloud Computing for Biology and Healthcare
BIOMEDIN 222, CS 273C, GENE 222 (Spr) - Genomics
GENE 211 (Win) - Healthcare Venture Capital
GENE 225 (Aut)
2022-23 Courses
- AI, Genes and Ethics
GENE 213 (Aut) - Aging: Science and Technology for Longevity
GENE 223 (Win) - Chronic Disease I: Applications of Novel Advances in Biology and Biotechnology
BIO 109A (Win) - Chronic Disease II: Applications of Advances in Precision Medicine and Digital Health Technologies
BIO 109B (Spr) - Cloud Computing for Biology and Healthcare
BIOMEDIN 222, CS 273C, GENE 222 (Spr) - Engineering Wellness
BIOS 237 (Spr) - Genomics
GENE 211 (Win) - LONGEVITY VENTURE CAPITAL
GENE 226 (Spr) - Stanford SKY Campus Happiness Retreat
BIOS 215 (Aut)
2021-22 Courses
- Cloud Computing for Biology and Healthcare
BIOMEDIN 222, CS 273C, GENE 222 (Spr) - Genomics
GENE 211 (Win) - Healthcare Venture Capital
GENE 225 (Spr) - How We Age
GENE 229 (Win) - LONGEVITY VENTURE CAPITAL
GENE 226 (Aut) - Stanford SKY Campus Happiness Retreat
BIOS 215 (Aut, Spr)
- AI for Beginners
Stanford Advisees
-
Med Scholar Project Advisor
Joseph Allen, Long Sha Liu, Isha Mehrotra, Jessalyn Ubellacker -
Doctoral Dissertation Reader (AC)
Michael Hittle, Eric Sun, Ronghao Zhou -
Postdoctoral Faculty Sponsor
Abdalla Ahmed, Mohan Babu, Nasim Bararpour, John Cao, Varuna Chander, Faye Chleilat, Sara Fakhretaha Aval, Shubham Gupta, Brady Hislop, Hirotaka Ieki, Linda Lan, Xiangping Lin, Tim MacKenzie, Caleb Mayer, Curtis McGinity, Pardis Miri, Mihir Mongia, Daniel Panyard, Majid Rodgar, M. Reza Sailani, Jou-Ho Shih, Mahasish Shome, Morgan Smith, Mingming Tong, Shannon White, Yue Wu, Yizhou Zhu -
Doctoral Dissertation Advisor (AC)
Martin Acosta Parra, Siranush Babakhanova, Naiomi Hunter, Jessica Kain, Ziv Lautman -
Doctoral Dissertation Co-Advisor (AC)
Ben Ehlert -
Doctoral (Program)
Alexander Johansen
Graduate and Fellowship Programs
-
Biomedical Informatics (Phd Program)
All Publications
-
Nonlinear dynamics of multi-omics profiles during human aging.
Nature aging
2024
Abstract
Aging is a complex process associated with nearly all diseases. Understanding the molecular changes underlying aging and identifying therapeutic targets for aging-related diseases are crucial for increasing healthspan. Although many studies have explored linear changes during aging, the prevalence of aging-related diseases and mortality risk accelerates after specific time points, indicating the importance of studying nonlinear molecular changes. In this study, we performed comprehensive multi-omics profiling on a longitudinal human cohort of 108 participants, aged between 25 years and 75 years. The participants resided in California, United States, and were tracked for a median period of 1.7 years, with a maximum follow-up duration of 6.8 years. The analysis revealed consistent nonlinear patterns in molecular markers of aging, with substantial dysregulation occurring at two major periods occurring at approximately 44 years and 60 years of chronological age. Distinct molecules and functional pathways associated with these periods were also identified, such as immune regulation and carbohydrate metabolism that shifted during the 60-year transition and cardiovascular disease, lipid and alcohol metabolism changes at the 40-year transition. Overall, this research demonstrates that functions and risks of aging-related diseases change nonlinearly across the human lifespan and provides insights into the molecular and biological pathways involved in these changes.
View details for DOI 10.1038/s43587-024-00692-2
View details for PubMedID 39143318
View details for PubMedCentralID 3341616
-
Post-GWAS multiomic functional investigation of the TNIP1 locus in Alzheimer's disease highlights a potential role for GPX3.
Alzheimer's & dementia : the journal of the Alzheimer's Association
2024
Abstract
Recent genome-wide association studies (GWAS) have reported a genetic association with Alzheimer's disease (AD) at the TNIP1/GPX3 locus, but the mechanism is unclear.We used cerebrospinal fluid (CSF) proteomics data to test (n = 137) and replicate (n = 446) the association of glutathione peroxidase 3 (GPX3) with CSF biomarkers (including amyloid and tau) and the GWAS-implicated variants (rs34294852 and rs871269).CSF GPX3 levels decreased with amyloid and tau positivity (analysis of variance P = 1.5 × 10-5) and higher CSF phosphorylated tau (p-tau) levels (P = 9.28 × 10-7). The rs34294852 minor allele was associated with decreased GPX3 (P = 0.041). The replication cohort found associations of GPX3 with amyloid and tau positivity (P = 2.56 × 10-6) and CSF p-tau levels (P = 4.38 × 10-9).These results suggest variants in the TNIP1 locus may affect the oxidative stress response in AD via altered GPX3 levels.Cerebrospinal fluid (CSF) glutathione peroxidase 3 (GPX3) levels decreased with amyloid and tau positivity and higher CSF phosphorylated tau. The minor allele of rs34294852 was associated with lower CSF GPX3. levels when also controlling for amyloid and tau category. GPX3 transcript levels in the prefrontal cortex were lower in Alzheimer's disease than controls. rs34294852 is an expression quantitative trait locus for GPX3 in blood, neutrophils, and microglia.
View details for DOI 10.1002/alz.13848
View details for PubMedID 38809917
-
Temporal dynamics of the multi-omic response to endurance exercise training.
Nature
2024; 629 (8010): 174-183
Abstract
Regular exercise promotes whole-body health and prevents disease, but the underlying molecular mechanisms are incompletely understood1-3. Here, the Molecular Transducers of Physical Activity Consortium4 profiled the temporal transcriptome, proteome, metabolome, lipidome, phosphoproteome, acetylproteome, ubiquitylproteome, epigenome and immunome in whole blood, plasma and 18 solid tissues in male and female Rattus norvegicus over eight weeks of endurance exercise training. The resulting data compendium encompasses 9,466 assays across 19 tissues, 25 molecular platforms and 4 training time points. Thousands of shared and tissue-specific molecular alterations were identified, with sex differences found in multiple tissues. Temporal multi-omic and multi-tissue analyses revealed expansive biological insights into the adaptive responses to endurance training, including widespread regulation of immune, metabolic, stress response and mitochondrial pathways. Many changes were relevant to human health, including non-alcoholic fatty liver disease, inflammatory bowel disease, cardiovascular health and tissue injury and recovery. The data and analyses presented in this study will serve as valuable resources for understanding and exploring the multi-tissue molecular effects of endurance training and are provided in a public repository ( https://motrpac-data.org/ ).
View details for DOI 10.1038/s41586-023-06877-w
View details for PubMedID 38693412
View details for PubMedCentralID PMC11062907
-
Longitudinal profiling of the microbiome at four body sites reveals core stability and individualized dynamics during health and disease.
Cell host & microbe
2024
Abstract
To understand the dynamic interplay between the human microbiome and host during health and disease, we analyzed the microbial composition, temporal dynamics, and associations with host multi-omics, immune, and clinical markers of microbiomes from four body sites in 86 participants over 6 years. We found that microbiome stability and individuality are body-site specific and heavily influenced by the host. The stool and oral microbiome are more stable than the skin and nasal microbiomes, possibly due to their interaction with the host and environment. We identify individual-specific and commonly shared bacterial taxa, with individualized taxa showing greater stability. Interestingly, microbiome dynamics correlate across body sites, suggesting systemic dynamics influenced by host-microbial-environment interactions. Notably, insulin-resistant individuals show altered microbial stability and associations among microbiome, molecular markers, and clinical features, suggesting their disrupted interaction in metabolic disease. Our study offers comprehensive views of multi-site microbial dynamics and their relationship with host health and disease.
View details for DOI 10.1016/j.chom.2024.02.012
View details for PubMedID 38479397
-
The importance, challenges, and possible solutions for sharing proteomics data while safeguarding individuals' privacy.
Molecular & cellular proteomics : MCP
2024: 100731
Abstract
Proteomics data sharing has profound benefits at individual level as well as at community level. While data sharing has increased over the years, mostly due to journal and funding agency requirements, the reluctance of researchers with regards to data sharing is evident as many shares only the bare minimum dataset required to publish an article. In many cases, proper metadata is missing, essentially making the dataset useless. This behavior can be explained by lack of incentives, insufficient awareness, or a lack of clarity surrounding ethical issues. Through adequate training at research institutes, researchers can realize the benefits associated with data sharing and can accelerate the norm of data sharing for the field of proteomics, as has been the standard in genomics for decades. In this article, we have put together various repository options available for proteomics data. We have also added pros and cons of those repositories to facilitate researchers in selecting the repository most suitable for their data submission. It is also important to note that a few types of proteomics data have the potential to re-identify an individual in certain scenarios. In such cases, extra caution should be taken to remove any personal identifiers before sharing on public repositories. Datasets which will be useless without personal identifiers need to be shared in a controlled access repository so that only authorized researchers can access the data and personal identifiers are kept safe.
View details for DOI 10.1016/j.mcpro.2024.100731
View details for PubMedID 38331191
-
Digital health application integrating wearable data and behavioral patterns improves metabolic health.
NPJ digital medicine
2023; 6 (1): 216
Abstract
The effectiveness of lifestyle interventions in reducing caloric intake and increasing physical activity for preventing Type 2 Diabetes (T2D) has been previously demonstrated. The use of modern technologies can potentially further improve the success of these interventions, promote metabolic health, and prevent T2D at scale. To test this concept, we built a remote program that uses continuous glucose monitoring (CGM) and wearables to make lifestyle recommendations that improve health. We enrolled 2,217 participants with varying degrees of glucose levels (normal range, and prediabetes and T2D ranges), using continuous glucose monitoring (CGM) over 28 days to capture glucose patterns. Participants logged food intake, physical activity, and body weight via a smartphone app that integrated wearables data and provided daily insights, including overlaying glucose patterns with activity and food intake, macronutrient breakdown, glycemic index (GI), glycemic load (GL), and activity measures. The app furthermore provided personalized recommendations based on users' preferences, goals, and observed glycemic patterns. Users could interact with the app for an additional 2 months without CGM. Here we report significant improvements in hyperglycemia, glucose variability, and hypoglycemia, particularly in those who were not diabetic at baseline. Body weight decreased in all groups, especially those who were overweight or obese. Healthy eating habits improved significantly, with reduced daily caloric intake and carbohydrate-to-calorie ratio and increased intake of protein, fiber, and healthy fats relative to calories. These findings suggest that lifestyle recommendations, in addition to behavior logging and CGM data integration within a mobile app, can enhance the metabolic health of both nondiabetic and T2D individuals, leading to healthier lifestyle choices. This technology can be a valuable tool for T2D prevention and treatment.
View details for DOI 10.1038/s41746-023-00956-y
View details for PubMedID 38001287
View details for PubMedCentralID 3891203
-
Wearable Devices: Implications for Precision Medicine and the Future of Health Care.
Annual review of medicine
2023
Abstract
Wearable devices are integrated analytical units equipped with sensitive physical, chemical, and biological sensors capable of noninvasive and continuous monitoring of vital physiological parameters. Recent advances in disciplines including electronics, computation, and material science have resulted in affordable and highly sensitive wearable devices that are routinely used for tracking and managing health and well-being. Combined with longitudinal monitoring of physiological parameters, wearables are poised to transform the early detection, diagnosis, and treatment/management of a range of clinical conditions. Smartwatches are the most commonly used wearable devices and have already demonstrated valuable biomedical potential in detecting clinical conditions such as arrhythmias, Lyme disease, inflammation, and, more recently, COVID-19 infection. Despite significant clinical promise shown in research settings, there remain major hurdles in translating the medical uses of wearables to the clinic. There is a clear need for more effective collaboration among stakeholders, including users, data scientists, clinicians, payers, and governments, to improve device security, user privacy, data standardization, regulatory approval, and clinical validity. This review examines the potential of wearables to offer affordable and reliable measures of physiological status that are on par with FDA-approved specialized medical devices. We briefly examine studies where wearables proved critical for the early detection of acute and chronic clinical conditions with a particular focus on cardiovascular disease, viral infections, and mental health. Finally, we discuss current obstacles to the clinical implementation of wearables and provide perspectives on their potential to deliver increasingly personalized proactive health care across a wide variety of conditions. Expected final online publication date for the Annual Review of Medicine, Volume 75 is January 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
View details for DOI 10.1146/annurev-med-052422-020437
View details for PubMedID 37983384
-
Dietary fiber deficiency in individuals with metabolic syndrome: a review.
Current opinion in clinical nutrition and metabolic care
2023
Abstract
PURPOSE OF REVIEW: Metabolic syndrome (MetS) refers to a group of risk factors, which increase the risk of cardiovascular disease (CVD), type 2 diabetes (T2D), and other chronic diseases. Dietary fiber has been shown to mitigate many of the effects of various risk factors associated with MetS. Our review summarizes the recent findings on the association between dietary fiber deficiency and MetS.RECENT FINDINGS: A number of studies have shown that dietary fiber deficiency is associated with an increased risk of MetS. The main mechanisms by which dietary fiber may reduce the risk of MetS include reduction of cholesterol levels; improvement of blood sugar control; reduction of inflammation; and promotion of weight loss.SUMMARY: Literature suggests that a deficiency in dietary fiber consumption is a risk factor for MetS. An increase in dietary fiber intake may help to reduce the risk of developing MetS and its associated chronic diseases.
View details for DOI 10.1097/MCO.0000000000000971
View details for PubMedID 37751374
-
Dynamic lipidome alterations associated with human health, disease and ageing.
Nature metabolism
2023
Abstract
Lipids can be of endogenous or exogenous origin and affect diverse biological functions, including cell membrane maintenance, energy management and cellular signalling. Here, we report >800 lipid species, many of which are associated with health-to-disease transitions in diabetes, ageing and inflammation, as well as cytokine-lipidome networks. We performed comprehensive longitudinal lipidomic profiling and analysed >1,500 plasma samples from 112 participants followed for up to 9 years (average 3.2 years) to define the distinct physiological roles of complex lipid subclasses, including large and small triacylglycerols, ester- and ether-linked phosphatidylethanolamines, lysophosphatidylcholines, lysophosphatidylethanolamines, cholesterol esters and ceramides. Our findings reveal dynamic changes in the plasma lipidome during respiratory viral infection, insulin resistance and ageing, suggesting that lipids may have roles in immune homoeostasis and inflammation regulation. Individuals with insulin resistance exhibit disturbed immune homoeostasis, altered associations between lipids and clinical markers, and accelerated changes in specific lipid subclasses during ageing. Our dataset based on longitudinal deep lipidome profiling offers insights into personalized ageing, metabolic health and inflammation, potentially guiding future monitoring and intervention strategies.
View details for DOI 10.1038/s42255-023-00880-1
View details for PubMedID 37697054
View details for PubMedCentralID 7736650
-
Biomarkers of aging for the identification and evaluation of longevity interventions.
Cell
2023; 186 (18): 3758-3775
Abstract
With the rapid expansion of aging biology research, the identification and evaluation of longevity interventions in humans have become key goals of this field. Biomarkers of aging are critically important tools in achieving these objectives over realistic time frames. However, the current lack of standards and consensus on the properties of a reliable aging biomarker hinders their further development and validation for clinical applications. Here, we advance a framework for the terminology and characterization of biomarkers of aging, including classification and potential clinical use cases. We discuss validation steps and highlight ongoing challenges as potential areas in need of future research. This framework sets the stage for the development of valid biomarkers of aging and their ultimate utilization in clinical trials and practice.
View details for DOI 10.1016/j.cell.2023.08.003
View details for PubMedID 37657418
-
Advances and prospects for the Human BioMolecular Atlas Program (HuBMAP).
Nature cell biology
2023
Abstract
The Human BioMolecular Atlas Program (HuBMAP) aims to create a multi-scale spatial atlas of the healthy human body at single-cell resolution by applying advanced technologies and disseminating resources to the community. As the HuBMAP moves past its first phase, creating ontologies, protocols and pipelines, this Perspective introduces the production phase: the generation of reference spatial maps of functional tissue units across many organs from diverse populations and the creation of mapping tools and infrastructure to advance biomedical research.
View details for DOI 10.1038/s41556-023-01194-w
View details for PubMedID 37468756
View details for PubMedCentralID 8238499
-
Organization of the human intestine at single-cell resolution.
Nature
2023; 619 (7970): 572-584
Abstract
The intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health1. The intesting has a length of over nine metres, along which there are differences in structure and function2. The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function. Here, to better understand these differences, we evaluated the organization of single cells using multiplexed imaging and single-nucleus RNA and open chromatin assays across eight different intestinal sites from nine donors. Through systematic analyses, we find cell compositions that differ substantially across regions of the intestine and demonstrate the complexity of epithelial subtypes, and find that the same cell types are organized into distinct neighbourhoods and communities, highlighting distinct immunological niches that are present in the intestine. We also map gene regulatory differences in these cells that are suggestive of a regulatory differentiation cascade, and associate intestinal disease heritability with specific cell types. These results describe the complexity of the cell composition, regulation and organization for this organ, and serve as an important reference map for understanding human biology and disease.
View details for DOI 10.1038/s41586-023-05915-x
View details for PubMedID 37468586
View details for PubMedCentralID PMC10356619
-
Dynamic monitoring of thousands of biochemical analytes using microsampling
NATURE BIOMEDICAL ENGINEERING
2023
View details for DOI 10.1038/s41551-023-01005-5
View details for Web of Science ID 000920584900001
View details for PubMedID 36697922
-
Multi-omics microsampling for the profiling of lifestyle-associated changes in health.
Nature biomedical engineering
2023
Abstract
Current healthcare practices are reactive and use limited physiological and clinical information, often collected months or years apart. Moreover, the discovery and profiling of blood biomarkers in clinical and research settings are constrained by geographical barriers, the cost and inconvenience of in-clinic venepuncture, low sampling frequency and the low depth of molecular measurements. Here we describe a strategy for the frequent capture and analysis of thousands of metabolites, lipids, cytokines and proteins in 10 μl of blood alongside physiological information from wearable sensors. We show the advantages of such frequent and dense multi-omics microsampling in two applications: the assessment of the reactions to a complex mixture of dietary interventions, to discover individualized inflammatory and metabolic responses; and deep individualized profiling, to reveal large-scale molecular fluctuations as well as thousands of molecular relationships associated with intra-day physiological variations (in heart rate, for example) and with the levels of clinical biomarkers (specifically, glucose and cortisol) and of physical activity. Combining wearables and multi-omics microsampling for frequent and scalable omics may facilitate dynamic health profiling and biomarker discovery.
View details for DOI 10.1038/s41551-022-00999-8
View details for PubMedID 36658343
-
Recurrent repeat expansions in human cancer genomes.
Nature
2022
Abstract
Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases1,2. However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs, a phenomenon termed microsatellite instability; however, larger repeat expansions have not been systematically analysed in cancer3-8. Here we identified TR expansions in 2,622 cancer genomes spanning 29 cancer types. In seven cancer types, we found 160 recurrent repeat expansions (rREs), most of which (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with enrichment near candidate cis-regulatory elements, suggesting a potential role in gene regulation. One rRE, a GAAA-repeat expansion, located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, in preliminary experiments, treating cells that harbour this rRE with a GAAA-targeting molecule led to a dose-dependent decrease in cell proliferation. Overall, our results suggest that rREs may be an important but unexplored source of genetic variation in human cancer, and we provide a comprehensive catalogue for further study.
View details for DOI 10.1038/s41586-022-05515-1
View details for PubMedID 36517591
-
Distinct factors associated with short-term and long-term weight loss induced by low-fat or low-carbohydrate diet intervention.
Cell reports. Medicine
2022: 100870
Abstract
To understand what determines the success of short- and long-term weight loss, we conduct a secondary analysis of dietary, metabolic, and molecular data collected from 609 participants before, during, and after a 1-year weight-loss intervention with either a healthy low-carbohydrate (HLC) or a healthy low-fat (HLF) diet. Through systematic analysis of multidomain datasets, we find that dietary adherence and diet quality, not just caloric restriction, are important for short-term weight loss in both diets. Interestingly, we observe minimal dietary differences between those who succeeded in long-term weight loss and those who did not. Instead, proteomic and gut microbiota signatures significantly differ between these two groups at baseline. Moreover, the baseline respiratory quotient may suggest a specific diet for better weight-loss outcomes. Overall, the identification of these dietary, molecular, and metabolic factors, common or unique to the HLC and HLF diets, provides a roadmap for developing individualized weight-loss strategies.
View details for DOI 10.1016/j.xcrm.2022.100870
View details for PubMedID 36516846
-
Longitudinally tracking personal physiomes for precision management of childhood epilepsy.
PLOS digital health
2022; 1 (12): e0000161
Abstract
Our current understanding of human physiology and activities is largely derived from sparse and discrete individual clinical measurements. To achieve precise, proactive, and effective health management of an individual, longitudinal, and dense tracking of personal physiomes and activities is required, which is only feasible by utilizing wearable biosensors. As a pilot study, we implemented a cloud computing infrastructure to integrate wearable sensors, mobile computing, digital signal processing, and machine learning to improve early detection of seizure onsets in children. We recruited 99 children diagnosed with epilepsy and longitudinally tracked them at single-second resolution using a wearable wristband, and prospectively acquired more than one billion data points. This unique dataset offered us an opportunity to quantify physiological dynamics (e.g., heart rate, stress response) across age groups and to identify physiological irregularities upon epilepsy onset. The high-dimensional personal physiome and activity profiles displayed a clustering pattern anchored by patient age groups. These signatory patterns included strong age and sex-specific effects on varying circadian rhythms and stress responses across major childhood developmental stages. For each patient, we further compared the physiological and activity profiles associated with seizure onsets with the personal baseline and developed a machine learning framework to accurately capture these onset moments. The performance of this framework was further replicated in another independent patient cohort. We next referenced our predictions with the electroencephalogram (EEG) signals on selected patients and demonstrated that our approach could detect subtle seizures not recognized by humans and could detect seizures prior to clinical onset. Our work demonstrated the feasibility of a real-time mobile infrastructure in a clinical setting, which has the potential to be valuable in caring for epileptic patients. Extension of such a system has the potential to be leveraged as a health management device or longitudinal phenotyping tool in clinical cohort studies.
View details for DOI 10.1371/journal.pdig.0000161
View details for PubMedID 36812648
View details for PubMedCentralID PMC9931296
-
Identification of non-coding silencer elements and their regulation of gene expression.
Nature reviews. Molecular cell biology
2022
Abstract
Cell type- and differentiation-specific gene expression is precisely controlled by genomic non-coding regulatory elements (NCREs), which include promoters, enhancers, silencers and insulators. It is estimated that more than 90% of disease-associated sequence variants lie within the non-coding part of the genome, potentially affecting the activity of NCREs. Consequently, the functional annotation of NCREs is a major driver of genome research. Compared with our knowledge of other regulatory elements, our knowledge of silencers, which are NCREs that repress the transcription of genes, is largely lacking. Multiple recent studies have reported large-scale identification of transcription silencer elements, indicating their importance in homeostasis and disease. In this Review, we discuss the biology of silencers, including methods for their discovery, epigenomic and other characteristics, and modes of function of silencers. We also discuss important silencer-relevant considerations in assessing data from genome-wide association studies and shed light on potential future silencer-based therapeutic applications.
View details for DOI 10.1038/s41580-022-00549-9
View details for PubMedID 36344659
-
Performance effectiveness of vital parameter combinations for early warning of sepsis-an exhaustive study using machine learning
JAMIA OPEN
2022; 5 (4): ooac080
Abstract
To carry out exhaustive data-driven computations for the performance of noninvasive vital signs heart rate (HR), respiratory rate (RR), peripheral oxygen saturation (SpO2), and temperature (Temp), considered both independently and in all possible combinations, for early detection of sepsis.By extracting features interpretable by clinicians, we applied Gradient Boosted Decision Tree machine learning on a dataset of 2630 patients to build 240 models. Validation was performed on a geographically distinct dataset. Relative to onset, predictions were clocked as per 16 pairs of monitoring intervals and prediction times, and the outcomes were ranked.The combination of HR and Temp was found to be a minimal feature set yielding maximal predictability with area under receiver operating curve 0.94, sensitivity of 0.85, and specificity of 0.90. Whereas HR and RR each directly enhance prediction, the effects of SpO2 and Temp are significant only when combined with HR or RR. In benchmarking relative to standard methods Systemic Inflammatory Response Syndrome (SIRS), National Early Warning Score (NEWS), and quick-Sequential Organ Failure Assessment (qSOFA), Vital-SEP outperformed all 3 of them.It can be concluded that using intensive care unit data even 2 vital signs are adequate to predict sepsis upto 6 h in advance with promising accuracy comparable to standard scoring methods and other sepsis predictive tools reported in literature. Vital-SEP can be used for fast-track prediction especially in limited resource hospital settings where laboratory based hematologic or biochemical assays may be unavailable, inaccurate, or entail clinically inordinate delays. A prospective study is essential to determine the clinical impact of the proposed sepsis prediction model and evaluate other outcomes such as mortality and duration of hospital stay.
View details for DOI 10.1093/jamiaopen/ooac080
View details for Web of Science ID 000868349400001
View details for PubMedID 36267121
View details for PubMedCentralID PMC9566305
-
Systems analysis of de novo mutations in congenital heart diseases identified a protein network in the hypoplastic left heart syndrome.
Cell systems
2022
Abstract
Despite a strong genetic component, only a few genes have been identified in congenital heart diseases (CHDs). We introduced systems analyses to uncover the hidden organization on biological networks of mutations in CHDs and leveraged network analysis to integrate the protein interactome, patient exomes, and single-cell transcriptomes of the developing heart. We identified a CHD network regulating heart development and observed that a sub-network also regulates fetal brain development, thereby providing mechanistic insights into the clinical comorbidities between CHDs and neurodevelopmental conditions. At a small scale, we experimentally verified uncharacterized cardiac functions of several proteins. At a global scale, our study revealed developmental dynamics of the network and observed its association with the hypoplastic left heart syndrome (HLHS), which was further supported by the dysregulation of the network in HLHS endothelial cells. Overall, our work identified previously uncharacterized CHD factors and provided a generalizable framework applicable to studying many other complex diseases. A record of this paper's Transparent Peer Review process is included in the supplemental information.
View details for DOI 10.1016/j.cels.2022.09.001
View details for PubMedID 36167075
-
Chimpanzee and pig-tailed macaque iPSCs: Improved culture and generation of primate cross-species embryos.
Cell reports
2022; 40 (9): 111264
Abstract
As our closest living relatives, non-human primates uniquely enable explorations of human health, disease, development, and evolution. Considerable effort has thus been devoted to generating induced pluripotent stem cells (iPSCs) from multiple non-human primate species. Here, we establish improved culture methods for chimpanzee (Pan troglodytes) and pig-tailed macaque (Macaca nemestrina) iPSCs. Such iPSCs spontaneously differentiate in conventional culture conditions, but can be readily propagated by inhibiting endogenous WNT signaling. As a unique functional test of these iPSCs, we injected them into the pre-implantation embryos of another non-human species, rhesus macaques (Macaca mulatta). Ectopic expression of gene BCL2 enhances the survival and proliferation of chimpanzee and pig-tailed macaque iPSCs within the pre-implantation embryo, although the identity and long-term contribution of the transplanted cells warrants further investigation. In summary, we disclose transcriptomic and proteomic data, cell lines, and cell culture resources that may be broadly enabling for non-human primate iPSCs research.
View details for DOI 10.1016/j.celrep.2022.111264
View details for PubMedID 36044843
-
massDatabase: utilities for the operation of the public compound and pathway database.
Bioinformatics (Oxford, England)
2022
Abstract
SUMMARY: One of the major challenges in LC-MS data is converting many metabolic feature entries to biological function information, such as metabolite annotation and pathway enrichment, which are based on the compound and pathway databases. Multiple online databases have been developed. However, no tool has been developed for operating all these databases for biological analysis. Therefore, we developed massDatabase, an R package that operates the online public databases and combines with other tools for streamlined compound annotation and pathway enrichment. massDatabase is a flexible, simple, and powerful tool that can be installed on all platforms, allowing the users to leverage all the online public databases for biological function mining. A detailed tutorial and a case study are provided in the Supplementary Materials.AVAILABILITY AND IMPLEMENTATION: https://massdatabase.tidymass.org/.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btac546
View details for PubMedID 35944213
-
TidyMass an object-oriented reproducible analysis framework for LC-MS data.
Nature communications
2022; 13 (1): 4365
Abstract
Reproducibility, traceability, and transparency have been long-standing issues for metabolomics data analysis. Multiple tools have been developed, but limitations still exist. Here, we present the tidyMass project ( https://www.tidymass.org/ ), a comprehensive R-based computational framework that can achieve the traceable, shareable, and reproducible workflow needs of data processing and analysis for LC-MS-based untargeted metabolomics. TidyMass is an ecosystem of R packages that share an underlying design philosophy, grammar, and data structure, which provides a comprehensive, reproducible, and object-oriented computational framework. The modular architecture makes tidyMass a highly flexible and extensible tool, which other users can improve and integrate with other tools to customize their own pipeline.
View details for DOI 10.1038/s41467-022-32155-w
View details for PubMedID 35902589
-
Single-cell analyses define a continuum of cell state and composition changes in the malignant transformation of polyps to colorectal cancer.
Nature genetics
2022
Abstract
To chart cell composition and cell state changes that occur during the transformation of healthy colon to precancerous adenomas to colorectal cancer (CRC), we generated single-cell chromatin accessibility profiles and single-cell transcriptomes from 1,000 to 10,000 cells per sample for 48 polyps, 27 normal tissues and 6 CRCs collected from patients with or without germline APC mutations. A large fraction of polyp and CRC cells exhibit a stem-like phenotype, and we define a continuum of epigenetic and transcriptional changes occurring in these stem-like cells as they progress from homeostasis to CRC. Advanced polyps contain increasing numbers of stem-like cells, regulatory T cells and a subtype of pre-cancer-associated fibroblasts. In the cancerous state, we observe T cell exhaustion, RUNX1-regulated cancer-associated fibroblasts and increasing accessibility associated with HNF4A motifs in epithelia. DNA methylation changes in sporadic CRC are strongly anti-correlated with accessibility changes along this continuum, further identifying regulatory markers for molecular staging of polyps.
View details for DOI 10.1038/s41588-022-01088-x
View details for PubMedID 35726067
-
A genome-wide atlas of recurrent repeat expansions in human cancer genomes
AMER ASSOC CANCER RESEARCH. 2022
View details for Web of Science ID 000892509504476
-
Precision environmental health monitoring by longitudinal exposome and multi-omics profiling.
Genome research
2022
Abstract
Conventional environmental health studies have primarily focused on limited environmental stressors at the population level, which lacks the power to dissect the complexity and heterogeneity of individualized environmental exposures. Here, as a pilot case study, we integrated deep-profiled longitudinal personal exposome and internal multi-omics to systematically investigate how the exposome shapes a single individual's phenome. We annotated thousands of chemical and biological components in the personal exposome cloud and found they were significantly correlated with thousands of internal biomolecules, which was further cross-validated using corresponding clinical data. Our results showed that agrochemicals and fungi predominated in the highly diverse and dynamic personal exposome, and the biomolecules and pathways related to the individual's immune system, kidney, and liver were highly associated with the personal external exposome. Overall, this data-driven longitudinal monitoring study shows the potential dynamic interactions between the personal exposome and internal multi-omics, as well as the impact of the exposome on precision health by producing abundant testable hypotheses.
View details for DOI 10.1101/gr.276521.121
View details for PubMedID 35667843
-
Multiomic analysis reveals cell-type-specific molecular determinants of COVID-19 severity.
Cell systems
2022
Abstract
The determinants of severe COVID-19 in healthy adults are poorly understood, which limits the opportunity for early intervention. We present a multiomic analysis using machine learning to characterize the genomic basis of COVID-19 severity. We use single-cell multiome profiling of human lungs to link genetic signals to cell-type-specific functions. We discover >1,000 risk genes across 19 cell types, which account for 77% of the SNP-based heritability for severe disease. Genetic risk is particularly focused within natural killer (NK) cells and T cells, placing the dysfunction of these cells upstream of severe disease. Mendelian randomization and single-cell profiling of human NK cells support the role of NK cells and further localize genetic risk to CD56bright NK cells, which are key cytokine producers during the innate immune response. Rare variant analysis confirms the enrichment of severe-disease-associated genetic variation within NK-cell risk genes. Our study provides insights into the pathogenesis of severe COVID-19 with potential therapeutic targets.
View details for DOI 10.1016/j.cels.2022.05.007
View details for PubMedID 35690068
-
Global, distinctive, and personal changes in molecular and microbial profiles by specific fibers in humans.
Cell host & microbe
2022
Abstract
Dietary fibers act through the microbiome to improve cardiovascular health and prevent metabolic disorders and cancer. To understand the health benefits of dietary fiber supplementation, we investigated two popular purified fibers, arabinoxylan (AX) and long-chain inulin (LCI), and a mixture of five fibers. We present multiomic signatures of metabolomics, lipidomics, proteomics, metagenomics, a cytokine panel, and clinical measurements on healthy and insulin-resistant participants. Each fiber is associated with fiber-dependent biochemical and microbial responses. AX consumption associates with a significant reduction in LDL and an increase in bile acids, contributing to its observed cholesterol reduction. LCI is associated with an increase in Bifidobacterium. However, at the highest LCI dose, there is increased inflammation and elevation in the liver enzyme alanine aminotransferase. This study yields insights into the effects of fiber supplementation and the mechanisms behind fiber-induced cholesterol reduction, and it shows effects of individual, purified fibers on the microbiome.
View details for DOI 10.1016/j.chom.2022.03.036
View details for PubMedID 35483363
-
Adverse childhood experiences, diabetes and associated conditions, preventive care practices and healthcare access: A population-based study.
Preventive medicine
2022: 107044
Abstract
Our objective was to examine associations between Adverse Childhood Experiences (ACEs) and diabetes mellitus, including related conditions and preventive care practices. We used data from the Behavioral Risk Factor Surveillance System (BRFSS) 2009-2012, a cross-sectional, population-based survey, to assess ACEs, diabetes, and healthcare access in 179,375 adults. In those with diabetes (n = 21,007), we assessed the association of ACEs with myocardial infarction, stroke, and five Healthy People 2020 (HP2020) diabetes-related preventive-care objectives (n = 13,152). Healthcare access indicators included lack of a regular healthcare provider, insurance, and difficulty affording healthcare. Regression analyses adjusted for age, sex, and race. The adjusted odds ratio (AOR) of diabetes increased in a stepwise fashion by ACE exposure, ranging from 1.2 (95% CI 1.1-1.3) for 1 ACE to 1.7 (95% CI 1.6-1.9) for ≥4 ACEs, versus having no ACEs. In persons with diabetes, those with ≥4 ACEs had an elevated adjusted odds of myocardial infarction (AOR = 1.6, 95% CI 1.2-2.0) and stroke (AOR = 1.8, 95% CI 1.3-2.4), versus having no ACEs. ACEs were also associated with a reduction in the adjusted percent of HP2020 diabetes objectives met: 72.9% (95% CI 71.3-74.5) for those with no ACEs versus only 66.5% (95% CI 63.8-69.3%) for those with ≥4 ACEs (p = 0.0002). Finally, ACEs predicted worse healthcare access in a stepwise fashion for all indicators. In conclusion, ACEs are associated with greater prevalence of diabetes and associated conditions, and with meeting fewer HP2020 prevention goals. ACEs screening and trauma-informed care practices are thus recommended.
View details for DOI 10.1016/j.ypmed.2022.107044
View details for PubMedID 35398366
-
Genome-wide identification of the genetic basis of amyotrophic lateral sclerosis.
Neuron
1800
Abstract
Amyotrophic lateral sclerosis (ALS) is a complex disease that leads to motor neuron death. Despite heritability estimates of 52%, genome-wide association studies (GWASs) have discovered relatively few loci. We developed a machine learning approach called RefMap, which integrates functional genomics with GWAS summary statistics for gene discovery. With transcriptomic and epigenetic profiling of motor neurons derived from induced pluripotent stem cells (iPSCs), RefMap identified 690 ALS-associated genes that represent a 5-fold increase in recovered heritability. Extensive conservation, transcriptome, network, and rare variant analyses demonstrated the functional significance of candidate genes in healthy and diseased motor neurons and brain tissues. Genetic convergence between common and rare variation highlighted KANK1 as a new ALS gene. Reproducing KANK1 patient mutations in human neurons led to neurotoxicity and demonstrated that TDP-43 mislocalization, a hallmark pathology of ALS, is downstream of axonal dysfunction. RefMap can be readily applied to other complex diseases.
View details for DOI 10.1016/j.neuron.2021.12.019
View details for PubMedID 35045337
-
Phenotypic characteristics of peripheral immune cells of Myalgic encephalomyelitis/chronic fatigue syndrome via transmission electron microscopy: A pilot study.
PloS one
2022; 17 (8): e0272703
Abstract
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a complex chronic multi-systemic disease characterized by extreme fatigue that is not improved by rest, and worsens after exertion, whether physical or mental. Previous studies have shown ME/CFS-associated alterations in the immune system and mitochondria. We used transmission electron microscopy (TEM) to investigate the morphology and ultrastructure of unstimulated and stimulated ME/CFS immune cells and their intracellular organelles, including mitochondria. PBMCs from four participants were studied: a pair of identical twins discordant for moderate ME/CFS, as well as two age- and gender- matched unrelated subjects-one with an extremely severe form of ME/CFS and the other healthy. TEM analysis of CD3/CD28-stimulated T cells suggested a significant increase in the levels of apoptotic and necrotic cell death in T cells from ME/CFS patients (over 2-fold). Stimulated Tcells of ME/CFS patients also had higher numbers of swollen mitochondria. We also found a large increase in intracellular giant lipid droplet-like organelles in the stimulated PBMCs from the extremely severe ME/CFS patient potentially indicative of a lipid storage disorder. Lastly, we observed a slight increase in platelet aggregation in stimulated cells, suggestive of a possible role of platelet activity in ME/CFS pathophysiology and disease severity. These results indicate extensive morphological alterations in the cellular and mitochondrial phenotypes of ME/CFS patients' immune cells and suggest new insights into ME/CFS biology.
View details for DOI 10.1371/journal.pone.0272703
View details for PubMedID 35943990
-
Real-time alerting system for COVID-19 and other stress events using wearable data.
Nature medicine
2021
Abstract
Early detection of infectious diseases is crucial for reducing transmission and facilitating early intervention. In this study, we built a real-time smartwatch-based alerting system that detects aberrant physiological and activity signals (heart rates and steps) associated with the onset of early infection and implemented this system in a prospective study. In a cohort of 3,318 participants, of whom 84 were infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), this system generated alerts for pre-symptomatic and asymptomatic SARS-CoV-2 infection in 67 (80%) of the infected individuals. Pre-symptomatic signals were observed at a median of 3 days before symptom onset. Examination of detailed survey responses provided by the participants revealed that other respiratory infections as well as events not associated with infection, such as stress, alcohol consumption and travel, could also trigger alerts, albeit at a much lower mean frequency (1.15 alert days per person compared to 3.42 alert days per person for coronavirus disease 2019 cases). Thus, analysis of smartwatch signals by an online detection algorithm provides advance warning of SARS-CoV-2 infection in a high percentage of cases. This study shows that a real-time alerting system can be used for early detection of infection and other stressors and employed on an open-source platform that is scalable to millions of users.
View details for DOI 10.1038/s41591-021-01593-2
View details for PubMedID 34845389
-
A scalable, secure, and interoperable platform for deep data-driven health management.
Nature communications
2021; 12 (1): 5757
Abstract
The large amount of biomedical data derived from wearable sensors, electronic health records, and molecular profiling (e.g., genomics data) is rapidly transforming our healthcare systems. The increasing scale and scope of biomedical data not only is generating enormous opportunities for improving health outcomes but also raises new challenges ranging from data acquisition and storage to data analysis and utilization. To meet these challenges, we developed the Personal Health Dashboard (PHD), which utilizes state-of-the-art security and scalability technologies to provide an end-to-end solution for big biomedical data analytics. The PHD platform is an open-source software framework that can be easily configured and deployed to any big data health project to store, organize, and process complex biomedical data sets, support real-time data analysis at both the individual level and the cohort level, and ensure participant privacy at every step. In addition to presenting the system, we illustrate the use of the PHD framework for large-scale applications in emerging multi-omics disease studies, such as collecting and visualization of diverse data types (wearable, clinical, omics) at a personal level, investigation of insulin resistance, and an infrastructure for the detection of presymptomatic COVID-19.
View details for DOI 10.1038/s41467-021-26040-1
View details for PubMedID 34599181
-
Chromatin accessibility associates with protein-RNA correlation in human cancer.
Nature communications
2021; 12 (1): 5732
Abstract
Although alterations in chromatin structure are known to exist in tumors, how these alterations relate to molecular phenotypes in cancer remains to be demonstrated. Multi-omics profiling of human tumors can provide insight into how alterations in chromatin structure are propagated through the pathway of gene expression to result in malignant protein expression. We applied multi-omics profiling of chromatin accessibility, RNA abundance, and protein abundance to 36 human thyroid cancer primary tumors, metastases, and patient-match normal tissue. Through quantification of chromatin accessibility associated with active transcription units and global protein expression, we identify a local chromatin structure that is highly correlated with coordinated RNA and protein expression. In particular, we identify enhancers located within gene-bodies as predictive of correlated RNA and protein expression, that is independent of overall transcriptional activity. To demonstrate the generalizability of these findings we also identify similar results in an independent cohort of human breast cancers. Taken together, these analyses suggest that local enhancers, rather than distal enhancers, are likely most predictive of cancer gene expression phenotypes. This allows for identification of potential targets for cancer therapeutic approaches and reinforces the utility of multi-omics profiling as a methodology to understand human disease.
View details for DOI 10.1038/s41467-021-25872-1
View details for PubMedID 34593797
-
metID: a R package for automatable compound annotation for LC-MS-based data.
Bioinformatics (Oxford, England)
2021
Abstract
SUMMARY: Accurate and efficient compound annotation is a long-standing challenge for LC-MS-based data (e.g., untargeted metabolomics and exposomics). Substantial efforts have been devoted to overcoming this obstacle, whereas current tools are limited by the sources of spectral information used (in-house and public databases) and are not automated and streamlined. Therefore, we developed metID, an R package that combines information from all major databases for comprehensive and streamlined compound annotation. metID is a flexible, simple, and powerful tool that can be installed on all platforms, allowing the compound annotation process to be fully automatic and reproducible. A detailed tutorial and a case study are provided in Supplementary Materials.AVAILABILITY AND IMPLEMENTATION: https://jaspershen.github.io/metID.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btab583
View details for PubMedID 34432001
-
Five-year pediatric use of a digital wearable fitness device: lessons from a pilot case study.
JAMIA open
2021; 4 (3): ooab054
Abstract
Objectives: Wearable fitness devices are increasingly being used by the general population, with many new applications being proposed for healthy adults as well as for adults with chronic diseases. Fewer, if any, studies of these devices have been conducted in healthy adolescents and teenagers, especially over a long period of time. The goal of this work was to document the successes and challenges involved in 5 years of a wearable fitness device use in a pediatric case study.Materials and methods: Comparison of 5 years of step counts and minutes asleep from a teenaged girl and her father.Results: At 60 months, this may be the longest reported pediatric study involving a wearable fitness device, and the first simultaneously involving a parent and a child. We find step counts to be significantly higher for both the adult and teen on school/work days, along with less sleep. The teen walked significantly less towards the end of the 5-year study. Surprisingly, many of the adult's and teen's sleeping and step counts were correlated, possibly due to coordinated behaviors.Discussion: We end with several recommendations for pediatricians and device manufacturers, including the need for constant adjustments of stride length and calorie counts as teens are growing.Conclusion: With periodic adjustments for growth, this pilot study shows these devices can be used for more accurate and consistent measurements in adolescents and teenagers over longer periods of time, to potentially promote healthy behaviors.
View details for DOI 10.1093/jamiaopen/ooab054
View details for PubMedID 34350390
-
Wearable sensors enable personalized predictions of clinical laboratory measurements.
Nature medicine
2021
Abstract
Vital signs, including heart rate and body temperature, are useful in detecting or monitoring medical conditions, but are typically measured in the clinic and require follow-up laboratory testing for more definitive diagnoses. Here we examined whether vital signs as measured by consumer wearable devices (that is, continuously monitored heart rate, body temperature, electrodermal activity and movement) can predict clinical laboratory test results using machine learning models, including random forest and Lasso models. Our results demonstrate that vital sign data collected from wearables give a more consistent and precise depiction of resting heart rate than do measurements taken in the clinic. Vital sign data collected from wearables can also predict several clinical laboratory measurements with lower prediction error than predictions made using clinically obtained vital sign measurements. The length of time over which vital signs are monitored and the proximity of the monitoring period to the date of prediction play a critical role in the performance of the machine learning models. These results demonstrate the value of commercial wearable devices for continuous and longitudinal assessment of physiological measurements that today can be measured only with clinical laboratory tests.
View details for DOI 10.1038/s41591-021-01339-0
View details for PubMedID 34031607
-
Pre-symptomatic detection of COVID-19 from smartwatch data.
Nature biomedical engineering
2020
Abstract
Consumer wearable devices that continuously measure vital signs have been used to monitor the onset of infectious disease. Here, we show that data from consumer smartwatches can be used for the pre-symptomatic detection of coronavirus disease 2019 (COVID-19). We analysed physiological and activity data from 32 individuals infected with COVID-19, identified from a cohort of nearly 5,300 participants, and found that 26 of them (81%) had alterations in their heart rate, number of daily steps or time asleep. Of the 25 cases of COVID-19 with detected physiological alterations for which we had symptom information, 22 were detected before (or at) symptom onset, with four cases detected at least nine days earlier. Using retrospective smartwatch data, we show that 63% of the COVID-19 cases could have been detected before symptom onset in real time via a two-tiered warning system based on the occurrence of extreme elevations in resting heart rate relative to the individual baseline. Our findings suggest that activity tracking and health monitoring via consumer wearable devices may be used for the large-scale, real-time detection of respiratory infections, often pre-symptomatically.
View details for DOI 10.1038/s41551-020-00640-6
View details for PubMedID 33208926
-
An integrative ENCODE resource for cancer genomics.
Nature communications
2020; 11 (1): 3696
Abstract
ENCODE comprises thousands of functional genomics datasets, and the encyclopedia covers hundreds of cell types, providing a universal annotation for genome interpretation. However, for particular applications, it may be advantageous to use a customized annotation. Here, we develop such a custom annotation by leveraging advanced assays, such as eCLIP, Hi-C, and whole-genome STARR-seq on a number of data-rich ENCODE cell types. A key aspect of this annotation is comprehensive and experimentally derived networks of both transcription factors and RNA-binding proteins (TFs and RBPs). Cancer, a disease of system-wide dysregulation, is an ideal application for such a network-based annotation. Specifically, for cancer-associated cell types, we put regulators into hierarchies and measure their network change (rewiring) during oncogenesis. We also extensively survey TF-RBP crosstalk, highlighting how SUB1, a previously uncharacterized RBP, drives aberrant tumor expression and amplifies the effect of MYC, a well-known oncogenic TF. Furthermore, we show how our annotation allows us to place oncogenic transformations in the context of a broad cell space; here, many normal-to-tumor transitions move towards a stem-like state, while oncogene knockdowns show an opposing trend. Finally, we organize the resource into a coherent workflow to prioritize key elements and variants, in addition to regulators. We showcase the application of this prioritization to somatic burdening, cancer differential expression and GWAS. Targeted validations of the prioritized regulators, elements and variants using siRNA knockdowns, CRISPR-based editing, and luciferase assays demonstrate the value of the ENCODE resource.
View details for DOI 10.1038/s41467-020-14743-w
View details for PubMedID 32728046
-
Perspectives on ENCODE.
Nature
2020; 583 (7818): 693–98
Abstract
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
View details for DOI 10.1038/s41586-020-2449-8
View details for PubMedID 32728248
-
Metabolic Dynamics and Prediction of Gestational Age and Time to Delivery in Pregnant Women.
Cell
2020; 181 (7): 1680
Abstract
Metabolism during pregnancy is a dynamic and precisely programmed process, the failure of which can bring devastating consequences to the mother and fetus. To define a high-resolution temporal profile of metabolites during healthy pregnancy, we analyzed the untargeted metabolome of 784weekly blood samples from 30 pregnant women. Broad changes and a highly choreographed profile were revealed: 4,995 metabolic features (of 9,651 total), 460 annotated compounds (of 687 total), and 34 human metabolic pathways (of 48 total) were significantly changed during pregnancy. Using linear models, we built a metabolic clock with five metabolites that time gestational age in high accordance with ultrasound (R= 0.92). Furthermore, two to three metabolites can identify when labor occurs (time to delivery within two, four, and eight weeks, AUROC ≥ 0.85). Our study represents a weekly characterization of the human pregnancy metabolome, providing a high-resolution landscape for understanding pregnancy with potential clinical utilities.
View details for DOI 10.1016/j.cell.2020.05.002
View details for PubMedID 32589958
-
Systematic identification of silencers in human cells.
Nature genetics
2020
Abstract
The majority of the human genome does not encode proteins. Many of these noncoding regions contain important regulatory sequences that control gene expression. To date, most studies have focused on activators such as enhancers, but regions that repress gene expression-silencers-have not been systematically studied. We have developed a system that identifies silencer regions in a genome-wide fashion on the basis of silencer-mediated transcriptional repression of caspase9. We found that silencers are widely distributed and may function in a tissue-specific fashion. These silencers harbor unique epigenetic signatures and are associated with specific transcription factors. Silencers also act at multiple genes, and at the level of chromosomal domains and long-range interactions. Deletion of silencer regions linked to the drug transporter genes ABCC2 and ABCG2 caused chemo-resistance. Overall, our study demonstrates that tissue-specific silencing is widespread throughout the human genome and probably contributes substantially to the regulation of gene expression and human biology.
View details for DOI 10.1038/s41588-020-0578-5
View details for PubMedID 32094911
-
Deep Characterization of the Human Antibody Response to Natural Infection Using Longitudinal Immune Repertoire Sequencing.
Molecular & cellular proteomics : MCP
2020; 19 (2): 278-293
Abstract
Human antibody response studies are largely restricted to periods of high immune activity (e.g. vaccination). To comprehensively understand the healthy B cell immune repertoire and how this changes over time and through natural infection, we conducted immune repertoire RNA sequencing on flow cytometry-sorted B cell subsets to profile a single individual's antibodies over 11 months through two periods of natural viral infection. We found that 1) a baseline of healthy variable (V) gene usage in antibodies exists and is stable over time, but antibodies in memory cells consistently have a different usage profile relative to earlier B cell stages; 2) a single complementarity-determining region 3 (CDR3) is potentially generated from more than one VJ gene combination; and 3) IgG and IgA antibody transcripts are found at low levels in early human B cell development, suggesting that class switching may occur earlier than previously realized. These findings provide insight into immune repertoire stability, response to natural infections, and human B cell development.
View details for DOI 10.1074/mcp.RA119.001633
View details for PubMedID 33451388
-
Molecular Choreography of Acute Exercise.
Cell
2020; 181 (5): 1112–30.e16
Abstract
Acute physical activity leads to several changes in metabolic, cardiovascular, and immune pathways. Although studies have examined selected changes in these pathways, the system-wide molecular response to an acute bout of exercise has not been fully characterized. We performed longitudinal multi-omic profiling of plasma and peripheral blood mononuclear cells including metabolome, lipidome, immunome, proteome, and transcriptome from 36 well-characterized volunteers, before and after a controlled bout of symptom-limited exercise. Time-series analysis revealed thousands of molecular changes and an orchestrated choreography of biological processes involving energy metabolism, oxidative stress, inflammation, tissue repair, and growth factor response, as well as regulatory pathways. Most of these processes were dampened and some were reversed in insulin-resistant participants. Finally, we discovered biological pathways involved in cardiopulmonary exercise response and developed prediction models revealing potential resting blood-based biomarkers of peak oxygen consumption.
View details for DOI 10.1016/j.cell.2020.04.043
View details for PubMedID 32470399
-
Personal aging markers and ageotypes revealed by deep longitudinal profiling.
Nature medicine
2020; 26 (1): 83–90
Abstract
The molecular changes that occur with aging are not well understood1-4. Here, we performed longitudinal and deep multiomics profiling of 106 healthy individuals from 29 to 75 years of age and examined how different types of 'omic' measurements, including transcripts, proteins, metabolites, cytokines, microbes and clinical laboratory values, correlate with age. We identified both known and new markers that associated with age, as well as distinct molecular patterns of aging in insulin-resistant as compared to insulin-sensitive individuals. In a longitudinal setting, we identified personal aging markers whose levels changed over a short time frame of 2-3 years. Further, we defined different types of aging patterns in different individuals, termed 'ageotypes', on the basis of the types of molecular pathways that changed over time in a given individual. Ageotypes may provide a molecular assessment of personal aging, reflective of personal lifestyle and medical history, that may ultimately be useful in monitoring and intervening in the aging process.
View details for DOI 10.1038/s41591-019-0719-5
View details for PubMedID 31932806
-
A Quantitative Proteome Map of the Human Body.
Cell
2020
Abstract
Determining protein levels in each tissue and how they compare with RNA levels is important for understanding human biology and disease as well as regulatory processes that control protein levels. We quantified the relative protein levels from over 12,000 genes across 32 normal human tissues. Tissue-specific or tissue-enriched proteins were identified and compared to transcriptome data. Many ubiquitous transcripts are found to encode tissue-specific proteins. Discordance of RNA and protein enrichment revealed potential sites of synthesis and action of secreted proteins. The tissue-specific distribution of proteins also provides an in-depth view of complex biological events that require the interplay of multiple tissues. Most importantly, our study demonstrated that protein tissue-enrichment information can explain phenotypes of genetic diseases, which cannot be obtained by transcript information alone. Overall, our results demonstrate how understanding protein levels can provide insights into regulation, secretome, metabolism, and human diseases.
View details for DOI 10.1016/j.cell.2020.08.036
View details for PubMedID 32916130
-
Candidate variants in TUB are associated with familial tremor.
PLoS genetics
2020; 16 (9): e1009010
Abstract
Essential tremor (ET) is the most common adult-onset movement disorder. In the present study, we performed whole exome sequencing of a large ET-affected family (10 affected and 6 un-affected family members) and identified a TUB p.V431I variant (rs75594955) segregating in a manner consistent with autosomal-dominant inheritance. Subsequent targeted re-sequencing of TUB in 820 unrelated individuals with sporadic ET and 630 controls revealed significant enrichment of rare nonsynonymous TUB variants (e.g. rs75594955: p.V431I, rs1241709665: p.Ile20Phe, rs55648406: p.Arg49Gln) in the ET cohort (SKAT-O test p-value = 6.20e-08). TUB encodes a transcription factor predominantly expressed in neuronal cells and has been previously implicated in obesity. ChIP-seq analyses of the TUB transcription factor across different regions of the mouse brain revealed that TUB regulates the pathways responsible for neurotransmitter production as well thyroid hormone signaling. Together, these results support the association of rare variants in TUB with ET.
View details for DOI 10.1371/journal.pgen.1009010
View details for PubMedID 32956375
-
Deep longitudinal multiomics profiling reveals two biological seasonal patterns in California.
Nature communications
2020; 11 (1): 4933
Abstract
The influence of seasons on biological processes is poorly understood. In order to identify biological seasonal patterns based on diverse molecular data, rather than calendar dates, we performed a deep longitudinal multiomics profiling of 105 individuals over 4 years. Here, we report more than 1000 seasonal variations in omics analytes and clinical measures. The different molecules group into two major seasonal patterns which correlate with peaks in late spring and late fall/early winter in California. The two patterns are enriched for molecules involved in human biological processes such as inflammation, immunity, cardiovascular health, as well as neurological and psychiatric conditions. Lastly, we identify molecules and microbes that demonstrate different seasonal patterns in insulin sensitive and insulin resistant individuals. The results of our study have important implications in healthcare and highlight the value of considering seasonality when assessing population wide health risk and management.
View details for DOI 10.1038/s41467-020-18758-1
View details for PubMedID 33004787
-
Landscape of cohesin-mediated chromatin loops in the human genome.
Nature
2020; 583 (7818): 737–43
Abstract
Physical interactions between distal regulatory elements have a key role in regulating gene expression, but the extent to which these interactions vary between cell types and contribute to cell-type-specific gene expression remains unclear. Here, to address these questions as part of phase III of the Encyclopedia of DNA Elements (ENCODE), we mapped cohesin-mediated chromatin loops, using chromatin interaction analysis by paired-end tag sequencing (ChIA-PET), and analysed gene expression in 24 diverse human cell types, including core ENCODE cell lines. Twenty-eight per cent of all chromatin loops vary across cell types; these variations modestly correlate with changes in gene expression and are effective at grouping cell types according to their tissue of origin. The connectivity of genes corresponds to different functional classes, with housekeeping genes having few contacts, and dosage-sensitive genes being more connected to enhancer elements. This atlas of chromatin loops complements the diverse maps of regulatory architecture that comprise the ENCODE Encyclopedia, and will help to support emerging analyses of genome structure and function.
View details for DOI 10.1038/s41586-020-2151-x
View details for PubMedID 32728247
-
Expanded encyclopaedias of DNA elements in the human and mouse genomes.
Nature
2020; 583 (7818): 699–710
Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
View details for DOI 10.1038/s41586-020-2493-4
View details for PubMedID 32728249
-
The human body at cellular resolution: the NIH Human Biomolecular Atlas Program
NATURE
2019; 574 (7777): 187–92
Abstract
Transformative technologies are enabling the construction of three-dimensional maps of tissues with unprecedented spatial and molecular resolution. Over the next seven years, the NIH Common Fund Human Biomolecular Atlas Program (HuBMAP) intends to develop a widely accessible framework for comprehensively mapping the human body at single-cell resolution by supporting technology development, data acquisition, and detailed spatial mapping. HuBMAP will integrate its efforts with other funding agencies, programs, consortia, and the biomedical research community at large towards the shared vision of a comprehensive, accessible three-dimensional molecular and cellular atlas of the human body, in health and under various disease conditions.
View details for DOI 10.1038/s41586-019-1629-x
View details for Web of Science ID 000489784200035
View details for PubMedID 31597973
View details for PubMedCentralID PMC6800388
-
Big data and health.
The Lancet. Digital health
2019; 1 (6): e252-e254
View details for DOI 10.1016/S2589-7500(19)30109-8
View details for PubMedID 33323249
-
HAT1 Coordinates Histone Production and Acetylation via H4 Promoter Binding.
Molecular cell
2019
Abstract
The energetic costs of duplicating chromatin are large and therefore likely depend on nutrient sensing checkpoints and metabolic inputs. By studying chromatin modifiers regulated by epithelial growth factor, we identified histone acetyltransferase 1 (HAT1) as an induced gene that enhances proliferation through coordinating histone production, acetylation, and glucose metabolism. In addition to its canonical role as a cytoplasmic histone H4 acetyltransferase, we isolated a HAT1-containing complex bound specifically at promoters of H4 genes. HAT1-dependent transcription of H4 genes required an acetate-sensitive promoter element. HAT1 expression was critical for S-phase progression and maintenance of H3 lysine 9 acetylation at proliferation-associated genes, including histone genes. Therefore, these data describe a feedforward circuit whereby HAT1 captures acetyl groups on nascent histones and drives H4 production by chromatin binding to support chromatin replication and acetylation. These findings have important implications for human disease, since high HAT1 levels associate with poor outcomes across multiple cancer types.
View details for DOI 10.1016/j.molcel.2019.05.034
View details for PubMedID 31278053
-
The Integrative Human Microbiome Project
NATURE
2019; 569 (7758): 641–48
Abstract
The NIH Human Microbiome Project (HMP) has been carried out over ten years and two phases to provide resources, methods, and discoveries that link interactions between humans and their microbiomes to health-related outcomes. The recently completed second phase, the Integrative Human Microbiome Project, comprised studies of dynamic changes in the microbiome and host under three conditions: pregnancy and preterm birth; inflammatory bowel diseases; and stressors that affect individuals with prediabetes. The associated research begins to elucidate mechanisms of host-microbiome interactions under these conditions, provides unique data resources (at the HMP Data Coordination Center), and represents a paradigm for future multi-omic studies of the human microbiome.
View details for DOI 10.1038/s41586-019-1238-8
View details for Web of Science ID 000470144100031
View details for PubMedID 31142853
-
A longitudinal big data approach for precision health
NATURE MEDICINE
2019; 25 (5): 792-+
View details for DOI 10.1038/s41591-019-0414-6
View details for Web of Science ID 000468247800023
-
Longitudinal multi-omics of host-microbe dynamics in prediabetes.
Nature
2019; 569 (7758): 663–71
Abstract
Type 2 diabetes mellitus (T2D) is a growing health problem, but little is known about its early disease stages, its effects on biological processes or the transition to clinical T2D. To understand the earliest stages of T2Dbetter, we obtained samples from 106 healthy individuals and individuals with prediabetes over approximately four years and performed deep profiling of transcriptomes, metabolomes, cytokines, and proteomes, as well as changes in the microbiome. This rich longitudinal data set revealed many insights: first, healthy profiles are distinct among individuals while displaying diverse patterns of intra- and/or inter-personal variability. Second, extensive host and microbial changes occur during respiratory viral infections and immunization, and immunization triggers potentially protective responses that are distinct from responses to respiratory viral infections. Moreover, during respiratory viral infections, insulin-resistant participants respond differently than insulin-sensitive participants. Third, global co-association analyses among the thousands of profiled molecules reveal specific host-microbe interactions that differ between insulin-resistant and insulin-sensitive individuals. Last, we identified early personal molecular signatures in one individual that preceded the onset of T2D, including the inflammation markers interleukin-1 receptor agonist (IL-1RA) and high-sensitivity C-reactive protein (CRP) paired with xenobiotic-induced immune signalling. Our study reveals insights into pathways and responses that differ between glucose-dysregulated and healthy individuals during health and disease and provides an open-access data resource to enable further research into healthy, prediabetic and T2D states.
View details for DOI 10.1038/s41586-019-1236-x
View details for PubMedID 31142858
-
The NASA Twins Study: A multidimensional analysis of a year-long human spaceflight
SCIENCE
2019; 364 (6436): 144-+
View details for DOI 10.1126/science.aau8650
View details for Web of Science ID 000464620000031
-
Gene-Environment Interaction in the Era of Precision Medicine
CELL
2019; 177 (1): 38–44
View details for DOI 10.1016/j.cell.2019.03.004
View details for Web of Science ID 000462034400011
-
A longitudinal big data approach for precision health.
Nature medicine
2019; 25 (5): 792–804
Abstract
Precision health relies on the ability to assess disease risk at an individual level, detect early preclinical conditions and initiate preventive strategies. Recent technological advances in omics and wearable monitoring enable deep molecular and physiological profiling and may provide important tools for precision health. We explored the ability of deep longitudinal profiling to make health-related discoveries, identify clinically relevant molecular pathways and affect behavior in a prospective longitudinal cohort (n = 109) enriched for risk of type 2 diabetes mellitus. The cohort underwent integrative personalized omics profiling from samples collected quarterly for up to 8 years (median, 2.8 years) using clinical measures and emerging technologies including genome, immunome, transcriptome, proteome, metabolome, microbiome and wearable monitoring. We discovered more than 67 clinically actionable health discoveries and identified multiple molecular pathways associated with metabolic, cardiovascular and oncologic pathophysiology. We developed prediction models for insulin resistance by using omics measurements, illustrating their potential to replace burdensome tests. Finally, study participation led the majority of participants to implement diet and exercise changes. Altogether, we conclude that deep longitudinal profiling can lead to actionable health discoveries and provide relevant information for precision health.
View details for PubMedID 31068711
-
Chromatin Remodeling in Response to BRCA2-Crisis.
Cell reports
2019; 28 (8): 2182–93.e6
Abstract
Individuals with a single functional copy of the BRCA2 tumor suppressor have elevated risks for breast, ovarian, and other solid tumor malignancies. The exact mechanisms of carcinogenesis due to BRCA2 haploinsufficiency remain unclear, but one possibility is that at-risk cells are subject to acute periods of decreased BRCA2 availability and function ("BRCA2-crisis"), which may contribute to disease. Here, we establish an in vitro model for BRCA2-crisis that demonstrates chromatin remodeling and activation of an NF-κB survival pathway in response to transient BRCA2 depletion. Mechanistically, we identify BRCA2 chromatin binding, histone acetylation, and associated transcriptional activity as critical determinants of the epigenetic response to BRCA2-crisis. These chromatin alterations are reflected in transcriptional profiles of pre-malignant tissues from BRCA2 carriers and, therefore, may reflect natural steps in human disease. By modeling BRCA2-crisis in vitro, we have derived insights into pre-neoplastic molecular alterations that may enhance the development of preventative therapies.
View details for DOI 10.1016/j.celrep.2019.07.057
View details for PubMedID 31433991
-
High-Resolution Bisulfite-Sequencing of Peripheral Blood DNA Methylation in Early-Onset and Familial Risk Breast Cancer Patients.
Clinical cancer research : an official journal of the American Association for Cancer Research
2019
Abstract
Understanding and explaining hereditary predisposition to cancer has focused on the genetic etiology of the disease. However, mutations in known genes associated with breast cancer, such as BRCA1 and BRCA2, account for less than 25% of familial cases of breast cancer. Recently, specific epigenetic modifications at BRCA1 have been shown to promote hereditary breast cancer, but the broader potential for epigenetic contribution to hereditary breast cancer is not yet well understood.We examined DNA methylation through deep bisulfite sequencing of CpG islands and known promoter or regulatory regions in peripheral blood DNA from 99 familial or early-onset breast or ovarian cancer patients, 6 unaffected BRCA-mutation carriers, and 49 unaffected controls.In 9% of patients, we observed altered methylation in the promoter regions of genes known to be involved in cancer including hypermethylation at the tumor suppressor PTEN and hypomethylation at the proto-oncogene TEX14 These alterations occur in the form of allelic methylation that span up to hundreds of base-pairs in length.Our observations suggest a broader role for DNA methylation in early-onset, familial risk breast cancer. Further studies are warranted to clarify these mechanisms and the benefits of DNA methylation screening for early risk prediction of familial cancers.
View details for DOI 10.1158/1078-0432.CCR-18-2423
View details for PubMedID 31175093
-
The NASA Twins Study: A multidimensional analysis of a year-long human spaceflight.
Science (New York, N.Y.)
2019; 364 (6436)
Abstract
To understand the health impact of long-duration spaceflight, one identical twin astronaut was monitored before, during, and after a 1-year mission onboard the International Space Station; his twin served as a genetically matched ground control. Longitudinal assessments identified spaceflight-specific changes, including decreased body mass, telomere elongation, genome instability, carotid artery distension and increased intima-media thickness, altered ocular structure, transcriptional and metabolic changes, DNA methylation changes in immune and oxidative stress-related pathways, gastrointestinal microbiota alterations, and some cognitive decline postflight. Although average telomere length, global gene expression, and microbiome changes returned to near preflight levels within 6 months after return to Earth, increased numbers of short telomeres were observed and expression of some genes was still disrupted. These multiomic, molecular, physiological, and behavioral datasets provide a valuable roadmap of the putative health risks for future human spaceflight.
View details for PubMedID 30975860
-
Metformin Affects Heme Function as a Possible Mechanism of Action.
G3 (Bethesda, Md.)
2018
Abstract
Metformin elicits pleiotropic effects that are beneficial for treating diabetes, and as well as particular cancers and aging. In spite of its importance, a convincing and unifying mechanism to explain how metformin operates is lacking. Here we describe investigations into the mechanism of metformin action through heme and hemoprotein(s). Metformin suppresses heme production by 50% in yeast, and this suppression requires mitochondria function, which is necessary for heme synthesis. At high concentrations comparable to those in the clinic, metformin also suppresses heme production in human erythrocytes, erythropoietic cells and hepatocytes by 30-50%; the heme-targeting drug artemisinin operates at a greater potency. Significantly, metformin prevents oxidation of heme in three protein scaffolds, cytochrome c, myoglobin and hemoglobin, with Kd values < 3 mM suggesting a dual oxidation and reduction role in the regulation of heme redox transition. Since heme- and porphyrin-like groups operate in diverse enzymes that control important metabolic processes, we suggest that metformin acts, at least in part, through stabilizing appropriate redox states in heme and other porphyrin-containing groups to control cellular metabolism.
View details for PubMedID 30554148
-
High Frequency Actionable Pathogenic Exome Variants in an Average-Risk Cohort.
Cold Spring Harbor molecular case studies
2018
Abstract
Exome sequencing is increasingly utilized in both clinical and non-clinical settings, but little is known about its utility in healthy individuals. Most previous studies on this topic have examined a small subset of genes known to be implicated in human disease and/or have used automated pipelines to assess pathogenicity of known variants. In order to determine the frequency of both medically actionable and non-actionable but medically relevant exome findings in the general population we assessed the exomes of 70 participants who have been extensively characterized over the past several years as part of a longitudinal integrated multi-omics profiling study. We analyzed exomes by identifying rare likely pathogenic and pathogenic variants in genes associated with Mendelian disease in the Online Mendelian Inheritance in Man (OMIM) database. We then used American College of Medical Genetics (ACMG) guidelines for the classification of rare sequence variants. Additionally, we assessed pharmacogenetic variants. Twelve out of 70 (17%) participants had medically actionable findings in Mendelian disease genes. Five had phenotypes or family histories associated with their genetic variants. The frequency of actionable variants is higher than that reported in most previous studies and suggests added benefit from utilizing expanded gene lists and manual curation to assess actionable findings. A total of 63 participants (90%) had additional non-actionable findings, including 60 who were found to be carriers for recessive diseases and 21 who have increased Alzheimer's disease risk due to heterozygous or homozygous APOE e4 alleles (18 participants had both). Our results suggest that exome sequencing may have considerable more utility for health management in the general population than previously thought.
View details for PubMedID 30487145
-
Longitudinal personal DNA methylome dynamics in a human with a chronic condition.
Nature medicine
2018
Abstract
Epigenomics regulates gene expression and is as important as genomics in precision personal health, as it is heavily influenced by environment and lifestyle. We profiled whole-genome DNA methylation and the corresponding transcriptome of peripheral blood mononuclear cells collected from a human volunteer over a period of 36 months, generating 28 methylome and 57 transcriptome datasets. We found that DNA methylomic changes are associated with infrequent glucose level alteration, whereas the transcriptome underwent dynamic changes during events such as viral infections. Most DNA meta-methylome changes occurred 80-90days before clinically detectable glucose elevation. Analysis of the deep personal methylome dataset revealed an unprecedented number of allelic differentially methylated regions that remain stable longitudinally and are preferentially associated with allele-specific gene regulation. Our results revealed that changes in different types of 'omics' data associate with different physiological aspects of this individual: DNA methylation with chronic conditions and transcriptome with acute events.
View details for PubMedID 30397358
-
Dynamic Human Environmental Exposome Revealed by Longitudinal Personal Monitoring.
Cell
2018; 175 (1): 277
Abstract
Human health is dependent upon environmental exposures, yet the diversity and variation in exposures are poorly understood. We developed a sensitive method to monitor personal airborne biological and chemical exposures and followed the personal exposomes of 15 individuals for up to 890days and over 66 distinct geographical locations. We found that individuals are potentially exposed to thousands of pan-domain species and chemical compounds, including insecticides and carcinogens. Personal biological and chemical exposomes are highly dynamic and vary spatiotemporally, even for individuals located in the same general geographical region.Integrated analysis of biological and chemical exposomes revealed strong location-dependent relationships. Finally, construction of an exposome interaction network demonstrated the presence of distinct yet interconnected human- and environment-centric clouds, comprised of interacting ecosystems such as human, flora, pets, and arthropods. Overall, we demonstrate that human exposomes are diverse, dynamic, spatiotemporally-driven interaction networks with the potential to impact human health.
View details for PubMedID 30241608
-
Decoding the Genomics of Abdominal Aortic Aneurysm.
Cell
2018; 174 (6): 1361
Abstract
A key aspect of genomic medicine is to make individualized clinical decisions from personal genomes. We developed a machine-learning framework to integrate personal genomes and electronic health record (EHR) data and used this framework to study abdominal aortic aneurysm (AAA), a prevalent irreversible cardiovascular disease with unclear etiology. Performing whole-genome sequencing on AAA patients and controls, we demonstrated its predictive precision solely from personal genomes. By modeling personal genomes with EHRs, this framework quantitatively assessed the effectiveness of adjusting personal lifestyles given personal genome baselines, demonstrating its utility as a personal health management tool. We showed that this new framework agnostically identified genetic components involved in AAA, which were subsequently validated in human aortic tissues and in murine models. Our study presents a new framework for disease genome analysis, which can be used for both health management and understanding the biological architecture of complex diseases. VIDEO ABSTRACT.
View details for PubMedID 30193110
-
Glucotypes reveal new patterns of glucose dysregulation.
PLoS biology
2018; 16 (7): e2005143
Abstract
Diabetes is an increasing problem worldwide; almost 30 million people, nearly 10% of the population, in the United States are diagnosed with diabetes. Another 84 million are prediabetic, and without intervention, up to 70% of these individuals may progress to type 2 diabetes. Current methods for quantifying blood glucose dysregulation in diabetes and prediabetes are limited by reliance on single-time-point measurements or on average measures of overall glycemia and neglect glucose dynamics. We have used continuous glucose monitoring (CGM) to evaluate the frequency with which individuals demonstrate elevations in postprandial glucose, the types of patterns, and how patterns vary between individuals given an identical nutrient challenge. Measurement of insulin resistance and secretion highlights the fact that the physiology underlying dysglycemia is highly variable between individuals. We developed an analytical framework that can group individuals according to specific patterns of glycemic responses called "glucotypes" that reveal heterogeneity, or subphenotypes, within traditional diagnostic categories of glucose regulation. Importantly, we found that even individuals considered normoglycemic by standard measures exhibit high glucose variability using CGM, with glucose levels reaching prediabetic and diabetic ranges 15% and 2% of the time, respectively. We thus show that glucose dysregulation, as characterized by CGM, is more prevalent and heterogeneous than previously thought and can affect individuals considered normoglycemic by standard measures, and specific patterns of glycemic responses reflect variable underlying physiology. The interindividual variability in glycemic responses to standardized meals also highlights the personal nature of glucose regulation. Through extensive phenotyping, we developed a model for identifying potential mechanisms of personal glucose dysregulation and built a webtool for visualizing a user-uploaded CGM profile and classifying individualized glucose patterns into glucotypes.
View details for PubMedID 30040822
-
Natural Selection Has Differentiated the Progesterone Receptor among Human Populations.
American journal of human genetics
2018
Abstract
The progesterone receptor (PGR) plays a central role in maintaining pregnancy and is significantly associated with medical conditions such as preterm birth that affects 12.6% of all the births in U.S. PGR has been evolving rapidly since the common ancestor of human and chimpanzee, and we herein investigated evolutionary dynamics of PGR during recent human migration and population differentiation. Our study revealed substantial population differentiation at the PGR locus driven by natural selection, where very recent positive selection in East Asians has substantially decreased its genetic diversity by nearly fixing evolutionarily novel alleles. On the contrary, in European populations, the PGR locus has been promoted to a highly polymorphic state likely due to balancing selection. Integrating transcriptome data across multiple tissue types together with large-scale genome-wide association data for preterm birth, our study demonstrated the consequence of the selection event in East Asians on remodeling PGR expression specifically in the ovary and determined a significant association of early spontaneous preterm birth with the evolutionarily selected variants. To reconstruct its evolutionary trajectory on the human lineage, we observed substantial differentiation between modern and archaic humans at the PGR locus, including fixation of a deleterious missense allele in the Neanderthal genome that was later introgressed in modern human populations. Taken together, our study revealed substantial evolutionary innovation in PGR even during very recent human evolution, and its different forms among human populations likely result in differential susceptibility to progesterone-associated disease conditions including preterm birth.
View details for PubMedID 29937092
-
Systematic Protein Prioritization for Targeted Proteomics Studies through Literature Mining
JOURNAL OF PROTEOME RESEARCH
2018; 17 (4): 1383–96
Abstract
There are more than 3.7 million published articles on the biological functions or disease implications of proteins, constituting an important resource of proteomics knowledge. However, it is difficult to summarize the millions of proteomics findings in the literature manually and quantify their relevance to the biology and diseases of interest. We developed a fully automated bioinformatics framework to identify and prioritize proteins associated with any biological entity. We used the 22 targeted areas of the Biology/Disease-driven (B/D)-Human Proteome Project (HPP) as examples, prioritized the relevant proteins through their Protein Universal Reference Publication-Originated Search Engine (PURPOSE) scores, validated the relevance of the score by comparing the protein prioritization results with a curated database, computed the scores of proteins across the topics of B/D-HPP, and characterized the top proteins in the common model organisms. We further extended the bioinformatics workflow to identify the relevant proteins in all organ systems and human diseases and deployed a cloud-based tool to prioritize proteins related to any custom search terms in real time. Our tool can facilitate the prioritization of proteins for any organ system or disease of interest and can contribute to the development of targeted proteomic studies for precision medicine.
View details for PubMedID 29505266
-
Microfluidic isoform sequencing shows widespread splicing coordination in the human transcriptome
GENOME RESEARCH
2018; 28 (2): 231–42
Abstract
Understanding transcriptome complexity is crucial for understanding human biology and disease. Technologies such as Synthetic long-read RNA sequencing (SLR-RNA-seq) delivered 5 million isoforms and allowed assessing splicing coordination. Pacific Biosciences and Oxford Nanopore increase throughput also but require high input amounts or amplification. Our new droplet-based method, sparse isoform sequencing (spISO-seq), sequences 100k-200k partitions of 10-200 molecules at a time, enabling analysis of 10-100 million RNA molecules. SpISO-seq requires less than 1 ng of input cDNA, limiting or removing the need for prior amplification with its associated biases. Adjusting the number of reads devoted to each molecule reduces sequencing lanes and cost, with little loss in detection power. The increased number of molecules expands our understanding of isoform complexity. In addition to confirming our previously published cases of splicing coordination (e.g., BIN1), the greater depth reveals many new cases, such as MAPT Coordination of internal exons is found to be extensive among protein coding genes: 23.5%-59.3% (95% confidence interval) of highly expressed genes with distant alternative exons exhibit coordination, showcasing the need for long-read transcriptomics. However, coordination is less frequent for noncoding sequences, suggesting a larger role of splicing coordination in shaping proteins. Groups of genes with coordination are involved in protein-protein interactions with each other, raising the possibility that coordination facilitates complex formation and/or function. We also find new splicing coordination types, involving initial and terminal exons. Our results provide a more comprehensive understanding of the human transcriptome and a general, cost-effective method to analyze it.
View details for PubMedID 29196558
View details for PubMedCentralID PMC5793787
-
A genome-wide association study identifies only two ancestry specific variants associated with spontaneous preterm birth
SCIENTIFIC REPORTS
2018; 8: 226
Abstract
Preterm birth (PTB), or the delivery prior to 37 weeks of gestation, is a significant cause of infant morbidity and mortality. Although twin studies estimate that maternal genetic contributions account for approximately 30% of the incidence of PTB, and other studies reported fetal gene polymorphism association, to date no consistent associations have been identified. In this study, we performed the largest reported genome-wide association study analysis on 1,349 cases of PTB and 12,595 ancestry-matched controls from the focusing on genomic fetal signals. We tested over 2 million single nucleotide polymorphisms (SNPs) for associations with PTB across five subpopulations: African (AFR), the Americas (AMR), European, South Asian, and East Asian. We identified only two intergenic loci associated with PTB at a genome-wide level of significance: rs17591250 (P = 4.55E-09) on chromosome 1 in the AFR population and rs1979081 (P = 3.72E-08) on chromosome 8 in the AMR group. We have queried several existing replication cohorts and found no support of these associations. We conclude that the fetal genetic contribution to PTB is unlikely due to single common genetic variant, but could be explained by interactions of multiple common variants, or of rare variants affected by environmental influences, all not detectable using a GWAS alone.
View details for PubMedID 29317701
-
Integrative Personal Omics Profiles during Periods of Weight Gain and Loss.
Cell systems
2018
Abstract
Advances in omics technologies now allow an unprecedented level of phenotyping for human diseases, including obesity, in which individual responses to excess weight are heterogeneous and unpredictable. To aid the development of better understanding of these phenotypes, we performed a controlled longitudinal weight perturbation study combining multiple omics strategies (genomics, transcriptomics, multiple proteomics assays, metabolomics, and microbiomics) during periods of weight gain and loss in humans. Results demonstrated that: (1) weight gain is associated with the activation of strong inflammatory and hypertrophic cardiomyopathy signatures in blood; (2) although weight loss reverses some changes, a number of signatures persist, indicative of long-term physiologic changes; (3) we observed omics signatures associated with insulin resistance that may serve as novel diagnostics; (4) specific biomolecules were highly individualized and stable in response to perturbations, potentially representing stable personalized markers. Most data are available open access and serve as a valuable resource for the community.
View details for PubMedID 29361466
-
Association of Omics Features with Histopathology Patterns in Lung Adenocarcinoma
CELL SYSTEMS
2017; 5 (6): 620-+
Abstract
Adenocarcinoma accounts for more than 40% of lung malignancy, and microscopic pathology evaluation is indispensable for its diagnosis. However, how histopathology findings relate to molecular abnormalities remains largely unknown. Here, we obtained H&E-stained whole-slide histopathology images, pathology reports, RNA sequencing, and proteomics data of 538 lung adenocarcinoma patients from The Cancer Genome Atlas and used these to identify molecular pathways associated with histopathology patterns. We report cell-cycle regulation and nucleotide binding pathways underpinning tumor cell dedifferentiation, and we predicted histology grade using transcriptomics and proteomics signatures (area under curve >0.80). We built an integrative histopathology-transcriptomics model to generate better prognostic predictions for stage I patients (p = 0.0182 ± 0.0021) compared with gene expression or histopathology studies alone, and the results were replicated in an independent cohort (p = 0.0220 ± 0.0070). These results motivate the integration of histopathology and omics data to investigate molecular mechanisms of pathology findings and enhance clinical prognostic prediction.
View details for PubMedID 29153840
View details for PubMedCentralID PMC5746468
-
Plasma sterols and depressive symptom severity in a population-based cohort
PLOS ONE
2017; 12 (9): e0184382
Abstract
Convergent evidence strongly suggests major depressive disorder is heterogeneous in its etiology and clinical characteristics. Depression biomarkers hold potential for identifying etiological subtypes, improving diagnostic accuracy, predicting treatment response, and personalization of treatment. Human plasma contains numerous sterols that have not been systematically studied. Changes in cholesterol concentrations have been implicated in suicide and depression, suggesting plasma sterols may be depression biomarkers. Here, we investigated associations between plasma levels of 34 sterols (measured by mass spectrometry) and scores on the Quick Inventory of Depressive Symptomatology-Self Report (QIDS-SR16) scale in 3117 adult participants in the Dallas Heart Study, an ethnically diverse, population-based cohort. We built a random forest model using feature selection from a pool of 43 variables including demographics, general health indicators, and sterol concentrations. This model comprised 19 variables, 13 of which were sterol concentrations, and explained 15.5% of the variation in depressive symptoms. Desmosterol concentrations below the fifth percentile (1.9 ng/mL, OR 1.9, 95% CI 1.2-2.9) were significantly associated with depressive symptoms of at least moderate severity (QIDS-SR16 score ≥10.5). This is the first study reporting a novel association between plasma concentrations cholesterol precursors and depressive symptom severity.
View details for PubMedID 28886149
-
Fetal de novo mutations and preterm birth.
PLoS genetics
2017; 13 (4)
Abstract
Preterm birth (PTB) affects ~12% of pregnancies in the US. Despite its high mortality and morbidity, the molecular etiology underlying PTB has been unclear. Numerous studies have been devoted to identifying genetic factors in maternal and fetal genomes, but so far few genomic loci have been associated with PTB. By analyzing whole-genome sequencing data from 816 trio families, for the first time, we observed the role of fetal de novo mutations in PTB. We observed a significant increase in de novo mutation burden in PTB fetal genomes. Our genomic analyses further revealed that affected genes by PTB de novo mutations were dosage sensitive, intolerant to genomic deletions, and their mouse orthologs were likely developmentally essential. These genes were significantly involved in early fetal brain development, which was further supported by our analysis of copy number variants identified from an independent PTB cohort. Our study indicates a new mechanism in PTB occurrence independently contributed from fetal genomes, and thus opens a new avenue for future PTB research.
View details for DOI 10.1371/journal.pgen.1006689
View details for PubMedID 28388617
-
De novo and rare mutations in the HSPA1L heat shock gene associated with inflammatory bowel disease
GENOME MEDICINE
2017; 9
Abstract
Inflammatory bowel disease (IBD) is a chronic, relapsing inflammatory disease of the gastrointestinal tract which includes ulcerative colitis and Crohn's disease. Genetic risk factors for IBD are not well understood.We performed a family-based whole exome sequencing (WES) analysis on a core family (Family A) to identify potential causal mutations and then analyzed exome data from a Caucasian pediatric cohort (136 patients and 106 controls) to validate the presence of mutations in the candidate gene, heat shock 70 kDa protein 1-like (HSPA1L). Biochemical assays of the de novo and rare (minor allele frequency, MAF < 0.01) mutation variant proteins further validated the predicted deleterious effects of the identified alleles.In the proband of Family A, we found a heterozygous de novo mutation (c.830C > T; p.Ser277Leu) in HSPA1L. Through analysis of WES data of 136 patients, we identified five additional rare HSPA1L mutations (p.Gly77Ser, p.Leu172del, p.Thr267Ile, p.Ala268Thr, p.Glu558Asp) in six patients. In contrast, rare HSPA1L mutations were not observed in controls, and were significantly enriched in patients (P = 0.02). Interestingly, we did not find non-synonymous rare mutations in the HSP70 isoforms HSPA1A and HSPA1B. Biochemical assays revealed that all six rare HSPA1L variant proteins showed decreased chaperone activity in vitro. Moreover, three variants demonstrated dominant negative effects on HSPA1L and HSPA1A protein activity.Our results indicate that de novo and rare mutations in HSPA1L are associated with IBD and provide insights into the pathogenesis of IBD, and also expand our understanding of the roles of HSP70s in human disease.
View details for DOI 10.1186/s13073-016-0394-9
View details for PubMedID 28126021
-
Digital Health: Tracking Physiomes and Activity Using Wearable Biosensors Reveals Useful Health-Related Information.
PLoS biology
2017; 15 (1)
Abstract
A new wave of portable biosensors allows frequent measurement of health-related physiology. We investigated the use of these devices to monitor human physiological changes during various activities and their role in managing health and diagnosing and analyzing disease. By recording over 250,000 daily measurements for up to 43 individuals, we found personalized circadian differences in physiological parameters, replicating previous physiological findings. Interestingly, we found striking changes in particular environments, such as airline flights (decreased peripheral capillary oxygen saturation [SpO2] and increased radiation exposure). These events are associated with physiological macro-phenotypes such as fatigue, providing a strong association between reduced pressure/oxygen and fatigue on high-altitude flights. Importantly, we combined biosensor information with frequent medical measurements and made two important observations: First, wearable devices were useful in identification of early signs of Lyme disease and inflammatory responses; we used this information to develop a personalized, activity-based normalization framework to identify abnormal physiological signals from longitudinal data for facile disease detection. Second, wearables distinguish physiological differences between insulin-sensitive and -resistant individuals. Overall, these results indicate that portable biosensors provide useful information for monitoring personal activities and physiology and are likely to play an important role in managing health and enabling affordable health care access to groups traditionally limited by socioeconomic class or remote geography.
View details for DOI 10.1371/journal.pbio.2001402
View details for PubMedID 28081144
-
Static and Dynamic DNA Loops form AP-1-Bound Activation Hubs during Macrophage Development.
Molecular cell
2017; 67 (6): 1037–48.e6
Abstract
The three-dimensional arrangement of the human genome comprises a complex network of structural and regulatory chromatin loops important for coordinating changes in transcription during human development. To better understand the mechanisms underlying context-specific 3D chromatin structure and transcription during cellular differentiation, we generated comprehensive in situ Hi-C maps of DNA loops in human monocytes and differentiated macrophages. We demonstrate that dynamic looping events are regulatory rather than structural in nature and uncover widespread coordination of dynamic enhancer activity at preformed and acquired DNA loops. Enhancer-bound loop formation and enhancer activation of preformed loops together form multi-loop activation hubs at key macrophage genes. Activation hubs connect 3.4 enhancers per promoter and exhibit a strong enrichment for activator protein 1 (AP-1)-binding events, suggesting that multi-loop activation hubs involving cell-type-specific transcription factors represent an important class of regulatory chromatin structures for the spatiotemporal control of transcription.
View details for PubMedID 28890333
-
Patient-Specific iPSC-Derived Endothelial Cells Uncover Pathways that Protect against Pulmonary Hypertension in BMPR2 Mutation Carriers.
Cell stem cell
2016
Abstract
In familial pulmonary arterial hypertension (FPAH), the autosomal dominant disease-causing BMPR2 mutation is only 20% penetrant, suggesting that genetic variation provides modifiers that alleviate the disease. Here, we used comparison of induced pluripotent stem cell-derived endothelial cells (iPSC-ECs) from three families with unaffected mutation carriers (UMCs), FPAH patients, and gender-matched controls to investigate this variation. Our analysis identified features of UMC iPSC-ECs related to modifiers of BMPR2 signaling or to differentially expressed genes. FPAH-iPSC-ECs showed reduced adhesion, survival, migration, and angiogenesis compared to UMC-iPSC-ECs and control cells. The "rescued" phenotype of UMC cells was related to an increase in specific BMPR2 activators and/or a reduction in inhibitors, and the improved cell adhesion could be attributed to preservation of related signaling. The improved survival was related to increased BIRC3 and was independent of BMPR2. Our findings therefore highlight protective modifiers for FPAH that could help inform development of future treatment strategies.
View details for DOI 10.1016/j.stem.2016.08.019
View details for PubMedID 28017794
-
Simul-seq: combined DNA and RNA sequencing for whole-genome and transcriptome profiling.
Nature methods
2016; 13 (11): 953-958
Abstract
Paired DNA and RNA profiling is increasingly employed in genomics research to uncover molecular mechanisms of disease and to explore personal genotype and phenotype correlations. Here, we introduce Simul-seq, a technique for the production of high-quality whole-genome and transcriptome sequencing libraries from small quantities of cells or tissues. We apply the method to laser-capture-microdissected esophageal adenocarcinoma tissue, revealing a highly aneuploid tumor genome with extensive blocks of increased homozygosity and corresponding increases in allele-specific expression. Among this widespread allele-specific expression, we identify germline polymorphisms that are associated with response to cancer therapies. We further leverage this integrative data to uncover expressed mutations in several known cancer genes as well as a recurrent mutation in the motor domain of KIF3B that significantly affects kinesin-microtubule interactions. Simul-seq provides a new streamlined approach for generating comprehensive genome and transcriptome profiles from limited quantities of clinically relevant samples.
View details for DOI 10.1038/nmeth.4028
View details for PubMedID 27723755
-
Yeast longevity promoted by reversing aging-associated decline in heavy isotope content.
NPJ aging and mechanisms of disease
2016; 2: 16004
Abstract
Dysregulation of metabolism develops with organismal aging. Both genetic and environmental manipulations promote longevity by effectively diverting various metabolic processes against aging. How these processes converge on the metabolome is not clear. Here we report that the heavy isotopic forms of common elements, a universal feature of metabolites, decline in yeast cells undergoing chronological aging. Supplementation of deuterium, a heavy hydrogen isotope, through heavy water (D2O) uptake extends yeast chronological lifespan (CLS) by up to 85% with minimal effects on growth. The CLS extension by D2O bypasses several known genetic regulators, but is abrogated by calorie restriction and mitochondrial deficiency. Heavy water substantially suppresses endogenous generation of reactive oxygen species (ROS) and slows the pace of metabolic consumption and disposal. Protection from aging by heavy isotopes might result from kinetic modulation of biochemical reactions. Altogether, our findings reveal a novel perspective of aging and new means for promoting longevity.
View details for DOI 10.1038/npjamd.2016.4
View details for PubMedID 28721263
View details for PubMedCentralID PMC5515009
-
Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations
NATURE GENETICS
2016; 48 (2): 117-125
Abstract
Cancer sequencing studies have primarily identified cancer driver genes by the accumulation of protein-altering mutations. An improved method would be annotation independent, sensitive to unknown distributions of functions within proteins and inclusive of noncoding drivers. We employed density-based clustering methods in 21 tumor types to detect variably sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and noncoding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs demonstrate spatial clustering of alterations in molecular domains and at interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated across tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest that mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally agnostic driver identification.
View details for DOI 10.1038/ng.3471
View details for Web of Science ID 000369043900008
-
Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome.
Nature biotechnology
2016; 34 (1): 64-69
Abstract
Identifying bacterial strains in metagenome and microbiome samples using computational analyses of short-read sequences remains a difficult problem. Here, we present an analysis of a human gut microbiome using TruSeq synthetic long reads combined with computational tools for metagenomic long-read assembly, variant calling and haplotyping (Nanoscope and Lens). Our analysis identifies 178 bacterial species, of which 51 were not found using shotgun reads alone. We recover bacterial contigs that comprise multiple operons, including 22 contigs of >1 Mbp. Furthermore, we observe extensive intraspecies variation within microbial strains in the form of haplotypes that span up to hundreds of Kbp. Incorporation of synthetic long-read sequencing technology with standard short-read approaches enables more precise and comprehensive analyses of metagenomic samples.
View details for DOI 10.1038/nbt.3416
View details for PubMedID 26655498
View details for PubMedCentralID PMC4884093
-
Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features.
Nature communications
2016; 7: 12474-?
Abstract
Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs.
View details for DOI 10.1038/ncomms12474
View details for PubMedID 27527408
-
Identification of Human Neuronal Protein Complexes Reveals Biochemical Activities and Convergent Mechanisms of Action in Autism Spectrum Disorders
CELL SYSTEMS
2015; 1 (5): 361-374
Abstract
The prevalence of autism spectrum disorders (ASDs) is rapidly growing, yet its molecular basis is poorly understood. We used a systems approach in which ASD candidate genes were mapped onto the ubiquitous human protein complexes and the resulting complexes were characterized. The studies revealed the role of histone deacetylases (HDAC1/2) in regulating the expression of ASD orthologs in the embryonic mouse brain. Proteome-wide screens for the co-complexed subunits with HDAC1 and six other key ASD proteins in neuronal cells revealed a protein interaction network, which displayed preferential expression in fetal brain development, exhibited increased deleterious mutations in ASD cases, and were strongly regulated by FMRP and MECP2 causal for Fragile X and Rett syndromes, respectively. Overall, our study reveals molecular components in ASD, suggests a shared mechanism between the syndromic and idiopathic forms of ASDs, and provides a systems framework for analyzing complex human diseases.
View details for DOI 10.1016/j.cels.2015.11.002
View details for Web of Science ID 000209926300009
View details for PubMedCentralID PMC4776331
-
Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions
CELL
2015; 162 (5): 1051-1065
Abstract
Deciphering the impact of genetic variants on gene regulation is fundamental to understanding human disease. Although gene regulation often involves long-range interactions, it is unknown to what extent non-coding genetic variants influence distal molecular phenotypes. Here, we integrate chromatin profiling for three histone marks in lymphoblastoid cell lines (LCLs) from 75 sequenced individuals with LCL-specific Hi-C and ChIA-PET-based chromatin contact maps to uncover one of the largest collections of local and distal histone quantitative trait loci (hQTLs). Distal QTLs are enriched within topologically associated domains and exhibit largely concordant variation of chromatin state coordinated by proximal and distal non-coding genetic variants. Histone QTLs are enriched for common variants associated with autoimmune diseases and enable identification of putative target genes of disease-associated variants from genome-wide association studies. These analyses provide insights into how genetic variation can affect human disease phenotypes by coordinated changes in chromatin at interacting regulatory elements.
View details for DOI 10.1016/j.cell.2015.07.048
View details for Web of Science ID 000360589900015
View details for PubMedCentralID PMC4556133
-
Recurrent somatic mutations in regulatory regions of human cancer genomes.
Nature genetics
2015; 47 (7): 710-716
Abstract
Aberrant regulation of gene expression in cancer can promote survival and proliferation of cancer cells. Here we integrate whole-genome sequencing data from The Cancer Genome Atlas (TCGA) for 436 patients from 8 cancer subtypes with ENCODE and other regulatory annotations to identify point mutations in regulatory regions. We find evidence for positive selection of mutations in transcription factor binding sites, consistent with these sites regulating important cancer cell functions. Using a new method that adjusts for sample- and genomic locus-specific mutation rates, we identify recurrently mutated sites across individuals with cancer. Mutated regulatory sites include known sites in the TERT promoter and many new sites, including a subset in proximity to cancer-related genes. In reporter assays, two new sites display decreased enhancer activity upon mutation. These data demonstrate that many regulatory regions contain mutations under selective pressure and suggest a greater role for regulatory mutations in cancer than previously appreciated.
View details for DOI 10.1038/ng.3332
View details for PubMedID 26053494
View details for PubMedCentralID PMC4485503
-
Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events
NATURE BIOTECHNOLOGY
2015; 33 (7): 736-742
Abstract
Alternative splicing shapes mammalian transcriptomes, with many RNA molecules undergoing multiple distant alternative splicing events. Comprehensive transcriptome analysis, including analysis of exon co-association in the same molecule, requires deep, long-read sequencing. Here we introduce an RNA sequencing method, synthetic long-read RNA sequencing (SLR-RNA-seq), in which small pools (≤1,000 molecules/pool, ≤1 molecule/gene for most genes) of full-length cDNAs are amplified, fragmented and short-read-sequenced. We demonstrate that these RNA sequences reconstructed from the short reads from each of the pools are mostly close to full length and contain few insertion and deletion errors. We report many previously undescribed isoforms (human brain: ∼13,800 affected genes, 14.5% of molecules; mouse brain ∼8,600 genes, 18% of molecules) and up to 165 human distant molecularly associated exon pairs (dMAPs) and distant molecularly and mutually exclusive pairs (dMEPs). Of 16 associated pairs detected in the mouse brain, 9 are conserved in human. Our results indicate conserved mechanisms that can produce distant but phased features on transcript and proteome isoforms.
View details for DOI 10.1038/nbt.3242
View details for Web of Science ID 000358396100029
-
Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events.
Nature biotechnology
2015
Abstract
Alternative splicing shapes mammalian transcriptomes, with many RNA molecules undergoing multiple distant alternative splicing events. Comprehensive transcriptome analysis, including analysis of exon co-association in the same molecule, requires deep, long-read sequencing. Here we introduce an RNA sequencing method, synthetic long-read RNA sequencing (SLR-RNA-seq), in which small pools (≤1,000 molecules/pool, ≤1 molecule/gene for most genes) of full-length cDNAs are amplified, fragmented and short-read-sequenced. We demonstrate that these RNA sequences reconstructed from the short reads from each of the pools are mostly close to full length and contain few insertion and deletion errors. We report many previously undescribed isoforms (human brain: ∼13,800 affected genes, 14.5% of molecules; mouse brain ∼8,600 genes, 18% of molecules) and up to 165 human distant molecularly associated exon pairs (dMAPs) and distant molecularly and mutually exclusive pairs (dMEPs). Of 16 associated pairs detected in the mouse brain, 9 are conserved in human. Our results indicate conserved mechanisms that can produce distant but phased features on transcript and proteome isoforms.
View details for DOI 10.1038/nbt.3242
View details for PubMedID 25985263
-
Comparison of the transcriptional landscapes between human and mouse tissues
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2014; 111 (48): 17224-17229
Abstract
Although the similarities between humans and mice are typically highlighted, morphologically and genetically, there are many differences. To better understand these two species on a molecular level, we performed a comparison of the expression profiles of 15 tissues by deep RNA sequencing and examined the similarities and differences in the transcriptome for both protein-coding and -noncoding transcripts. Although commonalities are evident in the expression of tissue-specific genes between the two species, the expression for many sets of genes was found to be more similar in different tissues within the same species than between species. These findings were further corroborated by associated epigenetic histone mark analyses. We also find that many noncoding transcripts are expressed at a low level and are not detectable at appreciable levels across individuals. Moreover, the majority lack obvious sequence homologs between species, even when we restrict our attention to those which are most highly reproducible across biological replicates. Overall, our results indicate that there is considerable RNA expression diversity between humans and mice, well beyond what was described previously, likely reflecting the fundamental physiological differences between these two organisms.
View details for DOI 10.1073/pnas.1413624111
View details for Web of Science ID 000345920800059
View details for PubMedID 25413365
View details for PubMedCentralID PMC4260565
-
Principles of regulatory information conservation between mouse and human.
Nature
2014; 515 (7527): 371-5
Abstract
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
View details for DOI 10.1038/nature13985
View details for PubMedID 25409826
View details for PubMedCentralID PMC4343047
-
Regulatory analysis of the C. elegans genome with spatiotemporal resolution.
Nature
2014; 512 (7515): 400-405
View details for DOI 10.1038/nature13497
View details for PubMedID 25164749
-
Comparative analysis of regulatory information and circuits across distant species.
Nature
2014; 512 (7515): 453-456
View details for DOI 10.1038/nature13668
View details for PubMedID 25164757
-
Defining a personal, allele-specific, and single-molecule long-read transcriptome
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2014; 111 (27): 9869-9874
Abstract
Personal transcriptomes in which all of an individual's genetic variants (e.g., single nucleotide variants) and transcript isoforms (transcription start sites, splice sites, and polyA sites) are defined and quantified for full-length transcripts are expected to be important for understanding individual biology and disease, but have not been described previously. To obtain such transcriptomes, we sequenced the lymphoblastoid transcriptomes of three family members (GM12878 and the parents GM12891 and GM12892) by using a Pacific Biosciences long-read approach complemented with Illumina 101-bp sequencing and made the following observations. First, we found that reads representing all splice sites of a transcript are evident for most sufficiently expressed genes ≤3 kb and often for genes longer than that. Second, we added and quantified previously unidentified splicing isoforms to an existing annotation, thus creating the first personalized annotation to our knowledge. Third, we determined SNVs in a de novo manner and connected them to RNA haplotypes, including HLA haplotypes, thereby assigning single full-length RNA molecules to their transcribed allele, and demonstrated Mendelian inheritance of RNA molecules. Fourth, we show how RNA molecules can be linked to personal variants on a one-by-one basis, which allows us to assess differential allelic expression (DAE) and differential allelic isoforms (DAI) from the phased full-length isoform reads. The DAI method is largely independent of the distance between exon and SNV-in contrast to fragmentation-based methods. Overall, in addition to improving eukaryotic transcriptome annotation, these results describe, to our knowledge, the first large-scale and full-length personal transcriptome.
View details for DOI 10.1073/pnas.1400447111
View details for Web of Science ID 000338514800044
View details for PubMedCentralID PMC4103364
-
Clinical interpretation and implications of whole-genome sequencing.
JAMA : the journal of the American Medical Association
2014; 311 (10): 1035-1045
Abstract
Whole-genome sequencing (WGS) is increasingly applied in clinical medicine and is expected to uncover clinically significant findings regardless of sequencing indication.To examine coverage and concordance of clinically relevant genetic variation provided by WGS technologies; to quantitate inherited disease risk and pharmacogenomic findings in WGS data and resources required for their discovery and interpretation; and to evaluate clinical action prompted by WGS findings.An exploratory study of 12 adult participants recruited at Stanford University Medical Center who underwent WGS between November 2011 and March 2012. A multidisciplinary team reviewed all potentially reportable genetic findings. Five physicians proposed initial clinical follow-up based on the genetic findings.Genome coverage and sequencing platform concordance in different categories of genetic disease risk, person-hours spent curating candidate disease-risk variants, interpretation agreement between trained curators and disease genetics databases, burden of inherited disease risk and pharmacogenomic findings, and burden and interrater agreement of proposed clinical follow-up.Depending on sequencing platform, 10% to 19% of inherited disease genes were not covered to accepted standards for single nucleotide variant discovery. Genotype concordance was high for previously described single nucleotide genetic variants (99%-100%) but low for small insertion/deletion variants (53%-59%). Curation of 90 to 127 genetic variants in each participant required a median of 54 minutes (range, 5-223 minutes) per genetic variant, resulted in moderate classification agreement between professionals (Gross κ, 0.52; 95% CI, 0.40-0.64), and reclassified 69% of genetic variants cataloged as disease causing in mutation databases to variants of uncertain or lesser significance. Two to 6 personal disease-risk findings were discovered in each participant, including 1 frameshift deletion in the BRCA1 gene implicated in hereditary breast and ovarian cancer. Physician review of sequencing findings prompted consideration of a median of 1 to 3 initial diagnostic tests and referrals per participant, with fair interrater agreement about the suitability of WGS findings for clinical follow-up (Fleiss κ, 0.24; P < 001).In this exploratory study of 12 volunteer adults, the use of WGS was associated with incomplete coverage of inherited disease genes, low reproducibility of detection of genetic variation with the highest potential clinical effects, and uncertainty about clinically reportable findings. In certain cases, WGS will identify clinically actionable genetic variants warranting early medical intervention. These issues should be considered when determining the role of WGS in clinical medicine.
View details for DOI 10.1001/jama.2014.1717
View details for PubMedID 24618965
View details for PubMedCentralID PMC4119063
-
Divergence in a master variator generates distinct phenotypes and transcriptional responses
GENES & DEVELOPMENT
2014; 28 (4): 409-421
Abstract
Genetic basis of phenotypic differences in individuals is an important area in biology and personalized medicine. Analysis of divergent Saccharomyces cerevisiae strains grown under different conditions revealed extensive variation in response to both drugs (e.g., 4-nitroquinoline 1-oxide [4NQO]) and different carbon sources. Differences in 4NQO resistance were due to amino acid variation in the transcription factor Yrr1. Yrr1(YJM789) conferred 4NQO resistance but caused slower growth on glycerol, and vice versa with Yrr1(S96), indicating that alleles of Yrr1 confer distinct phenotypes. The binding targets of Yrr1 alleles from diverse yeast strains varied considerably among different strains grown under the same conditions as well as for the same strain under different conditions, indicating that distinct molecular programs are conferred by the different Yrr1 alleles. Our results demonstrate that genetic variations in one important control gene (YRR1), lead to distinct regulatory programs and phenotypes in individuals. We term these polymorphic control genes "master variators."
View details for DOI 10.1101/gad.228940.113
View details for Web of Science ID 000331616100009
View details for PubMedID 24532717
View details for PubMedCentralID PMC3937518
-
Integrated systems analysis reveals a molecular network underlying autism spectrum disorders.
Molecular systems biology
2014; 10: 774-?
Abstract
Autism is a complex disease whose etiology remains elusive. We integrated previously and newly generated data and developed a systems framework involving the interactome, gene expression and genome sequencing to identify a protein interaction module with members strongly enriched for autism candidate genes. Sequencing of 25 patients confirmed the involvement of this module in autism, which was subsequently validated using an independent cohort of over 500 patients. Expression of this module was dichotomized with a ubiquitously expressed subcomponent and another subcomponent preferentially expressed in the corpus callosum, which was significantly affected by our identified mutations in the network center. RNA-sequencing of the corpus callosum from patients with autism exhibited extensive gene mis-expression in this module, and our immunochemical analysis showed that the human corpus callosum is predominantly populated by oligodendrocyte cells. Analysis of functional genomic data further revealed a significant involvement of this module in the development of oligodendrocyte cells in mouse brain. Our analysis delineates a natural network involved in autism, helps uncover novel candidate genes for this disease and improves our understanding of its molecular pathology.
View details for DOI 10.15252/msb.20145487
View details for PubMedID 25549968
View details for PubMedCentralID PMC4300495
-
Integrated systems analysis reveals a molecular network underlying autism spectrum disorders.
Molecular systems biology
2014; 10 (12): 774-?
View details for DOI 10.15252/msb.20145487
View details for PubMedID 25549968
-
Extensive Variation in Chromatin States Across Humans
SCIENCE
2013; 342 (6159): 750-752
Abstract
The majority of disease-associated variants lie outside protein-coding regions, suggesting a link between variation in regulatory regions and disease predisposition. We studied differences in chromatin states using five histone modifications, cohesin, and CTCF in lymphoblastoid lines from 19 individuals of diverse ancestry. We found extensive signal variation in regulatory regions, which often switch between active and repressed states across individuals. Enhancer activity is particularly diverse among individuals, whereas gene expression remains relatively stable. Chromatin variability shows genetic inheritance in trios, correlates with genetic variation and population divergence, and is associated with disruptions of transcription factor binding motifs. Overall, our results provide insights into chromatin variation among humans.
View details for DOI 10.1126/science.1242510
View details for PubMedID 24136358
-
A single-molecule long-read survey of the human transcriptome.
Nature biotechnology
2013; 31 (11): 1009-1014
Abstract
Global RNA studies have become central to understanding biological processes, but methods such as microarrays and short-read sequencing are unable to describe an entire RNA molecule from 5' to 3' end. Here we use single-molecule long-read sequencing technology from Pacific Biosciences to sequence the polyadenylated RNA complement of a pooled set of 20 human organs and tissues without the need for fragmentation or amplification. We show that full-length RNA molecules of up to 1.5 kb can readily be monitored with little sequence loss at the 5' ends. For longer RNA molecules more 5' nucleotides are missing, but complete intron structures are often preserved. In total, we identify ∼14,000 spliced GENCODE genes. High-confidence mappings are consistent with GENCODE annotations, but >10% of the alignments represent intron structures that were not previously annotated. As a group, transcripts mapping to unannotated regions have features of long, noncoding RNAs. Our results show the feasibility of deep sequencing full-length RNA from complex eukaryotic transcriptomes on a single-molecule level.
View details for DOI 10.1038/nbt.2705
View details for PubMedID 24108091
-
Dynamic trans-Acting Factor Colocalization in Human Cells
CELL
2013; 155 (3): 713-724
Abstract
Different trans-acting factors (TFs) collaborate and act in concert at distinct loci to perform accurate regulation of their target genes. To date, the cobinding of TF pairs has been investigated in a limited context both in terms of the number of factors within a cell type and across cell types and the extent of combinatorial colocalizations. Here, we use an approach to analyze TF colocalization within a cell type and across multiple cell lines at an unprecedented level. We extend this approach with large-scale mass spectrometry analysis of immunoprecipitations of 50 TFs. Our combined approach reveals large numbers of interesting TF-TF associations. We observe extensive change in TF colocalizations both within a cell type exposed to different conditions and across multiple cell types. We show distinct functional annotations and properties of different TF cobinding patterns and provide insights into the complex regulatory landscape of the cell.
View details for DOI 10.1016/j.cell.2013.09.043
View details for Web of Science ID 000326571800023
View details for PubMedID 24243024
-
Whole-exome sequencing identifies tetratricopeptide repeat domain 7A (TTC7A) mutations for combined immunodeficiency with intestinal atresias.
journal of allergy and clinical immunology
2013; 132 (3): 656-664 e17
Abstract
Combined immunodeficiency with multiple intestinal atresias (CID-MIA) is a rare hereditary disease characterized by intestinal obstructions and profound immune defects.We sought to determine the underlying genetic causes of CID-MIA by analyzing the exomic sequences of 5 patients and their healthy direct relatives from 5 unrelated families.We performed whole-exome sequencing on 5 patients with CID-MIA and 10 healthy direct family members belonging to 5 unrelated families with CID-MIA. We also performed targeted Sanger sequencing for the candidate gene tetratricopeptide repeat domain 7A (TTC7A) on 3 additional patients with CID-MIA.Through analysis and comparison of the exomic sequence of the subjects from these 5 families, we identified biallelic damaging mutations in the TTC7A gene, for a total of 7 distinct mutations. Targeted TTC7A gene sequencing in 3 additional unrelated patients with CID-MIA revealed biallelic deleterious mutations in 2 of them, as well as an aberrant splice product in the third patient. Staining of normal thymus showed that the TTC7A protein is expressed in thymic epithelial cells, as well as in thymocytes. Moreover, severe lymphoid depletion was observed in the thymus and peripheral lymphoid tissues from 2 patients with CID-MIA.We identified deleterious mutations of the TTC7A gene in 8 unrelated patients with CID-MIA and demonstrated that the TTC7A protein is expressed in the thymus. Our results strongly suggest that TTC7A gene defects cause CID-MIA.
View details for DOI 10.1016/j.jaci.2013.06.013
View details for PubMedID 23830146
-
Whole-exome sequencing identifies tetratricopeptide repeat domain 7A (TTC7A) mutations for combined immunodeficiency with intestinal atresias
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY
2013; 132 (3): 656-?
Abstract
Combined immunodeficiency with multiple intestinal atresias (CID-MIA) is a rare hereditary disease characterized by intestinal obstructions and profound immune defects.We sought to determine the underlying genetic causes of CID-MIA by analyzing the exomic sequences of 5 patients and their healthy direct relatives from 5 unrelated families.We performed whole-exome sequencing on 5 patients with CID-MIA and 10 healthy direct family members belonging to 5 unrelated families with CID-MIA. We also performed targeted Sanger sequencing for the candidate gene tetratricopeptide repeat domain 7A (TTC7A) on 3 additional patients with CID-MIA.Through analysis and comparison of the exomic sequence of the subjects from these 5 families, we identified biallelic damaging mutations in the TTC7A gene, for a total of 7 distinct mutations. Targeted TTC7A gene sequencing in 3 additional unrelated patients with CID-MIA revealed biallelic deleterious mutations in 2 of them, as well as an aberrant splice product in the third patient. Staining of normal thymus showed that the TTC7A protein is expressed in thymic epithelial cells, as well as in thymocytes. Moreover, severe lymphoid depletion was observed in the thymus and peripheral lymphoid tissues from 2 patients with CID-MIA.We identified deleterious mutations of the TTC7A gene in 8 unrelated patients with CID-MIA and demonstrated that the TTC7A protein is expressed in the thymus. Our results strongly suggest that TTC7A gene defects cause CID-MIA.
View details for DOI 10.1016/j.jaci.2013.06.013
View details for Web of Science ID 000323612000018
View details for PubMedID 23830146
-
Systematic functional regulatory assessment of disease-associated variants.
Proceedings of the National Academy of Sciences of the United States of America
2013; 110 (23): 9607-9612
Abstract
Genome-wide association studies have discovered many genetic loci associated with disease traits, but the functional molecular basis of these associations is often unresolved. Genome-wide regulatory and gene expression profiles measured across individuals and diseases reflect downstream effects of genetic variation and may allow for functional assessment of disease-associated loci. Here, we present a unique approach for systematic integration of genetic disease associations, transcription factor binding among individuals, and gene expression data to assess the functional consequences of variants associated with hundreds of human diseases. In an analysis of genome-wide binding profiles of NFκB, we find that disease-associated SNPs are enriched in NFκB binding regions overall, and specifically for inflammatory-mediated diseases, such as asthma, rheumatoid arthritis, and coronary artery disease. Using genome-wide variation in transcription factor-binding data, we find that NFκB binding is often correlated with disease-associated variants in a genotype-specific and allele-specific manner. Furthermore, we show that this binding variation is often related to expression of nearby genes, which are also found to have altered expression in independent profiling of the variant-associated disease condition. Thus, using this integrative approach, we provide a unique means to assign putative function to many disease-associated SNPs.
View details for DOI 10.1073/pnas.1219099110
View details for PubMedID 23690573
-
Specific plasma autoantibody reactivity in myelodysplastic syndromes.
Scientific reports
2013; 3: 3311-?
Abstract
Increased autoantibody reactivity in plasma from Myelodysplastic Syndromes (MDS) patients may provide novel disease signatures, and possible early detection. In a two-stage study we investigated Immunoglobulin G reactivity in plasma from MDS, Acute Myeloid Leukemia post MDS patients, and a healthy cohort. In exploratory Stage I we utilized high-throughput protein arrays to identify 35 high-interest proteins showing increased reactivity in patient subgroups compared to healthy controls. In validation Stage II we designed new arrays focusing on 25 of the proteins identified in Stage I and expanded the initial cohort. We validated increased antibody reactivity against AKT3, FCGR3A and ARL8B in patients, which enabled sample classification into stable MDS and healthy individuals. We also detected elevated AKT3 protein levels in MDS patient plasma. The discovery of increased specific autoantibody reactivity in MDS patients, provides molecular signatures for classification, supplementing existing risk categorizations, and may enhance diagnostic and prognostic capabilities for MDS.
View details for DOI 10.1038/srep03311
View details for PubMedID 24264604
-
Extensive genetic variation in somatic human tissues
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2012; 109 (44): 18018-18023
Abstract
Genetic variation between individuals has been extensively investigated, but differences between tissues within individuals are far less understood. It is commonly assumed that all healthy cells that arise from the same zygote possess the same genomic content, with a few known exceptions in the immune system and germ line. However, a growing body of evidence shows that genomic variation exists between differentiated tissues. We investigated the scope of somatic genomic variation between tissues within humans. Analysis of copy number variation by high-resolution array-comparative genomic hybridization in diverse tissues from six unrelated subjects reveals a significant number of intraindividual genomic changes between tissues. Many (79%) of these events affect genes. Our results have important consequences for understanding normal genetic and phenotypic variation within individuals, and they have significant implications for both the etiology of genetic diseases such as cancer and for immortalized cell lines that might be used in research and therapeutics.
View details for DOI 10.1073/pnas.1213736109
View details for Web of Science ID 000311149900070
View details for PubMedID 23043118
View details for PubMedCentralID PMC3497787
-
An integrated encyclopedia of DNA elements in the human genome
NATURE
2012; 489 (7414): 57-74
Abstract
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.
View details for DOI 10.1038/nature11247
View details for Web of Science ID 000308347000039
View details for PubMedID 22955616
View details for PubMedCentralID PMC3439153
-
Architecture of the human regulatory network derived from ENCODE data
NATURE
2012; 489 (7414): 91-100
Abstract
Transcription factors bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 transcription-related factors in over 450 distinct experiments. We found the combinatorial, co-association of transcription factors to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the transcription factor binding into a hierarchy and integrated it with other genomic information (for example, microRNA regulation), forming a dense meta-network. Factors at different levels have different properties; for instance, top-level transcription factors more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs (for example, noise-buffering feed-forward loops). Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (that is, differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
View details for DOI 10.1038/nature11245
View details for PubMedID 22955619
-
Linking disease associations with regulatory information in the human genome
GENOME RESEARCH
2012; 22 (9): 1748-1759
Abstract
Genome-wide association studies have been successful in identifying single nucleotide polymorphisms (SNPs) associated with a large number of phenotypes. However, an associated SNP is likely part of a larger region of linkage disequilibrium. This makes it difficult to precisely identify the SNPs that have a biological link with the phenotype. We have systematically investigated the association of multiple types of ENCODE data with disease-associated SNPs and show that there is significant enrichment for functional SNPs among the currently identified associations. This enrichment is strongest when integrating multiple sources of functional information and when highest confidence disease-associated SNPs are used. We propose an approach that integrates multiple types of functional data generated by the ENCODE Consortium to help identify "functional SNPs" that may be associated with the disease phenotype. Our approach generates putative functional annotations for up to 80% of all previously reported associations. We show that for most associations, the functional SNP most strongly supported by experimental evidence is a SNP in linkage disequilibrium with the reported association rather than the reported SNP itself. Our results show that the experimental data sets generated by the ENCODE Consortium can be successfully used to suggest functional hypotheses for variants associated with diseases and other phenotypes.
View details for DOI 10.1101/gr.136127.111
View details for PubMedID 22955986
-
Annotation of functional variation in personal genomes using RegulomeDB
GENOME RESEARCH
2012; 22 (9): 1790-1797
Abstract
As the sequencing of healthy and disease genomes becomes more commonplace, detailed annotation provides interpretation for individual variation responsible for normal and disease phenotypes. Current approaches focus on direct changes in protein coding genes, particularly nonsynonymous mutations that directly affect the gene product. However, most individual variation occurs outside of genes and, indeed, most markers generated from genome-wide association studies (GWAS) identify variants outside of coding segments. Identification of potential regulatory changes that perturb these sites will lead to a better localization of truly functional variants and interpretation of their effects. We have developed a novel approach and database, RegulomeDB, which guides interpretation of regulatory variants in the human genome. RegulomeDB includes high-throughput, experimental data sets from ENCODE and other sources, as well as computational predictions and manual annotations to identify putative regulatory potential and identify functional variants. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool and provides a small set of putative sites with testable hypotheses as to their function. We demonstrate the applicability of this tool to the annotation of noncoding variants from 69 full sequenced genomes as well as that of a personal genome, where thousands of functionally associated variants were identified. Moreover, we demonstrate a GWAS where the database is able to quickly identify the known associated functional variant and provide a hypothesis as to its function. Overall, we expect this approach and resource to be valuable for the annotation of human genome sequences.
View details for DOI 10.1101/gr.137323.112
View details for PubMedID 22955989
-
ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia
GENOME RESEARCH
2012; 22 (9): 1813-1831
Abstract
Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.
View details for DOI 10.1101/gr.136184.111
View details for PubMedID 22955991
-
Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes
CELL
2012; 148 (6): 1293-1307
Abstract
Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.
View details for DOI 10.1016/j.cell.2012.02.009
View details for PubMedID 22424236
-
Detecting and annotating genetic variations using the HugeSeq pipeline
NATURE BIOTECHNOLOGY
2012; 30 (3): 226-229
View details for Web of Science ID 000301303800013
View details for PubMedID 22398614
-
Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation
CELL
2012; 148 (1-2): 84-98
Abstract
Higher-order chromosomal organization for transcription regulation is poorly understood in eukaryotes. Using genome-wide Chromatin Interaction Analysis with Paired-End-Tag sequencing (ChIA-PET), we mapped long-range chromatin interactions associated with RNA polymerase II in human cells and uncovered widespread promoter-centered intragenic, extragenic, and intergenic interactions. These interactions further aggregated into higher-order clusters, wherein proximal and distal genes were engaged through promoter-promoter interactions. Most genes with promoter-promoter interactions were active and transcribed cooperatively, and some interacting promoters could influence each other implying combinatorial complexity of transcriptional controls. Comparative analyses of different cell lines showed that cell-specific chromatin interactions could provide structural frameworks for cell-specific transcription, and suggested significant enrichment of enhancer-promoter interactions for cell-specific functions. Furthermore, genetically-identified disease-associated noncoding elements were found to be spatially engaged with corresponding genes through long-range interactions. Overall, our study provides insights into transcription regulation by three-dimensional chromatin interactions for both housekeeping and cell-specific genes in human cells.
View details for DOI 10.1016/j.cell.2011.12.014
View details for Web of Science ID 000299540700016
View details for PubMedID 22265404
View details for PubMedCentralID PMC3339270
-
Performance comparison of whole-genome sequencing platforms
NATURE BIOTECHNOLOGY
2012; 30 (1): 78-U118
Abstract
Whole-genome sequencing is becoming commonplace, but the accuracy and completeness of variant calling by the most widely used platforms from Illumina and Complete Genomics have not been reported. Here we sequenced the genome of an individual with both technologies to a high average coverage of ∼76×, and compared their performance with respect to sequence coverage and calling of single-nucleotide variants (SNVs), insertions and deletions (indels). Although 88.1% of the ∼3.7 million unique SNVs were concordant between platforms, there were tens of thousands of platform-specific calls located in genes and other genomic regions. In contrast, 26.5% of indels were concordant between platforms. Target enrichment validated 92.7% of the concordant SNVs, whereas validation by genotyping array revealed a sensitivity of 99.3%. The validation experiments also suggested that >60% of the platform-specific variants were indeed present in the genome. Our results have important implications for understanding the accuracy and completeness of the genome sequencing platforms.
View details for DOI 10.1038/nbt.2065
View details for Web of Science ID 000299110600023
-
Dissecting phosphorylation networks: lessons learned from yeast
EXPERT REVIEW OF PROTEOMICS
2011; 8 (6): 775-786
Abstract
Protein phosphorylation continues to be regarded as one of the most important post-translational modifications found in eukaryotes and has been implicated in key roles in the development of a number of human diseases. In order to elucidate roles for the 518 human kinases, phosphorylation has routinely been studied using the budding yeast Saccharomyces cerevisiae as a model system. In recent years, a number of technologies have emerged to globally map phosphorylation in yeast. In this article, we review these technologies and discuss how these phosphorylation mapping efforts have shed light on our understanding of kinase signaling pathways and eukaryotic proteomic networks in general.
View details for DOI 10.1586/EPR.11.64
View details for Web of Science ID 000297299000013
View details for PubMedID 22087660
View details for PubMedCentralID PMC3262144
-
Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF
NATURE
2001; 409 (6819): 533-538
Abstract
Proteins interact with genomic DNA to bring the genome to life; and these interactions also define many functional features of the genome. SBF and MBF are sequence-specific transcription factors that activate gene expression during the G1/S transition of the cell cycle in yeast. SBF is a heterodimer of Swi4 and Swi6, and MBF is a heterodimer of Mbpl and Swi6 (refs 1, 3). The related Swi4 and Mbp1 proteins are the DNA-binding components of the respective factors, and Swi6 mayhave a regulatory function. A small number of SBF and MBF target genes have been identified. Here we define the genomic binding sites of the SBF and MBF transcription factors in vivo, by using DNA microarrays. In addition to the previously characterized targets, we have identified about 200 new putative targets. Our results support the hypothesis that SBF activated genes are predominantly involved in budding, and in membrane and cell-wall biosynthesis, whereas DNA replication and repair are the dominant functions among MBF activated genes. The functional specialization of these factors may provide a mechanism for independent regulation of distinct molecular processes that normally occur in synchrony during the mitotic cell cycle.
View details for Web of Science ID 000166570500053
View details for PubMedID 11206552
-
Mapping spatial organization and genetic cell-state regulators to target immune evasion in ovarian cancer.
Nature immunology
2024
Abstract
The drivers of immune evasion are not entirely clear, limiting the success of cancer immunotherapies. Here we applied single-cell spatial and perturbational transcriptomics to delineate immune evasion in high-grade serous tubo-ovarian cancer. To this end, we first mapped the spatial organization of high-grade serous tubo-ovarian cancer by profiling more than 2.5 million cells in situ in 130 tumors from 94 patients. This revealed a malignant cell state that reflects tumor genetics and is predictive of T cell and natural killer cell infiltration levels and response to immune checkpoint blockade. We then performed Perturb-seq screens and identified genetic perturbations-including knockout of PTPN1 and ACTR8-that trigger this malignant cell state. Finally, we show that these perturbations, as well as a PTPN1/PTPN2 inhibitor, sensitize ovarian cancer cells to T cell and natural killer cell cytotoxicity, as predicted. This study thus identifies ways to study and target immune evasion by linking genetic variation, cell-state regulators and spatial biology.
View details for DOI 10.1038/s41590-024-01943-5
View details for PubMedID 39179931
View details for PubMedCentralID 7969354
-
PRC2-AgeIndex as a universal biomarker of aging and rejuvenation.
Nature communications
2024; 15 (1): 5956
Abstract
DNA methylation (DNAm) is one of the most reliable biomarkers of aging across mammalian tissues. While the age-dependent global loss of DNAm has been well characterized, DNAm gain is less characterized. Studies have demonstrated that CpGs which gain methylation with age are enriched in Polycomb Repressive Complex 2 (PRC2) targets. However, whole-genome examination of all PRC2 targets as well as determination of the pan-tissue or tissue-specific nature of these associations is lacking. Here, we show that low-methylated regions (LMRs) which are highly bound by PRC2 in embryonic stem cells (PRC2 LMRs) gain methylation with age in all examined somatic mitotic cells. We estimated that this epigenetic change represents around 90% of the age-dependent DNAm gain genome-wide. Therefore, we propose the "PRC2-AgeIndex," defined as the average DNAm in PRC2 LMRs, as a universal biomarker of cellular aging in somatic cells which can distinguish the effect of different anti-aging interventions.
View details for DOI 10.1038/s41467-024-50098-2
View details for PubMedID 39009581
View details for PubMedCentralID PMC11250797
-
Glycan clock of ageing-analytical precision and time-dependent inter- and i-individual variability.
GeroScience
2024
Abstract
Ageing is a complex biological process with variations among individuals, leading to the development of ageing clocks to estimate biological age. Glycans, particularly in immunoglobulin G (IgG), have emerged as potential biomarkers of ageing, with changes in glycosylation patterns correlating with chronological age.For precision analysis, three different plasma pools were analysed over 26 days in tetraplicates, 312 samples in total. In short-term variability analysis, two cohorts were analysed: AstraZeneca MFO cohort of 26 healthy individuals (median age 20) and a cohort of 70 premenopausal Chinese women (median age 22.5) cohort monitored over 3 months. Long-term variability analysis involved two adult men aged 47 and 57, monitored for 5 and 10 years, respectively. Samples were collected every 3 months and 3 weeks, respectively. IgG N-glycan analysis followed a standardized approach by isolating IgG, its subsequent denaturation and deglycosylation followed by glycan cleanup and labelling. Capillary gel electrophoresis with laser-induced fluorescence (CGE-LIF) and ultra-performance liquid chromatography analyses were employed for glycan profiling. Statistical analysis involved normalization, batch correction, and linear mixed models to assess time effects on derived glycan traits.The intermediate precision results consistently exhibited very low coefficient of variation values across all three test samples. This consistent pattern underscores the high level of precision inherent in the CGE method for analysing the glycan clock of ageing. The AstraZeneca MFO cohort did not show any statistically significant trends, whereas the menstrual cycle cohort exhibited statistically significant trends in digalactosylated (G2), agalactosylated (G0) and fucosylation (F). These trends were attributed to the effects of the menstrual cycle. Long-term stability analysis identified enduring age-related trends in both subjects, showing a positive time effect in G0 and bisected N-acetylglucosamine, as well as a negative time effect in G2 and sialylation, aligning with earlier findings. Time effects measured for monogalactosylation, and F remained substantially lower than ones observed for other traits.The study found that IgG N-glycome analysis using CGE-LIF exhibited remarkably high intermediate precision. Moreover, the study highlights the short- and long-term stability of IgG glycome composition, coupled with a notable capacity to adapt and respond to physiological changes and environmental influences such as hormonal changes, disease, and interventions. The discoveries from this study propel personalized medicine forward by deepening our understanding of how IgG glycome relates to age-related health concerns. This study underscores the reliability of glycans as a biomarker for tracking age-related changes and individual health paths.
View details for DOI 10.1007/s11357-024-01239-4
View details for PubMedID 38877341
View details for PubMedCentralID 9234382
-
Psychogenic Aging: A Novel Prospect to Integrate Psychobiological Hallmarks of Aging.
Translational psychiatry
2024; 14 (1): 226
Abstract
Psychological factors are amongst the most robust predictors of healthspan and longevity, yet are rarely incorporated into scientific and medical frameworks of aging. The prospect of characterizing and integrating the psychological influences of aging is therefore an unmet step for the advancement of geroscience. Psychogenic Aging research is an emerging branch of biogerontology that aims to address this gap by investigating the impact of psychological factors on human longevity. It is an interdisciplinary field that integrates complex psychological, neurological, and molecular relationships that can be best understood with precision medicine methodologies. This perspective argues that psychogenic aging should be considered an integral component of the Hallmarks of Aging framework, opening the doors for future biopsychosocial integration in longevity research. By providing a unique perspective on frequently overlooked aspects of organismal aging, psychogenic aging offers new insights and targets for anti-aging therapeutics on individual and societal levels that can significantly benefit the scientific and medical communities.
View details for DOI 10.1038/s41398-024-02919-7
View details for PubMedID 38816369
View details for PubMedCentralID PMC11139997
-
Evolution of diapause in the African turquoise killifish by remodeling the ancient gene regulatory landscape.
Cell
2024
Abstract
Suspended animation states allow organisms to survive extreme environments. The African turquoise killifish has evolved diapause as a form of suspended development to survive a complete drought. However, the mechanisms underlying the evolution of extreme survival states are unknown. To understand diapause evolution, we performed integrative multi-omics (gene expression, chromatin accessibility, and lipidomics) in the embryos of multiple killifish species. We find that diapause evolved by a recent remodeling of regulatory elements at very ancient gene duplicates (paralogs) present in all vertebrates. CRISPR-Cas9-based perturbations identify the transcription factors REST/NRSF and FOXOs as critical for the diapause gene expression program, including genes involved in lipid metabolism. Indeed, diapause shows a distinct lipid profile, with an increase in triglycerides with very-long-chain fatty acids. Our work suggests a mechanism for the evolution of complex adaptations and offers strategies to promote long-term survival by activating suspended animation programs in other species.
View details for DOI 10.1016/j.cell.2024.04.048
View details for PubMedID 38810644
-
Genome-wide Cas9-mediated screening of essential non-coding regulatory elements via libraries of paired single-guide RNAs.
Nature biomedical engineering
2024
Abstract
The functions of non-coding regulatory elements (NCREs), which constitute a major fraction of the human genome, have not been systematically studied. Here we report a method involving libraries of paired single-guide RNAs targeting both ends of an NCRE as a screening system for the Cas9-mediated deletion of thousands of NCREs genome-wide to study their functions in distinct biological contexts. By using K562 and 293T cell lines and human embryonic stem cells, we show that NCREs can have redundant functions, and that many ultra-conserved elements have silencer activity and play essential roles in cell growth and in cellular responses to drugs (notably, the ultra-conserved element PAX6_Tarzan may be critical for heart development, as removing it from human embryonic stem cells led to defects in cardiomyocyte differentiation). The high-throughput screen, which is compatible with single-cell sequencing, may allow for the identification of druggable NCREs.
View details for DOI 10.1038/s41551-024-01204-8
View details for PubMedID 38778183
-
Integrative multi-omics profiling in human decedents receiving pig heart xenografts.
Nature medicine
2024
Abstract
In a previous study, heart xenografts from 10-gene-edited pigs transplanted into two human decedents did not show evidence of acute-onset cellular- or antibody-mediated rejection. Here, to better understand the detailed molecular landscape following xenotransplantation, we carried out bulk and single-cell transcriptomics, lipidomics, proteomics and metabolomics on blood samples obtained from the transplanted decedents every 6 h, as well as histological and transcriptomic tissue profiling. We observed substantial early immune responses in peripheral blood mononuclear cells and xenograft tissue obtained from decedent 1 (male), associated with downstream T cell and natural killer cell activity. Longitudinal analyses indicated the presence of ischemia reperfusion injury, exacerbated by inadequate immunosuppression of T cells, consistent with previous findings of perioperative cardiac xenograft dysfunction in pig-to-nonhuman primate studies. Moreover, at 42 h after transplantation, substantial alterations in cellular metabolism and liver-damage pathways occurred, correlating with profound organ-wide physiological dysfunction. By contrast, relatively minor changes in RNA, protein, lipid and metabolism profiles were observed in decedent 2 (female) as compared to decedent 1. Overall, these multi-omics analyses delineate distinct responses to cardiac xenotransplantation in the two human decedents and reveal new insights into early molecular and immune responses after xenotransplantation. These findings may aid in the development of targeted therapeutic approaches to limit ischemia reperfusion injury-related phenotypes and improve outcomes.
View details for DOI 10.1038/s41591-024-02972-1
View details for PubMedID 38760586
View details for PubMedCentralID 6666404
-
Deconvolution of polygenic risk score in single cells unravels cellular and molecular heterogeneity of complex human diseases.
bioRxiv : the preprint server for biology
2024
Abstract
Polygenic risk scores (PRSs) are commonly used for predicting an individual's genetic risk of complex diseases. Yet, their implication for disease pathogenesis remains largely limited. Here, we introduce scPRS, a geometric deep learning model that constructs single-cell-resolved PRS leveraging reference single-cell chromatin accessibility profiling data to enhance biological discovery as well as disease prediction. Real-world applications across multiple complex diseases, including type 2 diabetes (T2D), hypertrophic cardiomyopathy (HCM), and Alzheimer's disease (AD), showcase the superior prediction power of scPRS compared to traditional PRS methods. Importantly, scPRS not only predicts disease risk but also uncovers disease-relevant cells, such as hormone-high alpha and beta cells for T2D, cardiomyocytes and pericytes for HCM, and astrocytes, microglia and oligodendrocyte progenitor cells for AD. Facilitated by a layered multi-omic analysis, scPRS further identifies cell-type-specific genetic underpinnings, linking disease-associated genetic variants to gene regulation within corresponding cell types. We substantiate the disease relevance of scPRS-prioritized HCM genes and demonstrate that the suppression of these genes in HCM cardiomyocytes is rescued by Mavacamten treatment. Additionally, we establish a novel microglia-specific regulatory relationship between the AD risk variant rs7922621 and its target genes ANXA11 and TSPAN14. We further illustrate the detrimental effects of suppressing these two genes on microglia phagocytosis. Our work provides a multi-tasking, interpretable framework for precise disease prediction and systematic investigation of the genetic, cellular, and molecular basis of complex diseases, laying the methodological foundation for single-cell genetics.
View details for DOI 10.1101/2024.05.14.594252
View details for PubMedID 38798507
View details for PubMedCentralID PMC11118500
-
Personalized transcriptome signatures in a cardiomyopathy stem cell biobank.
bioRxiv : the preprint server for biology
2024
Abstract
There is growing evidence that pathogenic mutations do not fully explain hypertrophic (HCM) or dilated (DCM) cardiomyopathy phenotypes. We hypothesized that if a patient's genetic background was influencing cardiomyopathy this should be detectable as signatures in gene expression. We built a cardiomyopathy biobank resource for interrogating personalized genotype phenotype relationships in human cell lines.We recruited 308 diseased and control patients for our cardiomyopathy stem cell biobank. We successfully reprogrammed PBMCs (peripheral blood mononuclear cells) into induced pluripotent stem cells (iPSCs) for 300 donors. These iPSCs underwent whole genome sequencing and were differentiated into cardiomyocytes for RNA-seq. In addition to annotating pathogenic variants, mutation burden in a panel of cardiomyopathy genes was assessed for correlation with echocardiogram measurements. Line-specific co-expression networks were inferred to evaluate transcriptomic subtypes. Drug treatment targeted the sarcomere, either by activation with omecamtiv mecarbil or inhibition with mavacamten, to alter contractility.We generated an iPSC biobank from 300 donors, which included 101 individuals with HCM and 88 with DCM. Whole genome sequencing of 299 iPSC lines identified 78 unique pathogenic or likely pathogenic mutations in the diseased lines. Notably, only DCM lines lacking a known pathogenic or likely pathogenic mutation replicated a finding in the literature for greater nonsynonymous SNV mutation burden in 102 cardiomyopathy genes to correlate with lower left ventricular ejection fraction in DCM. We analyzed RNA-sequencing data from iPSC-derived cardiomyocytes for 102 donors. Inferred personalized co-expression networks revealed two transcriptional subtypes of HCM. The first subtype exhibited concerted activation of the co-expression network, with the degree of activation reflective of the disease severity of the donor. In contrast, the second HCM subtype and the entire DCM cohort exhibited partial activation of the respective disease network, with the strength of specific gene by gene relationships dependent on the iPSC-derived cardiomyocyte line. ADCY5 was the largest hubnode in both the HCM and DCM networks and partially corrected in response to drug treatment.We have a established a stem cell biobank for studying cardiomyopathy. Our analysis supports the hypothesis the genetic background influences pathologic gene expression programs and support a role for ADCY5 in cardiomyopathy.
View details for DOI 10.1101/2024.05.10.593618
View details for PubMedID 38798547
View details for PubMedCentralID PMC11118309
-
The impact of exercise on gene regulation in association with complex trait genetics.
Nature communications
2024; 15 (1): 3346
Abstract
Endurance exercise training is known to reduce risk for a range of complex diseases. However, the molecular basis of this effect has been challenging to study and largely restricted to analyses of either few or easily biopsied tissues. Extensive transcriptome data collected across 15 tissues during exercise training in rats as part of the Molecular Transducers of Physical Activity Consortium has provided a unique opportunity to clarify how exercise can affect tissue-specific gene expression and further suggest how exercise adaptation may impact complex disease-associated genes. To build this map, we integrate this multi-tissue atlas of gene expression changes with gene-disease targets, genetic regulation of expression, and trait relationship data in humans. Consensus from multiple approaches prioritizes specific tissues and genes where endurance exercise impacts disease-relevant gene expression. Specifically, we identify a total of 5523 trait-tissue-gene triplets to serve as a valuable starting point for future investigations [Exercise; Transcription; Human Phenotypic Variation].
View details for DOI 10.1038/s41467-024-45966-w
View details for PubMedID 38693125
-
Sexual dimorphism and the multi-omic response to exercise training in rat subcutaneous white adipose tissue.
Nature metabolism
2024
Abstract
Subcutaneous white adipose tissue (scWAT) is a dynamic storage and secretory organ that regulates systemic homeostasis, yet the impact of endurance exercise training (ExT) and sex on its molecular landscape is not fully established. Utilizing an integrative multi-omics approach, and leveraging data generated by the Molecular Transducers of Physical Activity Consortium (MoTrPAC), we show profound sexual dimorphism in the scWAT of sedentary rats and in the dynamic response of this tissue to ExT. Specifically, the scWAT of sedentary females displays -omic signatures related to insulin signaling and adipogenesis, whereas the scWAT of sedentary males is enriched in terms related to aerobic metabolism. These sex-specific -omic signatures are preserved or amplified with ExT. Integration of multi-omic analyses with phenotypic measures identifies molecular hubs predicted to drive sexually distinct responses to training. Overall, this study underscores the powerful impact of sex on adipose tissue biology and provides a rich resource to investigate the scWAT response to ExT.
View details for DOI 10.1038/s42255-023-00959-9
View details for PubMedID 38693320
-
Molecular adaptations in response to exercise training are associated with tissue-specific transcriptomic and epigenomic signatures.
Cell genomics
2024: 100421
Abstract
Regular exercise has many physical and brain health benefits, yet the molecular mechanisms mediating exercise effects across tissues remain poorly understood. Here we analyzed 400 high-quality DNA methylation, ATAC-seq, and RNA-seq datasets from eight tissues from control and endurance exercise-trained (EET) rats. Integration of baseline datasets mapped the gene location dependence of epigenetic control features and identified differing regulatory landscapes in each tissue. The transcriptional responses to 8weeks of EET showed little overlap across tissues and predominantly comprised tissue-type enriched genes. We identified sex differences in the transcriptomic and epigenomic changes induced by EET. However, the sex-biased gene responses were linked to shared signaling pathways. We found that many G protein-coupled receptor-encoding genes are regulated by EET, suggesting a role for these receptors in mediating the molecular adaptations to training across tissues. Our findings provide new insights into the mechanisms underlying EET-induced health benefits across organs.
View details for DOI 10.1016/j.xgen.2023.100421
View details for PubMedID 38697122
-
Molecular Transducers of Physical Activity Consortium (MoTrPAC): Human Studies Design and Protocol.
Journal of applied physiology (Bethesda, Md. : 1985)
2024
Abstract
Physical activity, including structured exercise, is associated with favorable health-related chronic disease outcomes. While there is evidence of various molecular pathways that affect these responses, a comprehensive molecular map of these molecular responses to exercise has not been developed. The Molecular Transducers of Physical Activity Consortium (MoTrPAC) is a multi-center study designed to isolate the effects of structured exercise training on the molecular mechanisms underlying the health benefits of exercise and physical activity. MoTrPAC contains both a pre-clinical and human component. The details of the human studies component of MoTrPAC that include the design and methods are presented here. The human studies contain both an adult and pediatric component. In the adult component, sedentary participants are randomized to 12 weeks of Control, Endurance Exercise Training, or Resistance Exercise Training with outcomes measures completed before and following the 12 weeks. The adult component also includes recruitment of highly active endurance trained or resistance trained participants who only complete measures once. A similar design is used for the pediatric component; however, only endurance exercise is examined. Phenotyping measures include weight, body composition, vital signs, cardiorespiratory fitness, muscular strength, physical activity and diet, and other questionnaires. Participants also complete an acute rest period (adults only) or exercise session (adults, pediatrics) with collection of biospecimens (blood only for pediatrics) to allow for examination of the molecular responses. The design and methods of MoTrPAC may inform other studies. Moreover, MoTrPAC will provide a repository of data that can be used broadly across the scientific community.
View details for DOI 10.1152/japplphysiol.00102.2024
View details for PubMedID 38634503
-
The mitochondrial multi-omic response to exercise training across rat tissues.
Cell metabolism
2024
Abstract
Mitochondria have diverse functions critical to whole-body metabolic homeostasis. Endurance training alters mitochondrial activity, but systematic characterization of these adaptations is lacking. Here, the Molecular Transducers of Physical Activity Consortium mapped the temporal, multi-omic changes in mitochondrial analytes across 19 tissues in male and female rats trained for 1, 2, 4, or 8 weeks. Training elicited substantial changes in the adrenal gland, brown adipose, colon, heart, and skeletal muscle. The colon showed non-linear response dynamics, whereas mitochondrial pathways were downregulated in brown adipose and adrenal tissues. Protein acetylation increased in the liver, with a shift in lipid metabolism, whereas oxidative proteins increased in striated muscles. Exercise-upregulated networks were downregulated in human diabetes and cirrhosis. Knockdown of the central network protein 17-beta-hydroxysteroid dehydrogenase 10 (HSD17B10) elevated oxygen consumption, indicative of metabolic stress. We provide a multi-omic, multi-tissue, temporal atlas of the mitochondrial response to exercise training and identify candidates linked to mitochondrial dysfunction.
View details for DOI 10.1016/j.cmet.2023.12.021
View details for PubMedID 38701776
-
Vaginal microbiomes show ethnic evolutionary dynamics and positive selection of Lactobacillus adhesins driven by a long-term niche-specific process.
Cell reports
2024; 43 (4): 114078
Abstract
The vaginal microbiome's composition varies among ethnicities. However, the evolutionary landscape of the vaginal microbiome in the multi-ethnic context remains understudied. We perform a systematic evolutionary analysis of 351 vaginal microbiome samples from 35 multi-ethnic pregnant women, in addition to two validation cohorts, totaling 462 samples from 90 women. Microbiome alpha diversity and community state dynamics show strong ethnic signatures. Lactobacillaceae have a higher ratio of non-synonymous to synonymous polymorphism and lower nucleotide diversity than non-Lactobacillaceae in all ethnicities, with a large repertoire of positively selected genes, including the mucin-binding and cell wall anchor genes. These evolutionary dynamics are driven by the long-term evolutionary process unique to the human vaginal niche. Finally, we propose an evolutionary model reflecting the environmental niches of microbes. Our study reveals the extensive ethnic signatures in vaginal microbial ecology and evolution, highlighting the importance of studying the host-microbiome ecosystem from an evolutionary perspective.
View details for DOI 10.1016/j.celrep.2024.114078
View details for PubMedID 38598334
-
Longitudinal cytokine and multi-modal health data of an extremely severe ME/CFS patient with HSD reveals insights into immunopathology, and disease severity.
Frontiers in immunology
2024; 15: 1369295
Abstract
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) presents substantial challenges in patient care due to its intricate multisystem nature, comorbidities, and global prevalence. The heterogeneity among patient populations, coupled with the absence of FDA-approved diagnostics and therapeutics, further complicates research into disease etiology and patient managment. Integrating longitudinal multi-omics data with clinical, health,textual, pharmaceutical, and nutraceutical data offers a promising avenue to address these complexities, aiding in the identification of underlying causes and providing insights into effective therapeutics and diagnostic strategies.This study focused on an exceptionally severe ME/CFS patient with hypermobility spectrum disorder (HSD) during a period of marginal symptom improvements. Longitudinal cytokine profiling was conducted alongside the collection of extensive multi-modal health data to explore the dynamic nature of symptoms, severity, triggers, and modifying factors. Additionally, an updated severity assessment platform and two applications, ME-CFSTrackerApp and LexiTime, were introduced to facilitate real-time symptom tracking and enhance patient-physician/researcher communication, and evaluate response to medical intervention.Longitudinal cytokine profiling revealed the significance of Th2-type cytokines and highlighted synergistic activities between mast cells and eosinophils, skewing Th1 toward Th2 immune responses in ME/CFS pathogenesis, particularly in cognitive impairment and sensorial intolerance. This suggests a potentially shared underlying mechanism with major ME/CFS comorbidities such as HSD, Mast cell activation syndrome, postural orthostatic tachycardia syndrome (POTS), and small fiber neuropathy. Additionally, the data identified potential roles of BCL6 and TP53 pathways in ME/CFS etiology and emphasized the importance of investigating adverse reactions to medication and supplements and drug interactions in ME/CFS severity and progression.Our study advocates for the integration of longitudinal multi-omics with multi-modal health data and artificial intelligence (AI) techniques to better understand ME/CFS and its major comorbidities. These findings highlight the significance of dysregulated Th2-type cytokines in patient stratification and precision medicine strategies. Additionally, our results suggest exploring the use of low-dose drugs with partial agonist activity as a potential avenue for ME/CFS treatment. This comprehensive approach emphasizes the importance of adopting a patient-centered care approach to improve ME/CFS healthcare management, disease severity assessment, and personalized medicine. Overall, these findings contribute to our understanding of ME/CFS and offer avenues for future research and clinical practice.
View details for DOI 10.3389/fimmu.2024.1369295
View details for PubMedID 38650940
View details for PubMedCentralID PMC11033372
-
Deep learning modeling of rare noncoding genetic variants in human motor neurons definesCCDC146as a therapeutic target for ALS.
medRxiv : the preprint server for health sciences
2024
Abstract
Amyotrophic lateral sclerosis (ALS) is a fatal and incurable neurodegenerative disease caused by the selective and progressive death of motor neurons (MNs). Understanding the genetic and molecular factors influencing ALS survival is crucial for disease management and therapeutics. In this study, we introduce a deep learning-powered genetic analysis framework to link rare noncoding genetic variants to ALS survival. Using data from human induced pluripotent stem cell (iPSC)-derived MNs, this method prioritizes functional noncoding variants using deep learning, links cis-regulatory elements (CREs) to target genes using epigenomics data, and integrates these data through gene-level burden tests to identify survival-modifying variants, CREs, and genes. We apply this approach to analyze 6,715 ALS genomes, and pinpoint four novel rare noncoding variants associated with survival, including chr7:76,009,472:C>T linked to CCDC146. CRISPR-Cas9 editing of this variant increases CCDC146 expression in iPSC-derived MNs and exacerbates ALS-specific phenotypes, including TDP-43 mislocalization. Suppressing CCDC146 with an antisense oligonucleotide (ASO), showing no toxicity, completely rescues ALS-associated survival defects in neurons derived from sporadic ALS patients and from carriers of the ALS-associated G4C2-repeat expansion within C9ORF72. ASO targeting of CCDC146 may be a broadly effective therapeutic approach for ALS. Our framework provides a generic and powerful approach for studying noncoding genetics of complex human diseases.
View details for DOI 10.1101/2024.03.30.24305115
View details for PubMedID 38633814
-
Emerging therapeutic drug monitoring technologies: considerations and opportunities in precision medicine.
Frontiers in pharmacology
2024; 15: 1348112
Abstract
In recent years, the development of sensor and wearable technologies have led to their increased adoption in clinical and health monitoring settings. One area that is in early, but promising, stages of development is the use of biosensors for therapeutic drug monitoring (TDM). Traditionally, TDM could only be performed in certified laboratories and was used in specific scenarios to optimize drug dosage based on measurement of plasma/blood drug concentrations. Although TDM has been typically pursued in settings involving medications that are challenging to manage, the basic approach is useful for characterizing drug activity. TDM is based on the idea that there is likely a clear relationship between plasma/blood drug concentration (or concentration in other matrices) and clinical efficacy. However, these relationships may vary across individuals and may be affected by genetic factors, comorbidities, lifestyle, and diet. TDM technologies will be valuable for enabling precision medicine strategies to determine the clinical efficacy of drugs in individuals, as well as optimizing personalized dosing, especially since therapeutic windows may vary inter-individually. In this mini-review, we discuss emerging TDM technologies and their applications, and factors that influence TDM including drug interactions, polypharmacy, and supplement use. We also discuss how using TDM within single subject (N-of-1) and aggregated N-of-1 clinical trial designs provides opportunities to better capture drug response and activity at the individual level. Individualized TDM solutions have the potential to help optimize treatment selection and dosing regimens so that the right drug and right dose may be matched to the right person and in the right context.
View details for DOI 10.3389/fphar.2024.1348112
View details for PubMedID 38545548
View details for PubMedCentralID PMC10965556
-
Immunotherapeutic IL-6R and targeting the MCT-1/IL-6/CXCL7/PD-L1 circuit prevent relapse and metastasis of triple-negative breast cancer.
Theranostics
2024; 14 (5): 2167-2189
Abstract
Rationale: Multiple copies in T-cell malignancy 1 (MCT-1) is a prognostic biomarker for aggressive breast cancers. Overexpressed MCT-1 stimulates the IL-6/IL-6R/gp130/STAT3 axis, which promotes epithelial-to-mesenchymal transition and cancer stemness. Because cancer stemness largely contributes to the tumor metastasis and recurrence, we aimed to identify whether the blockade of MCT-1 and IL-6R can render these effects and to understand the underlying mechanisms that govern the process. Methods: We assessed primary tumor invasion, postsurgical local recurrence and distant metastasis in orthotopic syngeneic mice given the indicated immunotherapy and MCT-1 silencing (shMCT-1). Results: We found that shMCT-1 suppresses the transcriptomes of the inflammatory response and metastatic signaling in TNBC cells and inhibits tumor recurrence, metastasis and mortality in xenograft mice. IL-6R immunotherapy and shMCT-1 combined further decreased intratumoral M2 macrophages and T regulatory cells (Tregs) and avoided postsurgical TNBC expansion. shMCT-1 also enhances IL-6R-based immunotherapy effectively in preventing postsurgical TNBC metastasis, recurrence and mortality. Anti-IL-6R improved helper T, cytotoxic T and natural killer (NK) cells in the lymphatic system and decreased Tregs in the recurrent and metastatic tumors. Combined IL-6R and PD-L1 immunotherapies abridged TNBC cell stemness and M2 macrophage activity to a greater extent than monotherapy. Sequential immunotherapy of PD-L1 and IL-6R demonstrated the best survival outcome and lowest postoperative recurrence and metastasis compared with synchronized therapy, particularly in the shMCT-1 context. Multiple positive feedforward loops of the MCT-1/IL-6/IL-6R/CXCL7/PD-L1 axis were identified in TNBC cells, which boosted metastatic niches and immunosuppressive microenvironments. Clinically, MCT-1high/PD-L1high/CXCL7high and CXCL7high/IL-6high/IL-6Rhigh expression patterns predict worse prognosis and poorer survival of breast cancer patients. Conclusion: Systemic targeting the MCT-1/IL-6/IL-6R/CXCL7/PD-L1 interconnections enhances immune surveillance that inhibits the aggressiveness of TNBC.
View details for DOI 10.7150/thno.92922
View details for PubMedID 38505617
View details for PubMedCentralID PMC10945351
-
Author Correction: Advances and prospects for the Human BioMolecular Atlas Program (HuBMAP).
Nature cell biology
2024
View details for DOI 10.1038/s41556-024-01384-0
View details for PubMedID 38429479
-
Corrigendum: Advances and potential of omics studies for understanding the development of food allergy.
Frontiers in allergy
2024; 5: 1373485
Abstract
[This corrects the article DOI: 10.3389/falgy.2023.1149008.].
View details for DOI 10.3389/falgy.2024.1373485
View details for PubMedID 38464397
View details for PubMedCentralID PMC10921899
-
CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods
GENOME BIOLOGY
2024; 25 (1): 53
Abstract
The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors.Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic.Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.
View details for DOI 10.1186/s13059-023-03113-6
View details for Web of Science ID 001184832400002
View details for PubMedID 38389099
View details for PubMedCentralID PMC10882881
-
Short-chain fatty acids propionate and butyrate control growth and differentiation linked to cellular metabolism.
Research square
2024
Abstract
The short-chain fatty acids (SCFA) propionate and butyrate are produced in large amounts by microbial metabolism and have been identified as unique acyl lysine histone marks. In order to better understand the function of these modifications we used ChIP-seq to map the genome-wide location of four short-chain acyl histone marks H3K18pr/bu and H4K12pr/bu in treated and untreated colorectal cancer (CRC) and normal cells, as well as in mouse intestines in vivo. We correlate these marks with open chromatin regions along with gene expression to access the function of the target regions. Our data demonstrate that propionate and butyrate act as promoters of growth, differentiation as well as ion transport. We propose a mechanism involving direct modification of specific genomic regions, resulting in increased chromatin accessibility, and in case of butyrate, opposing effects on the proliferation of normal versus CRC cells.
View details for DOI 10.21203/rs.3.rs-3935562/v1
View details for PubMedID 38410440
View details for PubMedCentralID PMC10896393
-
Rare and common genetic determinants of mitochondrial function determine severity but not risk of amyotrophic lateral sclerosis.
Heliyon
2024; 10 (3): e24975
Abstract
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease involving selective vulnerability of energy-intensive motor neurons (MNs). It has been unclear whether mitochondrial function is an upstream driver or a downstream modifier of neurotoxicity. We separated upstream genetic determinants of mitochondrial function, including genetic variation within the mitochondrial genome or autosomes; from downstream changeable factors including mitochondrial DNA copy number (mtCN). Across three cohorts including 6,437 ALS patients, we discovered that a set of mitochondrial haplotypes, chosen because they are linked to measurements of mitochondrial function, are a determinant of ALS survival following disease onset, but do not modify ALS risk. One particular haplotype appeared to be neuroprotective and was significantly over-represented in two cohorts of long-surviving ALS patients. Causal inference for mitochondrial function was achievable using mitochondrial haplotypes, but not autosomal SNPs in traditional Mendelian randomization (MR). Furthermore, rare loss-of-function genetic variants within, and reduced MN expression of, ACADM and DNA2 lead to ∼50 % shorter ALS survival; both proteins are implicated in mitochondrial function. Both mtCN and cellular vulnerability are linked to DNA2 function in ALS patient-derived neurons. Finally, MtCN responds dynamically to the onset of ALS independently of mitochondrial haplotype, and is correlated with disease severity. We conclude that, based on the genetic measures we have employed, mitochondrial function is a therapeutic target for amelioration of disease severity but not prevention of ALS.
View details for DOI 10.1016/j.heliyon.2024.e24975
View details for PubMedID 38317984
View details for PubMedCentralID PMC10839612
-
Harnessing human genetics and stem cells for precision cardiovascular medicine.
Cell genomics
2024; 4 (2): 100445
Abstract
Human induced pluripotent stem cell (iPSC) platforms are valuable for biomedical and pharmaceutical research by providing tissue-specific human cells that retain patients' genetic integrity and display disease phenotypes in a dish. Looking forward, combining iPSC phenotyping platforms with genomic and screening technologies will continue to pave new directions for precision medicine, including genetic prediction, visualization, and treatment of heart disease. This review summarizes the recent use of iPSC technology to unpack the influence of genetic variants in cardiovascular pathology. We focus on various state-of-the-art genomic tools for cardiovascular therapies-including the expansion of genetic toolkits for molecular interrogation, in vitro population studies, and function-based drug screening-and their current applications in patient- and genome-edited iPSC platforms that are heralding new avenues for cardiovascular research.
View details for DOI 10.1016/j.xgen.2023.100445
View details for PubMedID 38359791
-
Validation of biomarkers of aging.
Nature medicine
2024
Abstract
The search for biomarkers that quantify biological aging (particularly 'omic'-based biomarkers) has intensified in recent years. Such biomarkers could predict aging-related outcomes and could serve as surrogate endpoints for the evaluation of interventions promoting healthy aging and longevity. However, no consensus exists on how biomarkers of aging should be validated before their translation to the clinic. Here, we review current efforts to evaluate the predictive validity of omic biomarkers of aging in population studies, discuss challenges in comparability and generalizability and provide recommendations to facilitate future validation of biomarkers of aging. Finally, we discuss how systematic validation can accelerate clinical translation of biomarkers of aging and their use in gerotherapeutic clinical trials.
View details for DOI 10.1038/s41591-023-02784-9
View details for PubMedID 38355974
View details for PubMedCentralID 9792204
-
Miscarriage risk assessment: a bioinformatic approach to identifying candidate lethal genes and variants.
Human genetics
2024
Abstract
PURPOSE: Miscarriage, often resulting from a variety of genetic factors, is a common pregnancy outcome. Preconception genetic carrier screening (PGCS) identifies at-risk partners for newborn genetic disorders; however, PGCS panels currently lack miscarriage-related genes. In this study, we evaluated the potential impact of both known and candidate genes on prenatal lethality and the effectiveness of PGCS in diverse populations.METHODS: We analyzed 125,748 human exome sequences and mouse and human gene function databases. Our goals were to identify genes crucial for human fetal survival (lethal genes), to find variants not present in a homozygous state in healthy humans, and to estimate carrier rates of known and candidate lethal genes in various populations and ethnic groups.RESULTS: This study identified 138 genes in which heterozygous lethal variants are present in the general population with a frequency of 0.5% or greater. Screening for these 138 genes could identify 4.6% (in the Finnish population) to 39.8% (in the East Asian population) of couples at risk of miscarriage. This explains the cause of pregnancy loss in approximately 1.1-10% of cases affected by biallelic lethal variants.CONCLUSION: This study has identified a set of genes and variants potentially associated with lethality across different ethnic backgrounds. The variation of these genes across ethnic groups underscores the need for a comprehensive, pan-ethnic PGCS panel that includes genes related to miscarriage.
View details for DOI 10.1007/s00439-023-02637-y
View details for PubMedID 38302665
-
Untargeted metabolomic profiling in children identifies novel pathways in asthma and atopy.
The Journal of allergy and clinical immunology
2024; 153 (2): 418-434
Abstract
Asthma and other atopic disorders can present with varying clinical phenotypes marked by differential metabolomic manifestations and enriched biological pathways.We sought to identify these unique metabolomic profiles in atopy and asthma.We analyzed baseline nonfasted plasma samples from a large multisite pediatric population of 470 children aged <13 years from 3 different sites in the United States and France. Atopy positivity (At+) was defined as skin prick test result of ≥3 mm and/or specific IgE ≥ 0.35 IU/mL and/or total IgE ≥ 173 IU/mL. Asthma positivity (As+) was based on physician diagnosis. The cohort was divided into 4 groups of varying combinations of asthma and atopy, and 6 pairwise analyses were conducted to best assess the differential metabolomic profiles between groups.Two hundred ten children were classified as At-As-, 42 as At+As-, 74 as At-As+, and 144 as At+As+. Untargeted global metabolomic profiles were generated through ultra-high-performance liquid chromatography-tandem mass spectroscopy. We applied 2 independent machine learning classifiers and short-listed 362 metabolites as discriminant features. Our analysis showed the most diverse metabolomic profile in the At+As+/At-As- comparison, followed by the At-As+/At-As- comparison, indicating that asthma is the most discriminant condition associated with metabolomic changes. At+As+ metabolomic profiles were characterized by higher levels of bile acids, sphingolipids, and phospholipids, and lower levels of polyamine, tryptophan, and gamma-glutamyl amino acids.The At+As+ phenotype displays a distinct metabolomic profile suggesting underlying mechanisms such as modulation of host-pathogen and gut microbiota interactions, epigenetic changes in T-cell differentiation, and lower antioxidant properties of the airway epithelium.
View details for DOI 10.1016/j.jaci.2023.09.040
View details for PubMedID 38344970
-
Correction: Digital health application integrating wearable data and behavioral patterns improves metabolic health.
NPJ digital medicine
2024; 7 (1): 9
View details for DOI 10.1038/s41746-024-00996-y
View details for PubMedID 38216626
-
Semi-supervised Cooperative Learning for Multiomics Data Fusion
SPRINGER INTERNATIONAL PUBLISHING AG. 2024: 54-63
View details for DOI 10.1007/978-3-031-47679-2_5
View details for Web of Science ID 001148056600005
-
Multi-omics in stress and health research: study designs that will drive the field forward.
Stress (Amsterdam, Netherlands)
2024; 27 (1): 2321610
Abstract
Despite decades of stress research, there still exist substantial gaps in our understanding of how social, environmental, and biological factors interact and combine with developmental stressor exposures, cognitive appraisals of stressors, and psychosocial coping processes to shape individuals' stress reactivity, health, and disease risk. Relatively new biological profiling approaches, called multi-omics, are helping address these issues by enabling researchers to quantify thousands of molecules from a single blood or tissue sample, thus providing a panoramic snapshot of the molecular processes occurring in an organism from a systems perspective. In this review, we summarize two types of research designs for which multi-omics approaches are best suited, and describe how these approaches can help advance our understanding of stress processes and the development, prevention, and treatment of stress-related pathologies. We first discuss incorporating multi-omics approaches into theory-rich, intensive longitudinal study designs to characterize, in high-resolution, the transition to stress-related multisystem dysfunction and disease throughout development. Next, we discuss how multi-omics approaches should be incorporated into intervention research to better understand the transition from stress-related dysfunction back to health, which can help inform novel precision medicine approaches to managing stress and fostering biopsychosocial resilience. Throughout, we provide concrete recommendations for types of studies that will help advance stress research, and translate multi-omics data into better health and health care.
View details for DOI 10.1080/10253890.2024.2321610
View details for PubMedID 38425100
-
Using Ecological Momentary Assessments to Study How Daily Fluctuations in Psychological States Impact Stress, Well-Being, and Health.
Journal of clinical medicine
2023; 13 (1)
Abstract
Despite great interest in how dynamic fluctuations in psychological states such as mood, social safety, energy, present-focused attention, and burnout impact stress, well-being, and health, most studies examining these constructs use retrospective assessments with relatively long time-lags. Here, we discuss how ecological momentary assessments (EMAs) address methodological issues associated with retrospective reports to help reveal dynamic associations between psychological states at small timescales that are often missed in stress and health research. In addition to helping researchers characterize daily and within-day fluctuations and temporal dynamics between different health-relevant processes, EMAs can elucidate mechanisms through which interventions reduce stress and enhance well-being. EMAs can also be used to identify changes that precede critical health events, which can in turn be used to deliver ecological momentary interventions, or just-in-time interventions, to help prevent such events from occurring. To enable this work, we provide examples of scales and single-item questions used in EMA studies, recommend study designs and statistical approaches that capitalize on EMA data, and discuss limitations of EMA methods. In doing so, we aim to demonstrate how, when used carefully, EMA methods are well poised to greatly advance our understanding of how intrapersonal dynamics affect stress levels, well-being, and human health.
View details for DOI 10.3390/jcm13010024
View details for PubMedID 38202031
-
NGLY1 mutations cause protein aggregation in human neurons.
Cell reports
2023; 42 (12): 113466
Abstract
Biallelic mutations in the gene that encodes the enzyme N-glycanase 1 (NGLY1) cause a rare disease with multi-symptomatic features including developmental delay, intellectual disability, neuropathy, and seizures. NGLY1's activity in human neural cells is currently not well understood. To understand how NGLY1 gene loss leads to the specific phenotypes of NGLY1 deficiency, we employed direct conversion of NGLY1 patient-derived induced pluripotent stem cells (iPSCs) to functional cortical neurons. Transcriptomic, proteomic, and functional studies of iPSC-derived neurons lacking NGLY1 function revealed several major cellular processes that were altered, including protein aggregate-clearing functionality, mitochondrial homeostasis, and synaptic dysfunctions. These phenotypes were rescued by introduction of a functional NGLY1 gene and were observed in iPSC-derived mature neurons but not astrocytes. Finally, laser capture microscopy followed by mass spectrometry provided detailed characterization of the composition of protein aggregates specific to NGLY1-deficient neurons. Future studies will harness this knowledge for therapeutic development.
View details for DOI 10.1016/j.celrep.2023.113466
View details for PubMedID 38039131
-
Reduced FOXF1 links unrepaired DNA damage to pulmonary arterial hypertension.
Nature communications
2023; 14 (1): 7578
Abstract
Pulmonary arterial hypertension (PAH) is a progressive disease in which pulmonary arterial (PA) endothelial cell (EC) dysfunction is associated with unrepaired DNA damage. BMPR2 is the most common genetic cause of PAH. We report that human PAEC with reduced BMPR2 have persistent DNA damage in room air after hypoxia (reoxygenation), as do mice with EC-specific deletion of Bmpr2 (EC-Bmpr2-/-) and persistent pulmonary hypertension. Similar findings are observed in PAEC with loss of the DNA damage sensor ATM, and in mice with Atm deleted in EC (EC-Atm-/-). Gene expression analysis of EC-Atm-/- and EC-Bmpr2-/- lung EC reveals reduced Foxf1, a transcription factor with selectivity for lung EC. Reducing FOXF1 in control PAEC induces DNA damage and impaired angiogenesis whereas transfection of FOXF1 in PAH PAEC repairs DNA damage and restores angiogenesis. Lung EC targeted delivery of Foxf1 to reoxygenated EC-Bmpr2-/- mice repairs DNA damage, induces angiogenesis and reverses pulmonary hypertension.
View details for DOI 10.1038/s41467-023-43039-y
View details for PubMedID 37989727
View details for PubMedCentralID 4737700
-
Integrative multi-omic profiling of adult mouse brain endothelial cells and potential implications in Alzheimer's disease.
Cell reports
2023; 42 (11): 113392
Abstract
The blood-brain barrier (BBB) is primarily manifested by a variety of physiological properties of brain endothelial cells (ECs), but the molecular foundation for these properties remains incompletely clear. Here, we generate a comprehensive molecular atlas of adult brain ECs using acutely purified mouse ECs and integrated multi-omics. Using RNA sequencing (RNA-seq) and proteomics, we identify the transcripts and proteins selectively enriched in brain ECs and demonstrate that they are partially correlated. Using single-cell RNA-seq, we dissect the molecular basis of functional heterogeneity of brain ECs. Using integrative epigenomics and transcriptomics, we determine that TCF/LEF, SOX, and ETS families are top-ranked transcription factors regulating the BBB. We then validate the identified brain-EC-enriched proteins and transcription factors in normal mouse and human brain tissue and assess their expression changes in mice with Alzheimer's disease. Overall, we present a valuable resource with broad implications for regulation of the BBB and treatment of neurological disorders.
View details for DOI 10.1016/j.celrep.2023.113392
View details for PubMedID 37925638
-
Mental Health for All: The Case for Investing in Digital Mental Health to Improve Global Outcomes, Access, and Innovation in Low-Resource Settings.
Journal of clinical medicine
2023; 12 (21)
Abstract
Mental health disorders are an increasing global public health concern that contribute to morbidity, mortality, disability, and healthcare costs across the world. Biomedical and psychological research has come a long way in identifying the importance of mental health and its impact on behavioral risk factors, physiological health, and overall quality of life. Despite this, access to psychological and psychiatric services remains widely unavailable and is a challenge for many healthcare systems, particularly those in developing countries. This review article highlights the strengths and opportunities brought forward by digital mental health in narrowing this divide. Further, it points to the economic and societal benefits of effectively managing mental illness, making a case for investing resources into mental healthcare as a larger priority for large non-governmental organizations and individual nations across the globe.
View details for DOI 10.3390/jcm12216735
View details for PubMedID 37959201
-
Relationship of Heterologous Virus Responses and Outcomes in Hospitalized COVID-19 Patients.
Journal of immunology (Baltimore, Md. : 1950)
2023; 211 (8): 1224-1231
Abstract
The clinical trajectory of COVID-19 may be influenced by previous responses to heterologous viruses. We examined the relationship of Abs against different viruses to clinical trajectory groups from the National Institutes of Health IMPACC (Immunophenotyping Assessment in a COVID-19 Cohort) study of hospitalized COVID-19 patients. Whereas initial Ab titers to SARS-CoV-2 tended to be higher with increasing severity (excluding fatal disease), those to seasonal coronaviruses trended in the opposite direction. Initial Ab titers to influenza and parainfluenza viruses also tended to be lower with increasing severity. However, no significant relationship was observed for Abs to other viruses, including measles, CMV, EBV, and respiratory syncytial virus. We hypothesize that some individuals may produce lower or less durable Ab responses to respiratory viruses generally (reflected in lower baseline titers in our study), and that this may carry over into poorer outcomes for COVID-19 (despite high initial SARS-CoV-2 titers). We further looked at longitudinal changes in Ab responses to heterologous viruses, but found little change during the course of acute COVID-19 infection. We saw significant trends with age for Ab levels to many of these viruses, but no difference in longitudinal SARS-CoV-2 titers for those with high versus low seasonal coronavirus titers. We detected no difference in longitudinal SARS-CoV-2 titers for CMV seropositive versus seronegative patients, although there was an overrepresentation of CMV seropositives among the IMPACC cohort, compared with expected frequencies in the United States population. Our results both reinforce findings from other studies and suggest (to our knowledge) new relationships between the response to SARS-CoV-2 and Abs to heterologous viruses.
View details for DOI 10.4049/jimmunol.2300391
View details for PubMedID 37756530
View details for PubMedCentralID PMC10539027
-
Integrative omic profiling and analyses in two pig heart to human xenotransplants
LIPPINCOTT WILLIAMS & WILKINS. 2023: 137
View details for DOI 10.1097/01.tp.0000994532.65771.9d
View details for Web of Science ID 001089038800202
-
Integration of spatial and single-cell data across modalities with weakly linked features.
Nature biotechnology
2023
Abstract
Although single-cell and spatial sequencing methods enable simultaneous measurement of more than one biological modality, no technology can capture all modalities within the same cell. For current data integration methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori 'linked' features. We describe matching X-modality via fuzzy smoothed embedding (MaxFuse), a cross-modal data integration method that, through iterative coembedding, data smoothing and cell matching, uses all information in each modality to obtain high-quality integration even when features are weakly linked. MaxFuse is modality-agnostic and demonstrates high robustness and accuracy in the weak linkage scenario, achieving 20~70% relative improvement over existing methods under key evaluation metrics on benchmarking datasets. A prototypical example of weak linkage is the integration of spatial proteomic data with single-cell sequencing data. On two example analyses of this type, MaxFuse enabled the spatial consolidation of proteomic, transcriptomic and epigenomic information at single-cell resolution on the same tissue section.
View details for DOI 10.1038/s41587-023-01935-0
View details for PubMedID 37679544
View details for PubMedCentralID 5669064
-
Author Correction: Lipid droplets and peroxisomes are co-regulated to drive lifespan extension in response to mono-unsaturated fatty acids.
Nature cell biology
2023
View details for DOI 10.1038/s41556-023-01220-x
View details for PubMedID 37567997
-
Multi-omics approaches in psychoneuroimmunology and health research: Conceptual considerations and methodological recommendations.
Brain, behavior, and immunity
2023
Abstract
The field of psychoneuroimmunology (PNI) has grown substantially in both relevance and prominence over the past 40 years. Notwithstanding its impressive trajectory, a majority of PNI studies are still based on a relatively small number of analytes. To advance this work, we suggest that PNI, and health research in general, can benefit greatly from adopting a multi-omics approach, which involves integrating data across multiple biological levels (e.g., the genome, proteome, transcriptome, metabolome, lipidome, and microbiome/metagenome) to more comprehensively profile biological functions and relate these profiles to clinical and behavioral outcomes. To assist investigators in this endeavor, we provide an overview of multi-omics research, highlight recent landmark multi-omics studies investigating human health and disease risk, and discuss how multi-omics can be applied to better elucidate links between psychological, nervous system, and immune system activity. In doing so, we describe how to design high-quality multi-omics PNI studies, decide which biological samples (e.g., blood, stool, urine, saliva, solid tissue) are most relevant, incorporate behavioral and wearable sensing data into multi-omics research, and understand key data quality, integration, analysis, and interpretation issues. PNI researchers are addressing some of the most interesting and important questions at the intersection of psychology, neuroscience, and immunology. Applying a multi-omics approach to this work will greatly expand the horizon of what is possible in PNI and has the potential to revolutionize our understanding of mind-body medicine.
View details for DOI 10.1016/j.bbi.2023.07.022
View details for PubMedID 37543247
-
Organ Mapping Antibody Panels: a community resource for standardized multiplexed tissue imaging.
Nature methods
2023
Abstract
Multiplexed antibody-based imaging enables the detailed characterization of molecular and cellular organization in tissues. Advances in the field now allow high-parameter data collection (>60 targets); however, considerable expertise and capital are needed to construct the antibody panels employed by these methods. Organ mapping antibody panels are community-validated resources that save time and money, increase reproducibility, accelerate discovery and support the construction of a Human Reference Atlas.
View details for DOI 10.1038/s41592-023-01846-7
View details for PubMedID 37468619
View details for PubMedCentralID 10335836
-
Segmentation of human functional tissue units in support of a Human Reference Atlas.
Communications biology
2023; 6 (1): 717
Abstract
The Human BioMolecular Atlas Program (HuBMAP) aims to compile a Human Reference Atlas (HRA) for the healthy adult body at the cellular level. Functional tissue units (FTUs), relevant for HRA construction, are of pathobiological significance. Manual segmentation of FTUs does not scale; highly accurate and performant, open-source machine-learning algorithms are needed. We designed and hosted a Kaggle competition that focused on development of such algorithms and 1200 teams from 60 countries participated. We present the competition outcomes and an expanded analysis of the winning algorithms on additional kidney and colon tissue data, and conduct a pilot study to understand spatial location and density of FTUs across the kidney. The top algorithm from the competition, Tom, outperforms other algorithms in the expanded study, while using fewer computational resources. Tom was added to the HuBMAP infrastructure to run kidney FTU segmentation at scale-showcasing the value of Kaggle competitions for advancing research.
View details for DOI 10.1038/s42003-023-04848-5
View details for PubMedID 37468557
View details for PubMedCentralID PMC10356924
-
Reverse-ChIP Techniques for Identifying Locus-Specific Proteomes: A Key Tool in Unlocking the Cancer Regulome.
Cells
2023; 12 (14)
Abstract
A phenotypic hallmark of cancer is aberrant transcriptional regulation. Transcriptional regulation is controlled by a complicated array of molecular factors, including the presence of transcription factors, the deposition of histone post-translational modifications, and long-range DNA interactions. Determining the molecular identity and function of these various factors is necessary to understand specific aspects of cancer biology and reveal potential therapeutic targets. Regulation of the genome by specific factors is typically studied using chromatin immunoprecipitation followed by sequencing (ChIP-Seq) that identifies genome-wide binding interactions through the use of factor-specific antibodies. A long-standing goal in many laboratories has been the development of a 'reverse-ChIP' approach to identify unknown binding partners at loci of interest. A variety of strategies have been employed to enable the selective biochemical purification of sequence-defined chromatin regions, including single-copy loci, and the subsequent analytical detection of associated proteins. This review covers mass spectrometry techniques that enable quantitative proteomics before providing a survey of approaches toward the development of strategies for the purification of sequence-specific chromatin as a 'reverse-ChIP' technique. A fully realized reverse-ChIP technique holds great potential for identifying cancer-specific targets and the development of personalized therapeutic regimens.
View details for DOI 10.3390/cells12141860
View details for PubMedID 37508524
-
Author Correction: Clonal haematopoiesis and risk of chronic liver disease.
Nature
2023
View details for DOI 10.1038/s41586-023-06375-z
View details for PubMedID 37400552
-
A Roadmap for the Human Gut Cell Atlas.
Nature reviews. Gastroenterology & hepatology
2023
Abstract
The number of studies investigating the human gastrointestinal tract using various single-cell profiling methods has increased substantially in the past few years. Although this increase provides a unique opportunity for the generation of the first comprehensive Human Gut Cell Atlas (HGCA), there remains a range of major challenges ahead. Above all, the ultimate success will largely depend on a structured and coordinated approach that aligns global efforts undertaken by a large number of research groups. In this Roadmap, we discuss a comprehensive forward-thinking direction for the generation of the HGCA on behalf of the Gut Biological Network of the Human Cell Atlas. Based on the consensus opinion of experts from across the globe, we outline the main requirements for the first complete HGCA by summarizing existing data sets and highlighting anatomical regions and/or tissues with limited coverage. We provide recommendations for future studies and discuss key methodologies and the importance of integrating the healthy gut atlas with related diseases and gut organoids. Importantly, we critically overview the computational tools available and provide recommendations to overcome key challenges.
View details for DOI 10.1038/s41575-023-00784-1
View details for PubMedID 37258747
View details for PubMedCentralID 5541232
-
Multiomic signals associated with maternal epidemiological factors contributing to preterm birth in low- and middle-income countries.
Science advances
2023; 9 (21): eade7692
Abstract
Preterm birth (PTB) is the leading cause of death in children under five, yet comprehensive studies are hindered by its multiple complex etiologies. Epidemiological associations between PTB and maternal characteristics have been previously described. This work used multiomic profiling and multivariate modeling to investigate the biological signatures of these characteristics. Maternal covariates were collected during pregnancy from 13,841 pregnant women across five sites. Plasma samples from 231 participants were analyzed to generate proteomic, metabolomic, and lipidomic datasets. Machine learning models showed robust performance for the prediction of PTB (AUROC = 0.70), time-to-delivery (r = 0.65), maternal age (r = 0.59), gravidity (r = 0.56), and BMI (r = 0.81). Time-to-delivery biological correlates included fetal-associated proteins (e.g., ALPP, AFP, and PGF) and immune proteins (e.g., PD-L1, CCL28, and LIFR). Maternal age negatively correlated with collagen COL9A1, gravidity with endothelial NOS and inflammatory chemokine CXCL13, and BMI with leptin and structural protein FABP4. These results provide an integrated view of epidemiological factors associated with PTB and identify biological signatures of clinical covariates affecting this disease.
View details for DOI 10.1126/sciadv.ade7692
View details for PubMedID 37224249
-
The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity.
bioRxiv : the preprint server for biology
2023
Abstract
The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3' end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts. We sequenced 264 LR-RNA-seq PacBio libraries totaling over 1 billion circular consensus reads (CCS) for 81 unique human and mouse samples. We detect at least one full-length transcript from 87.7% of annotated human protein coding genes and a total of 200,000 full-length transcripts, 40% of which have novel exon junction chains. To capture and compute on the three sources of transcript structure diversity, we introduce a gene and transcript annotation framework that uses triplets representing the transcript start site, exon junction chain, and transcript end site of each transcript. Using triplets in a simplex representation demonstrates how promoter selection, splice pattern, and 3' processing are deployed across human tissues, with nearly half of multi-transcript protein coding genes showing a clear bias toward one of the three diversity mechanisms. Evaluated across samples, the predominantly expressed transcript changes for 74% of protein coding genes. In evolution, the human and mouse transcriptomes are globally similar in types of transcript structure diversity, yet among individual orthologous gene pairs, more than half (57.8%) show substantial differences in mechanism of diversification in matching tissues. This initial large-scale survey of human and mouse long-read transcriptomes provides a foundation for further analyses of alternative transcript usage, and is complemented by short-read and microRNA data on the same samples and by epigenome data elsewhere in the ENCODE4 collection.
View details for DOI 10.1101/2023.05.15.540865
View details for PubMedID 37292896
View details for PubMedCentralID PMC10245583
-
Organ-specific aging and the risk of chronic diseases.
Nature medicine
2023
View details for DOI 10.1038/s41591-023-02338-z
View details for PubMedID 37161069
-
Gut Microbiome-Based Management ofPatients With HeartFailure: JACC Review Topic of the Week.
Journal of the American College of Cardiology
2023; 81 (17): 1729-1739
Abstract
Despite therapeutic advances, chronic heart failure (HF) is still associated with significant risk of morbidity and mortality. The course of disease and responses to therapies vary widely among individuals with HF, highlighting the need for precision medicine approaches. Gut microbiome stands to be an important aspect of precision medicine in HF. Exploratory clinical studies have revealed shared patterns of gut microbiome dysregulation in this disease, with mechanistic animal studies providing evidence for active involvement of the gut microbiome in development and pathophysiology of HF. Deeper insights into gut microbiome-host interactions in patients with HF promise to deliver novel disease biomarkers, preventative and therapeutic targets, and improve disease risk stratification. This knowledge may enable a paradigm shift in how we care for patients with HF, and pave the path toward improved clinical outcomes through personalized HF care.
View details for DOI 10.1016/j.jacc.2023.02.045
View details for PubMedID 37100490
-
Association between the dynamics of the gut microbiota and responsiveness to mental health therapy
AMER ASSOC IMMUNOLOGISTS. 2023
View details for Web of Science ID 001106506503167
-
Lipid droplets and peroxisomes are co-regulated to drive lifespan extension in response to mono-unsaturated fatty acids.
Nature cell biology
2023
Abstract
Dietary mono-unsaturated fatty acids (MUFAs) are linked to longevity in several species. But the mechanisms by which MUFAs extend lifespan remain unclear. Here we show that an organelle network involving lipid droplets and peroxisomes is critical for MUFA-induced longevity in Caenorhabditis elegans. MUFAs upregulate the number of lipid droplets in fat storage tissues. Increased lipid droplet number is necessary for MUFA-induced longevity and predicts remaining lifespan. Lipidomics datasets reveal that MUFAs also modify the ratio of membrane lipids and ether lipids-a signature associated with decreased lipid oxidation. In agreement with this, MUFAs decrease lipid oxidation in middle-aged individuals. Intriguingly, MUFAs upregulate not only lipid droplet number but also peroxisome number. A targeted screen identifies genes involved in the co-regulation of lipid droplets and peroxisomes, and reveals that induction of both organelles is optimal for longevity. Our study uncovers an organelle network involved in lipid homeostasis and lifespan regulation, opening new avenues for interventions to delay aging.
View details for DOI 10.1038/s41556-023-01136-6
View details for PubMedID 37127715
-
Organism-wide, cell-type-specific secretome mapping of exercise training in mice.
Cell metabolism
2023
Abstract
There is a significant interest in identifying blood-borne factors that mediate tissue crosstalk and function as molecular effectors of physical activity. Although past studies have focused on an individual molecule or cell type, the organism-wide secretome response to physical activity has not been evaluated. Here, we use a cell-type-specific proteomic approach to generate a 21-cell-type, 10-tissue map of exercise training-regulated secretomes in mice. Our dataset identifies >200 exercise training-regulated cell-type-secreted protein pairs, the majority of which have not been previously reported. Pdgfra-cre-labeled secretomes were the most responsive to exercise training. Finally, we show anti-obesity, anti-diabetic, and exercise performance-enhancing activities for proteoforms of intracellular carboxylesterases whose secretion from the liver is induced by exercise training.
View details for DOI 10.1016/j.cmet.2023.04.011
View details for PubMedID 37141889
-
Leveraging Physiology and Artificial Intelligence to Deliver Advancements in Healthcare.
Physiological reviews
2023
Abstract
Artificial Intelligence (AI) in healthcare has generated remarkable innovation and progress in the last decade. Significant advancements can be attributed to the utilization of AI to transform physiology data to advance healthcare. In this review, we will explore how past work has shaped the field and defined future challenges and directions. In particular, we focus on three areas of development. First, we give an overview of AI, with special attention to the most relevant AI models. We then detail how physiology data has been harnessed by AI to advance the main areas of healthcare such as automating existing healthcare tasks, increasing access to care, and augmenting healthcare capabilities. Finally, we discuss emerging concerns surrounding the use of individual physiology data and detail an increasingly important consideration for the field, namely the challenges of deploying AI models to achieve meaningful clinical impact.
View details for DOI 10.1152/physrev.00033.2022
View details for PubMedID 37104717
-
Multi-omics profiling for health.
Molecular & cellular proteomics : MCP
2023: 100561
Abstract
The world has witnessed a steady rise in both non-infectious and infectious chronic diseases, prompting a cross-disciplinary approach to understand and treat disease. Current medical care focuses on treating people after they become patients rather than to preventing illness, leading to high costs in treating chronic and late-stage diseases. Additionally, a 'one-size-fits all' approach to healthcare does not take into account individual differences in genetics, environment, or lifestyle factors, decreasing the number of people benefiting from interventions. Rapid advances in omics technologies and progress in computational capabilities have led to the development of multi-omics deep phenotyping, which profiles the interaction of multiple levels of biology over time and empowers precision health approaches. This review highlights current and emerging multi-omics modalities for precision health and discusses applications in the following areas: genetic variation, cardio-metabolic diseases, cancer, infectious diseases, organ transplantation, pregnancy, and longevity/aging. We will briefly discuss the potential of multi-omics approaches in disentangling host-microbe and host-environmental interactions. We will touch on emerging areas of electronic health record and clinical imaging integration with muti-omics for precision health. Finally, we will briefly discuss the challenges in clinical implementation of multi-omics and their future prospects.
View details for DOI 10.1016/j.mcpro.2023.100561
View details for PubMedID 37119971
-
The ENCODE Imputation Challenge: a critical assessment of methods for cross-cell type imputation of epigenomic profiles.
Genome biology
2023; 24 (1): 79
Abstract
A promising alternative to comprehensively performing genomics experiments is to, instead, perform a subset of experiments and use computational methods to impute the remainder. However, identifying the best imputation methods and what measures meaningfully evaluate performance are open questions. We address these questions by comprehensively analyzing 23 methods from the ENCODE Imputation Challenge. We find that imputation evaluations are challenging and confounded by distributional shifts from differences in data collection and processing over time, the amount of available data, and redundancy among performance measures. Our analyses suggest simple steps for overcoming these issues and promising directions for more robust research.
View details for DOI 10.1186/s13059-023-02915-y
View details for PubMedID 37072822
View details for PubMedCentralID PMC10111747
-
Acetyl-Click Screening Platform Identifies Small-Molecule Inhibitors of Histone Acetyltransferase 1 (HAT1).
Journal of medicinal chemistry
2023
Abstract
HAT1 is a central regulator of chromatin synthesis that acetylates nascent histone H4. To ascertain whether targeting HAT1 is a viable anticancer treatment strategy, we sought to identify small-molecule inhibitors of HAT1 by developing a high-throughput HAT1 acetyl-click assay. Screening of small-molecule libraries led to the discovery of multiple riboflavin analogs that inhibited HAT1 enzymatic activity. Compounds were refined by synthesis and testing of over 70 analogs, which yielded structure-activity relationships. The isoalloxazine core was required for enzymatic inhibition, whereas modifications of the ribityl side chain improved enzymatic potency and cellular growth suppression. One compound (JG-2016 [24a]) showed relative specificity toward HAT1 compared to other acetyltransferases, suppressed the growth of human cancer cell lines, impaired enzymatic activity in cellulo, and interfered with tumor growth. This is the first report of a small-molecule inhibitor of the HAT1 enzyme complex and represents a step toward targeting this pathway for cancer therapy.
View details for DOI 10.1021/acs.jmedchem.3c00039
View details for PubMedID 37027002
-
Withdrawal of 'Precision Neoantigen Discovery Using Large-scale Immunopeptidomes and Composite Modeling of MHC Peptide Presentation'.
Molecular & cellular proteomics : MCP
2023; 22 (4): 100511
View details for DOI 10.1016/j.mcpro.2023.100511
View details for PubMedID 37019059
-
Leveraging electronic health records to identify risk factors for recurrent pregnancy loss across two medical centers: a case-control study.
Research square
2023
Abstract
Recurrent pregnancy loss (RPL), defined as 2 or more pregnancy losses, affects 5-6% of ever-pregnant individuals. Approximately half of these cases have no identifiable explanation. To generate hypotheses about RPL etiologies, we implemented a case-control study comparing the history of over 1,600 diagnoses between RPL and live-birth patients, leveraging the University of California San Francisco (UCSF) and Stanford University electronic health record databases. In total, our study included 8,496 RPL (UCSF: 3,840, Stanford: 4,656) and 53,278 Control (UCSF: 17,259, Stanford: 36,019) patients. Menstrual abnormalities and infertility-associated diagnoses were significantly positively associated with RPL in both medical centers. Age-stratified analysis revealed that the majority of RPL-associated diagnoses had higher odds ratios for patients <35 compared with 35+ patients. While Stanford results were sensitive to control for healthcare utilization, UCSF results were stable across analyses with and without utilization. Intersecting significant results between medical centers was an effective filter to identify associations that are robust across center-specific utilization patterns.
View details for DOI 10.21203/rs.3.rs-2631220/v1
View details for PubMedID 36993325
View details for PubMedCentralID PMC10055527
-
The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models.
Cell
2023; 186 (7): 1493-1511.e40
Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
View details for DOI 10.1016/j.cell.2023.02.018
View details for PubMedID 37001506
-
Advances and potential of omics studies for understanding the development of food allergy.
Frontiers in allergy
2023; 4: 1149008
Abstract
The prevalence of food allergy continues to rise globally, carrying with it substantial safety, economic, and emotional burdens. Although preventative strategies do exist, the heterogeneity of allergy trajectories and clinical phenotypes has made it difficult to identify patients who would benefit from these strategies. Therefore, further studies investigating the molecular mechanisms that differentiate these trajectories are needed. Large-scale omics studies have identified key insights into the molecular mechanisms for many different diseases, however the application of these technologies to uncover the drivers of food allergy development is in its infancy. Here we review the use of omics approaches in food allergy and highlight key gaps in knowledge for applying these technologies for the characterization of food allergy development.
View details for DOI 10.3389/falgy.2023.1149008
View details for PubMedID 37034151
View details for PubMedCentralID PMC10080041
-
Proinflammatory polarization of monocytes by particulate air pollutants is mediated by induction of trained immunity in pediatric asthma.
Allergy
2023
Abstract
The impact of exposure to air pollutants, such as fine particulate matter (PM), on the immune system and its consequences on pediatric asthma, are not well understood. We investigated whether ambient levels of fine PM with aerodynamic diameter ≤2.5 microns (PM2.5 ) are associated with alterations in circulating monocytes in children with or without asthma.Monocyte phenotyping was performed by cytometry time-of-flight (CyTOF). Cytokines were measured using cytomtric bead array and Luminex assay. ChIP-Seq was utilized to address histone modifications in monocytes.Increased exposure to ambient PM2.5 was linked to specific monocyte subtypes, particularly in children with asthma. Mechanistically, we hypothesized that innate trained immunity is evoked by a primary exposure to fine PM and accounts for an enhanced inflammatory response after secondary stimulation in vitro. We determined that the trained immunity was induced in circulating monocytes by fine particulate pollutants, and it was characterized by the upregulation of proinflammatory mediators, such as TNF, IL-6, and IL-8, upon stimulation with house dust mite or lipopolysaccharide. This phenotype was epigenetically controlled by enhanced H3K27ac marks in circulating monocytes.The specific alterations of monocytes after ambient pollution exposure suggest a possible prognostic immune signature for pediatric asthma, and pollution-induced trained immunity may provide a potential therapeutic target for asthmatic children living in areas with increased air pollution.
View details for DOI 10.1111/all.15692
View details for PubMedID 36929161
-
Simultaneous profiling of host expression and microbial abundance by spatial metatranscriptome sequencing
GENOME RESEARCH
2023; 33 (3): 401-411
View details for DOI 10.1101/gr.277178.122
View details for Web of Science ID 000963913000007
-
Biomonitoring and precision health in deep space supported by artificial intelligence
NATURE MACHINE INTELLIGENCE
2023; 5 (3): 196-207
View details for DOI 10.1038/s42256-023-00617-5
View details for Web of Science ID 000974378500002
-
Biological research and self-driving labs in deep space supported by artificial intelligence
NATURE MACHINE INTELLIGENCE
2023; 5 (3): 208-219
View details for DOI 10.1038/s42256-023-00618-4
View details for Web of Science ID 000974378500003
-
Sensor-enabled Multilayer Artificial Intelligence Analysis for Predictive Wound Healing and Real-Time Patient Monitoring
WILEY. 2023: 268-269
View details for Web of Science ID 001005693800060
-
Simultaneous profiling of host expression and microbial abundance by spatial metatranscriptome sequencing.
Genome research
2023; 33 (3): 401-411
Abstract
We developed an analysis pipeline that can extract microbial sequences from spatial transcriptomic (ST) data and assign taxonomic labels, generating a spatial microbial abundance matrix in addition to the default host expression matrix, enabling simultaneous analysis of host expression and microbial distribution. We called the pipeline spatial metatranscriptome (SMT) and applied it on both human and murine intestinal sections and validated the spatial microbial abundance information with alternative assays. Biological insights were gained from these novel data that showed host-microbe interaction at various spatial scales. Finally, we tested experimental modification that can increase microbial capture while preserving host spatial expression quality and, by use of positive controls, quantitatively showed the capture efficiency and recall of our methods. This proof-of-concept work shows the feasibility of SMT analysis and paves the way for further experimental optimization and application.
View details for DOI 10.1101/gr.277178.122
View details for PubMedID 37310927
-
Precision neoantigen discovery using large-scale immunopeptidomes and composite modeling of MHC peptide presentation.
Molecular & cellular proteomics : MCP
2023: 100506
Abstract
Major histocompatibility complex (MHC)-bound peptides that originate from tumor-specific genetic alterations, known as neoantigens, are an important class of anti-cancer therapeutic targets. Accurately predicting peptide presentation by MHC complexes is a key aspect of discovering therapeutically relevant neoantigens. Technological improvements in mass-spectrometry-based immunopeptidomics and advanced modeling techniques have vastly improved MHC presentation prediction over the past two decades. However, improvement in the sensitivity and specificity of prediction algorithms is needed for clinical applications such as the development of personalized cancer vaccines, the discovery of biomarkers for response to checkpoint blockade and the quantification of autoimmune risk in gene therapies. Toward this end, we generated allele-specific immunopeptidomics data using 25 mono-allelic cell lines and created Systematic HLA Epitope Ranking Pan Algorithm (SHERPA™), a pan-allelic MHC-peptide algorithm for predicting MHC-peptide binding and presentation. In contrast to previously published large-scale mono-allelic data, we used an HLA-null K562 parental cell line and a stable transfection of HLA alleles to better emulate native presentation. Our dataset includes five previously unprofiled alleles that expand MHC binding pocket diversity in the training data and extend allelic coverage in underprofiled populations. To improve generalizability, SHERPA systematically integrates 128 mono-allelic and 384 multi-allelic samples with publicly available immunoproteomics data and binding assay data. Using this dataset, we developed two features that empirically estimate the propensities of genes and specific regions within gene bodies to engender immunopeptides to represent antigen processing. Using a composite model constructed with gradient boosting decision trees, multi-allelic deconvolution and 2.15 million peptides encompassing 167 alleles, we achieved a 1.44 fold improvement of positive predictive value compared to existing tools when evaluated on independent mono-allelic datasets and a 1.17 fold improvement when evaluating on tumor samples. With a high degree of accuracy, SHERPA has the potential to enable precision neoantigen discovery for future clinical applications.
View details for DOI 10.1016/j.mcpro.2023.100506
View details for PubMedID 36796642
-
Challenging obesity and sex based differences in resting energy expenditure using allometric modeling, a sub-study of the DIETFITS clinical trial.
Clinical nutrition ESPEN
2023; 53: 43-52
Abstract
BACKGROUND & AIMS: Resting energy expenditure (REE) is a major component of energy balance. While REE is usually indexed to total body weight (BW), this may introduce biases when assessing REE in obesity or during weight loss intervention. The main objective of the study was to quantify the bias introduced by ratiometric scaling of REE using BW both at baseline and following weight loss intervention.DESIGN: Participants in the DIETFITS Study (Diet Intervention Examining The Factors Interacting with Treatment Success) who completed indirect calorimetry and dual-energy X-ray absorptiometry (DXA) were included in the study. Data were available in 438 participants at baseline, 340at 6 months and 323at 12 months. We used multiplicative allometric modeling based on lean body mass (LBM) and fat mass (FM) to derive body size independent scaling of REE. Longitudinal changes in indexed REE were then assessed following weight loss intervention.RESULTS: A multiplicative model including LBM, FM, age, Black race and the double product (DP) of systolic blood pressure and heart rate explained 79% of variance in REE. REE indexed to [LBM0.66*FM0.066] was body size and sex independent (p=0.91 and p=0.73, respectively) in contrast to BW based indexing which showed a significant inverse relationship to BW (r=-0.47 for female and r=-0.44 for male, both p<0.001). When indexed to BW, significant baseline differences in REE were observed between male and female (p<0.001) and between individuals who are overweight and obese (p<0.001) while no significant differences were observed when indexed to REE/[LBM0.66*FM0.066], p>0.05). Percentage predicted REE adjusted for LBM, FM and DP remained stable following weight loss intervention (p=0.614).CONCLUSION: Allometric scaling of REE based on LBM and FM removes body composition-associated biases and should be considered in obesity and weight-based intervention studies.
View details for DOI 10.1016/j.clnesp.2022.11.015
View details for PubMedID 36657929
-
Stem cell plasticity, acetylation of H3K14, and de novo gene activation rely on KAT7.
Cell reports
2023; 42 (1): 111980
Abstract
In the conventional model of transcriptional activation, transcription factors bind to response elements and recruit co-factors, including histone acetyltransferases. Contrary to this model, we show that the histone acetyltransferase KAT7 (HBO1/MYST2) is required genome wide for histone H3 lysine 14 acetylation (H3K14ac). Examining neural stem cells, we find that KAT7 and H3K14ac are present not only at transcribed genes but also at inactive genes, intergenic regions, and in heterochromatin. KAT7 and H3K14ac were not required for the continued transcription of genes that were actively transcribed at the time of loss of KAT7 but indispensable for the activation of repressed genes. The absence of KAT7 abrogates neural stem cell plasticity, diverse differentiation pathways, and cerebral cortex development. Re-expression of KAT7 restored stem cell developmental potential. Overexpression of KAT7 enhanced neuron and oligodendrocyte differentiation. Our data suggest that KAT7 prepares chromatin for transcriptional activation and is a prerequisite for gene activation.
View details for DOI 10.1016/j.celrep.2022.111980
View details for PubMedID 36641753
-
Multiomic identification of key transcriptional regulatory programs during endurance exercise training.
bioRxiv : the preprint server for biology
2023
Abstract
Transcription factors (TFs) play a key role in regulating gene expression and responses to stimuli. We conducted an integrated analysis of chromatin accessibility and RNA expression across various rat tissues following endurance exercise training (EET) to map epigenomic changes to transcriptional changes and determine key TFs involved. We uncovered tissue-specific changes across both omic layers, including highly correlated differentially accessible regions (DARs) and differentially expressed genes (DEGs). We identified open chromatin regions associated with DEGs (DEGaPs) and found tissue-specific and genomic feature-specific TF motif enrichment patterns among both DARs and DEGaPs. Accessible promoters of up-vs. down-regulated DEGs per tissue showed distinct TF enrichment patterns. Further, some EET-induced TFs in skeletal muscle were either validated at the proteomic level (MEF2C and NUR77) or correlated with exercise-related phenotypic changes. We provide an in-depth analysis of the epigenetic and trans-factor-dependent processes governing gene expression during EET.
View details for DOI 10.1101/2023.01.10.523450
View details for PubMedID 36711841
-
Low expression of EXOSC2 protects against clinical COVID-19 and impedes SARS-CoV-2 replication.
Life science alliance
2023; 6 (1)
Abstract
New therapeutic targets are a valuable resource for treatment of SARS-CoV-2 viral infection. Genome-wide association studies have identified risk loci associated with COVID-19, but many loci are associated with comorbidities and are not specific to host-virus interactions. Here, we identify and experimentally validate a link between reduced expression of EXOSC2 and reduced SARS-CoV-2 replication. EXOSC2 was one of the 332 host proteins examined, all of which interact directly with SARS-CoV-2 proteins. Aggregating COVID-19 genome-wide association studies statistics for gene-specific eQTLs revealed an association between increased expression of EXOSC2 and higher risk of clinical COVID-19. EXOSC2 interacts with Nsp8 which forms part of the viral RNA polymerase. EXOSC2 is a component of the RNA exosome, and here, LC-MS/MS analysis of protein pulldowns demonstrated interaction between the SARS-CoV-2 RNA polymerase and most of the human RNA exosome components. CRISPR/Cas9 introduction of nonsense mutations within EXOSC2 in Calu-3 cells reduced EXOSC2 protein expression and impeded SARS-CoV-2 replication without impacting cellular viability. Targeted depletion of EXOSC2 may be a safe and effective strategy to protect against clinical COVID-19.
View details for DOI 10.26508/lsa.202201449
View details for PubMedID 36241425
-
Harnessing human genetics and stem cells for precision cardiovascular medicine
Cell Genomics
2023
View details for DOI 10.1016/j.xgen.2023.100445
-
Leveraging Mobile Technology for Public Health Promotion: A Multidisciplinary Perspective.
Annual review of public health
2022
Abstract
Health behaviors are inextricably linked to health and well-being, yet issues such as physical inactivity and insufficient sleep remain significant global public health problems. Mobile technology-and the unprecedented scope and quantity of data it generates-has a promising but largely untapped potential to promote health behaviors at the individual and population levels. This perspective article provides multidisciplinary recommendations on the design and use of mobile technology, and the concomitant wealth of data, to promote behaviors that support overall health. Using physical activity as an exemplar health behavior, we review emerging strategies for health behavior change interventions. We describe progress on personalizing interventions to an individual and their social, cultural, and built environments, as well as on evaluating relationships between mobile technology data and health to establish evidence-based guidelines. In reviewing these strategies and highlighting directions for future research, we advance the use of theory-based, personalized, and human-centered approaches in promoting health behaviors. Expected final online publication date for the Annual Review of Public Health, Volume 44 is April 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
View details for DOI 10.1146/annurev-publhealth-060220-041643
View details for PubMedID 36542772
-
Gut microbiota analyses of Saudi populations for type 2 diabetes-related phenotypes reveals significant association.
BMC microbiology
2022; 22 (1): 301
Abstract
Large-scale gut microbiome sequencing has revealed key links between microbiome dysfunction and metabolic diseases such as type 2 diabetes (T2D). To date, these efforts have largely focused on Western populations, with few studies assessing T2D microbiota associations in Middle Eastern communities where T2D prevalence is now over 20%. We analyzed the composition of stool 16S rRNA from 461 T2D and 119 non-T2D participants from the Eastern Province of Saudi Arabia. We quantified the abundance of microbial communities to examine any significant differences between subpopulations of samples based on diabetes status and glucose level.In this study we performed the largest microbiome study ever conducted in Saudi Arabia, as well as the first-ever characterization of gut microbiota T2D versus non-T2D in this population. We observed overall positive enrichment within diabetics compared to healthy individuals and amongst diabetic participants; those with high glucose levels exhibited slightly more positive enrichment compared to those at lower risk of fasting hyperglycemia. In particular, the genus Firmicutes was upregulated in diabetic individuals compared to non-diabetic individuals, and T2D was associated with an elevated Firmicutes/Bacteroidetes ratio, consistent with previous findings.Based on diabetes status and glucose levels of Saudi participants, relatively stable differences in stool composition were perceived by differential abundance and alpha diversity measures. However, community level differences are evident in the Saudi population between T2D and non-T2D individuals, and diversity patterns appear to vary from well-characterized microbiota from Western cohorts. Comparing overlapping and varying patterns in gut microbiota with other studies is critical to assessing novel treatment options in light of a rapidly growing T2D health epidemic in the region. As a rapidly emerging chronic condition in Saudi Arabia and the Middle East, T2D burdens have grown more quickly and affect larger proportions of the population than any other global region, making a regional reference T2D-microbiome dataset critical to understanding the nuances of disease development on a global scale.
View details for DOI 10.1186/s12866-022-02714-8
View details for PubMedID 36510121
-
Early prediction and longitudinal modeling of preeclampsia from multiomics.
Patterns (New York, N.Y.)
2022; 3 (12): 100655
Abstract
Preeclampsia is a complex disease of pregnancy whose physiopathology remains unclear. We developed machine-learning models for early prediction of preeclampsia (first 16weeks of pregnancy) and over gestation by analyzing six omics datasets from a longitudinal cohort of pregnant women. For early pregnancy, a prediction model using nine urine metabolites had the highest accuracy and was validated on an independent cohort (area under the receiver-operating characteristic curve [AUC]= 0.88, 95% confidence interval [CI] [0.76, 0.99] cross-validated; AUC= 0.83, 95% CI [0.62,1] validated). Univariate analysis demonstrated statistical significance of identified metabolites. An integrated multiomics model further improved accuracy (AUC= 0.94). Several biological pathways were identified including tryptophan, caffeine, and arachidonic acid metabolisms. Integration with immune cytometry data suggested novel associations between immune and proteomic dynamics. While further validation in a larger population is necessary, these encouraging results can serve as a basis for a simple, early diagnostic test for preeclampsia.
View details for DOI 10.1016/j.patter.2022.100655
View details for PubMedID 36569558
-
Wireless, closed-loop, smart bandage with integrated sensors and stimulators for advanced wound care and accelerated healing.
Nature biotechnology
2022
Abstract
'Smart' bandages based on multimodal wearable devices could enable real-time physiological monitoring and active intervention to promote healing of chronic wounds. However, there has been limited development in incorporation of both sensors and stimulators for the current smart bandage technologies. Additionally, while adhesive electrodes are essential for robust signal transduction, detachment of existing adhesive dressings can lead to secondary damage to delicate wound tissues without switchable adhesion. Here we overcome these issues by developing a flexible bioelectronic system consisting of wirelessly powered, closed-loop sensing and stimulation circuits with skin-interfacing hydrogel electrodes capable of on-demand adhesion and detachment. In mice, we demonstrate that our wound care system can continuously monitor skin impedance and temperature and deliver electrical stimulation in response to the wound environment. Across preclinical wound models, the treatment group healed ~25% more rapidly and with ~50% enhancement in dermal remodeling compared with control. Further, we observed activation of proregenerative genes in monocyte and macrophage cell populations, which may enhance tissue regeneration, neovascularization and dermal recovery.
View details for DOI 10.1038/s41587-022-01528-3
View details for PubMedID 36424488
View details for PubMedCentralID 5350204
-
Author Correction: Prediction of gestational age using urinary metabolites in term and preterm pregnancies.
Scientific reports
2022; 12 (1): 19753
View details for DOI 10.1038/s41598-022-23715-7
View details for PubMedID 36396676
-
Annotation of spatially resolved single-cell data with STELLAR.
Nature methods
2022
Abstract
Accurate cell-type annotation from spatially resolved single cells is crucial to understand functional spatial biology that is the basis of tissue organization. However, current computational methods for annotating spatially resolved single-cell data are typically based on techniques established for dissociated single-cell technologies and thus do not take spatial organization into account. Here we present STELLAR, a geometric deep learning method for cell-type discovery and identification in spatially resolved single-cell datasets. STELLAR automatically assigns cells to cell types present in the annotated reference dataset and discovers novel cell types and cell states. STELLAR transfers annotations across different dissection regions, different tissues and different donors, and learns cell representations that capture higher-order tissue structures. We successfully applied STELLAR to CODEX multiplexed fluorescent microscopy data and multiplexed RNA imaging datasets. Within the Human BioMolecular Atlas Program, STELLAR has annotated 2.6million spatially resolved single cells with dramatic time savings.
View details for DOI 10.1038/s41592-022-01651-8
View details for PubMedID 36280720
-
The metabolomics of human aging: Advances, challenges, and opportunities.
Science advances
2022; 8 (42): eadd6155
Abstract
As the global population becomes older, understanding the impact of aging on health and disease becomes paramount. Recent advancements in multiomic technology have allowed for the high-throughput molecular characterization of aging at the population level. Metabolomics studies that analyze the small molecules in the body can provide biological information across a diversity of aging processes. Here, we review the growing body of population-scale metabolomics research on aging in humans, identifying the major trends in the field, implicated biological pathways, and how these pathways relate to health and aging. We conclude by assessing the main challenges in the research to date, opportunities for advancing the field, and the outlook for precision health applications.
View details for DOI 10.1126/sciadv.add6155
View details for PubMedID 36260671
-
LEVERAGING ELECTRONIC HEALTH RECORD DATA TO IDENTIFY PHENOTYPES ASSOCIATED WITH PREGNANCY LOSS MAY LEAD TO IMPROVED UNDERSTANDING OF RECURRENT PREGNANCY LOSS
ELSEVIER SCIENCE INC. 2022: E107
View details for Web of Science ID 000891804600262
-
Precision Medicine Approaches to Mental Healthcare.
Physiology (Bethesda, Md.)
2022
Abstract
By developing a more comprehensive understanding of the physiological underpinnings of mental illness, precision medicine has the potential to revolutionize psychiatric care. With recent breakthroughs in next-generation multi-omics technologies and data analytics, it is becoming more feasible to leverage multimodal biomarkers, from genetic variants to neuroimaging biomarkers, to objectify diagnostics and treatment decisions in psychiatry and improve patient outcomes. Ongoing work in precision psychiatry will parallel progress in precision oncology and cardiology to develop an expanded suite of blood- and neuroimaging-based diagnostic tests, empower monitoring of treatment efficacy over time, and reduce patient exposure to ineffective treatments. The emerging model of precision psychiatry has the potential to mitigate some of psychiatry's most pressing issues, including improvingdisease classification, lengthy treatment duration, and suboptimal treatment outcomes. This narrative-style review summarizes some of the emerging breakthroughs and recurring challenges in the application of precision medicine approaches to mental healthcare.
View details for DOI 10.1152/physiol.00013.2022
View details for PubMedID 36099270
-
A method for intelligent allocation of diagnostic testing by leveraging data from commercial wearable devices: a case study on COVID-19.
NPJ digital medicine
2022; 5 (1): 130
Abstract
Mass surveillance testing can help control outbreaks of infectious diseases such as COVID-19. However, diagnostic test shortages are prevalent globally and continue to occur in the US with the onset of new COVID-19 variants and emerging diseases like monkeypox, demonstrating an unprecedented need for improving our current methods for mass surveillance testing. By targeting surveillance testing toward individuals who are most likely to be infected and, thus, increasing the testing positivity rate (i.e., percent positive in the surveillance group), fewer tests are needed to capture the same number of positive cases. Here, we developed an Intelligent Testing Allocation (ITA) method by leveraging data from the CovIdentify study (6765 participants) and the MyPHD study (8580 participants), including smartwatch data from 1265 individuals of whom 126 tested positive for COVID-19. Our rigorous model and parameter search uncovered the optimal time periods and aggregate metrics for monitoring continuous digital biomarkers to increase the positivity rate of COVID-19 diagnostic testing. We found that resting heart rate (RHR) features distinguished between COVID-19-positive and -negative cases earlier in the course of the infection than steps features, as early as 10 and 5 days prior to the diagnostic test, respectively. We also found that including steps features increased the area under the receiver operating characteristic curve (AUC-ROC) by 7-11% when compared with RHR features alone, while including RHR features improved the AUC of the ITA model's precision-recall curve (AUC-PR) by 38-50% when compared with steps features alone. The best AUC-ROC (0.73±0.14 and 0.77 on the cross-validated training set and independent test set, respectively) and AUC-PR (0.55±0.21 and 0.24) were achieved by using data from a single device type (Fitbit) with high-resolution (minute-level) data. Finally, we show that ITA generates up to a 6.5-fold increase in the positivity rate in the cross-validated training set and up to a 4.5-fold increase in the positivity rate in the independent test set, including both symptomatic and asymptomatic (up to 27%) individuals. Our findings suggest that, if deployed on a large scale and without needing self-reported symptoms, the ITA method could improve the allocation of diagnostic testing resources and reduce the burden of test shortages.
View details for DOI 10.1038/s41746-022-00672-z
View details for PubMedID 36050372
-
Deploying wearable sensors for pandemic mitigation: A counterfactual modelling study of Canada's second COVID-19 wave.
PLOS digital health
2022; 1 (9): e0000100
Abstract
Wearable sensors can continuously and passively detect potential respiratory infections before or absent symptoms. However, the population-level impact of deploying these devices during pandemics is unclear. We built a compartmental model of Canada's second COVID-19 wave and simulated wearable sensor deployment scenarios, systematically varying detection algorithm accuracy, uptake, and adherence. With current detection algorithms and 4% uptake, we observed a 16% reduction in the second wave burden of infection; however, 22% of this reduction was attributed to incorrectly quarantining uninfected device users. Improving detection specificity and offering confirmatory rapid tests each minimized unnecessary quarantines and lab-based tests. With a sufficiently low false positive rate, increasing uptake and adherence became effective strategies for scaling averted infections. We concluded that wearable sensors capable of detecting presymptomatic or asymptomatic infections have potential to help reduce the burden of infection during a pandemic; in the case of COVID-19, technology improvements or supporting measures are required to keep social and resource costs sustainable.
View details for DOI 10.1371/journal.pdig.0000100
View details for PubMedID 36812624
-
KLF4 recruits SWI/SNF to increase chromatin accessibility and reprogram the endothelial enhancer landscape under laminar shear stress.
Nature communications
2022; 13 (1): 4941
Abstract
Physiologic laminar shear stress (LSS) induces an endothelial gene expression profile that is vasculo-protective. In this report, we delineate how LSS mediates changes in the epigenetic landscape to promote this beneficial response. We show that under LSS, KLF4 interacts with the SWI/SNF nucleosome remodeling complex to increase accessibility at enhancer sites that promote the expression of homeostatic endothelial genes. By combining molecular and computational approaches we discover enhancers that loop to promoters of KLF4- and LSS-responsive genes that stabilize endothelial cells and suppress inflammation, such as BMPR2, SMAD5, and DUSP5. By linking enhancers to genes that they regulate under physiologic LSS, our work establishes a foundation for interpreting how non-coding DNA variants in these regions might disrupt protective gene expression to influence vascular disease.
View details for DOI 10.1038/s41467-022-32566-9
View details for PubMedID 35999210
-
Deep learning-based pseudo-mass spectrometry imaging analysis for precision medicine.
Briefings in bioinformatics
2022
Abstract
Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics provides systematic profiling of metabolic. Yet, its applications in precision medicine (disease diagnosis) have been limited by several challenges, including metabolite identification, information loss and low reproducibility. Here, we present the deep-learning-based Pseudo-Mass Spectrometry Imaging (deepPseudoMSI) project (https://www.deeppseudomsi.org/), which converts LC-MS raw data to pseudo-MS images and then processes them by deep learning for precision medicine, such as disease diagnosis. Extensive tests based on real data demonstrated the superiority of deepPseudoMSI over traditional approaches and the capacity of our method to achieve an accurate individualized diagnosis. Our framework lays the foundation for future metabolic-based precision medicine.
View details for DOI 10.1093/bib/bbac331
View details for PubMedID 35947990
-
Transcriptome variation in human tissues revealed by long-read sequencing.
Nature
2022
Abstract
Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.
View details for DOI 10.1038/s41586-022-05035-y
View details for PubMedID 35922509
-
Reply to 'Lactate as a major myokine and exerkine'.
Nature reviews. Endocrinology
2022
View details for DOI 10.1038/s41574-022-00726-y
View details for PubMedID 35915255
-
DSIF modulates RNA polymerase II occupancy according to template G plus C content
NAR GENOMICS AND BIOINFORMATICS
2022; 4 (3)
View details for DOI 10.1093/nargab/lqac054
View details for Web of Science ID 000830732000001
-
Robust Identification of Temporal Biomarkers in Longitudinal Omics Studies.
Bioinformatics (Oxford, England)
2022
Abstract
Longitudinal studies increasingly collect rich 'omics' data sampled frequently over time and across large cohorts to capture dynamic health fluctuations and disease transitions. However, the generation of longitudinal omics data has preceded the development of analysis tools that can efficiently extract insights from such data. In particular, there is a need for statistical frameworks that can identify not only which omics features are differentially regulated between groups but also over what time intervals. Additionally, longitudinal omics data may have inconsistencies, including nonuniform sampling intervals, missing data points, subject dropout, and differing numbers of samples per subject.In this work, we developed OmicsLonDA, a statistical method that provides robust identification of time intervals of temporal omics biomarkers. OmicsLonDA is based on a semi-parametric approach, in which we use smoothing splines to model longitudinal data and infer significant time intervals of omics features based on an empirical distribution constructed through a permutation procedure. We benchmarked OmicsLonDA on five simulated datasets with diverse temporal patterns, and the method showed specificity greater than 0.99 and sensitivity greater than 0.87. Applying OmicsLonDA to the iPOP cohort revealed temporal patterns of genes, proteins, hormone metabolites, and microbes that are differentially regulated in male versus female subjects following a respiratory infection. In addition, we applied OmicsLonDA to the longitudinal multi-omics dataset of pregnant women with and without preeclampsia, and the method identified potential lipid markers that are temporally significantly different between the two groups.We provide an open-source R package (https://bioconductor.org/packages/OmicsLonDA), to enable widespread use.Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btac403
View details for PubMedID 35762936
-
KMT2D-NOTCH Mediates Coronary Abnormalities in Hypoplastic Left Heart Syndrome.
Circulation research
2022: 101161CIRCRESAHA122320783
View details for DOI 10.1161/CIRCRESAHA.122.320783
View details for PubMedID 35762338
-
Serine biosynthesis as a novel therapeutic target for dilated cardiomyopathy.
European heart journal
2022
Abstract
AIMS: Genetic dilated cardiomyopathy (DCM) is a leading cause of heart failure. Despite significant progress in understanding the genetic aetiologies of DCM, the molecular mechanisms underlying the pathogenesis of familial DCM remain unknown, translating to a lack of disease-specific therapies. The discovery of novel targets for the treatment of DCM was sought using phenotypic sceening assays in induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) that recapitulate the disease phenotypes in vitro.METHODS AND RESULTS: Using patient-specific iPSCs carrying a pathogenic TNNT2 gene mutation (p.R183W) and CRISPR-based genome editing, a faithful DCM model in vitro was developed. An unbiased phenotypic screening in TNNT2 mutant iPSC-derived cardiomyocytes (iPSC-CMs) with small molecule kinase inhibitors (SMKIs) was performed to identify novel therapeutic targets. Two SMKIs, Go 6976 and SB 203580, were discovered whose combinatorial treatment rescued contractile dysfunction in DCM iPSC-CMs carrying gene mutations of various ontologies (TNNT2, TTN, LMNA, PLN, TPM1, LAMA2). The combinatorial SMKI treatment upregulated the expression of genes that encode serine, glycine, and one-carbon metabolism enzymes and significantly increased the intracellular levels of glucose-derived serine and glycine in DCM iPSC-CMs. Furthermore, the treatment rescued the mitochondrial respiration defects and increased the levels of the tricarboxylic acid cycle metabolites and ATP in DCM iPSC-CMs. Finally, the rescue of the DCM phenotypes was mediated by the activating transcription factor 4 (ATF4) and its downstream effector genes, phosphoglycerate dehydrogenase (PHGDH), which encodes a critical enzyme of the serine biosynthesis pathway, and Tribbles 3 (TRIB3), a pseudokinase with pleiotropic cellular functions.CONCLUSIONS: A phenotypic screening platform using DCM iPSC-CMs was established for therapeutic target discovery. A combination of SMKIs ameliorated contractile and metabolic dysfunction in DCM iPSC-CMs mediated via the ATF4-dependent serine biosynthesis pathway. Together, these findings suggest that modulation of serine biosynthesis signalling may represent a novel genotype-agnostic therapeutic strategy for genetic DCM.
View details for DOI 10.1093/eurheartj/ehac305
View details for PubMedID 35728000
-
An exercise-inducible metabolite that suppresses feeding and obesity.
Nature
2022
Abstract
Exercise confers protection against obesity, type 2 diabetes and other cardiometabolic diseases1-5. However, the molecular and cellular mechanisms that mediate the metabolic benefits of physical activity remain unclear6. Here we show that exercise stimulates the production of N-lactoyl-phenylalanine (Lac-Phe), a blood-borne signalling metabolite that suppresses feeding and obesity. The biosynthesis of Lac-Phe from lactate and phenylalanine occurs in CNDP2+ cells, including macrophages, monocytes and other immune and epithelial cells localized to diverse organs. In diet-induced obese mice, pharmacological-mediated increases in Lac-Phe reduces food intake without affecting movement or energy expenditure. Chronic administration of Lac-Phe decreases adiposity and body weight and improves glucose homeostasis. Conversely, genetic ablation of Lac-Phe biosynthesis in mice increases food intake and obesity following exercise training. Last, large activity-inducible increases in circulating Lac-Phe are also observed in humans and racehorses, establishing this metabolite as a molecular effector associated with physical activity across multiple activity modalities and mammalian species. These data define a conserved exercise-inducible metabolite that controls food intake and influences systemic energy balance.
View details for DOI 10.1038/s41586-022-04828-5
View details for PubMedID 35705806
-
Ultra-Low Input High-Fidelity (ULI-HiFi) long-reads uncover variants in genomic dark matter from pre-cancer polyp and tumor samples
AMER ASSOC CANCER RESEARCH. 2022
View details for Web of Science ID 000892509506653
-
Endogenous Retroviral Elements Generate Pathologic Neutrophils in Pulmonary Arterial Hypertension.
American journal of respiratory and critical care medicine
2022
Abstract
RATIONALE: The role of neutrophils and their extracellular vesicles (EVs) in the pathogenesis of pulmonary arterial hypertension is unclear.OBJECTIVES: Relate functional abnormalities in pulmonary arterial hypertension neutrophils and their EVs to mechanisms uncovered by proteomic and transcriptomic profiling.METHODS: Production of elastase, release of extracellular traps, adhesion and migration were assessed in neutrophils from pulmonary arterial hypertension patients and control subjects. Proteomic analyses were applied to explain functional perturbations, and transcriptomic data were used to find underlying mechanisms. CD66b-specific neutrophil EVs were isolated from plasma of patients with pulmonary arterial hypertension and we determined whether they produce pulmonary hypertension in mice.MEASUREMENTS AND MAIN RESULTS: Neutrophils from pulmonary arterial hypertension patients produce and release increased neutrophil elastase, associated with enhanced extracellular traps. They exhibit reduced migration and increased adhesion attributed to elevated beta1integrin and vinculin identified on proteomic analysis and previously linked to an antiviral response. This was substantiated by a transcriptomic interferon signature that we related to an increase in human endogenous retrovirus k envelope protein. Transfection of human endogenous retrovirus k envelope in a neutrophil cell line (HL-60) increases neutrophil elastase and interferon genes, whereas vinculin is increased by human endogenous retrovirus k dUTPase that is elevated in patient plasma. Neutrophil EVs from patient plasma contain increased neutrophil elastase and human endogenous retrovirus k envelope and induce pulmonary hypertension in mice, mitigated by elafin, an elastase inhibitor.CONCLUSIONS: Elevated human endogenous retroviral elements and elastase link a neutrophil innate immune response to pulmonary arterial hypertension.
View details for DOI 10.1164/rccm.202102-0446OC
View details for PubMedID 35696338
-
Wnt Signaling Interactor WTIP (Wilms Tumor Interacting Protein) Underlies Novel Mechanism for Cardiac Hypertrophy.
Circulation. Genomic and precision medicine
2022: 101161CIRCGEN121003563
Abstract
BACKGROUND: The study of hypertrophic cardiomyopathy (HCM)-a severe Mendelian disease-can yield insight into the mechanisms underlying the complex trait of cardiac hypertrophy. To date, most genetic variants associated with HCM have been found in sarcomeric genes. Here, we describe a novel HCM-associated variant in the noncanonical Wnt signaling interactor WTIP (Wilms tumor interacting protein) and provide evidence of a role for WTIP in complex disease.METHODS: In a family affected by HCM, we used exome sequencing and identity-by-descent analysis to identify a novel variant in WTIP (p.Y233F). We knocked down WTIP in isolated neonatal rat ventricular myocytes with lentivirally delivered shRNAs and in Danio rerio via morpholino injection. We performed weighted gene coexpression network analysis for WTIP in human cardiac tissue, as well as association analysis for WTIP variation and left ventricular hypertrophy. Finally, we generated induced pluripotent stem cell-derived cardiomyocytes from patient tissue, characterized size and calcium cycling, and determined the effect of verapamil treatment on calcium dynamics.RESULTS: WTIP knockdown caused hypertrophy in neonatal rat ventricular myocytes and increased cardiac hypertrophy, peak calcium, and resting calcium in D rerio. Network analysis of human cardiac tissue indicated WTIP as a central coordinator of prohypertrophic networks, while common variation at the WTIP locus was associated with human left ventricular hypertrophy. Patient-derived WTIP p.Y233F-induced pluripotent stem cell-derived cardiomyocytes recapitulated cellular hypertrophy and increased resting calcium, which was ameliorated by verapamil.CONCLUSIONS: We demonstrate that a novel genetic variant found in a family with HCM disrupts binding to a known Wnt signaling protein, misregulating cardiomyocyte calcium dynamics. Further, in orthogonal model systems, we show that expression of the gene WTIP is important in complex cardiac hypertrophy phenotypes. These findings, derived from the observation of a rare Mendelian disease variant, uncover a novel disease mechanism with implications across diverse forms of cardiac hypertrophy.
View details for DOI 10.1161/CIRCGEN.121.003563
View details for PubMedID 35671065
-
A cancer-associated RNA polymerase III identity drives robust transcription and expression of snaR-A noncoding RNA.
Nature communications
2022; 13 (1): 3007
Abstract
RNA polymerase III (Pol III) includes two alternate isoforms, defined by mutually exclusive incorporation of subunit POLR3G (RPC7alpha) or POLR3GL (RPC7beta), in mammals. The contributions of POLR3G and POLR3GL to transcription potential has remained poorly defined. Here, we discover that loss of subunit POLR3G is accompanied by a restricted repertoire of genes transcribed by Pol III. Particularly sensitive is snaR-A, a small noncoding RNA implicated in cancer proliferation and metastasis. Analysis of Pol III isoform biases and downstream chromatin features identifies loss of POLR3G and snaR-A during differentiation, and conversely, re-establishment of POLR3G gene expression and SNAR-A gene features in cancer contexts. Our results support a model in which Pol III identity functions as an important transcriptional regulatory mechanism. Upregulation of POLR3G, which is driven by MYC, identifies a subgroup of patients with unfavorable survival outcomes in specific cancers, further implicating the POLR3G-enhanced transcription repertoire as a potential disease factor.
View details for DOI 10.1038/s41467-022-30323-6
View details for PubMedID 35637192
-
Prediction of gestational age using urinary metabolites in term and preterm pregnancies.
Scientific reports
2022; 12 (1): 8033
Abstract
Assessment of gestational age (GA) is key to provide optimal care during pregnancy. However, its accurate determination remains challenging in low- and middle-income countries, where access to obstetric ultrasound is limited. Hence, there is an urgent need to develop clinical approaches that allow accurate and inexpensive estimations of GA. We investigated the ability of urinary metabolites to predict GA at time of collection in a diverse multi-site cohort of healthy and pathological pregnancies (n=99) using a broad-spectrum liquid chromatography coupled with mass spectrometry (LC-MS) platform. Our approach detected a myriad of steroid hormones and their derivatives including estrogens, progesterones, corticosteroids, and androgens which were associated with pregnancy progression. We developed a restricted model that predicted GA with high accuracy using three metabolites (rho=0.87, RMSE=1.58weeks) that was validated in an independent cohort (n=20). The predictions were more robust in pregnancies that went to term in comparison to pregnancies that ended prematurely. Overall, we demonstrated the feasibility of implementing urine metabolomics analysis in large-scale multi-site studies and report a predictive model of GA with a potential clinical value.
View details for DOI 10.1038/s41598-022-11866-6
View details for PubMedID 35577875
-
Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes.
Nature
2022
View details for DOI 10.1038/s41586-021-04226-3
View details for PubMedID 35474001
-
Author Correction: Perspectives on ENCODE.
Nature
2022
View details for DOI 10.1038/s41586-021-04213-8
View details for PubMedID 35474002
-
A machine learning algorithm with subclonal sensitivity reveals widespread pan-cancer human leukocyte antigen loss of heterozygosity.
Nature communications
2022; 13 (1): 1925
Abstract
Human leukocyte antigen loss of heterozygosity (HLA LOH) allows cancer cells to escape immune recognition by deleting HLA alleles, causing the suppressed presentation of tumor neoantigens. Despite its importance in immunotherapy response, few methods exist to detect HLA LOH, and their accuracy is not well understood. Here, we develop DASH (Deletion of Allele-Specific HLAs), a machine learning-based algorithm to detect HLA LOH from paired tumor-normal sequencing data. With cell line mixtures, we demonstrate increased sensitivity compared to previously published tools. Moreover, our patient-specific digital PCR validation approach provides a sensitive, robust orthogonal approach that could be used for clinical validation. Using DASH on 610 patients across 15 tumor types, we find that 18% of patients have HLA LOH. Moreover, we show inflated HLA LOH rates compared to genome-wide LOH and correlations between CD274 (encodes PD-L1) expression and microsatellite instability status, suggesting the HLA LOH is a key immune resistance strategy.
View details for DOI 10.1038/s41467-022-29203-w
View details for PubMedID 35414054
-
A Method for Intelligent Allocation of Diagnostic Testing by Leveraging Data from Commercial Wearable Devices: A Case Study on COVID-19.
Research square
2022
Abstract
Mass surveillance testing can help control outbreaks of infectious diseases such as COVID-19. However, diagnostic test shortages are prevalent globally and continue to occur in the US with the onset of new COVID-19 variants, demonstrating an unprecedented need for improving our current methods for mass surveillance testing. By targeting surveillance testing towards individuals who are most likely to be infected and, thus, increasing testing positivity rate (i.e., percent positive in the surveillance group), fewer tests are needed to capture the same number of positive cases. Here, we developed an Intelligent Testing Allocation (ITA) method by leveraging data from the CovIdentify study (6,765 participants) and the MyPHD study (8,580 participants), including smartwatch data from 1,265 individuals of whom 126 tested positive for COVID-19. Our rigorous model and parameter search uncovered the optimal time periods and aggregate metrics for monitoring continuous digital biomarkers to increase the positivity rate of COVID-19 diagnostic testing. We found that resting heart rate features distinguished between COVID-19 positive and negative cases earlier in the course of the infection than steps features, as early as ten and five days prior to the diagnostic test, respectively. We also found that including steps features increased the area under the receiver operating characteristic curve (AUC-ROC) by 7-11% when compared with RHR features alone, while including RHR features improved the AUC of the ITA model's precision-recall curve (AUC-PR) by 38-50% when compared with steps features alone. The best AUC-ROC (0.73 ± 0.14 and 0.77 on the cross-validated training set and independent test set, respectively) and AUC-PR (0.55 ± 0.21 and 0.24) were achieved by using data from a single device type (Fitbit) with high-resolution (minute-level) data. Finally, we show that ITA generates up to a 6.5-fold increase in the positivity rate in the cross-validated training set and up to a 3-fold increase in the positivity rate in the independent test set, including both symptomatic and asymptomatic (up to 27%) individuals. Our findings suggest that, if deployed on a large scale and without needing self-reported symptoms, the ITA method could improve allocation of diagnostic testing resources and reduce the burden of test shortages.
View details for DOI 10.21203/rs.3.rs-1490524/v1
View details for PubMedID 35378754
View details for PubMedCentralID PMC8978951
-
Exploring disease interrelationships in patients with lymphatic disorders: A single center retrospective experience.
Clinical and translational medicine
2022; 12 (4): e760
Abstract
The lymphatic contribution to the circulation is of paramount importance in regulating fluid homeostasis, immune cell trafficking/activation and lipid metabolism. In comparison to the blood vasculature, the impact of the lymphatics has been underappreciated, both in health and disease, likely due to a less well-delineated anatomy and function. Emerging data suggest that lymphatic dysfunction can be pivotal in the initiation and development of a variety of diseases across broad organ systems. Understanding the clinical associations between lymphatic dysfunction and non-lymphatic morbidity provides valuable evidence for future investigations and may foster the discovery of novel biomarkers and therapies.We retrospectively analysed the electronic medical records of 724 patients referred to the Stanford Center for Lymphatic and Venous Disorders. Patients with an established lymphatic diagnosis were assigned to groups of secondary lymphoedema, lipoedema or primary lymphovascular disease. Individuals found to have no lymphatic disorder were served as the non-lymphatic controls. The prevalence of comorbid conditions was enumerated. Pairwise co-occurrence pattern analyses, validated by Jaccard similarity tests, was utilised to investigate disease-disease interrelationships.Comorbidity analyses underscored the expected relationship between the presence of secondary lymphoedema and those diseases that damage the lymphatics. Cardiovascular conditions were common in all lymphatic subgroups. Additionally, statistically significant alteration of disease-disease interrelationships was noted in all three lymphatic categories when compared to the control population.The presence or absence of a lymphatic disease significantly influences disease interrelationships in the study cohorts. As a physiologic substrate, the lymphatic circulation may be an underappreciated participant in disease pathogenesis. These relationships warrant further, prospective scrutiny and study.
View details for DOI 10.1002/ctm2.760
View details for PubMedID 35452183
-
Exerkines in health, resilience and disease.
Nature reviews. Endocrinology
2022
Abstract
The health benefits of exercise are well-recognized and are observed across multiple organ systems. These beneficial effects enhance overall resilience, healthspan and longevity. The molecular mechanisms that underlie the beneficial effects of exercise, however, remain poorly understood. Since the discovery in 2000 that muscle contraction releases IL-6, the number of exercise-associated signalling molecules that have been identified has multiplied. Exerkines are defined as signalling moieties released in response to acute and/or chronic exercise, which exert their effects through endocrine, paracrine and/or autocrine pathways. A multitude of organs, cells and tissues release these factors, including skeletal muscle (myokines), the heart (cardiokines), liver (hepatokines), white adipose tissue (adipokines), brown adipose tissue (baptokines) and neurons (neurokines). Exerkines have potential roles in improving cardiovascular, metabolic, immune and neurological health. As such, exerkines have potential for the treatment of cardiovascular disease, type 2 diabetes mellitus and obesity, and possibly in the facilitation of healthy ageing. This Review summarizes the importance and current state of exerkine research, prevailing challenges and future directions.
View details for DOI 10.1038/s41574-022-00641-2
View details for PubMedID 35304603
-
Effects of an immersive psychosocial training program on depression and well-being: A randomized clinical trial.
Journal of psychiatric research
2022; 150: 292-299
Abstract
Psychiatry stands to benefit from brief non-pharmacological treatments that effectively reduce depressive symptoms. To address this need, we conducted a single-blind randomized clinical trial assessing how a 6-day immersive psychosocial training program, followed by 10-min daily psychosocial exercises for 30 days, improves depressive symptoms. Forty-five adults were block-randomized by depression score to two arms: (a) the immersive psychosocial training program and 10-min daily exercise group (36 days total; total n=23; depressed at baseline n=14); or (b) a gratitude journaling control group (36 days total; total n=22; depressed at baseline n=13). The self-report PHQ-9 was used to assess depression levels in both groups at three time points: baseline, study week one, and study week six. Depression severity improved over time, with a significantly greater reduction in the psychosocial training program group (-82.7%) vs. the control group (-23%), p=0.02 for baseline vs. week six. The effect size for this reduction in depression symptoms was large for the intervention group (d=-1.3; 95% CI, -2.07, -0.45; p<0.001) and small for the control group (d=-0.3; 95% CI, -0.68, 0.03; p=0.22). Seventy-nine percent (11/14) of depressed participants in the intervention condition were in remission (PHQ-9≤4) by week one and 100% (14/14) were in remission at week six. Secondary measures of anxiety, stress, loneliness, and well-being also improved by 15-80% in the intervention group (vs. 0-34% in the control group), ps<0.05. Overall, this brief, immersive psychosocial training program rapidly and substantially improved depression levels and several related secondary outcomes, suggesting that immersive interventions may be useful for reducing depressive symptoms and enhancing well-being.
View details for DOI 10.1016/j.jpsychires.2022.02.034
View details for PubMedID 35429739
-
Low expression of EXOSC2 protects against clinical COVID-19 and impedes SARS-CoV-2 replication.
bioRxiv : the preprint server for biology
2022
Abstract
New therapeutic targets are a valuable resource in the struggle to reduce the morbidity and mortality associated with the COVID-19 pandemic, caused by the SARS-CoV-2 virus. Genome-wide association studies (GWAS) have identified risk loci, but some loci are associated with co-morbidities and are not specific to host-virus interactions. Here, we identify and experimentally validate a link between reduced expression of EXOSC2 and reduced SARS-CoV-2 replication. EXOSC2 was one of 332 host proteins examined, all of which interact directly with SARS-CoV-2 proteins; EXOSC2 interacts with Nsp8 which forms part of the viral RNA polymerase. Lung-specific eQTLs were identified from GTEx (v7) for each of the 332 host proteins. Aggregating COVID-19 GWAS statistics for gene-specific eQTLs revealed an association between increased expression of EXOSC2 and higher risk of clinical COVID-19 which survived stringent multiple testing correction. EXOSC2 is a component of the RNA exosome and indeed, LC-MS/MS analysis of protein pulldowns demonstrated an interaction between the SARS-CoV-2 RNA polymerase and the majority of human RNA exosome components. CRISPR/Cas9 introduction of nonsense mutations within EXOSC2 in Calu-3 cells reduced EXOSC2 protein expression, impeded SARS-CoV-2 replication and upregulated oligoadenylate synthase ( OAS) genes, which have been linked to a successful immune response against SARS-CoV-2. Reduced EXOSC2 expression did not reduce cellular viability. OAS gene expression changes occurred independent of infection and in the absence of significant upregulation of other interferon-stimulated genes (ISGs). Targeted depletion or functional inhibition of EXOSC2 may be a safe and effective strategy to protect at-risk individuals against clinical COVID-19.
View details for DOI 10.1101/2022.03.06.483172
View details for PubMedID 35291294
View details for PubMedCentralID PMC8923113
-
MITI minimum information guidelines for highly multiplexed tissue images.
Nature methods
2022; 19 (3): 262-267
View details for DOI 10.1038/s41592-022-01415-4
View details for PubMedID 35277708
-
Whole transcriptome profiling of prospective endomyocardial biopsies reveals prognostic and diagnostic signatures of cardiac allograft rejection.
The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation
2022
Abstract
BACKGROUND: Heart transplantation provides a significant improvement in survival and quality of life for patients with end-stage heart disease, however many recipients experience different levels of graft rejection that can be associated with significant morbidities and mortality. Current clinical standard-of-care for the evaluation of heart transplant acute rejection (AR) consists of routine endomyocardial biopsy (EMB) followed by visual assessment by histopathology for immune infiltration and cardiomyocyte damage. We assessed whether the sensitivity and/or specificity of this process could be improved upon by adding RNA sequencing (RNA-seq) of EMBs coupled with histopathological interpretation.METHODS: Up to 6 standard-of-care, or for-cause EMBs, were collected from 26 heart transplant recipients from the prospective observational Clinical Trials of Transplantation (CTOT)-03 study, during the first 12-months post-transplant and subjected to RNA-seq (n=125 EMBs total). Differential expression and random-forest-based machine learning were applied to develop signatures for classification and prognostication.RESULTS: Leveraging the unique longitudinal nature of this study, we show that transcriptional hallmarks for significant rejection events occur months before the actual event and are not visible using traditional histopathology. Using this information, we identified a prognostic signature for 0R/1R biopsies that with 90% accuracy can predict whether the next biopsy will be 2R/3R.CONCLUSIONS: RNA-seq-based molecular characterization of EMBs shows significant promise for the early detection of cardiac allograft rejection.
View details for DOI 10.1016/j.healun.2022.01.1377
View details for PubMedID 35317953
-
Dual isoform sequencing reveals complex transcriptomic and epitranscriptomic landscapes of a prototype baculovirus.
Scientific reports
1800; 12 (1): 1291
Abstract
In this study, two long-read sequencing (LRS) techniques, MinION from Oxford Nanopore Technologies and Sequel from the Pacific Biosciences, were used for the transcriptional characterization of a prototype baculovirus, Autographa californica multiple nucleopolyhedrovirus. LRS is able to read full-length RNA molecules, and thereby distinguish between transcript isoforms, mono- and polycistronic RNAs, and overlapping transcripts. Altogether, we detected 875 transcript species, of which 759 were novel and 116 were annotated previously. These RNA molecules include 41 novel putative protein coding transcripts [each containing 5'-truncated in-frame open reading frames (ORFs), 14 monocistronic transcripts, 99 polygenic RNAs, 101 non-coding RNAs, and 504 untranslated region isoforms. This work also identified novel replication origin-associated transcripts, upstream ORFs, cis-regulatory sequences and poly(A) sites. We also detected RNA methylation in 99 viral genes and RNA hyper-editing in the longer 5'-UTR transcript isoform of the canonical ORF 19 transcript.
View details for DOI 10.1038/s41598-022-05457-8
View details for PubMedID 35079129
-
Unbiased metabolome screen leads to personalized medicine strategy for amyotrophic lateral sclerosis.
Brain communications
2022; 4 (2): fcac069
Abstract
Amyotrophic lateral sclerosis is a rapidly progressive neurodegenerative disease that affects 1/350 individuals in the United Kingdom. The cause of amyotrophic lateral sclerosis is unknown in the majority of cases. Two-sample Mendelian randomization enables causal inference between an exposure, such as the serum concentration of a specific metabolite, and disease risk. We obtained genome-wide association study summary statistics for serum concentrations of 566 metabolites which were population matched with a genome-wide association study of amyotrophic lateral sclerosis. For each metabolite, we performed Mendelian randomization using an inverse variance weighted estimate for significance testing. After stringent Bonferroni multiple testing correction, our unbiased screen revealed three metabolites that were significantly linked to the risk of amyotrophic lateral sclerosis: Estrone-3-sulphate and bradykinin were protective, which is consistent with literature describing a male preponderance of amyotrophic lateral sclerosis and a preventive effect of angiotensin-converting enzyme inhibitors which inhibit the breakdown of bradykinin. Serum isoleucine was positively associated with amyotrophic lateral sclerosis risk. All three metabolites were supported by robust Mendelian randomization measures and sensitivity analyses; estrone-3-sulphate and isoleucine were confirmed in a validation amyotrophic lateral sclerosis genome-wide association study. Estrone-3-sulphate is metabolized to the more active estradiol by the enzyme 17beta-hydroxysteroid dehydrogenase 1; further, Mendelian randomization demonstrated a protective effect of estradiol and rare variant analysis showed that missense variants within HSD17B1, the gene encoding 17beta-hydroxysteroid dehydrogenase 1, modify risk for amyotrophic lateral sclerosis. Finally, in a zebrafish model of C9ORF72-amyotrophic lateral sclerosis, we present evidence that estradiol is neuroprotective. Isoleucine is metabolized via methylmalonyl-CoA mutase encoded by the gene MMUT in a reaction that consumes vitamin B12. Multivariable Mendelian randomization revealed that the toxic effect of isoleucine is dependent on the depletion of vitamin B12; consistent with this, rare variants which reduce the function of MMUT are protective against amyotrophic lateral sclerosis. We propose that amyotrophic lateral sclerosis patients and family members with high serum isoleucine levels should be offered supplementation with vitamin B12.
View details for DOI 10.1093/braincomms/fcac069
View details for PubMedID 35441136
-
Identification of end-stage renal disease metabolic signatures from human perspiration
Natural Sciences
2022
View details for DOI 10.1002/ntls.20220048
-
Patient-derived gene and protein expression signatures of NGLY1 deficiency.
Journal of biochemistry
2021
Abstract
N-Glycanase 1 (NGLY1) deficiency is a rare and complex genetic disorder. Although recent studies have shed light on the molecular underpinnings of NGLY1 deficiency, a systematic characterization of gene and protein expression changes in patient-derived cells has been lacking. Here, we performed RNA-sequencing and mass spectrometry to determine the transcriptomes and proteomes of 66 cell lines representing 4 different cell types derived from 14 NGLY1 deficient patients and 17 controls. Although NGLY1 protein levels were up to 9.5-fold downregulated in patients compared to parents, residual and likely non-functional NGLY1 protein was detectable in all patient-derived lymphoblastoid cell lines. Consistent with the role of NGLY1 as a regulator of the transcription factor Nrf1, we observed a cell type-independent downregulation of proteasomal genes in NGLY1 deficient cells. In contrast, genes involved in ribosome biogenesis and mRNA processing were upregulated in multiple cell types. In addition, we observed cell type-specific effects. For example, genes and proteins involved in glutathione synthesis, such as the glutamate-cysteine ligase subunits GCLC and GCLM, were downregulated specifically in lymphoblastoid cells. We provide a web application that enables access to all results generated in this study at https://apps.embl.de/ngly1browser. This resource will guide future studies of NGLY1 deficiency in directions that are most relevant to patients.
View details for DOI 10.1093/jb/mvab131
View details for PubMedID 34878535
-
Tet enzymes are essential for early embryogenesis and completion of embryonic genome activation.
EMBO reports
2021: e53968
Abstract
Mammalian development begins in transcriptional silence followed by a period of widespread activation of thousands of genes. DNA methylation reprogramming is integral to embryogenesis and linked to Tet enzymes, but their function in early development is not well understood. Here, we generate combined deficiencies of all three Tet enzymes in mouse oocytes using a morpholino-guided knockdown approach and study the impact of acute Tet enzyme deficiencies on preimplantation development. Tet1-3 deficient embryos arrest at the 2-cell stage with the most severe phenotype linked to Tet2. Individual Tet enzymes display non-redundant rolesin the consecutive oxidation of 5-methylcytosine to 5-carboxylcytosine. Gene expression analysis uncovers that Tet enzymes are required for completion of embryonic genome activation (EGA) and fine-tuned expression of transposable elements and chimeric transcripts. Whole-genome bisulfite sequencing reveals minor changes of global DNA methylation in Tet-deficient 2-cell embryos, suggesting an important role of non-catalytic functions of Tet enzymes in early embryogenesis. Our results demonstrate that Tet enzymes are key components of the clock that regulates the timing and extent of EGA in mammalian embryos.
View details for DOI 10.15252/embr.202153968
View details for PubMedID 34866320
-
Cross-Laboratory Standardization of Preclinical Lipidomics Using Differential Mobility Spectrometry and Multiple Reaction Monitoring.
Analytical chemistry
2021
Abstract
Modern biomarker and translational research as well as personalized health care studies rely heavily on powerful omics' technologies, including metabolomics and lipidomics. However, to translate metabolomics and lipidomics discoveries into a high-throughput clinical setting, standardization is of utmost importance. Here, we compared and benchmarked a quantitative lipidomics platform. The employed Lipidyzer platform is based on lipid class separation by means of differential mobility spectrometry with subsequent multiple reaction monitoring. Quantitation is achieved by the use of 54 deuterated internal standards and an automated informatics approach. We investigated the platform performance across nine laboratories using NIST SRM 1950-Metabolites in Frozen Human Plasma, and three NIST Candidate Reference Materials 8231-Frozen Human Plasma Suite for Metabolomics (high triglyceride, diabetic, and African-American plasma). In addition, we comparatively analyzed 59 plasma samples from individuals with familial hypercholesterolemia from a clinical cohort study. We provide evidence that the more practical methyl-tert-butyl ether extraction outperforms the classic Bligh and Dyer approach and compare our results with two previously published ring trials. In summary, we present standardized lipidomics protocols, allowing for the highly reproducible analysis of several hundred human plasma lipids, and present detailed molecular information for potentially disease relevant and ethnicity-related materials.
View details for DOI 10.1021/acs.analchem.1c02826
View details for PubMedID 34859676
-
Human exposome assessment platform.
Environmental epidemiology (Philadelphia, Pa.)
1800; 5 (6): e182
Abstract
The Human Exposome Assessment Platform (HEAP) is a research resource for the integrated and efficient management and analysis of human exposome data. The project will provide the complete workflow for obtaining exposome actionable knowledge from population-based cohorts. HEAP is a state-of-the-science service composed of computational resources from partner institutions, accessed through a software framework that provides the world's fastest Hadoop platform for data warehousing and applied artificial intelligence (AI). The software, will provide a decision support system for researchers and policymakers. All the data managed and processed by HEAP, together with the analysis pipelines, will be available for future research. In addition, the platform enables adding new data and analysis pipelines. HEAP's final product can be deployed in multiple instances to create a network of shareable and reusable knowledge on the impact of exposures on public health.
View details for DOI 10.1097/EE9.0000000000000182
View details for PubMedID 34909561
-
Design and Methods of the Validating Injury to the Renal Transplant Using Urinary Signatures (VIRTUUS) Study in Children.
Transplantation direct
2021; 7 (12): e791
Abstract
Lack of noninvasive diagnostic and prognostic biomarkers to reliably detect early allograft injury poses a major hindrance to long-term allograft survival in pediatric kidney transplant recipients.Methods: Validating Injury to the Renal Transplant Using Urinary Signatures Children's Study, a North American multicenter prospective cohort study of pediatric kidney transplant recipients, aims to validate urinary cell mRNA and metabolite profiles that were diagnostic and prognostic of acute cellular rejection (ACR) and BK virus nephropathy (BKVN) in adult kidney transplant recipients in Clinical Trials in Organ Transplantation-4. Specifically, we are investigating: (1) whether a urinary cell mRNA 3-gene signature (18S-normalized CD3epsilon, CXCL10 mRNA, and 18S ribosomal RNA) discriminates biopsies with versus without ACR, (2) whether a combined metabolite profile with the 3-gene signature increases sensitivity and specificity of diagnosis and prognostication of ACR, and (3) whether BKV-VP1 mRNA levels in urinary cells are diagnostic of BKVN and prognostic for allograft failure.Results: To date, 204 subjects are enrolled, with 1405 urine samples, including 144 biopsy-associated samples. Among 424 urine samples processed for mRNA, the median A260:280 ratio (RNA purity) was 1.91, comparable with Clinical Trials in Organ Transplantation-4 (median 1.82). The quality control failure rate was 10%. Preliminary results from urine supernatant showed that our metabolomics platform successfully captured a broad array of metabolites. Clustering of pool samples and overlay of samples from various batches demonstrated platform robustness. No study site effect was noted.Conclusions: Multicenter efforts to ascertain urinary biomarkers in pediatric kidney transplant recipients are feasible with high-quality control. Further study will inform whether these signatures are discriminatory and predictive for rejection and infection.
View details for DOI 10.1097/TXD.0000000000001244
View details for PubMedID 34805493
-
Network biology bridges the gaps between quantitative genetics and multi-omics to map complex diseases.
Current opinion in chemical biology
2021; 66: 102101
Abstract
With advances in high-throughput sequencing technologies, quantitative genetics approaches have provided insights into genetic basis of many complex diseases. Emerging in-depth multi-omics profiling technologies have created exciting opportunities for systematically investigating intricate interaction networks with different layers of biological molecules underlying disease etiology. Herein, we summarized two main categories of biological networks: evidence-based and statistically inferred. These different types of molecular networks complement each other at both bulk and single-cell levels. We also review three main strategies to incorporate quantitative genetics results with multi-omics data by network analysis: (a) network propagation, (b) functional module-based methods, (c) comparative/dynamic networks. These strategies not only aid in elucidating molecular mechanisms of complex diseases but can guide the search for therapeutic targets.
View details for DOI 10.1016/j.cbpa.2021.102101
View details for PubMedID 34861483
-
Master lineage transcription factors anchor trans mega transcriptional complexes at highly accessible enhancer sites to promote long-range chromatin clustering and transcription of distal target genes.
Nucleic acids research
2021
Abstract
The term 'super enhancers' (SE) has been widely used to describe stretches of closely localized enhancers that are occupied collectively by large numbers of transcription factors (TFs) and co-factors, and control the transcription of highly-expressed genes. Through integrated analysis of >600 DNase-seq, ChIP-seq, GRO-seq, STARR-seq, RNA-seq, Hi-C and ChIA-PET data in five human cancer cell lines, we identified a new class of autonomous SEs (aSEs) that are excluded from classic SE calls by the widely used Rank Ordering of Super-Enhancers (ROSE) method. TF footprint analysis revealed that compared to classic SEs and regular enhancers, aSEs are tightly bound by a dense array of master lineage TFs, which serve as anchors to recruit additional TFs and co-factors in trans. In addition, aSEs are preferentially enriched for Cohesins, which likely involve in stabilizing long-distance interactions between aSEs and their distal target genes. Finally, we showed that aSEs can be reliably predicted using a single DNase-seq data or combined with Mediator and/or P300 ChIP-seq. Overall, our study demonstrates that aSEs represent a unique class of functionally important enhancer elements that distally regulate the transcription of highly expressed genes.
View details for DOI 10.1093/nar/gkab1105
View details for PubMedID 34850122
-
Spatial mapping of protein composition and tissue organization: a primer for multiplexed antibody-based imaging.
Nature methods
2021
Abstract
Tissues and organs are composed of distinct cell types that must operate in concert to perform physiological functions. Efforts to create high-dimensional biomarker catalogs of these cells have been largely based on single-cell sequencing approaches, which lack the spatial context required to understand critical cellular communication and correlated structural organization. To probe in situ biology with sufficient depth, several multiplexed protein imaging methods have been recently developed. Though these technologies differ in strategy and mode of immunolabeling and detection tags, they commonly utilize antibodies directed against protein biomarkers to provide detailed spatial and functional maps of complex tissues. As these promising antibody-based multiplexing approaches become more widely adopted, new frameworks and considerations are critical for training future users, generating molecular tools, validating antibody panels, and harmonizing datasets. In this Perspective, we provide essential resources, key considerations for obtaining robust and reproducible imaging data, and specialized knowledge from domain experts and technology developers.
View details for DOI 10.1038/s41592-021-01316-y
View details for PubMedID 34811556
-
A review of Mendelian randomization in amyotrophic lateral sclerosis.
Brain : a journal of neurology
2021
Abstract
Amyotrophic lateral sclerosis (ALS) is a relatively common and rapidly progressive neurodegenerative disease which, in the majority of cases, is thought to be determined by a complex gene-environment interaction. Exponential growth in the number of performed genome-wide association studies (GWAS), combined with the advent of Mendelian randomization (MR) is opening significant new opportunities to identify environmental exposures which increase or decrease the risk of ALS. Each of these discoveries has the potential to shape new therapeutic interventions. However, to do so rigorous methodological standards must be applied in the performance of MR. We have performed a review of MR studies performed in ALS to date. We identified 20 MR studies, including evaluation of physical exercise, adiposity, cognitive performance, immune function, blood lipids, sleep behaviours, educational attainment, alcohol consumption, smoking and type 2 diabetes mellitus. We have evaluated each study using gold standard methodology supported by the MR literature and the STROBE-MR checklist. Where discrepancies exist between MR studies, we suggest the underlying reasons. A number of studies conclude that there is a causal link between blood lipids and risk of ALS; replication across different datasets and even different populations adds confidence. For other putative risk factors, such as smoking and immune function, MR studies have provided cause for doubt. We highlight the use of positive control analyses in choosing exposure SNPs to make up the MR instrument, use of SNP clumping to avoid false positive results due to SNPs in linkage, and the importance of multiple testing correction. We discuss the implications of survival bias for study of late age of onset diseases such as ALS, and make recommendations to mitigate this potentially important confounder. For MR to be useful to the ALS field, high methodological standards must be applied to ensure reproducibility. MR is already an impactful tool but poor quality studies will lead to incorrect interpretations by a field which includes non-statisticians, wasted resources and missed opportunities.
View details for DOI 10.1093/brain/awab420
View details for PubMedID 34791088
-
In-depth triacylglycerol profiling using MS3 Q-Trap mass spectrometry.
Analytica chimica acta
2021; 1184: 339023
Abstract
Total triacylglycerol (TAG) level is a key clinical marker of metabolic and cardiovascular diseases. However, the roles of individual TAGs have not been thoroughly explored in part due to their extreme structural complexity. We present a targeted mass spectrometry-based method combining multiple reaction monitoring (MRM) and multiple stage mass spectrometry (MS3) for the comprehensive qualitative and semiquantitative profiling of TAGs. This method referred as TriP-MS3 - triacylglycerol profiling using MS3 - screens for more than 6,700 TAG species in a fully automated fashion. TriP-MS3 demonstrated excellent reproducibility (median interday CV0.15) and linearity (median R2=0.978) and detected 285 individual TAG species in human plasma. The semiquantitative accuracy of the method was validated by comparison with a state-of-the-art reverse phase liquid chromatography (RPLC)-MS (R2=0.83), which is the most commonly used approach for TAGs profiling. Finally, we demonstrate the utility and the versatility of the method by characterizing the effects of a fatty acid desaturase inhibitor on TAG profiles invitro and by profiling TAGs in Caenorhabditis elegans.
View details for DOI 10.1016/j.aca.2021.339023
View details for PubMedID 34625255
-
COVID-19-Induced New-Onset Diabetes: Trends and Technologies.
Diabetes
2021
Abstract
The coronavirus disease 2019 (COVID-19) global pandemic continues to spread worldwide with approximately 216 million confirmed cases and 4.49 million deaths to date. Intensive efforts are ongoing to combat this disease by suppressing viral transmission, understanding its pathogenesis, developing vaccination strategies, and identifying effective therapeutic targets. Individuals with preexisting diabetes also show higher incidence of COVID-19 illness and poorer prognosis upon infection. Likewise, an increased frequency of diabetes onset and diabetes complications has been reported in patients following COVID-19 diagnosis. COVID-19 may elevate the risk of hyperglycemia and other complications in patients with and without prior diabetes history. It is unclear whether the virus induces type 1 or type 2 diabetes or instead causes a novel atypical form of diabetes. Moreover, it remains unknown if recovering COVID-19 patients exhibit a higher risk of developing new-onset diabetes or its complications going forward. The aim of this review is to summarize what is currently known about the epidemiology and mechanisms of this bidirectional relationship between COVID-19 and diabetes. We highlight major challenges that hinder the study of COVID-19-induced new-onset of diabetes and propose a potential framework for overcoming these obstacles. We also review state-of-the-art wearables and microsampling technologies that can further study diabetes management and progression in new-onset diabetes cases. We conclude by outlining current research initiatives investigating the bidirectional relationship between COVID-19 and diabetes, some with emphasis on wearable technology.
View details for DOI 10.2337/dbi21-0029
View details for PubMedID 34686519
-
Altered Cardiac Energetics and Mitochondrial Dysfunction in Hypertrophic Cardiomyopathy.
Circulation
2021
Abstract
Background: Hypertrophic cardiomyopathy (HCM) is a complex disease partly explained by the effects of individual gene variants on sarcomeric protein biomechanics. At the cellular level, HCM mutations most commonly enhance force production, leading to higher energy demands. Despite significant advances in elucidating sarcomeric structure-function relationships, there is still much to be learned about the mechanisms that link altered cardiac energetics to HCM phenotypes. In this work, we test the hypothesis that changes in cardiac energetics represent a common pathophysiologic pathway in HCM. Methods: We performed a comprehensive multi-omics profile of the molecular (transcripts, metabolites, and complex lipids), ultrastructural, and functional components of HCM energetics using myocardial samples from 27 HCM patients and 13 normal controls (donor hearts). Results: Integrated omics analysis revealed alterations in a wide array of biochemical pathways with major dysregulation in fatty acid metabolism, reduction of acylcarnitines, and accumulation of free fatty acids. HCM hearts showed evidence of global energetic decompensation manifested by a decrease in high energy phosphate metabolites [ATP, ADP, and phosphocreatine (PCr)] and a reduction in mitochondrial genes involved in creatine kinase and ATP synthesis. Accompanying these metabolic derangements, electron microscopy showed an increased fraction of severely damaged mitochondria with reduced cristae density, coinciding with reduced citrate synthase (CS) activity and mitochondrial oxidative respiration. These mitochondrial abnormalities were associated with elevated reactive oxygen species (ROS) and reduced antioxidant defenses. However, despite significant mitochondrial injury, HCM hearts failed to upregulate mitophagic clearance. Conclusions: Overall, our findings suggest that perturbed metabolic signaling and mitochondrial dysfunction are common pathogenic mechanisms in patients with HCM. These results highlight potential new drug targets for attenuation of the clinical disease through improving metabolic function and reducing mitochondrial injury.
View details for DOI 10.1161/CIRCULATIONAHA.121.053575
View details for PubMedID 34672721
-
The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation.
Nature genetics
2021
Abstract
Transcription factors bind DNA sequence motif vocabularies in cis-regulatory elements (CREs) to modulate chromatin state and gene expression during cell state transitions. A quantitative understanding of how motif lexicons influence dynamic regulatory activity has been elusive due to the combinatorial nature of the cis-regulatory code. To address this, we undertook multiomic data profiling of chromatin and expression dynamics across epidermal differentiation to identify 40,103 dynamic CREs associated with 3,609 dynamically expressed genes, then applied an interpretable deep-learning framework to model the cis-regulatory logic of chromatin accessibility. This analysis framework identified cooperative DNA sequence rules in dynamic CREs regulating synchronous gene modules with diverse roles in skin differentiation. Massively parallel reporter assay analysis validated temporal dynamics and cooperative cis-regulatory logic. Variants linked to human polygenic skin disease were enriched in these time-dependent combinatorial motif rules. This integrative approach shows the combinatorial cis-regulatory lexicon of epidermal differentiation and represents a general framework for deciphering the organizational principles of the cis-regulatory code of dynamic gene regulation.
View details for DOI 10.1038/s41588-021-00947-3
View details for PubMedID 34650237
-
Divergent patterns of selection on metabolite levels and gene expression.
BMC ecology and evolution
2021; 21 (1): 185
Abstract
BACKGROUND: Natural selection can act on multiple genes in the same pathway, leading to polygenic adaptation. For example, adaptive changes were found to down-regulate six genes involved in ergosterol biosynthesis-an essential pathway targeted by many antifungal drugs-in some strains of the yeast Saccharomyces cerevisiae. However, the impact of this polygenic adaptation on metabolite levels was unknown. Here, we performed targeted mass spectrometry to measure the levels of eight metabolites in this pathway in 74 yeast strains from a genetic cross.RESULTS: Through quantitative trait locus (QTL) mapping we identified 19 loci affecting ergosterol pathway metabolite levels, many of which overlap loci that also impact gene expression within the pathway. We then used the recently developed v-test, which identified selection acting upon three metabolite levels within the pathway, none of which were predictable from the gene expression adaptation.CONCLUSIONS: These data showed that effects of selection on metabolite levels were complex and not predictable from gene expression data. This suggests that a deeper understanding of metabolism is necessary before we can understand the impacts of even relatively straightforward gene expression adaptations on metabolic pathways.
View details for DOI 10.1186/s12862-021-01915-5
View details for PubMedID 34587900
-
Statins Are Associated With Increased Insulin Resistance and Secretion.
Arteriosclerosis, thrombosis, and vascular biology
2021: ATVBAHA121316159
Abstract
OBJECTIVE: Statin treatment reduces the risk of atherosclerotic cardiovascular disease but is associated with a modest increased risk of type 2 diabetes, especially in those with insulin resistance or prediabetes. Our objective was to determine the physiological mechanism for the increased type 2 diabetes risk. Approach and Results: We conducted an open-label clinical trial of atorvastatin 40 mg daily in adults without known atherosclerotic cardiovascular disease or type 2 diabetes at baseline. The co-primary outcomes were changes at 10 weeks versus baseline in insulin resistance as assessed by steady-state plasma glucose during the insulin suppression test and insulin secretion as assessed by insulin secretion rate area under the curve (ISRAUC) during the graded-glucose infusion test. Secondary outcomes included glucose and insulin, both fasting and during oral glucose tolerance test. Of 75 participants who enrolled, 71 completed the study (median age 61 years, 37% women, 65% non-Hispanic White, median body mass index, 27.8 kg/m2). Atorvastatin reduced LDL (low-density lipoprotein)-cholesterol (median decrease 53%, P<0.001) but did not change body weight. Compared with baseline, atorvastatin increased insulin resistance (steady-state plasma glucose) by a median of 8% (P=0.01) and insulin secretion (ISRAUC) by a median of 9% (P<0.001). There were small increases in oral glucose tolerance test glucoseAUC (median increase, 0.05%; P=0.03) and fasting insulin (median increase, 7%; P=0.01).CONCLUSIONS: In individuals without type 2 diabetes, high-intensity atorvastatin for 10 weeks increases insulin resistance and insulin secretion. Over time, the risk of new-onset diabetes with statin use may increase in individuals who become more insulin resistant but are unable to maintain compensatory increases in insulin secretion.REGISTRATION: URL: https://www.clinicaltrials.gov; Unique identifier: NCT02437084.
View details for DOI 10.1161/ATVBAHA.121.316159
View details for PubMedID 34433298
-
Temporal changes in soluble angiotensin-converting enzyme 2 associated with metabolic health, body composition, and proteome dynamics during a weight loss diet intervention: a randomized trial with implications for the COVID-19 pandemic.
The American journal of clinical nutrition
2021
Abstract
BACKGROUND: Angiotensin-converting enzyme 2 (ACE2) serves protective functions in metabolic, cardiovascular, renal, and pulmonary diseases and is linked to COVID-19 pathology. The correlates of temporal changes in soluble ACE2 (sACE2) remain understudied.OBJECTIVES: We explored the associations of sACE2 with metabolic health and proteome dynamics during a weight loss diet intervention.METHODS: We analyzed 457 healthy individuals (mean±SD age: 39.8±6.6 y) with BMI 28-40kg/m2 in the DIETFITS (Diet Intervention Examining the Factors Interacting with Treatment Success) study. Biochemical markers of metabolic health and 236 proteins were measured by Olink CVDII, CVDIII, and Inflammation I arrays at baseline and at 6 mo during the dietary intervention. We determined clinical and routine biochemical correlates of the diet-induced change in sACE2 (DeltasACE2) using stepwise linear regression. We combined feature selection models and multivariable-adjusted linear regression to identify protein dynamics associated with DeltasACE2.RESULTS: sACE2 decreased on average at 6 mo during the diet intervention. Stronger decline in sACE2 during the diet intervention was independently associated with female sex, lower HOMA-IR and LDL cholesterol at baseline, and a stronger decline in HOMA-IR, triglycerides, HDL cholesterol, and fat mass. Participants with decreasing HOMA-IR (OR: 1.97; 95% CI: 1.28, 3.03) and triglycerides (OR: 2.71; 95% CI: 1.72, 4.26) had significantly higher odds for a decrease in sACE2 during the diet intervention than those without (P≤0.0073). Feature selection models linked DeltasACE2 to changes in alpha-1-microglobulin/bikunin precursor, E-selectin, hydroxyacid oxidase 1, kidney injury molecule 1, tyrosine-protein kinase Mer, placental growth factor, thrombomodulin, and TNF receptor superfamily member 10B. DeltasACE2 remained associated with these protein changes in multivariable-adjusted linear regression.CONCLUSIONS: Decrease in sACE2 during a weight loss diet intervention was associated with improvements in metabolic health, fat mass, and markers of angiotensin peptide metabolism, hepatic and vascular injury, renal function, chronic inflammation, and oxidative stress. Our findings may improve the risk stratification, prevention, and management of cardiometabolic complications.This trial was registered at clinicaltrials.gov as NCT01826591.
View details for DOI 10.1093/ajcn/nqab243
View details for PubMedID 34375388
-
Prediction of Immunotherapy Response in Melanoma through Combined Modeling of Neoantigen Burden and Immune-Related Resistance Mechanisms.
Clinical cancer research : an official journal of the American Association for Cancer Research
2021; 27 (15): 4265-4276
Abstract
PURPOSE: While immune checkpoint blockade (ICB) has become a pillar of cancer treatment, biomarkers that consistently predict patient response remain elusive due to the complex mechanisms driving immune response to tumors. We hypothesized that a multi-dimensional approach modeling both tumor and immune-related molecular mechanisms would better predict ICB response than simpler mutation-focused biomarkers, such as tumor mutational burden (TMB).EXPERIMENTAL DESIGN: Tumors from a cohort of patients with late-stage melanoma (n = 51) were profiled using an immune-enhanced exome and transcriptome platform. We demonstrate increasing predictive power with deeper modeling of neoantigens and immune-related resistance mechanisms to ICB.RESULTS: Our neoantigen burden score, which integrates both exome and transcriptome features, more significantly stratified responders and nonresponders (P = 0.016) than TMB alone (P = 0.049). Extension of this model to include immune-related resistance mechanisms affecting the antigen presentation machinery, such as HLA allele-specific LOH, resulted in a composite neoantigen presentation score (NEOPS) that demonstrated further increased association with therapy response (P = 0.002).CONCLUSIONS: NEOPS proved the statistically strongest biomarker compared with all single-gene biomarkers, expression signatures, and TMB biomarkers evaluated in this cohort. Subsequent confirmation of these findings in an independent cohort of patients (n = 110) suggests that NEOPS is a robust, novel biomarker of ICB response in melanoma.
View details for DOI 10.1158/1078-0432.CCR-20-4314
View details for PubMedID 34341053
-
Longitudinal linked-read sequencing reveals ecological and evolutionary responses of a human gut microbiome during antibiotic treatment.
Genome research
2021
Abstract
Gut microbial communities can respond to antibiotic perturbations by rapidly altering their taxonomic and functional composition. However, little is known about the strain-level processes that drive this collective response. Here, we characterize the gut microbiome of a single individual at high temporal and genetic resolution through a period of health, disease, antibiotic treatment, and recovery. We used deep, linked-read metagenomic sequencing to track the longitudinal trajectories of thousands of single nucleotide variants within 36 species, which allowed us to contrast these genetic dynamics with the ecological fluctuations at the species level. We found that antibiotics can drive rapid shifts in the genetic composition of individual species, often involving incomplete genome-wide sweeps of pre-existing variants. These genetic changes were frequently observed in species without obvious changes in species abundance, emphasizing the importance of monitoring diversity below the species level. We also found that many sweeping variants quickly reverted to their baseline levels once antibiotic treatment had concluded, demonstrating that the ecological resilience of the microbiota can sometimes extend all the way down to the genetic level. Our results provide new insights into the population genetic forces that shape individual microbiomes on therapeutically relevant timescales, with potential implications for personalized health and disease.
View details for DOI 10.1101/gr.265058.120
View details for PubMedID 34301627
-
Time-Course Transcriptome Profiling of a Poxvirus Using Long-Read Full-Length Assay.
Pathogens (Basel, Switzerland)
2021; 10 (8)
Abstract
Viral transcriptomes that are determined using first- and second-generation sequencing techniques are incomplete. Due to the short read length, these methods are inefficient or fail to distinguish between transcript isoforms, polycistronic RNAs, and transcriptional overlaps and readthroughs. Additionally, these approaches are insensitive for the identification of splice and transcriptional start sites (TSSs) and, in most cases, transcriptional end sites (TESs), especially in transcript isoforms with varying transcript ends, and in multi-spliced transcripts. Long-read sequencing is able to read full-length nucleic acids and can therefore be used to assemble complete transcriptome atlases. Although vaccinia virus (VACV) does not produce spliced RNAs, its transcriptome has a high diversity of TSSs and TESs, and a high degree of polycistronism that leads to enormous complexity. We applied single-molecule, real-time, and nanopore-based sequencing methods to investigate the time-lapse transcriptome patterns of VACV gene expression.
View details for DOI 10.3390/pathogens10080919
View details for PubMedID 34451383
-
The Exposome in the Era of the Quantified Self.
Annual review of biomedical data science
2021; 4: 255-277
Abstract
Human health is regulated by complex interactions among the genome, the microbiome, and the environment. While extensive research has been conducted on the human genome and microbiome, little is known about the human exposome. The exposome comprises the totality of chemical, biological, and physical exposures that individuals encounter over their lifetimes. Traditional environmental and biological monitoring only targets specific substances, whereas exposomic approaches identify and quantify thousands of substances simultaneously using nontargeted high-throughput and high-resolution analyses. The quantified self (QS) aims at enhancing our understanding of human health and disease through self-tracking. QS measurements are critical in exposome research, as external exposures impact an individual's health, behavior, and biology. This review discusses both the achievements and the shortcomings of current research and methodologies on the QS and the exposome and proposes future research directions.
View details for DOI 10.1146/annurev-biodatasci-012721-122807
View details for PubMedID 34465170
-
Combined nanopore and single-molecule real-time sequencing survey of human betaherpesvirus 5 transcriptome.
Scientific reports
2021; 11 (1): 14487
Abstract
Long-read sequencing (LRS), a powerful novel approach, is able to read full-length transcripts and confers a major advantage over the earlier gold standard short-read sequencing in the efficiency of identifying for example polycistronic transcripts and transcript isoforms, including transcript length- and splice variants. In this work, we profile the human cytomegalovirus transcriptome using two third-generation LRS platforms: the Sequel from Pacific BioSciences, and MinION from Oxford Nanopore Technologies. We carried out both cDNA and direct RNA sequencing, and applied the LoRTIA software, developed in our laboratory, for the transcript annotations. This study identified a large number of novel transcript variants, including splice isoforms and transcript start and end site isoforms, as well as putative mRNAs with truncated in-frame ORFs (located within the larger ORFs of the canonical mRNAs), which potentially encode N-terminally truncated polypeptides. Our work also disclosed a highly complex meshwork of transcriptional read-throughs and overlaps.
View details for DOI 10.1038/s41598-021-93593-y
View details for PubMedID 34262076
-
Multi-Omic, Longitudinal Profile of Third-Trimester Pregnancies Identifies a Molecular Switch That Predicts the Onset of Labor.
SPRINGER HEIDELBERG. 2021: 233A-234A
View details for Web of Science ID 000675441000486
-
Pan-cancer survey of HLA loss of heterozygosity using a robustly validated NGS-based machine learning algorithm.
AMER ASSOC CANCER RESEARCH. 2021
View details for Web of Science ID 000680263502451
-
Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices.
Nature methods
2021; 18 (7): 747-756
Abstract
Mass spectrometry-based metabolomics approaches can enable detection and quantification of many thousands of metabolite features simultaneously. However, compound identification and reliable quantification are greatly complicated owing to the chemical complexity and dynamic range of the metabolome. Simultaneous quantification of many metabolites within complex mixtures can additionally be complicated by ion suppression, fragmentation and the presence of isomers. Here we present guidelines covering sample preparation, replication and randomization, quantification, recovery and recombination, ion suppression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography- and gas chromatography-mass spectrometry-based metabolomics-derived data.
View details for DOI 10.1038/s41592-021-01197-1
View details for PubMedID 34239102
-
Time-course transcriptome analysis of host cell response to poxvirus infection using a dual long-read sequencing approach.
BMC research notes
2021; 14 (1): 239
Abstract
OBJECTIVE: In this study, we applied two long-read sequencing (LRS) approaches, including single-molecule real-time and nanopore-based sequencing methods to investigate the time-lapse transcriptome patterns of host gene expression as a response to Vaccinia virus infection. Transcriptomes determined using short-read sequencing approaches are incomplete because these platforms are inefficient or fail to distinguish between polycistronic RNAs, transcript isoforms, transcriptional start sites, as well as transcriptional readthroughs and overlaps. Long-read sequencing is able to read full-length nucleic acids and can therefore be used to assemble complete transcriptome atlases.RESULTS: In this work, we identified a number of novel transcripts and transcript isoforms of Chlorocebus sabaeus. Additionally, analysis of the most abundant 768 host transcripts revealed a significant overrepresentation of the class of genes in the "regulation of signaling receptor activity" Gene Ontology annotation as a result of viral infection.
View details for DOI 10.1186/s13104-021-05657-x
View details for PubMedID 34167576
-
AdaTiSS: A Novel Data-Adaptive Robust Method for Identifying Tissue Specificity Scores.
Bioinformatics (Oxford, England)
2021
Abstract
MOTIVATION: Accurately detecting tissue specificity (TS) in genes helps researchers understand tissue functions at the molecular level. The Genotype-Tissue Expression project is one of the publicly available data resources, providing large-scale gene expressions across multiple tissue types. Multiple tissue comparisons and heterogeneous tissue expression make it challenging to accurately identify tissue specific gene expression. How to distinguish the inlier expression from the outlier expression becomes important to build the population level information and further quantify the TS. There still lacks a robust and data-adaptive TS method taking into account heterogeneities of the data.METHODS: We found that the key to identify tissue specific gene expression is to properly define a concept of expression population. In a linear regression problem, we developed a novel data-adaptive robust estimation based on density-power-weight under unknown outlier distribution and non-vanishing outlier proportion. The Gaussian-population mixture model was considered in the setting of identifying TS. We took into account heterogeneities of gene expression and applied the robust data-adaptive procedure to estimate the population parameters. With the well-estimated population parameters, we constructed the AdaTiSS algorithm.RESULTS: Our AdaTiSS profiled TS for each gene and each tissue, which standardized the gene expression in terms of TS. We provided a new robust and powerful tool to the literature of defining tissue specificity.AVAILABILITY: https://github.com/mwgrassgreen/AdaTiSS.
View details for DOI 10.1093/bioinformatics/btab460
View details for PubMedID 34146104
-
Precision neoantigen discovery using large-scale immunopeptidomes and composite modeling of MHC peptide presentation.
Molecular & cellular proteomics : MCP
2021: 100111
Abstract
Major histocompatibility complex (MHC)-bound peptides that originate from tumor-specific genetic alterations, known as neoantigens, are an important class of anti-cancer therapeutic targets. Accurately predicting peptide presentation by MHC complexes is a key aspect of discovering therapeutically relevant neoantigens. Technological improvements in mass-spectrometry-based immunopeptidomics and advanced modeling techniques have vastly improved MHC presentation prediction over the past two decades. However, improvement in the sensitivity and specificity of prediction algorithms is needed for clinical applications such as the development of personalized cancer vaccines, the discovery of biomarkers for response to checkpoint blockade and the quantification of autoimmune risk in gene therapies. Toward this end, we generated allele-specific immunopeptidomics data using 25 mono-allelic cell lines and created Systematic HLA Epitope Ranking Pan Algorithm (SHERPA), a pan-allelic MHC-peptide algorithm for predicting MHC-peptide binding and presentation. In contrast to previously published large-scale mono-allelic data, we used an HLA-null K562 parental cell line and a stable transfection of HLA alleles to better emulate native presentation. Our dataset includes five previously unprofiled alleles that expand MHC binding pocket diversity in the training data and extend allelic coverage in under profiled populations. To improve generalizability, SHERPA systematically integrates 128 mono-allelic and 384 multi-allelic samples with publicly available immunoproteomics data and binding assay data. Using this dataset, we developed two features that empirically estimate the propensities of genes and specific regions within gene bodies to engender immunopeptides to represent antigen processing. Using a composite model constructed with gradient boosting decision trees, multi-allelic deconvolution and 2.15 million peptides encompassing 167 alleles, we achieved a 1.44 fold improvement of positive predictive value compared to existing tools when evaluated on independent mono-allelic datasets and a 1.15 fold improvement when evaluating on tumor samples. With a high degree of accuracy, SHERPA has the potential to enable precision neoantigen discovery for future clinical applications.
View details for DOI 10.1016/j.mcpro.2021.100111
View details for PubMedID 34126241
-
Non-invasive wearables for remote monitoring of HbA1c and glucose variability: proof of concept.
BMJ open diabetes research & care
2021; 9 (1)
Abstract
Diabetes prevalence continues to grow and there remains a significant diagnostic gap in one-third of the US population that has pre-diabetes. Innovative, practical strategies to improve monitoring of glycemic health are desperately needed. In this proof-of-concept study, we explore the relationship between non-invasive wearables and glycemic metrics and demonstrate the feasibility of using non-invasive wearables to estimate glycemic metrics, including hemoglobin A1c (HbA1c) and glucose variability metrics.We recorded over 25 000 measurements from a continuous glucose monitor (CGM) with simultaneous wrist-worn wearable (skin temperature, electrodermal activity, heart rate, and accelerometry sensors) data over 8-10 days in 16 participants with normal glycemic state and pre-diabetes (HbA1c 5.2-6.4). We used data from the wearable to develop machine learning models to predict HbA1c recorded on day 0 and glucose variability calculated from the CGM. We tested the accuracy of the HbA1c model on a retrospective, external validation cohort of 10 additional participants and compared results against CGM-based HbA1c estimation models.A total of 250 days of data from 26 participants were collected. Out of the 27 models of glucose variability metrics that we developed using non-invasive wearables, 11 of the models achieved high accuracy (<10% mean average per cent error, MAPE). Our HbA1c estimation model using non-invasive wearables data achieved MAPE of 5.1% on an external validation cohort. The ranking of wearable sensor's importance in estimating HbA1c was skin temperature (33%), electrodermal activity (28%), accelerometry (25%), and heart rate (14%).This study demonstrates the feasibility of using non-invasive wearables to estimate glucose variability metrics and HbA1c for glycemic monitoring and investigates the relationship between non-invasive wearables and the glycemic metrics of glucose variability and HbA1c. The methods used in this study can be used to inform future studies confirming the results of this proof-of-concept study.
View details for DOI 10.1136/bmjdrc-2020-002027
View details for PubMedID 36170350
-
Physical exercise is a risk factor for amyotrophic lateral sclerosis: Convergent evidence from Mendelian randomisation, transcriptomics and risk genotypes.
EBioMedicine
2021; 68: 103397
Abstract
BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a universally fatal neurodegenerative disease. ALS is determined by gene-environment interactions and improved understanding of these interactions may lead to effective personalised medicine. The role of physical exercise in the development of ALS is currently controversial.METHODS: First, we dissected the exercise-ALS relationship in a series of two-sample Mendelian randomisation (MR) experiments. Next we tested for enrichment of ALS genetic risk within exercise-associated transcriptome changes. Finally, we applied a validated physical activity questionnaire in a small cohort of genetically selected ALS patients.FINDINGS: We present MR evidence supporting a causal relationship between genetic liability to frequent and strenuous leisure-time exercise and ALS using a liberal instrument (multiplicative random effects IVW, p=0.01). Transcriptomic analysis revealed that genes with altered expression in response to acute exercise are enriched with known ALS risk genes (permutation test, p=0.013) including C9ORF72, and with ALS-associated rare variants of uncertain significance. Questionnaire evidence revealed that age of onset is inversely proportional to historical physical activity for C9ORF72-ALS (Cox proportional hazards model, Wald test p=0.007, likelihood ratio test p=0.01, concordance=74%) but not for non-C9ORF72-ALS. Variability in average physical activity was lower in C9ORF72-ALS compared to both non-C9ORF72-ALS (F-test, p=0.002) and neurologically normal controls (F-test, p=0.049) which is consistent with a homogeneous effect of physical activity in all C9ORF72-ALS patients.INTERPRETATION: Our MR approach suggests a positive causal relationship between ALS and physical exercise. Exercise is likely to cause motor neuron injury only in patients with a risk-genotype. Consistent with this we have shown that ALS risk genes are activated in response to exercise. In particular, we propose that G4C2-repeat expansion of C9ORF72 predisposes to exercise-induced ALS.FUNDING: We acknowledge support from the Wellcome Trust (JCK, 216596/Z/19/Z), NIHR (PJS, NF-SI-0617-10077; IS-BRC-1215-20017) and NIH (MPS, CEGS5P50HG00773504,1P50HL083800, 1R01HL101388, 1R01-HL122939, S10OD025212, P30DK116074, and UM1HG009442).
View details for DOI 10.1016/j.ebiom.2021.103397
View details for PubMedID 34051439
-
Association of HLA loss of heterozygosity with allele-specific neoantigen expansion in response to immunotherapy.
LIPPINCOTT WILLIAMS & WILKINS. 2021
View details for DOI 10.1200/JCO.2021.39.15_suppl.e18030
View details for Web of Science ID 000708120303080
-
Robust prediction of response to immunotherapy in a mixed cohort of previously treated and immunotherapy-naive melanoma patients.
LIPPINCOTT WILLIAMS & WILKINS. 2021
View details for DOI 10.1200/JCO.2021.39.15_suppl.e21548
View details for Web of Science ID 000708120305222
-
Integrated trajectories of the maternal metabolome, proteome, and immunome predict labor onset.
Science translational medicine
2021; 13 (592)
Abstract
Estimating the time of delivery is of high clinical importance because pre- and postterm deviations are associated with complications for the mother and her offspring. However, current estimations are inaccurate. As pregnancy progresses toward labor, major transitions occur in fetomaternal immune, metabolic, and endocrine systems that culminate in birth. The comprehensive characterization of maternal biology that precedes labor is key to understanding these physiological transitions and identifying predictive biomarkers of delivery. Here, a longitudinal study was conducted in 63 women who went into labor spontaneously. More than 7000 plasma analytes and peripheral immune cell responses were analyzed using untargeted mass spectrometry, aptamer-based proteomic technology, and single-cell mass cytometry in serial blood samples collected during the last 100 days of pregnancy. The high-dimensional dataset was integrated into a multiomic model that predicted the time to spontaneous labor [R = 0.85, 95% confidence interval (CI) [0.79 to 0.89], P = 1.2 * 10-40, N = 53, training set; R = 0.81, 95% CI [0.61 to 0.91], P = 3.9 * 10-7, N = 10, independent test set]. Coordinated alterations in maternal metabolome, proteome, and immunome marked a molecular shift from pregnancy maintenance to prelabor biology 2 to 4 weeks before delivery. A surge in steroid hormone metabolites and interleukin-1 receptor type 4 that preceded labor coincided with a switch from immune activation to regulation of inflammatory responses. Our study lays the groundwork for developing blood-based methods for predicting the day of labor, anchored in mechanisms shared in preterm and term pregnancies.
View details for DOI 10.1126/scitranslmed.abd9898
View details for PubMedID 33952678
-
A genome-wide atlas of co-essential modules assigns function to uncharacterized genes.
Nature genetics
2021
Abstract
A central question in the post-genomic era is how genes interact to form biological pathways. Measurements of gene dependency across hundreds of cell lines have been used to cluster genes into 'co-essential' pathways, but this approach has been limited by ubiquitous false positives. In the present study, we develop a statistical method that enables robust identification of gene co-essentiality and yields a genome-wide set of functional modules. This atlas recapitulates diverse pathways and protein complexes, and predicts the functions of 108 uncharacterized genes. Validating top predictions, we show that TMEM189 encodes plasmanylethanolamine desaturase, a key enzyme for plasmalogen synthesis. We also show that C15orf57 encodes a protein that binds the AP2 complex, localizes to clathrin-coated pits and enables efficient transferrin uptake. Finally, we provide an interactive webtool for the community to explore our results, which establish co-essentiality profiling as a powerful resource for biological pathway identification and discovery of new gene functions.
View details for DOI 10.1038/s41588-021-00840-z
View details for PubMedID 33859415
-
Population-scale tissue transcriptomics maps long non-coding RNAs to complex disease.
Cell
2021
Abstract
Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.
View details for DOI 10.1016/j.cell.2021.03.050
View details for PubMedID 33864768
-
iNetModels 2.0: an interactive visualization and database of multi-omics data.
Nucleic acids research
2021
Abstract
It is essential to reveal the associations between various omics data for a comprehensive understanding of the altered biological process in human wellness and disease. To date, very few studies have focused on collecting and exhibiting multi-omics associations in a single database. Here, we present iNetModels, an interactive database and visualization platform of Multi-Omics Biological Networks (MOBNs). This platform describes the associations between the clinical chemistry, anthropometric parameters, plasma proteomics, plasma metabolomics, as well as metagenomics for oral and gut microbiome obtained from the same individuals. Moreover, iNetModels includes tissue- and cancer-specific Gene Co-expression Networks (GCNs) for exploring the connections between the specific genes. This platform allows the user to interactively explore a single feature's association with other omics data and customize its particular context (e.g. male/female specific). The users can also register their data for sharing and visualization of the MOBNs and GCNs. Moreover, iNetModels allows users who do not have a bioinformatics background to facilitate human wellness and disease research. iNetModels can be accessed freely at https://inetmodels.com without any limitation.
View details for DOI 10.1093/nar/gkab254
View details for PubMedID 33849075
-
Inherited causes of clonal haematopoiesis in 97,691 whole genomes (vol 586 , pg 763, 2020)
NATURE
2021; 591 (7851): E27
View details for DOI 10.1038/s41586-021-03280
View details for Web of Science ID 000632177100002
-
ALDH1A3 Coordinates Metabolism with Gene Regulation in Pulmonary Arterial Hypertension.
Circulation
2021
Abstract
Background: Metabolic alterations provide substrates that influence chromatin structure to regulate gene expression that determines cell function in health and disease. Heightened proliferation of smooth muscle cells (SMC) leading to the formation of a neointima is a feature of pulmonary arterial hypertension (PAH) and systemic vascular disease. Increased glycolysis is linked to the proliferative phenotype of these SMC. Methods: RNA Sequencing was applied to pulmonary arterial (PA) SMC from PAH patients with and without a BMPR2 mutation vs. control PASMC to uncover genes required for their heightened proliferation and glycolytic metabolism. Assessment of differentially expressed genes established metabolism as a major pathway, and the most highly upregulated metabolic gene in PAH PASMC was aldehyde dehydrogenase family 1 member 3 (ALDH1A3), an enzyme previously linked to glycolysis and proliferation in cancer cells and systemic vascular SMC. We determined if these functions are ALDH1A3-dependent in PAH PASMC, and if ALDH1A3 is required for the development of pulmonary hypertension in a transgenic mouse. Nuclear localization of ALDH1A3 in PAH PASMC led us to determine whether and how this enzyme coordinately regulates gene expression and metabolism in PAH PASMC. Results: ALDH1A3 mRNA and protein were increased in PAH vs control PASMC, and ALDH1A3 was required for their highly proliferative and glycolytic properties. Mice with Aldh1a3 deleted in SMC did not develop hypoxia-induced PA muscularization or pulmonary hypertension. Nuclear ALDH1A3 converted acetaldehyde to acetate to produce acetyl-CoA to acetylate H3K27, marking active enhancers. This allowed for chromatin modification at nuclear factor Y (NFY)A binding sites via the acetyltransferase KAT2B and permitted NFY mediated transcription of cell cycle and metabolic genes that is required for ALDH1A3-dependent proliferation and glycolysis. Loss of BMPR2 in PAH SMC with or without a mutation upregulated ALDH1A3, and transcription of NFYA and ALDH1A3 in PAH PASMC was beta-catenin dependent. Conclusions: Our studies have uncovered a metabolic-transcriptional axis explaining how dividing cells use ALDH1A3 to coordinate their energy needs with the epigenetic and transcriptional regulation of genes required for SMC proliferation. They suggest that selectively disrupting the pivotal role of ALDH1A3 in PAH SMC, but not EC, is an important therapeutic consideration.
View details for DOI 10.1161/CIRCULATIONAHA.120.048845
View details for PubMedID 33764154
-
Understanding how biologic and social determinants affect disparities in preterm birth and outcomes of preterm infants in the NICU.
Seminars in perinatology
2021: 151408
Abstract
To understand the disparities in spontaneous preterm birth (sPTB) and/or its outcomes, biologic and social determinants as well as healthcare practice (such as those in neonatal intensive care units) should be considered. They have been largely intractable and remain obscure in most cases, despite a myriad of identified risk factors for and causes of sPTB. We still do not know how they might actually affect and lead to the different outcomes at different gestational ages and if they are independent of NICU practices. Here we describe an integrated approach to study the interplay between the genome and exposome, which may drive biochemistry and physiology, with health disparities.
View details for DOI 10.1016/j.semperi.2021.151408
View details for PubMedID 33875265
-
Early Detection of SARS-CoV-2 and other Infections in Solid Organ Transplant Recipients and Household Members using Wearable Devices.
Transplant international : official journal of the European Society for Organ Transplantation
2021
Abstract
The increasing global prevalence of SARS-CoV-2 and the resulting COVID-19 disease pandemic pose significant concerns for clinical management of solid organ transplant recipients (SOTR). Wearable devices that can measure physiologic changes in biometrics including heart rate, heart rate variability, body temperature, respiratory, activity (such as steps taken per day) and sleep patterns and blood oxygen saturation, show utility for the early detection of infection before clinical presentation of symptoms. Recent algorithms developed using preliminary wearable datasets show that SARS-CoV-2 is detectable before clinical symptoms in >80% of adults. Early detection of SARS-CoV-2, influenza, and other pathogens in SOTR, and their household members, could facilitate early interventions such as self-isolation and early clinical management of relevant infection(s). Ongoing studies testing the utility of wearable devices such as smartwatches for early detection of SARS-CoV-2 and other infections in the general population are reviewed here, along with the practical challenges to implementing these processes at scale in pediatric and adult SOTR, and their household members. The resources and logistics, including transplant specific analyses pipelines to account for confounders such as polypharmacy and comorbidities, required in studies of pediatric and adult SOTR for the robust early detection of SARS-CoV-2 and other infections are also reviewed.
View details for DOI 10.1111/tri.13860
View details for PubMedID 33735480
-
Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud.
Bioinformatics (Oxford, England)
2021
Abstract
MOTIVATION: A major drawback of executing genomic applications on cloud computing facilities is the lack of tools to predict which instance type is the most appropriate, often resulting in an over- or under- matching of resources. Determining the right configuration before actually running the applications will save money and time. Here, we introduce Hummingbird, a tool for predicting performance of computing instances with varying memory and CPU on multiple cloud platforms.RESULTS: Our experiments on three major genomic data pipelines, including GATK HaplotypeCaller, GATK MuTect2, and ENCODE ATAC-seq, showed that Hummingbird was able to address applications in command line specified in JSON format or workflow description language (WDL) format, and accurately predicted the fastest, the cheapest, and the most cost-efficient compute instances in an economic manner.AVAILABILITY: Hummingbird is available as an open source tool at: https://github.com/StanfordBioinformatics/Hummingbird.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btab161
View details for PubMedID 33693476
-
Response to Hulman and colleagues regarding "Glucotypes reveal new patterns of glucose dysregulation".
PLoS biology
2021; 19 (3): e3001092
Abstract
In a response to a Formal Comment critiquing their model for classifying individualized glucose patterns into glucotypes, these authors stand by their results and conclusions, which can be reproduced using their publicly available data, and maintain that improved algorithms for analyzing CGM data will continue to emerge and enrich the field.
View details for DOI 10.1371/journal.pbio.3001092
View details for PubMedID 33705379
-
Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays.
BMC bioinformatics
2021; 22 (1): 85
Abstract
BACKGROUND: Benchmarking the performance of complex analytical pipelines is an essential part of developing Lab Developed Tests (LDT). Reference samples and benchmark calls published by Genome in a Bottle (GIAB) consortium have enabled the evaluation of analytical methods. The performance of such methods is not uniform across the different genomic regions of interest and variant types. Several benchmarking methods such as hap.py, vcfeval, and vcflib are available to assess the analytical performance characteristics of variant calling algorithms. However, assessing the performance characteristics of an overall LDT assay still requires stringing together several such methods and experienced bioinformaticians to interpret the results. In addition, these methods are dependent on the hardware, operating system and other software libraries, making it impossible to reliably repeat the analytical assessment, when any of the underlying dependencies change in the assay. Here we present a scalable and reproducible, cloud-based benchmarking workflow that is independent of the laboratory and the technician executing the workflow, or the underlying compute hardware used to rapidly and continually assess the performance of LDT assays, across their regions of interest and reportable range, using a broad set of benchmarking samples.RESULTS: The benchmarking workflow was used to evaluate the performance characteristics for secondary analysis pipelines commonly used by Clinical Genomics laboratories in their LDT assays such as the GATK HaplotypeCaller v3.7 and the SpeedSeq workflow based on FreeBayes v0.9.10. Five reference sample truth sets generated by Genome in a Bottle (GIAB) consortium, six samples from the Personal Genome Project (PGP) and several samples with validated clinically relevant variants from the Centers for Disease Control were used in this work. The performance characteristics were evaluated and compared for multiple reportable ranges, such as whole exome and the clinical exome.CONCLUSIONS: We have implemented a benchmarking workflow for clinical diagnostic laboratories that generates metrics such as specificity, precision and sensitivity for germline SNPs and InDels within a reportable range using whole exome or genome sequencing data. Combining these benchmarking results with validation using known variants of clinical significance in publicly available cell lines, we were able to establish the performance of variant calling pipelines in a clinical setting.
View details for DOI 10.1186/s12859-020-03934-3
View details for PubMedID 33627090
-
An Integrated Sequencing Approach for Updating the Pseudorabies Virus Transcriptome.
Pathogens (Basel, Switzerland)
2021; 10 (2)
Abstract
In the last couple of years, the implementation of long-read sequencing (LRS) technologies for transcriptome profiling has uncovered an extreme complexity of viral gene expression. In this study, we carried out a systematic analysis on the pseudorabies virus transcriptome by combining our current data obtained by using Pacific Biosciences Sequel and Oxford Nanopore Technologies MinION sequencing with our earlier data generated by other LRS and short-read sequencing techniques. As a result, we identified a number of novel genes, transcripts, and transcript isoforms, including splice and length variants, and also confirmed earlier annotated RNA molecules. One of the major findings of this study is the discovery of a large number of 5'-truncations of larger putative mRNAs being 3'-co-terminal with canonical mRNAs of PRV. A large fraction of these putative RNAs contain in-frame ATGs, which might initiate translation of N-terminally truncated polypeptides. Our analyses indicate that CTO-S, a replication origin-associated RNA molecule is expressed at an extremely high level. This study demonstrates that the PRV transcriptome is much more complex than previously appreciated.
View details for DOI 10.3390/pathogens10020242
View details for PubMedID 33672563
-
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.
Nature
2021; 590 (7845): 290–99
Abstract
The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
View details for DOI 10.1038/s41586-021-03205-y
View details for PubMedID 33568819
-
Decoding personal biotic and abiotic airborne exposome.
Nature protocols
2021
Abstract
The complexity and dynamics of human diseases are driven by the interactions between internal molecular activities and external environmental exposures. Although advances in omics technology have dramatically broadened the understanding of internal molecular and cellular mechanisms, understanding of the external environmental exposures, especially at the personal level, is still rudimentary in comparison. This is largely owing to our limited ability to efficiently collect the personal environmental exposome (PEE) and extract the nucleic acids and chemicals from PEE. Here we describe a protocol that integrates hardware and experimental pipelines to collect and decode biotic and abiotic external exposome at the individual level. The described protocol has several advantages over conventional approaches, such as exposome monitoring at the personal level, decontamination steps to increase sensitivity and simultaneous capture and high-throughput profiling of biotic and abiotic exposures. The protocol takes ~18 h of bench time over 2-3 d to prepare samples for high-throughput profiling and up to a couple of weeks of instrumental time to analyze, depending on the number of samples. Hundreds to thousands of species and organic compounds could be detected in the airborne particulate samples using this protocol. The composition and complexity of the biotic and abiotic substances are heavily influenced by the sampling spatiotemporal factors. Basic skillsets in molecular biology and analytical chemistry are required to carry out this protocol. This protocol could be modified to decode biotic and abiotic substances in other types of low or ultra-low input samples.
View details for DOI 10.1038/s41596-020-00451-8
View details for PubMedID 33437065
-
The gut microbiome: a key player in the complexity of amyotrophic lateral sclerosis (ALS).
BMC medicine
2021; 19 (1): 13
Abstract
Much progress has been made in mapping genetic abnormalities linked to amyotrophic lateral sclerosis (ALS), but the majority of cases still present with no known underlying cause. Furthermore, even in families with a shared genetic abnormality there is significant phenotypic variability, suggesting that non-genetic elements may modify pathogenesis. Identification of such disease-modifiers is important as they might represent new therapeutic targets. A growing body of research has begun to shed light on the role played by the gut microbiome in health and disease with a number of studies linking abnormalities to ALS.The microbiome refers to the genes belonging to the myriad different microorganisms that live within and upon us, collectively known as the microbiota. Most of these microbes are found in the intestines, where they play important roles in digestion and the generation of key metabolites including neurotransmitters. The gut microbiota is an important aspect of the environment in which our bodies operate and inter-individual differences may be key to explaining the different disease outcomes seen in ALS. Work has begun to investigate animal models of the disease, and the gut microbiomes of people living with ALS, revealing changes in the microbial communities of these groups. The current body of knowledge will be summarised in this review. Advances in microbiome sequencing methods will be highlighted, as their improved resolution now enables researchers to further explore differences at a functional level. Proposed mechanisms connecting the gut microbiome to neurodegeneration will also be considered, including direct effects via metabolites released into the host circulation and indirect effects on bioavailability of nutrients and even medications.Profiling of the gut microbiome has the potential to add an environmental component to rapidly advancing studies of ALS genetics and move research a step further towards personalised medicine for this disease. Moreover, should compelling evidence of upstream neurotoxicity or neuroprotection initiated by gut microbiota emerge, modification of the microbiome will represent a potential new avenue for disease modifying therapies. For an intractable condition with few current therapeutic options, further research into the ALS microbiome is of crucial importance.
View details for DOI 10.1186/s12916-020-01885-3
View details for PubMedID 33468103
-
The Exposome in the Era of the Quantified Self
ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 4
2021; 4: 255-277
View details for DOI 10.1146/annurev-biodatasci-012721-122807
View details for Web of Science ID 000677831600013
-
Non-invasive wearables for remote monitoring of HbA1c and glucose variability: proof of concept
BMJ OPEN DIABETES RESEARCH & CARE
2021; 9 (1)
View details for DOI 10.1136/bmjdrc-2020-002027
View details for Web of Science ID 000662276300001
-
AdaReg: data adaptive robust estimation in linear regression with application in GTEx gene expressions.
Statistical applications in genetics and molecular biology
2021
Abstract
The Genotype-Tissue Expression (GTEx) project provides a valuable resource of large-scale gene expressions across multiple tissue types. Under various technical noise and unknown or unmeasured factors, how to robustly estimate the major tissue effect becomes challenging. Moreover, different genes exhibit heterogeneous expressions across different tissue types. Therefore, we need a robust method which adapts to the heterogeneities of gene expressions to improve the estimation for the tissue effect. We followed the approach of the robust estimation based on γ-density-power-weight in the works of Fujisawa, H. and Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination. J. Multivariate Anal. 99: 2053-2081 and Windham, M.P. (1995). Robustifying model fitting. J. Roy. Stat. Soc. B: 599-609, where γ is the exponent of density weight which controls the balance between bias and variance. As far as we know, our work is the first to propose a procedure to tune the parameter γ to balance the bias-variance trade-off under the mixture models. We constructed a robust likelihood criterion based on weighted densities in the mixture model of Gaussian population distribution mixed with unknown outlier distribution, and developed a data-adaptive γ-selection procedure embedded into the robust estimation. We provided a heuristic analysis on the selection criterion and found that our practical selection trend under various γ's in average performance has similar capability to capture minimizer γ as the inestimable mean squared error (MSE) trend from our simulation studies under a series of settings. Our data-adaptive robustifying procedure in the linear regression problem (AdaReg) showed a significant advantage in both simulation studies and real data application in estimating tissue effect of heart samples from the GTEx project, compared to the fixed γ procedure and other robust methods. At the end, the paper discussed some limitations on this method and future work.
View details for DOI 10.1515/sagmb-2020-0042
View details for PubMedID 34252998
-
Exposome-wide Association Study for Metabolic Syndrome.
Frontiers in genetics
1800; 12: 783930
View details for DOI 10.3389/fgene.2021.783930
View details for PubMedID 34950191
-
Adapting skills from genetic counseling to wearables technology research during the COVID-19 pandemic: Poised for the pivot.
Journal of genetic counseling
2021
Abstract
Genetic counselors have shown themselves to be adaptable in an evolving profession, with expansion into new sub-specialties, various non-clinical settings, and research roles. The COVID-19 pandemic caused a sudden and drastic shift in healthcare priorities. In an effort to contribute meaningfully to the COVID-19 crisis, and to adapt to a remote- and essential-only research environment, our workplace and thus our roles pivoted from genomics research to remote COVID-19 research using wearables technologies. With a deep understanding of genomic data, we were quickly able to apply similar concepts to wearables data including considering privacy implications, managing uncertain findings, and acknowledging the lack of ethnic diversity in many datasets. By sharing our own experience as an example, we hope individuals trained in genetic counseling may see opportunities for adaptation of their skills into expanding roles.
View details for DOI 10.1002/jgc4.1509
View details for PubMedID 34580951
-
Real-time Alerting System for COVID-19 Using Wearable Data.
medRxiv : the preprint server for health sciences
2021
Abstract
Early detection of infectious disease is crucial for reducing transmission and facilitating early intervention. We built a real-time smartwatch-based alerting system for the detection of aberrant physiological and activity signals (e.g. resting heart rate, steps) associated with early infection onset at the individual level. Upon applying this system to a cohort of 3,246 participants, we found that alerts were generated for pre-symptomatic and asymptomatic COVID-19 infections in 78% of cases, and pre-symptomatic signals were observed a median of three days prior to symptom onset. Furthermore, by examining over 100,000 survey annotations, we found that other respiratory infections as well as events not associated with COVID-19 (e.g. stress, alcohol consumption, travel) could trigger alerts, albeit at a lower mean period (1.9 days) than those observed in the COVID-19 cases (4.3 days). Thus this system has potential both for advanced warning of COVID-19 as well as a general system for measuring health via detection of physiological shifts from personal baselines. The system is open-source and scalable to millions of users, offering a personal health monitoring system that can operate in real time on a global scale.
View details for DOI 10.1101/2021.06.13.21258795
View details for PubMedID 34189532
View details for PubMedCentralID PMC8240687
-
A DMS Shotgun Lipidomics Workflow Application to Facilitate High-Throughput, Comprehensive Lipidomics.
Journal of the American Society for Mass Spectrometry
2021
Abstract
Differential mobility spectrometry (DMS) is highly useful for shotgun lipidomic analysis because it overcomes difficulties in measuring isobaric species within a complex lipid sample and allows for acyl tail characterization of phospholipid species. Despite these advantages, the resulting workflow presents technical challenges, including the need to tune the DMS before every batch to update compensative voltages settings within the method. The Sciex Lipidyzer platform uses a Sciex 5500 QTRAP with a DMS (SelexION), an LC system configured for direction infusion experiments, an extensive set of standards designed for quantitative lipidomics, and a software package (Lipidyzer Workflow Manager) that facilitates the workflow and rapidly analyzes the data. Although the Lipidyzer platform remains very useful for DMS-based shotgun lipidomics, the software is no longer updated for current versions of Analyst and Windows. Furthermore, the software is fixed to a single workflow and cannot take advantage of new lipidomics standards or analyze additional lipid species. To address this multitude of issues, we developed Shotgun Lipidomics Assistant (SLA), a Python-based application that facilitates DMS-based lipidomics workflows. SLA provides the user with flexibility in adding and subtracting lipid and standard MRMs. It can report quantitative lipidomics results from raw data in minutes, comparable to the Lipidyzer software. We show that SLA facilitates an expanded lipidomics analysis that measures over 1450 lipid species across 17 (sub)classes. Lastly, we demonstrate that the SLA performs isotope correction, a feature that was absent from the original software.
View details for DOI 10.1021/jasms.1c00203
View details for PubMedID 34637296
-
Swarm: A federated cloud framework for large-scale variant analysis.
PLoS computational biology
2021; 17 (5): e1008977
Abstract
Genomic data analysis across multiple cloud platforms is an ongoing challenge, especially when large amounts of data are involved. Here, we present Swarm, a framework for federated computation that promotes minimal data motion and facilitates crosstalk between genomic datasets stored on various cloud platforms. We demonstrate its utility via common inquiries of genomic variants across BigQuery in the Google Cloud Platform (GCP), Athena in the Amazon Web Services (AWS), Apache Presto and MySQL. Compared to single-cloud platforms, the Swarm framework significantly reduced computational costs, run-time delays and risks of security breach and privacy violation.
View details for DOI 10.1371/journal.pcbi.1008977
View details for PubMedID 33979321
-
Precision medicine in women with epilepsy: The challenge, systematic review, and future direction.
Epilepsy & behavior : E&B
2021; 118: 107928
Abstract
Epilepsy is one of the most prevalent neurologic conditions, affecting almost 70 million people worldwide. In the United States, 1.3 million women with epilepsy (WWE) are in their active reproductive years. Women with epilepsy (WWE) face gender-specific challenges such as pregnancy, seizure exacerbation with hormonal pattern fluctuations, contraception, fertility, and menopause. Precision medicine, which applies state-of-the art molecular profiling to diagnostic, prognostic, and therapeutic problems, has the potential to advance the care of WWE by precisely tailoring individualized management to each patient's needs. For example, antiseizure medications (ASMs) are among the most common teratogens prescribed to women of childbearing potential. Teratogens act in a dose-dependent manner on a susceptible genotype. However, the genotypes at risk for ASM-induced teratogenic deficits are unknown. Here we summarize current challenging issues for WWE, review the state-of-art tools for clinical precision medicine approaches, perform a systematic review of pharmacogenomic approaches in management for WWE, and discuss potential future directions in this field. We envision a future in which precision medicine enables a new practice style that puts focus on early detection, prediction, and targeted therapies for WWE.
View details for DOI 10.1016/j.yebeh.2021.107928
View details for PubMedID 33774354
-
Structured elements drive extensive circular RNA translation.
Molecular cell
2021
Abstract
The human genome encodes tens of thousands circular RNAs (circRNAs) with mostly unknown functions. Circular RNAs require internal ribosome entry sites (IRES) if they are to undergo translation without a 5' cap. Here, we develop a high-throughput screen to systematically discover RNA sequences that can direct circRNA translation in human cells. We identify more than 17,000 endogenous and synthetic sequences as candidate circRNA IRES. 18S rRNA complementarity and a structured RNA element positioned on the IRES are important for driving circRNA translation. Ribosome profiling and peptidomic analyses show extensive IRES-ribosome association, hundreds of circRNA-encoded proteins with tissue-specific distribution, and antigen presentation. We find that circFGFR1p, a protein encoded by circFGFR1 that is downregulated in cancer, functions as a negative regulator of FGFR1 oncoprotein to suppress cell growth during stress. Systematic identification of circRNA IRES elements may provide important links among circRNA regulation, biological function, and disease.
View details for DOI 10.1016/j.molcel.2021.07.042
View details for PubMedID 34437836
-
The X chromosome from telomere to telomere: key achievements and future opportunities.
Faculty reviews
1800; 10: 63
Abstract
While the human genome represents the most accurate vertebrate reference assembly to date, it still contains numerous gaps, including centromeric and other large repeat-containing regions - often termed the "dark side" of the genome - many of which are of fundamental biological importance. Miga et al.1,2 present the first gapless assembly of the human X chromosome, with the help of ultra-long-read nanopore reads generated for the haploid complete hydatidiform mole (CHM13) genome. They reconstruct the ~3.1 megabase centromeric satellite DNA array and map DNA methylation patterns across complex tandem repeats and satellite arrays. This Telomere-to-Telomere assembly provides a superior human X chromosome reference enabling future sex-determination and X-linked disease research, and provides a path towards finishing the entire human genome sequence.
View details for DOI 10.12703/r-01-000001
View details for PubMedID 35088059
-
CTLA-4 expression by B-1a B cells is essential for immune tolerance.
Nature communications
2021; 12 (1): 525
Abstract
CTLA-4 is an important regulator of T-cell function. Here, we report that expression of this immune-regulator in mouse B-1a cells has a critical function in maintaining self-tolerance by regulating these early-developing B cells that express a repertoire enriched for auto-reactivity. Selective deletion of CTLA-4 from B cells results in mice that spontaneously develop autoantibodies, T follicular helper (Tfh) cells and germinal centers (GCs) in the spleen, and autoimmune pathology later in life. This impaired immune homeostasis results from B-1a cell dysfunction upon loss of CTLA-4. Therefore, CTLA-4-deficient B-1a cells up-regulate epigenetic and transcriptional activation programs and show increased self-replenishment. These activated cells further internalize surface IgM, differentiate into antigen-presenting cells and, when reconstituted in normal IgH-allotype congenic recipient mice, induce GCs and Tfh cells expressing a highly selected repertoire. These findings show that CTLA-4 regulation of B-1a cells is a crucial immune-regulatory mechanism.
View details for DOI 10.1038/s41467-020-20874-x
View details for PubMedID 33483505
-
Cell-free DNA (cfDNA) and Exosome Profiling from a Year-Long Human Spaceflight Reveals Circulating Biomarkers.
iScience
2020; 23 (12): 101844
Abstract
Liquid biopsies based on cell-free DNA (cfDNA) or exosomes provide a noninvasive approach to monitor human health and disease but have not been utilized for astronauts. Here, we profile cfDNA characteristics, including fragment size, cellular deconvolution, and nucleosome positioning, in an astronaut during a year-long mission on the International Space Station, compared to his identical twin on Earth and healthy donors. We observed a significant increase in the proportion of cell-free mitochondrial DNA (cf-mtDNA) inflight, and analysis of post-flight exosomes in plasma revealed a 30-fold increase in circulating exosomes and patient-specific protein cargo (including brain-derived peptides) after the year-long mission. This longitudinal analysis of astronaut cfDNA during spaceflight and the exosome profiles highlights their utility for astronaut health monitoring, as well as cf-mtDNA levels as a potential biomarker for physiological stress or immune system responses related to microgravity, radiation exposure, and the other unique environmental conditions of spaceflight.
View details for DOI 10.1016/j.isci.2020.101844
View details for PubMedID 33376973
-
Rare Variant Burden Analysis within Enhancers Identifies CAV1 as an ALS Risk Gene.
Cell reports
2020; 33 (9): 108456
Abstract
Amyotrophic lateral sclerosis (ALS) is an incurable neurodegenerative disease. CAV1 and CAV2 organize membrane lipid rafts (MLRs) important for cell signaling and neuronal survival, and overexpression of CAV1 ameliorates ALS phenotypes invivo. Genome-wide association studies localize a large proportion of ALS risk variants within the non-coding genome, but further characterization has been limited by lack ofappropriate tools. By designing and applying a pipeline to identify pathogenic genetic variation within enhancer elements responsible for regulating gene expression, we identify disease-associated variation within CAV1/CAV2 enhancers, which replicate in an independent cohort. Discovered enhancer mutations reduce CAV1/CAV2 expression and disrupt MLRs in patient-derived cells, and CRISPR-Cas9 perturbation proximate to a patient mutation is sufficient to reduce CAV1/CAV2 expression in neurons. Additional enrichment of ALS-associated mutations within CAV1 exons positions CAV1 as an ALS risk gene. We propose CAV1/CAV2 overexpression as a personalized medicine target for ALS.
View details for DOI 10.1016/j.celrep.2020.108456
View details for PubMedID 33264630
-
A Customizable Analysis Flow in Integrative Multi-Omics.
Biomolecules
2020; 10 (12)
Abstract
The number of researchers using multi-omics is growing. Though still expensive, every year it is cheaper to perform multi-omic studies, often exponentially so. In addition to its increasing accessibility, multi-omics reveals a view of systems biology to an unprecedented depth. Thus, multi-omics can be used to answer a broad range of biological questions in finer resolution than previous methods. We used six omic measurements-four nucleic acid (i.e., genomic, epigenomic, transcriptomics, and metagenomic) and two mass spectrometry (proteomics and metabolomics) based-to highlight an analysis workflow on this type of data, which is often vast. This workflow is not exhaustive of all the omic measurements or analysis methods, but it will provide an experienced or even a novice multi-omic researcher with the tools necessary to analyze their data. This review begins with analyzing a single ome and study design, and then synthesizes best practices in data integration techniques that include machine learning. Furthermore, we delineate methods to validate findings from multi-omic integration. Ultimately, multi-omic integration offers a window into the complexity of molecular interactions and a comprehensive view of systems biology.
View details for DOI 10.3390/biom10121606
View details for PubMedID 33260881
-
Metabolic Dynamics and Prediction of Gestational Age and Time to Delivery in Pregnant Women
OBSTETRICAL & GYNECOLOGICAL SURVEY
2020; 75 (11): 649–51
View details for DOI 10.1097/OGX.0000000000000864
View details for Web of Science ID 000594473400001
-
Inherited causes of clonal haematopoiesis in 97,691 whole genomes.
Nature
2020
Abstract
Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown1. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer2-4 and coronary heart disease5-this phenomenon istermed clonal haematopoiesis of indeterminate potential (CHIP)6. Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIPdriver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.
View details for DOI 10.1038/s41586-020-2819-2
View details for PubMedID 33057201
-
Quality-control mechanisms targeting translationally stalled and C-terminally extended poly(GR) associated with ALS/FTD.
Proceedings of the National Academy of Sciences of the United States of America
2020
Abstract
Maintaining the fidelity of nascent peptide chain (NP) synthesis is essential for proteome integrity and cellular health. Ribosome-associated quality control (RQC) serves to resolve stalled translation, during which untemplated Ala/Thr residues are added C terminally to stalled peptide, as shown during C-terminal Ala and Thr addition (CAT-tailing) in yeast. The mechanism and biological effects of CAT-tailing-like activity in metazoans remain unclear. Here we show that CAT-tailing-like modification of poly(GR), a dipeptide repeat derived from amyotrophic lateral sclerosis with frontotemporal dementia (ALS/FTD)-associated GGGGCC (G4C2) repeat expansion in C9ORF72, contributes to disease. We find that poly(GR) can act as a mitochondria-targeting signal, causing some poly(GR) to be cotranslationally imported into mitochondria. However, poly(GR) translation on mitochondrial surface is frequently stalled, triggering RQC and CAT-tailing-like C-terminal extension (CTE). CTE promotes poly(GR) stabilization, aggregation, and toxicity. Our genetic studies in Drosophila uncovered an important role of the mitochondrial protease YME1L in clearing poly(GR), revealing mitochondria as major sites of poly(GR) metabolism. Moreover, the mitochondria-associated noncanonical Notch signaling pathway impinges on the RQC machinery to restrain poly(GR) accumulation, at least in part through the AKT/VCP axis. The conserved actions of YME1L and noncanonical Notch signaling in animal models and patient cells support their fundamental involvement in ALS/FTD.
View details for DOI 10.1073/pnas.2005506117
View details for PubMedID 32958650
-
The GTEx Consortium atlas of genetic regulatory effects across human tissues
SCIENCE
2020; 369 (6509): 1318-+
View details for DOI 10.1126/science.aaz1776
View details for Web of Science ID 000569840300041
-
Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale.
Nature genetics
2020
Abstract
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.
View details for DOI 10.1038/s41588-020-0676-4
View details for PubMedID 32839606
-
Multi-faceted epigenetic dysregulation of gene expression promotes esophageal squamous cell carcinoma.
Nature communications
2020; 11 (1): 3675
Abstract
Epigenetic landscapes can shape physiologic and disease phenotypes. We used integrative, high resolution multi-omics methods to delineate the methylome landscape and characterize the oncogenic drivers of esophageal squamous cell carcinoma (ESCC). We found 98% of CpGs are hypomethylated across the ESCC genome. Hypo-methylated regions are enriched in areas with heterochromatin binding markers (H3K9me3, H3K27me3), while hyper-methylated regions are enriched in polycomb repressive complex (EZH2/SUZ12) recognizing regions. Altered methylation in promoters, enhancers, and gene bodies, as well as in polycomb repressive complex occupancy and CTCF binding sites are associated with cancer-specific gene dysregulation. Epigenetic-mediated activation of non-canonical WNT/beta-catenin/MMP signaling and a YY1/lncRNA ESCCAL-1/ribosomal protein network are uncovered and validated as potential novel ESCC driver alterations. This study advances our understanding of how epigenetic landscapes shape cancer pathogenesis and provides a resource for biomarker and target discovery.
View details for DOI 10.1038/s41467-020-17227-z
View details for PubMedID 32699215
-
Prevention of Severe Intestinal Barrier Dysfunction Through a Single-Species Probiotics Is Associated With the Activation of Microbiome-Mediated Glutamate-Glutamine Biosynthesis.
Shock (Augusta, Ga.)
2020
Abstract
INTRODUCTION: Intra-abdominal hypertension (IAH), the leading complication in the intensive care unit, significantly disturbs the gut microbial composition by decreasing the relative abundance of Lactobacillus and increasing the relative abundance of opportunistic infectious bacteria.METHODS: To evaluate the preventative effect of Lactobacillus-based probiotics on IAH-induced intestinal barrier damages, a single-species probiotics (L92) and a multi-species probiotics (VSL#3) were introduced orally to Sprague-Dawley rats for 7 days before inducing IAH. The intestinal histology and permeability to macromolecules (fluoresceine isothiocyanate, FITC-dextran, N = 8 for each group), the parameters of immunomodulatory and oxidative responses [Monocyte chemotactic protein 1(MCP-1), interleukin-1beta (IL-1beta), interleukin-4 (IL-4), interleukin-10 (IL-10), malonaldehyde (MDA), glutathione peroxidase (GSH- Px), catalase (CAT), and superoxide dismutase (SOD); N = 4 for each group], and the microbiome profiling (N = 4 for each group) were analyzed.RESULTS: 7-day pretreatments of L92 significantly alleviated the IAH-induced increase in intestinal permeability to FITC-dextran and histological damage(P < 0.0001), accompanied with the suppression of inflammatory and oxidative activation. The increase of MCP-1 and IL-1beta were significantly inhibited (P < 0.05); the anti-inflammatory cytokines, IL-4 and IL-10 were maintained at high levels; and the suppression of CAT (P < 0.05) were significantly reversed when pretreated with L92. On the contrary, no significant protective effects were observed in the VSL#3-pretreated group. Among the 84 identified species, 260 MetaCyc pathways, and 217 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, the protective effects of L92 were correlated with an increased relative abundance of Bacteroides finegoldii, Odoribacter splanchnicus, and the global activation of amino acid biosynthesis pathways, especially the glutamate-glutamine biosynthesis pathway.CONCLUSIONS: 7-day pretreatment with a single-species probiotics can prevent IAH induced severe intestinal barrier dysfunction, potentially through microbial modulation.
View details for DOI 10.1097/SHK.0000000000001593
View details for PubMedID 32694391
-
Physiological blood-brain transport is impaired with age by a shift in transcytosis.
Nature
2020
Abstract
The vascular interfaceof the brain, known as the blood-brain barrier (BBB), is understood to maintain brain function in part via its low transcellular permeability1-3. Yet, recent studies have demonstrated that brain ageing is sensitive to circulatory proteins4,5. Thus, it is unclear whether permeability to individually injected exogenous tracers-as isstandard in BBB studies-fully represents blood-to-brain transport. Here we label hundreds of proteins constituting the mouse blood plasma proteome, and upon their systemic administration, study the BBB with its physiological ligand. We find that plasma proteins readily permeate the healthy brain parenchyma, with transport maintained by BBB-specific transcriptional programmes. Unlike IgG antibody, plasma protein uptake diminishes in the aged brain, driven by an age-related shift in transport from ligand-specific receptor-mediated to non-specific caveolar transcytosis. This age-related shift occurs alongside a specific loss of pericyte coverage. Pharmacological inhibition of the age-upregulated phosphatase ALPL, a predicted negative regulator of transport, enhances brain uptake of therapeutically relevant transferrin, transferrin receptor antibody and plasma. These findings reveal the extent of physiological protein transcytosis to the healthy brain, a mechanism of widespread BBB dysfunction with age and a strategy for enhanced drug delivery.
View details for DOI 10.1038/s41586-020-2453-z
View details for PubMedID 32612231
-
Molecular Transducers of Physical Activity Consortium (MoTrPAC): Mapping the Dynamic Responses to Exercise.
Cell
2020; 181 (7): 1464–74
Abstract
Exercise provides a robust physiological stimulus that evokes cross-talk among multiple tissues that when repeated regularly (i.e., training) improves physiological capacity, benefits numerous organ systems, and decreases the risk for premature mortality. However, a gap remains in identifying the detailed molecular signals induced by exercise that benefits health and prevents disease. The Molecular Transducers of Physical Activity Consortium (MoTrPAC) was established to address this gap and generate a molecular map of exercise. Preclinical and clinical studies will examine the systemic effects of endurance and resistance exercise across a range of ages and fitness levels by molecular probing of multiple tissues before and after acute and chronic exercise. From this multi-omic and bioinformatic analysis, a molecular map of exercise will be established. Altogether, MoTrPAC will provide a public database that is expected to enhance our understanding of the health benefits of exercise and to provide insight into how physical activity mitigates disease.
View details for DOI 10.1016/j.cell.2020.06.004
View details for PubMedID 32589957
-
Towards personalized medicine in maternal and child health: integrating biologic and social determinants.
Pediatric research
2020
View details for DOI 10.1038/s41390-020-0981-8
View details for PubMedID 32454518
-
The Human Tumor Atlas Network: Charting Tumor Transitions across Space and Time at Single-Cell Resolution.
Cell
2020; 181 (2): 236–49
Abstract
Crucial transitions in cancer-including tumor initiation, local expansion, metastasis, and therapeutic resistance-involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer.
View details for DOI 10.1016/j.cell.2020.03.053
View details for PubMedID 32302568
-
Humans Are Selectively Exposed to Pneumocystis jirovecii.
mBio
2020; 11 (2)
Abstract
Environmental exposure has a significant impact on human health. While some airborne fungi can cause life-threatening infections, the impact of environment on fungal spore dispersal and transmission is poorly understood. The democratization of shotgun metagenomics allows us to explore important questions about fungal propagation. We focus on Pneumocystis, a genus of host-specific fungi that infect mammals via airborne particles. In humans, Pneumocystis jirovecii causes lethal infections in immunocompromised patients if untreated, although its environmental reservoir and transmission route remain unclear. Here, we attempt to clarify, by analyzing human exposome metagenomic data sets, whether humans are exposed to different Pneumocystis species present in the air but only P. jirovecii cells are able to replicate or whether they are selectively exposed to P. jirovecii Our analysis supports the latter hypothesis, which is consistent with a local transmission model. These data also suggest that healthy carriers are a major driver for the transmission.
View details for DOI 10.1128/mBio.03138-19
View details for PubMedID 32156824
-
Systematic Identification of Regulators of Oxidative Stress Reveals Non-canonical Roles for Peroxisomal Import and the Pentose Phosphate Pathway.
Cell reports
2020; 30 (5): 1417
Abstract
Reactive oxygen species (ROS) play critical roles inmetabolism and disease, yet a comprehensive analysis of the cellular response to oxidative stress is lacking. To systematically identify regulators ofoxidative stress, we conducted genome-wide Cas9/CRISPR and shRNA screens. This revealed a detailed picture of diverse pathways that control oxidative stress response, ranging from the TCA cycle and DNA repair machineries to iron transport, trafficking, and metabolism. Paradoxically, disrupting the pentose phosphate pathway (PPP) at the level of phosphogluconate dehydrogenase (PGD) protects cells against ROS. This dramatically alters metabolites in the PPP, consistent with rewiring of upper glycolysis to promote antioxidant production. In addition, disruption of peroxisomal import unexpectedly increases resistance to oxidative stress by altering the localization of catalase. Together, these studies provide insights into the roles of peroxisomal matrix import and the PPP in redox biology and represent a rich resource for understanding the cellular response to oxidative stress.
View details for DOI 10.1016/j.celrep.2020.01.013
View details for PubMedID 32023459
-
The MEK5-ERK5 kinase axis controls lipid metabolism in small cell lung cancer.
Cancer research
2020
Abstract
Small cell lung cancer (SCLC) is an aggressive form of lung cancer with dismal survival rates. While kinases often play key roles driving tumorigenesis, there are strikingly few kinases known to promote the development of SCLC. Here we investigated the contribution of the MAP kinase module MEK5/ERK5 to SCLC growth. MEK5 and ERK5 were required for optimal survival and expansion of SCLC cell lines in vitro and in vivo. Transcriptomics analyses identified a role for the MEK5-ERK5 axis in the metabolism of SCLC cells, including lipid metabolism. In-depth lipidomics analyses showed that loss of MEK5/ERK5 perturbs several lipid metabolism pathways, including the mevalonate pathway that controls cholesterol synthesis. Notably, depletion of MEK5/ERK5 sensitized SCLC cells to pharmacological inhibition of the mevalonate pathway by statins. These data identify a new MEK5-ERK5-lipid metabolism axis that promotes the growth of SCLC.
View details for DOI 10.1158/0008-5472.CAN-19-1027
View details for PubMedID 31969375
-
RobNorm: Model-Based Robust Normalization Method for Labeled Quantitative Mass Spectrometry Proteomics Data.
Bioinformatics (Oxford, England)
2020
Abstract
Data normalization is an important step in processing proteomics data generated in mass spectrometry (MS) experiments, which aims to reduce sample-level variation and facilitate comparisons of samples. Previously published methods for normalization primarily depend on the assumption that the distribution of protein expression is similar across all samples. However, this assumption fails when the protein expression data is generated from heterogenous samples, such as from various tissue types. This led us to develop a novel data-driven method for improved normalization to correct the systematic bias meanwhile maintaining underlying biological heterogeneity.To robustly correct the systematic bias, we used the density-power-weight method to down-weigh outliers and extended the one-dimensional robust fitting method described in the previous work of (Windham, 1995, Fujisawa and Eguchi, 2008) to our structured data. We then constructed a robustness criterion and developed a new normalization algorithm, called RobNorm.In simulation studies and analysis of real data from the genotype-tissue expression (GTEx) project, we compared and evaluated the performance of RobNorm against other normalization methods. We found that the RobNorm approach exhibits the greatest reduction in systematic bias while maintaining across-tissue variation, especially for datasets from highly heterogeneous samples.https://github.com/mwgrassgreen/RobNorm.
View details for DOI 10.1093/bioinformatics/btaa904
View details for PubMedID 33098413
-
Cumulative Lifetime Burden of Cardiovascular Disease From Early Exposure to Air Pollution.
Journal of the American Heart Association
2020; 9 (6): e014944
Abstract
The disease burden associated with air pollution continues to grow. The World Health Organization (WHO) estimates ≈7 million people worldwide die yearly from exposure to polluted air, half of which-3.3 million-are attributable to cardiovascular disease (CVD), greater than from major modifiable CVD risks including smoking, hypertension, hyperlipidemia, and diabetes mellitus. This serious and growing health threat is attributed to increasing urbanization of the world's populations with consequent exposure to polluted air. Especially vulnerable are the elderly, patients with pre-existing CVD, and children. The cumulative lifetime burden in children is particularly of concern because their rapidly developing cardiopulmonary systems are more susceptible to damage and they spend more time outdoors and therefore inhale more pollutants. World Health Organization estimates that 93% of the world's children aged <15 years-1.8 billion children-breathe air that puts their health and development at risk. Here, we present growing scientific evidence, including from our own group, that chronic exposure to air pollution early in life is directly linked to development of major CVD risks, including obesity, hypertension, and metabolic disorders. In this review, we surveyed the literature for current knowledge of how pollution exposure early in life adversely impacts cardiovascular phenotypes, and lay the foundation for early intervention and other strategies that can help prevent this damage. We also discuss the need for better guidelines and additional research to validate exposure metrics and interventions that will ultimately help healthcare providers reduce the growing burden of CVD from pollution.
View details for DOI 10.1161/JAHA.119.014944
View details for PubMedID 32174249
-
A limited set of transcriptional programs define major cell types.
Genome research
2020; 30 (7): 1047–59
Abstract
We have produced RNA sequencing data for 53 primary cells from different locations in the human body. The clustering of these primary cells reveals that most cells in the human body share a few broad transcriptional programs, which define five major cell types: epithelial, endothelial, mesenchymal, neural, and blood cells. These act as basic components of many tissues and organs. Based on gene expression, these cell types redefine the basic histological types by which tissues have been traditionally classified. We identified genes whose expression is specific to these cell types, and from these genes, we estimated the contribution of the major cell types to the composition of human tissues. We found this cellular composition to be a characteristic signature of tissues and to reflect tissue morphological heterogeneity and histology. We identified changes in cellular composition in different tissues associated with age and sex, and found that departures from the normal cellular composition correlate with histological phenotypes associated with disease.
View details for DOI 10.1101/gr.263186.120
View details for PubMedID 32759341
-
Multiomics Characterization of Preterm Birth in Low- and Middle-Income Countries.
JAMA network open
2020; 3 (12): e2029655
Abstract
Worldwide, preterm birth (PTB) is the single largest cause of deaths in the perinatal and neonatal period and is associated with increased morbidity in young children. The cause of PTB is multifactorial, and the development of generalizable biological models may enable early detection and guide therapeutic studies.To investigate the ability of transcriptomics and proteomics profiling of plasma and metabolomics analysis of urine to identify early biological measurements associated with PTB.This diagnostic/prognostic study analyzed plasma and urine samples collected from May 2014 to June 2017 from pregnant women in 5 biorepository cohorts in low- and middle-income countries (LMICs; ie, Matlab, Bangladesh; Lusaka, Zambia; Sylhet, Bangladesh; Karachi, Pakistan; and Pemba, Tanzania). These cohorts were established to study maternal and fetal outcomes and were supported by the Alliance for Maternal and Newborn Health Improvement and the Global Alliance to Prevent Prematurity and Stillbirth biorepositories. Data were analyzed from December 2018 to July 2019.Blood and urine specimens that were collected early during pregnancy (median sampling time of 13.6 weeks of gestation, according to ultrasonography) were processed, stored, and shipped to the laboratories under uniform protocols. Plasma samples were assayed for targeted measurement of proteins and untargeted cell-free ribonucleic acid profiling; urine samples were assayed for metabolites.The PTB phenotype was defined as the delivery of a live infant before completing 37 weeks of gestation.Of the 81 pregnant women included in this study, 39 had PTBs (48.1%) and 42 had term pregnancies (51.9%) (mean [SD] age of 24.8 [5.3] years). Univariate analysis demonstrated functional biological differences across the 5 cohorts. A cohort-adjusted machine learning algorithm was applied to each biological data set, and then a higher-level machine learning modeling combined the results into a final integrative model. The integrated model was more accurate, with an area under the receiver operating characteristic curve (AUROC) of 0.83 (95% CI, 0.72-0.91) compared with the models derived for each independent biological modality (transcriptomics AUROC, 0.73 [95% CI, 0.61-0.83]; metabolomics AUROC, 0.59 [95% CI, 0.47-0.72]; and proteomics AUROC, 0.75 [95% CI, 0.64-0.85]). Primary features associated with PTB included an inflammatory module as well as a metabolomic module measured in urine associated with the glutamine and glutamate metabolism and valine, leucine, and isoleucine biosynthesis pathways.This study found that, in LMICs and high PTB settings, major biological adaptations during term pregnancy follow a generalizable model and the predictive accuracy for PTB was augmented by combining various omics data sets, suggesting that PTB is a condition that manifests within multiple biological systems. These data sets, with machine learning partnerships, may be a key step in developing valuable predictive tests and intervention candidates for preventing PTB.
View details for DOI 10.1001/jamanetworkopen.2020.29655
View details for PubMedID 33337494
-
Remodeling of active endothelial enhancers is associated with aberrant gene-regulatory networks in pulmonary arterial hypertension.
Nature communications
2020; 11 (1): 1673
Abstract
Environmental and epigenetic factors often play an important role in polygenic disorders. However, how such factors affect disease-specific tissues at the molecular level remains to be understood. Here, we address this in pulmonary arterial hypertension (PAH). We obtain pulmonary arterial endothelial cells (PAECs) from lungs of patients and controls (n = 19), and perform chromatin, transcriptomic and interaction profiling. Overall, we observe extensive remodeling at active enhancers in PAH PAECs and identify hundreds of differentially active TFs, yet find very little transcriptomic changes in steady-state. We devise a disease-specific enhancer-gene regulatory network and predict that primed enhancers in PAH PAECs are activated by the differentially active TFs, resulting in an aberrant response to endothelial signals, which could lead to disturbed angiogenesis and endothelial-to-mesenchymal-transition. We validate these predictions for a selection of target genes in PAECs stimulated with TGF-β, VEGF or serotonin. Our study highlights the role of chromatin state and enhancers in disease-relevant cell types of PAH.
View details for DOI 10.1038/s41467-020-15463-x
View details for PubMedID 32245974
-
Classifying non-small cell lung cancer types and transcriptomic subtypes using convolutional neural networks.
Journal of the American Medical Informatics Association : JAMIA
2020; 27 (5): 757–69
Abstract
Non-small cell lung cancer is a leading cause of cancer death worldwide, and histopathological evaluation plays the primary role in its diagnosis. However, the morphological patterns associated with the molecular subtypes have not been systematically studied. To bridge this gap, we developed a quantitative histopathology analytic framework to identify the types and gene expression subtypes of non-small cell lung cancer objectively.We processed whole-slide histopathology images of lung adenocarcinoma (n = 427) and lung squamous cell carcinoma patients (n = 457) in the Cancer Genome Atlas. We built convolutional neural networks to classify histopathology images, evaluated their performance by the areas under the receiver-operating characteristic curves (AUCs), and validated the results in an independent cohort (n = 125).To establish neural networks for quantitative image analyses, we first built convolutional neural network models to identify tumor regions from adjacent dense benign tissues (AUCs > 0.935) and recapitulated expert pathologists' diagnosis (AUCs > 0.877), with the results validated in an independent cohort (AUCs = 0.726-0.864). We further demonstrated that quantitative histopathology morphology features identified the major transcriptomic subtypes of both adenocarcinoma and squamous cell carcinoma (P < .01).Our study is the first to classify the transcriptomic subtypes of non-small cell lung cancer using fully automated machine learning methods. Our approach does not rely on prior pathology knowledge and can discover novel clinically relevant histopathology patterns objectively. The developed procedure is generalizable to other tumor types or diseases.
View details for DOI 10.1093/jamia/ocz230
View details for PubMedID 32364237
-
Long-read assays shed new light on the transcriptome complexity of a viral pathogen.
Scientific reports
2020; 10 (1): 13822
Abstract
Characterization of global transcriptomes using conventional short-read sequencing is challenging due to the insensitivity of these platforms to transcripts isoforms, multigenic RNA molecules, and transcriptional overlaps. Long-read sequencing (LRS) can overcome these limitations by reading full-length transcripts. Employment of these technologies has led to the redefinition of transcriptional complexities in reported organisms. In this study, we applied LRS platforms from Pacific Biosciences and Oxford Nanopore Technologies to profile the vaccinia virus (VACV) transcriptome. We performed cDNA and direct RNA sequencing analyses and revealed an extremely complex transcriptional landscape of this virus. In particular, VACV genes produce large numbers of transcript isoforms that vary in their start and termination sites. A significant fraction of VACV transcripts start or end within coding regions of neighbouring genes. This study provides new insights into the transcriptomic profile of this viral pathogen.
View details for DOI 10.1038/s41598-020-70794-5
View details for PubMedID 32796917
-
Research on the Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, According to the HUPO Human Proteome Project.
Journal of proteome research
2020
Abstract
According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19 773 predicted proteins coded in the human genome. The HPP annually reports on progress made throughout the world toward credibly identifying and characterizing the complete human protein parts list and promoting proteomics as an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2020-01 classified 17 874 proteins as PE1, having strong protein-level evidence, up 180 from 17 694 one year earlier. These represent 90.4% of the 19 773 predicted coding genes (all PE1,2,3,4 proteins in neXtProt). Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), was reduced by 230 from 2129 to 1899 since the neXtProt 2019-01 release. PeptideAtlas is the primary source of uniform reanalysis of raw mass spectrometry data for neXtProt, supplemented this year with extensive data from MassIVE. PeptideAtlas 2020-01 added 362 canonical proteins between 2019 and 2020 and MassIVE contributed 84 more, many of which converted PE1 entries based on non-MS evidence to the MS-based subgroup. The 19 Biology and Disease-driven B/D-HPP teams continue to pursue the identification of driver proteins that underlie disease states, the characterization of regulatory mechanisms controlling the functions of these proteins, their proteoforms, and their interactions, and the progression of transitions from correlation to coexpression to causal networks after system perturbations. And the Human Protein Atlas published Blood, Brain, and Metabolic Atlases.
View details for DOI 10.1021/acs.jproteome.0c00485
View details for PubMedID 32931287
-
iPSC Modeling of RBM20-Deficient DCM Identifies Upregulation of RBM20 as a Therapeutic Strategy.
Cell reports
2020; 32 (10): 108117
Abstract
Recent advances in induced pluripotent stem cell (iPSC) technology and directed differentiation of iPSCs into cardiomyocytes (iPSC-CMs) make it possible to model genetic heart disease in vitro. We apply CRISPR/Cas9 genome editing technology to introduce three RBM20 mutations in iPSCs and differentiate them into iPSC-CMs to establish an in vitro model of RBM20 mutant dilated cardiomyopathy (DCM). In iPSC-CMs harboring a known causal RBM20 variant, the splicing of RBM20 target genes, calcium handling, and contractility are impaired consistent with the disease manifestation in patients. A variant (Pro633Leu) identified by exome sequencing of patient genomes displays the same disease phenotypes, thus establishing this variant as disease causing. We find that all-trans retinoic acid upregulates RBM20 expression and reverts the splicing, calcium handling, and contractility defects in iPSC-CMs with different causal RBM20 mutations. These results suggest that pharmacological upregulation of RBM20 expression is a promising therapeutic strategy for DCM patients with a heterozygous mutation in RBM20.
View details for DOI 10.1016/j.celrep.2020.108117
View details for PubMedID 32905764
-
A high-stringency blueprint of the human proteome.
Nature communications
2020; 11 (1): 5301
Abstract
The Human Proteome Organization (HUPO) launched the Human Proteome Project (HPP) in 2010, creating an international framework for global collaboration, data sharing, quality assurance and enhancing accurate annotation of the genome-encoded proteome. During the subsequent decade, the HPP established collaborations, developed guidelines and metrics, and undertook reanalysis of previously deposited community data, continuously increasing the coverage of the human proteome. On the occasion of the HPP's tenth anniversary, we here report a 90.4% complete high-stringency human proteome blueprint. This knowledge is essential for discerning molecular processes in health and disease, as we demonstrate by highlighting potential roles the human proteome plays in our understanding, diagnosis and treatment of cancers, cardiovascular and infectious diseases.
View details for DOI 10.1038/s41467-020-19045-9
View details for PubMedID 33067450
-
Longitudinal Analysis of Serum Cytokine Levels and Gut Microbial Abundance Links IL-17/IL-22 with Clostridia and Insulin Sensitivity in Humans.
Diabetes
2020
Abstract
Recent studies using mouse models suggest that interaction between the gut microbiome and IL-17/IL-22 producing cells plays a role in the development of metabolic diseases. We investigated this relationship in humans using data from the prediabetes study of the Integrated Human Microbiome Project (iHMP). Specifically, we addressed the hypothesis that early in the onset of metabolic diseases there is a decline in serum levels of IL-17/IL-22, with concomitant changes in the gut microbiome. Clustering iHMP study participants on the basis of longitudinal IL-17/IL-22 profiles identified discrete groups. Individuals distinguished by low levels of IL-17/IL-22 were linked to established markers of metabolic disease, including insulin sensitivity. These individuals also displayed gut microbiome dysbiosis, characterized by decreased diversity, and IL-17/IL-22-related declines in the phylum Firmicutes, class Clostridia, and order Clostridiales. This ancillary analysis of the iHMP data therefore supports a link between the gut microbiome, IL-17/IL-22 and the onset of metabolic diseases. This raises the possibility for novel, microbiome-related therapeutic targets that may effectively alleviate metabolic diseases in humans as they do in animal models.
View details for DOI 10.2337/db19-0592
View details for PubMedID 32366680
-
Immunologic effects of forest fire exposure show increases in IL-1β and CRP.
Allergy
2020
View details for DOI 10.1111/all.14251
View details for PubMedID 32112439
-
Human-engineered Treg-like cells suppress FOXP3-deficient T cells but preserve adaptive immune responses in vivo.
Clinical & translational immunology
2020; 9 (11): e1214
Abstract
Genetic or acquired defects in FOXP3+ regulatory T cells (Tregs) play a key role in many immune-mediated diseases including immune dysregulation polyendocrinopathy, enteropathy, X-linked (IPEX) syndrome. Previously, we demonstrated CD4+ T cells from healthy donors and IPEX patients can be converted into functional Treg-like cells by lentiviral transfer of FOXP3 (CD4LVFOXP3). These CD4LVFOXP3 cells have potent regulatory function, suggesting their potential as an innovative therapeutic. Here, we present molecular and preclinical in vivo data supporting CD4LVFOXP3 cell clinical progression.The molecular characterisation of CD4LVFOXP3 cells included flow cytometry, qPCR, RNA-seq and TCR-seq. The in vivo suppressive function of CD4LVFOXP3 cells was assessed in xenograft-versus-host disease (xeno-GvHD) and FOXP3-deficient IPEX-like humanised mouse models. The safety of CD4LVFOXP3 cells was evaluated using peripheral blood (PB) humanised (hu)- mice testing their impact on immune response against pathogens, and immune surveillance against tumor antigens.We demonstrate that the conversion of CD4+ T cells to CD4LVFOXP3 cells leads to specific transcriptional changes as compared to CD4+ T-cell transduction in the absence of FOXP3, including upregulation of Treg-related genes. Furthermore, we observe specific preservation of a polyclonal TCR repertoire during in vitro cell production. Both allogeneic and autologous CD4LVFOXP3 cells protect from xeno-GvHD after two sequential infusions of effector T cells. CD4LVFOXP3 cells prevent hyper-proliferation of CD4+ memory T cells in the FOXP3-deficient IPEX-like hu-mice. CD4LVFOXP3 cells do not impede in vivo expansion of antigen-primed T cells or tumor clearance in the PB hu-mice.These data support the clinical readiness of CD4LVFOXP3 cells to treat IPEX syndrome and other immune-mediated diseases caused by insufficient or dysfunctional FOXP3+ Tregs.
View details for DOI 10.1002/cti2.1214
View details for PubMedID 33304583
View details for PubMedCentralID PMC7688376
-
Meta-analytic approach for transcriptome profiling of herpes simplex virus type 1.
Scientific data
2020; 7 (1): 223
Abstract
In this meta-analysis, we re-analysed and compared herpes simplex virus type 1 transcriptomic data generated by eight studies using various short- and long-read sequencing techniques and different library preparation methods. We identified a large number of novel mRNAs, non-coding RNAs and transcript isoforms, and validated many previously published transcripts. Here, we present the most complete HSV-1 transcriptome to date. Furthermore, we also demonstrate that various sequencing techniques, including both cDNA and direct RNA sequencing approaches, are error-prone, which can be circumvented by using integrated approaches. This work draws attention to the need for using multiple sequencing approaches and meta-analyses in transcriptome profiling studies to obtain reliable results.
View details for DOI 10.1038/s41597-020-0558-8
View details for PubMedID 32647284
-
Chromosome-level de novo assembly of the pig-tailed macaque genome using linked-read sequencing and HiC proximity scaffolding.
GigaScience
2020; 9 (7)
Abstract
Macaque species share >93% genome homology with humans and develop many disease phenotypes similar to those of humans, making them valuable animal models for the study of human diseases (e.g., HIV and neurodegenerative diseases). However, the quality of genome assembly and annotation for several macaque species lags behind the human genome effort.To close this gap and enhance functional genomics approaches, we used a combination of de novo linked-read assembly and scaffolding using proximity ligation assay (HiC) to assemble the pig-tailed macaque (Macaca nemestrina) genome. This combinatorial method yielded large scaffolds at chromosome level with a scaffold N50 of 127.5 Mb; the 23 largest scaffolds covered 90% of the entire genome. This assembly revealed large-scale rearrangements between pig-tailed macaque chromosomes 7, 12, and 13 and human chromosomes 2, 14, and 15. We subsequently annotated the genome using transcriptome and proteomics data from personalized induced pluripotent stem cells derived from the same animal. Reconstruction of the evolutionary tree using whole-genome annotation and orthologous comparisons among 3 macaque species, human, and mouse genomes revealed extensive homology between human and pig-tailed macaques with regards to both pluripotent stem cell genes and innate immune gene pathways. Our results confirm that rhesus and cynomolgus macaques exhibit a closer evolutionary distance to each other than either species exhibits to humans or pig-tailed macaques.These findings demonstrate that pig-tailed macaques can serve as an excellent animal model for the study of many human diseases particularly with regards to pluripotency and innate immune pathways.
View details for DOI 10.1093/gigascience/giaa069
View details for PubMedID 32649757
-
PPARγ-p53-Mediated Vasculoregenerative Program to Reverse Pulmonary Hypertension.
Circulation research
2020
Abstract
Rationale: In pulmonary arterial hypertension (PAH), endothelial dysfunction and obliterative vascular disease are associated with DNA damage and impaired signaling of bone morphogenetic protein type 2 receptor (BMPR2) via two downstream transcription factors, PPARγ and p53. Objective: We investigated the vasculoprotective and regenerative potential of a newly identified PPARγ- p53 transcription factor complex in the pulmonary endothelium. Methods and Results: In this study, we identified a pharmacologically inducible vasculoprotective mechanism in pulmonary arterial (PA) and lung microvascular (MV) endothelial cells (EC) in response to DNA damage and oxidant stress regulated in part by a BMPR2 dependent transcription factor complex between PPARγ and p53. Chromatin immunoprecipitation (ChIP) sequencing (seq) and RNA-seq established an inducible PPARγ-p53 mediated regenerative program regulating 19 genes involved in lung EC survival, angiogenesis and DNA repair including, EPHA2, FHL2, JAG1, SULF2 and TIGAR. Expression of these genes was partially impaired when the PPARγ-p53 complex was pharmacologically disrupted or when BMPR2 was reduced in PAEC subjected to oxidative stress. In EC-Bmpr2-knockout mice unable to stabilize p53 in ECs under oxidative stress, Nutlin-3 rescued endothelial p53 and PPARγ-p53 complex formation and induced target genes, such as APLN and JAG1, to regenerate pulmonary microvessels and reverse pulmonary hypertension. In PAEC from BMPR2 mutant PAH patients, pharmacological induction of p53 and PPARγ-p53 genes repaired damaged DNA utilizing genes from the nucleotide excision repair pathway without provoking PAEC apoptosis. Conclusions: We identified a novel therapeutic strategy that activates a vasculoprotective gene regulation program in PAEC downstream of dysfunctional BMPR2 to rehabilitate PAH PAEC, regenerate pulmonary microvessels and reverse disease. Our studies pave the way for p53-based vasculoregenerative therapies for PAH by extending the therapeutic focus to PAEC dysfunction and to DNA damage associated with PAH progression.
View details for DOI 10.1161/CIRCRESAHA.119.316339
View details for PubMedID 33322916
-
Obesity Drives Delayed Infarct Expansion, Inflammation, and Distinct Gene Networks in a Mouse Stroke Model.
Translational stroke research
2020
Abstract
Obesity is associated with chronic peripheral inflammation, is a risk factor for stroke, and causes increased infarct sizes. To characterize how obesity increases infarct size, we fed a high-fat diet to wild-type C57BL/6J mice for either 6 weeks or 15 weeks and then induced distal middle cerebral artery strokes. We found that infarct expansion happened late after stroke. There were no differences in cortical neuroinflammation (astrogliosis, microgliosis, or pro-inflammatory cytokines) either prior to or 10 h after stroke, and also no differences in stroke size at 10 h. However, by 3 days after stroke, animals fed a high-fat diet had a dramatic increase in microgliosis and astrogliosis that was associated with larger strokes and worsened functional recovery. RNA sequencing revealed a dramatic increase in inflammatory genes in the high-fat diet-fed animals 3 days after stroke that were not present prior to stroke. Genetic pathways unique to diet-induced obesity were primarily related to adaptive immunity, extracellular matrix components, cell migration, and vasculogenesis. The late appearance of neuroinflammation and infarct expansion indicates that there may be a therapeutic window between 10 and 36 h after stroke where inflammation and obesity-specific transcriptional programs could be targeted to improve outcomes in people with obesity and stroke.
View details for DOI 10.1007/s12975-020-00826-9
View details for PubMedID 32588199
-
Multiomic immune clockworks of pregnancy.
Seminars in immunopathology
2020
Abstract
Preterm birth is the leading cause of mortality in children under the age of five worldwide. Despite major efforts, we still lack the ability to accurately predict and effectively prevent preterm birth. While multiple factors contribute to preterm labor, dysregulations of immunological adaptations required for the maintenance of a healthy pregnancy is at its pathophysiological core. Consequently, a precise understanding of these chronologically paced immune adaptations and of the biological pacemakers that synchronize the pregnancy "immune clock" is a critical first step towards identifying deviations that are hallmarks of peterm birth. Here, we will review key elements of the fetal, placental, and maternal pacemakers that program the immune clock of pregnancy. We will then emphasize multiomic studies that enable a more integrated view of pregnancy-related immune adaptations. Such multiomic assessments can strengthen the biological plausibility of immunological findings and increase the power of biological signatures predictive of preterm birth.
View details for DOI 10.1007/s00281-019-00772-1
View details for PubMedID 32020337
-
Template-switching artifacts resemble alternative polyadenylation.
BMC genomics
2019; 20 (1): 824
Abstract
BACKGROUND: Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming.RESULTS: Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads.CONCLUSIONS: Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.
View details for DOI 10.1186/s12864-019-6199-7
View details for PubMedID 31703623
-
Big data and health
LANCET DIGITAL HEALTH
2019; 1 (6): E252–E254
View details for DOI 10.1016/S2589-7500(19)30109-8
View details for Web of Science ID 000525871300006
-
Genome-wide effects of social status on DNA methylation in the brain of a cichlid fish, Astatotilapia burtoni.
BMC genomics
2019; 20 (1): 699
Abstract
BACKGROUND: Successful social behavior requires real-time integration of information about the environment, internal physiology, and past experience. The molecular substrates of this integration are poorly understood, but likely modulate neural plasticity and gene regulation. In the cichlid fish species Astatotilapia burtoni, male social status can shift rapidly depending on the environment, causing fast behavioral modifications and a cascade of changes in gene transcription, the brain, and the reproductive system. These changes can be permanent but are also reversible, implying the involvement of a robust but flexible mechanism that regulates plasticity based on internal and external conditions. One candidate mechanism is DNA methylation, which has been linked to social behavior in many species, including A. burtoni. But, the extent of its effects after A. burtoni social change were previously unknown.RESULTS: We performed the first genome-wide search for DNA methylation patterns associated with social status in the brains of male A. burtoni, identifying hundreds of Differentially Methylated genomic Regions (DMRs) in dominant versus non-dominant fish. Most DMRs were inside genes supporting neural development, synapse function, and other processes relevant to neural plasticity, and DMRs could affect gene expression in multiple ways. DMR genes were more likely to be transcription factors, have a duplicate elsewhere in the genome, have an anti-sense lncRNA, and have more splice variants than other genes. Dozens of genes had multiple DMRs that were often seemingly positioned to regulate specific splice variants.CONCLUSIONS: Our results revealed genome-wide effects of A. burtoni social status on DNA methylation in the brain and strongly suggest a role for methylation in modulating plasticity across multiple biological levels. They also suggest many novel hypotheses to address in mechanistic follow-up studies, and will be a rich resource for identifying the relationships between behavioral, neural, and transcriptional plasticity in the context of social status.
View details for DOI 10.1186/s12864-019-6047-9
View details for PubMedID 31506062
-
Systematic Identification of Host Cell Regulators of Legionella pneumophila Pathogenesis Using a Genome-wide CRISPR Screen.
Cell host & microbe
2019
Abstract
During infection, Legionella pneumophila translocates over 300 effector proteins into the host cytosol, allowing the pathogen to establish an endoplasmic reticulum (ER)-like Legionella-containing vacuole (LCV) that supports bacterial replication. Here, we perform a genome-wide CRISPR-Cas9 screen and secondary targeted screens in U937 human monocyte/macrophage-like cells to systematically identify host factorsthat regulate killing by L.pneumophila. The screens reveal known host factors hijacked by L.pneumophila, as well as genes spanning diverse trafficking and signaling pathways previously not linked to L.pneumophila pathogenesis. We further characterize C1orf43 and KIAA1109 as regulators ofphagocytosis and show that RAB10 and its chaperone RABIF are required for optimal L.pneumophila replication and ER recruitment to the LCV. Finally, we show that Rab10 protein is recruited to the LCV and ubiquitinated by the effectors SidC/SdcA. Collectively, our results provide a wealth of previously undescribed insights into L.pneumophila pathogenesis and mammalian cell function.
View details for DOI 10.1016/j.chom.2019.08.017
View details for PubMedID 31540829
-
Large-Scale Analyses of Human Microbiomes Reveal Thousands of Small, Novel Genes.
Cell
2019
Abstract
Small proteins are traditionally overlooked due to computational and experimental difficulties in detecting them. To systematically identify small proteins, we carried out a comparative genomics study on 1,773 human-associated metagenomes from four different body sites. We describe >4,000 conserved protein families, the majority of which are novel; 30% of these protein families are predicted to be secreted or transmembrane. Over 90% of the small protein families have no known domain and almost half are not represented in reference genomes. We identify putative housekeeping, mammalian-specific, defense-related, and protein families that are likely to be horizontally transferred. We provide evidence of transcription and translation for a subset of these families. Our study suggests that small proteins are highly abundant and those of the human microbiome, in particular, may perform diverse functions that have not been previously reported.
View details for DOI 10.1016/j.cell.2019.07.016
View details for PubMedID 31402174
-
Simultaneous RNA purification and size selection using on-chip isotachophoresis with an ionic spacer.
Lab on a chip
2019
Abstract
We present an on-chip method for the extraction of RNA within a specific size range from low-abundance samples. We use isotachophoresis (ITP) with an ionic spacer and a sieving matrix to enable size-selection with a high yield of RNA in the target size range. The spacer zone separates two concentrated ITP peaks, the first containing unwanted single nucleotides and the second focusing RNA of the target size range (2-35 nt). Our ITP method excludes >90% of single nucleotides and >65% of longer RNAs (>35 nt). Compared to size selection using gel electrophoresis, ITP-based size-selection yields a 2.2-fold increase in the amount of extracted RNAs within the target size range. We also demonstrate compatibility of the ITP-based size-selection with downstream next generation sequencing. On-chip ITP-prepared samples reveal higher reproducibility of transcript-specific measurements compared to samples size-selected by gel electrophoresis. Our method offers an attractive alternative to conventional sample preparation for sequencing with shorter assay time, higher extraction efficiency and reproducibility. Potential applications of ITP-based size-selection include sequencing-based analyses of small RNAs from low-abundance samples such as rare cell types, samples from fluorescence activated cell sorting (FACS), or limited clinical samples.
View details for DOI 10.1039/c9lc00311h
View details for PubMedID 31328753
-
MISTERMINATE Mechanistically Links Mitochondrial Dysfunction with Proteostasis Failure.
Molecular cell
2019
Abstract
Mitochondrial dysfunction and proteostasis failure frequently coexist as hallmarks of neurodegenerative disease. How these pathologies are related is notwell understood. Here, we describe a phenomenon termed MISTERMINATE (mitochondrial-stress-induced translational termination impairment and protein carboxyl terminal extension), which mechanistically links mitochondrial dysfunction with proteostasis failure. We show that mitochondrial dysfunction impairs translational termination of nuclear-encoded mitochondrial mRNAs, including complex-I 30kD subunit (C-I30) mRNA, occurring on the mitochondrial surface in Drosophila and mammalian cells. Ribosomes stalled at the normal stop codon continue to add to the C terminus of C-I30 certain amino acids non-coded by mRNA template. C-terminally extended C-I30 is toxic when assembled into C-I and forms aggregates in the cytosol. Enhancing co-translational quality control prevents C-I30 C-terminal extension and rescues mitochondrial and neuromuscular degeneration in a Parkinson's disease model. These findings emphasize theimportance of efficient translation termination and reveal unexpected link between mitochondrial health and proteome homeostasis mediated by MISTERMINATE.
View details for DOI 10.1016/j.molcel.2019.06.031
View details for PubMedID 31378462
-
Matrix stiffness induces a tumorigenic phenotype in mammary epithelium through changes in chromatin accessibility.
Nature biomedical engineering
2019
Abstract
In breast cancer, the increased stiffness of the extracellular matrix is a key driver of malignancy. Yet little is known about the epigenomic changes that underlie the tumorigenic impact of extracellular matrix mechanics. Here, we show in a three-dimensional culture model of breast cancer that stiff extracellular matrix induces a tumorigenic phenotype through changes in chromatin state. We found that increased stiffness yielded cells with more wrinkled nuclei and with increased lamina-associated chromatin, that cells cultured in stiff matrices displayed more accessible chromatin sites, which exhibited footprints of Sp1 binding, and that this transcription factor acts along with the histone deacetylases 3 and 8 to regulate the induction of stiffness-mediated tumorigenicity. Just as cell culture on soft environments or in them rather than on tissue-culture plastic better recapitulates the acinar morphology observed in mammary epithelium in vivo, mammary epithelial cells cultured on soft microenvironments or in them also more closely replicate the in vivo chromatin state. Our results emphasize the importance of culture conditions for epigenomic studies, and reveal that chromatin state is a critical mediator of mechanotransduction.
View details for DOI 10.1038/s41551-019-0420-5
View details for PubMedID 31285581
-
Long-Read Sequencing - A Powerful Toll in Viral Transcriptome Research
TRENDS IN MICROBIOLOGY
2019; 27 (7): 578–92
View details for DOI 10.1016/j.tim.2019.01.010
View details for Web of Science ID 000470969400005
-
Comment on 'AIRE-deficient patients harbor unique high-affinity disease-ameliorating autoantibodies'.
eLife
2019; 8
Abstract
The AIRE gene plays a key role in the development of central immune tolerance by promoting thymic presentation of tissue-specific molecules. Patients with AIRE-deficiency develop multiple autoimmune manifestations and display autoantibodies against the affected tissues. In 2016 it was reported that: i) the spectrum of autoantibodies in patients with AIRE-deficiency is much broader than previously appreciated; ii) neutralizing autoantibodies to type I interferons (IFNs) could provide protection against type 1 diabetes in these patients (Meyer et al., 2016). We attempted to replicate these new findings using a similar experimental approach in an independent patient cohort, and found no evidence for either conclusion.
View details for DOI 10.7554/eLife.43578
View details for PubMedID 31244471
-
Engineering Genetic Predisposition in Human Neuroepithelial Stem Cells Recapitulates Medulloblastoma Tumorigenesis.
Cell stem cell
2019
Abstract
Human neural stem cell cultures provide progenitor cells that are potential cells of origin for brain cancers. However, the extent to which genetic predisposition to tumor formation can be faithfully captured in stem cell lines is uncertain. Here, we evaluated neuroepithelial stem (NES) cells, representative of cerebellar progenitors. We transduced NES cells with MYCN, observing medulloblastoma upon orthotopic implantation in mice. Significantly, transcriptomes and patternsof DNA methylation from xenograft tumors were globally more representative of human medulloblastoma compared to a MYCN-driven genetically engineered mouse model. Orthotopic transplantation of NES cells generated from Gorlin syndrome patients, who are predisposed to medulloblastoma due to germline-mutated PTCH1, also generated medulloblastoma. We engineered candidate cooperating mutations in Gorlin NES cells, with mutation of DDX3X or loss of GSE1 both accelerating tumorigenesis. These findings demonstrate that human NES cells provide a potent experimental resource for dissecting genetic causation in medulloblastoma.
View details for DOI 10.1016/j.stem.2019.05.013
View details for PubMedID 31204176
-
Novel mutations in PIEZO1 cause an autosomal recessive generalized lymphatic dysplasia with non-immune hydrops fetalis (vol 6, 8035, 2015)
NATURE COMMUNICATIONS
2019; 10
View details for DOI 10.1038/s41467-019-09905-4
View details for Web of Science ID 000465838600003
-
Analysis of the Complete Genome Sequence of a Novel, Pseudorabies Virus Strain Isolated in Southeast Europe
CANADIAN JOURNAL OF INFECTIOUS DISEASES & MEDICAL MICROBIOLOGY
2019; 2019
View details for DOI 10.1155/2019/1806842
View details for Web of Science ID 000482132600001
-
Much ado about nothing: A qualitative study of the experiences of an average-risk population receiving results of exome sequencing
JOURNAL OF GENETIC COUNSELING
2019; 28 (2): 428–37
View details for DOI 10.1002/jgc4.1096
View details for Web of Science ID 000463993600026
-
Much ado about nothing: A qualitative study of the experiences of an average-risk population receiving results of exome sequencing.
Journal of genetic counseling
2019
Abstract
The increasing availability of exome sequencing to the general ("healthy") population raises questions about the implications of genomic testing for individuals without suspected Mendelian diseases. Little is known about this population's motivations for undergoing exome sequencing, their expectations, reactions, and perceptions of utility. In order to address these questions, we conducted in-depth semi-structured interviews with 12 participants recruited from a longitudinal multi-omics profiling study that included exome sequencing. Participants were interviewed after receiving exome results, which included Mendelian disease-associated pathogenic and likely pathogenic variants, pharmacogenetic variants, and risk assessments for multifactorial diseases such as type 2 diabetes. The primary motivation driving participation in exome sequencing was personal curiosity. While they reported feeling validation and relief, participants were frequently underwhelmed by the results and described having expected more from exome sequencing. All participants reported discussing the results with at least some family, friends, and healthcare providers. Participants' recollection of the results returned to them was sometimes incorrect or incomplete, in many cases aligning with their perceptions of their health risks when entering the study. These results underscore the need for different genetic counseling approaches for generally healthy patients undergoing exome sequencing, in particular the need to provide anticipatory guidance to moderate participants' expectations. They also provide a preview of potential challenges clinicians may face as genomic sequencing continues to scale-up in the general population despite a lack of full understanding of its impact.
View details for PubMedID 30835913
-
Multi-Omics Profiling, Microscopic Cervical Remodeling, and Parturition: Insights from the Smart Diaphragm Study.
SAGE PUBLICATIONS INC. 2019: 216A
View details for Web of Science ID 000459610400452
-
Applying circulating tumor DNA methylation in the diagnosis of lung cancer.
Precision clinical medicine
2019; 2 (1): 45-56
Abstract
Lung cancer is the leading cause of cancer-related deaths worldwide. Low dose computed tomography (LDCT) is commonly used for disease screening, with identified candidate cancerous regions further diagnosed using tissue biopsy. However, existing techniques are all invasive and unavoidably cause multiple complications. In contrast, liquid biopsy is a noninvasive, ideal surrogate for tissue biopsy that can identify circulating tumor DNA (ctDNA) containing tumorigenic signatures. It has been successfully implemented to assist treatment decisions and disease outcome prediction. ctDNA methylation, a type of lipid biopsy that profiles critical epigenetic alterations occurring during carcinogenesis, has gained increasing attention. Indeed, aberrant ctDNA methylation occurs at early stages in lung malignancy and therefore can be used as an alternative for the early diagnosis of lung cancer. In this review, we give a brief synopsis of the biological basis and detecting techniques of ctDNA methylation. We then summarize the latest progress in use of ctDNA methylation as a diagnosis biomarker. Lastly, we discuss the major issues that limit application of ctDNA methylation in the clinic, and propose possible solutions to enhance its usage.
View details for DOI 10.1093/pcmedi/pbz003
View details for PubMedID 35694699
View details for PubMedCentralID PMC8985769
-
Windows Into Human Health Through Wearables Data Analytics.
Current opinion in biomedical engineering
2019; 9: 28–46
Abstract
Background: Wearable sensors (wearables) have been commonly integrated into a wide variety of commercial products and are increasingly being used to collect and process raw physiological parameters into salient digital health information. The data collected by wearables are currently being investigated across a broad set of clinical domains and patient populations. There is significant research occurring in the domain of algorithm development, with the aim of translating raw sensor data into fitness- or health-related outcomes of interest for users, patients, and health care providers.Objectives: The aim of this review is to highlight a selected group of fitness- and health-related indicators from wearables data and to describe several algorithmic approaches used to generate these higher order indicators.Methods: A systematic search of the Pubmed database was performed with the following search terms (number of records in parentheses): Fitbit algorithm (18), Apple Watch algorithm (3), Garmin algorithm (5), Microsoft Band algorithm (8), Samsung Gear algorithm (2), Xiaomi MiBand algorithm (1), Huawei Band (Watch) algorithm (2), photoplethysmography algorithm (465), accelerometry algorithm (966), ECG algorithm (8287), continuous glucose monitor algorithm (343). The search terms chosen for this review are focused on algorithms for wearable devices that dominated the commercial wearables market between 2014-2017 and that were highly represented in the biomedical literature. A second set of search terms included categories of algorithms for fitness-related and health-related indicators that are commonly used in wearable devices (e.g. accelerometry, PPG, ECG). These papers covered the following domain areas: fitness; exercise; movement; physical activity; step count; walking; running; swimming; energy expenditure; atrial fibrillation; arrhythmia; cardiovascular; autonomic nervous system; neuropathy; heart rate variability; fall detection; trauma; behavior change; diet; eating; stress detection; serum glucose monitoring; continuous glucose monitoring; diabetes mellitus type 1; diabetes mellitus type 2. All studies uncovered through this search on commercially available device algorithms and pivotal studies on sensor algorithm development were summarized, and a summary table was constructed using references generated by the literature review as described (Table 1).Conclusions: Wearable health technologies aim to collect and process raw physiological or environmental parameters into salient digital health information. Much of the current and future utility of wearables lies in the signal processing steps and algorithms used to analyze large volumes of data. Continued algorithmic development and advances in machine learning techniques will further increase analytic capabilities. In the context of these advances, our review aims to highlight a range of advances in fitness- and other health-related indicators provided by current wearable technologies.
View details for DOI 10.1016/j.cobme.2019.01.001
View details for PubMedID 31832566
-
Lifelong physical activity is associated with promoter hypomethylation of genes involved in metabolism, myogenesis, contractile properties and oxidative stress resistance in aged human skeletal muscle.
Scientific reports
2019; 9 (1): 3272
Abstract
Lifelong regular physical activity is associated with reduced risk of type 2 diabetes (T2D), maintenance of muscle mass and increased metabolic capacity. However, little is known about epigenetic mechanisms that might contribute to these beneficial effects in aged individuals. We investigated the effect of lifelong physical activity on global DNA methylation patterns in skeletal muscle of healthy aged men, who had either performed regular exercise or remained sedentary their entire lives (average age 62 years). DNA methylation was significantly lower in 714 promoters of the physically active than inactive men while methylation of introns, exons and CpG islands was similar in the two groups. Promoters for genes encoding critical insulin-responsive enzymes in glycogen metabolism, glycolysis and TCA cycle were hypomethylated in active relative to inactive men. Hypomethylation was also found in promoters of myosin light chain, dystrophin, actin polymerization, PAK regulatory genes and oxidative stress response genes. A cluster of genes regulated by GSK3beta-TCF7L2 also displayed promoter hypomethylation. Together, our results suggest that lifelong physical activity is associated with DNA methylation patterns that potentially allow for increased insulin sensitivity and a higher expression of genes in energy metabolism, myogenesis, contractile properties and oxidative stress resistance in skeletal muscle of aged individuals.
View details for PubMedID 30824849
-
Long-Read Sequencing - A Powerful Tool in Viral Transcriptome Research.
Trends in microbiology
2019
Abstract
Long-read sequencing (LRS) has become increasingly popular due to its strengths in de novo assembly and in resolving complex DNA regions as well as in determining full-length RNA molecules. Two important LRS technologies have been developed during the past few years, including single-molecule, real-time sequencing by Pacific Biosciences, and nanopore sequencing by Oxford Nanopore Technologies. Although current LRS methods produce lower coverage, and are more error prone than short-read sequencing, these methods continue to be superior in identifying transcript isoforms including multispliced RNAs and transcript-length variants as well as overlapping transcripts and alternative polycistronic RNA molecules. Viruses have small, compact genomes and therefore these organisms are ideal subjects for transcriptome analysis with the relatively low-throughput LRS techniques. Recent LRS studies have multiplied the number of previously known transcripts and have revealed complex networks of transcriptional overlaps in the examined viruses.
View details for PubMedID 30824172
-
2017 NIH-wide workshop report on "The Human Microbiome: Emerging Themes at the Horizon of the 21st Century"
MICROBIOME
2019; 7: 32
Abstract
The National Institutes of Health (NIH) organized a three-day human microbiome research workshop, August 16-18, 2017, to highlight the accomplishments of the 10-year Human Microbiome Project program, the outcomes of the investments made by the 21 NIH Institutes and Centers which now fund this area, and the technical challenges and knowledge gaps which will need to be addressed in order for this field to advance over the next 10 years. This report summarizes the key points in the talks, round table discussions, and Joint Agency Panel from this workshop.
View details for DOI 10.1186/s40168-019-0627-4
View details for Web of Science ID 000459927100002
View details for PubMedID 30808401
View details for PubMedCentralID PMC6391828
-
Whole-exome sequencing data of suicide victims who had suffered from major depressive disorder.
Scientific data
2019; 6: 190010
Abstract
Suicide is one of the leading causes of mortality worldwide; it causes the death of more than one million patients each year. Suicide is a complex, multifactorial phenotype with environmental and genetic factors contributing to the risk of the forthcoming suicide. These factors first generally lead to mental disorders, such as depression, schizophrenia and bipolar disorder, which then become the direct cause of suicide. Here we present a high quality dataset (including processed BAM and VCF files) gained from the high-throughput whole-exome Illumina sequencing of 23 suicide victims - all of whom had suffered from major depressive disorder - and 21 control patients to a depth of at least 40-fold coverage in both cohorts. We identified ~130,000 variants per sample and altogether 442,270 unique variants in the cohort of 44 samples. To our best knowledge, this is the first whole-exome sequencing dataset from suicide victims. We expect that this dataset provides useful information for genomic studies of suicide and depression, and also for the analysis of the Hungarian population.
View details for PubMedID 30720799
-
Whole-exome sequencing data of suicide victims who had suffered from major depressive disorder
SCIENTIFIC DATA
2019; 6
View details for DOI 10.1038/sdata.2019.10
View details for Web of Science ID 000458496800002
-
Smooth Muscle Contact Drives Endothelial Regeneration by BMPR2-Notch1-Mediated Metabolic and Epigenetic Changes
CIRCULATION RESEARCH
2019; 124 (2): 211–24
View details for DOI 10.1161/CIRCRESAHA.118.313374
View details for Web of Science ID 000469341100015
-
Activation of PDGF pathway links LMNA mutation to dilated cardiomyopathy.
Nature
2019
Abstract
Lamin A/C (LMNA) is one of the most frequently mutated genes associated with dilated cardiomyopathy (DCM). DCM related to mutations in LMNA is a common inherited cardiomyopathy that is associated with systolic dysfunction and cardiac arrhythmias. Here we modelled the LMNA-related DCM in vitro using patient-specific induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs). Electrophysiological studies showed that the mutant iPSC-CMs displayed aberrant calcium homeostasis that led to arrhythmias at the single-cell level. Mechanistically, we show that the platelet-derived growth factor (PDGF) signalling pathway is activated in mutant iPSC-CMs compared to isogenic control iPSC-CMs. Conversely, pharmacological and molecular inhibition of the PDGF signalling pathway ameliorated the arrhythmic phenotypes of mutant iPSC-CMs in vitro. Taken together, our findings suggest that the activation of the PDGF pathway contributes to the pathogenesis of LMNA-related DCM and point to PDGF receptor-β (PDGFRB) as a potential therapeutic target.
View details for DOI 10.1038/s41586-019-1406-x
View details for PubMedID 31316208
-
Phenotypically-Silent Bone Morphogenetic Protein Receptor 2 (Bmpr2) Mutations Predispose Rats to Inflammation-Induced Pulmonary Arterial Hypertension by Enhancing The Risk for Neointimal Transformation.
Circulation
2019
Abstract
Bmpr2 mutations are critical risk factors for hereditary pulmonary arterial hypertension (hPAH) with approximately 20% of carriers developing disease. There is an unmet medical need to understand how environmental factors, such as inflammation, render Bmpr2 mutants susceptible to PAH. Overexpressing 5-lipoxygenase (5-LO) provokes lung inflammation and transient PAH in Bmpr2+/- mice. Accordingly, 5-LO and its metabolite, leukotriene B4 (LTB4), are candidates for the 'second hit'. The purpose of this study was to determine how 5-LO-mediated pulmonary inflammation synergized with phenotypically-silent Bmpr2 defects to elicit significant pulmonary vascular disease in rats.Monoallelic Bmpr2 mutant rats were generated and found phenotypically normal for up to one year of observation. To evaluate whether a second hit would elicit disease, animals were exposed to 5-LO-expressing adenovirus (AdAlox5), monocrotaline, SU5416, SU5416 with chronic hypoxia or chronic hypoxia alone. Bmpr2-mutant hPAH patient samples were assessed for neointimal 5-LO expression. Pulmonary artery endothelial cells (PAECs) with impaired BMPR2 signaling were exposed to increased 5-LO-mediated inflammation and were assessed for phenotypic and transcriptomic changes.Lung inflammation, induced by intratracheal delivery of AdAlox5, elicited severe PAH with intimal remodeling in Bmpr2+/- rats but not in their wild-type littermates. Neointimal lesions in the diseased Bmpr2+/- rats gained endogenous 5-LO expression associated with elevated LTB4 biosynthesis. Bmpr2-mutant hPAH patients similarly expressed 5-LO in the neointimal cells. In vitro, BMPR2 deficiency, compounded by 5-LO-mediated inflammation, generated apoptosis-resistant, and proliferative PAECs with mesenchymal characteristics. These transformed cells expressed nuclear envelope-localized 5-LO consistent with induced LTB4 production, as well as a transcriptomic signature similar to clinical disease, including upregulated NF-κB, IL-6, and TGF-β signaling pathways. The reversal of PAH and vasculopathy in Bmpr2 mutants by TGF-β antagonism suggests that TGF-β is critical for neointimal transformation.In a new 'two-hit' model of disease, lung inflammation induced severe PAH pathology in Bmpr2+/- rats. Endothelial transformation required the activation of canonical and noncanonical TGF-β signaling pathways and was characterized by 5-LO nuclear envelope translocation with enhanced LTB4 production. This study offers one explanation of how an environmental injury unleashes the destructive potential of an otherwise-silent genetic mutation.
View details for DOI 10.1161/CIRCULATIONAHA.119.040629
View details for PubMedID 31462075
-
Macrophage de novo NAD(+) synthesis specifies immune function in aging and inflammation
NATURE IMMUNOLOGY
2019; 20 (1): 50-+
View details for DOI 10.1038/s41590-018-0255-3
View details for Web of Science ID 000456273900014
-
Progress on Identifying and Characterizing the Human Proteome: 2019 Metrics from the HUPO Human Proteome Project.
Journal of proteome research
2019
Abstract
The Human Proteome Project (HPP) annually reports on progress made throughout the field in credibly identifying and characterizing the complete human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2019-01-11 contains 17 694 proteins with strong protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data v2.1; these represent 89% of all 19 823 neXtProt predicted coding genes (all PE1,2,3,4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), has been reduced from 2949 to 2129 since 2016 through efforts throughout the community, including the chromosome-centric HPP. PeptideAtlas is the source of uniformly reanalyzed raw mass spectrometry data for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, especially from studies designed to detect hard-to-identify proteins. Meanwhile, the Human Protein Atlas has released version 18.1 with immunohistochemical evidence of expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many investigators apply multiplexed SRM-targeted proteomics for quantitation of organ-specific popular proteins in studies of various human diseases. The 19 teams of the Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, bringing proteomics to a broad array of biomedical research.
View details for DOI 10.1021/acs.jproteome.9b00434
View details for PubMedID 31430157
-
MACHINE LEARNING ANALYSIS OF ULTRA-DEEP WHOLE-GENOME SEQUENCING IN HUMAN BRAIN REVEALS SOMATIC GENOMIC RETROTRANSPOSITION IN GLIA AS WELL AS IN NEURONS
ELSEVIER. 2019: 1240
View details for DOI 10.1016/j.euroneuro.2018.08.316
View details for Web of Science ID 000477708400398
-
Global metabolic profiling to model biological processes of aging in twins.
Aging cell
2019: e13073
Abstract
Aging is intimately linked to system-wide metabolic changes that can be captured in blood. Understanding biological processes of aging in humans could help maintain a healthy aging trajectory and promote longevity. We performed untargeted plasma metabolomics quantifying 770 metabolites on a cross-sectional cohort of 268 healthy individuals including 125 twin pairs covering human lifespan (from 6 months to 82 years). Unsupervised clustering of metabolic profiles revealed 6 main aging trajectories throughout life that were associated with key metabolic pathways such as progestin steroids, xanthine metabolism, and long-chain fatty acids. A random forest (RF) model was successful to predict age in adult subjects (≥16 years) using 52 metabolites (R2 = .97). Another RF model selected 54 metabolites to classify pediatric and adult participants (out-of-bag error = 8.58%). These RF models in combination with correlation network analysis were used to explore biological processes of healthy aging. The models highlighted established metabolites, like steroids, amino acids, and free fatty acids as well as novel metabolites and pathways. Finally, we show that metabolic profiles of twins become more dissimilar with age which provides insights into nongenetic age-related variability in metabolic profiles in response to environmental exposure.
View details for DOI 10.1111/acel.13073
View details for PubMedID 31746094
-
Smart Diaphragm Study: Multi-omics profiling and cervical device measurements during pregnancy
MOSBY-ELSEVIER. 2019: S649
View details for DOI 10.1016/j.ajog.2018.11.1033
View details for Web of Science ID 000454249403169
-
Personalized Metabolomics.
Methods in molecular biology (Clifton, N.J.)
2019; 1978: 447–56
Abstract
The human metabolome is the cumulative product of ingested metabolites and those produced by the body and its microbiota. Together these metabolites can dynamically report on the health and disease state of an individual, as well as their response to drug treatments and other external perturbations. Profiling metabolites in human body fluids provides an opportunity to identify biomarkers and stratify patients for personalized treatments but requires the development of high-throughput approaches compatible with large cohort and longitudinal studies. Here we review in detail sample preparation and analytical liquid chromatography-mass spectrometry (LC-MS) methods to measure the broad chemical diversity of metabolites found in human plasma and urine.
View details for DOI 10.1007/978-1-4939-9236-2_27
View details for PubMedID 31119679
-
Analysis of the Complete Genome Sequence of a Novel, Pseudorabies Virus Strain Isolated in Southeast Europe.
The Canadian journal of infectious diseases & medical microbiology = Journal canadien des maladies infectieuses et de la microbiologie medicale
2019; 2019: 1806842
Abstract
Pseudorabies virus (PRV) is the causative agent of Aujeszky's disease giving rise to significant economic losses worldwide. Many countries have implemented national programs for the eradication of this virus. In this study, long-read sequencing was used to determine the nucleotide sequence of the genome of a novel PRV strain (PRV-MdBio) isolated in Serbia.In this study, a novel PRV strain was isolated and characterized. PRV-MdBio was found to exhibit similar growth properties to those of another wild-type PRV, the strain Kaplan. Single-molecule real-time (SMRT) sequencing has revealed that the new strain differs significantly in base composition even from strain Kaplan, to which it otherwise exhibits the highest similarity. We compared the genetic composition of PRV-MdBio to strain Kaplan and the China reference strain Ea and obtained that radical base replacements were the most common point mutations preceding conservative and silent mutations. We also found that the adaptation of PRV to cell culture does not lead to any tendentious genetic alteration in the viral genome.PRV-MdBio is a wild-type virus, which differs in base composition from other PRV strains to a relatively large extent.
View details for PubMedID 31093307
-
Heterogeneity in old fibroblasts is linked to variability in reprogramming and wound healing.
Nature
2019; 574 (7779): 553–58
Abstract
Age-associated chronic inflammation (inflammageing) is a central hallmark of ageing1, but its influence on specific cells remains largely unknown. Fibroblasts are present in most tissues and contribute to wound healing2,3. They are also the most widely used cell type for reprogramming to induced pluripotent stem (iPS) cells, a process that has implications for regenerative medicine and rejuvenation strategies4. Here we show that fibroblast cultures from old mice secrete inflammatory cytokines and exhibit increased variability in the efficiency of iPS cell reprogramming between mice. Variability between individuals is emerging as a feature of old age5-8, but the underlying mechanisms remain unknown. To identify drivers of this variability, we performed multi-omics profiling of fibroblast cultures from young and old mice that have different reprogramming efficiencies. This approach revealed that fibroblast cultures from old mice contain 'activated fibroblasts' that secrete inflammatory cytokines, and that the proportion of activated fibroblasts in a culture correlates with the reprogramming efficiency of that culture. Experiments in which conditioned medium was swapped between cultures showed that extrinsic factors secreted by activated fibroblasts underlie part of the variability between mice in reprogramming efficiency, and we have identified inflammatory cytokines, including TNF, as key contributors. Notably, old mice also exhibited variability in wound healing rate in vivo. Single-cell RNA-sequencing analysis identified distinct subpopulations of fibroblasts with different cytokine expression and signalling in the wounds of old mice with slow versus fast healing rates. Hence, a shift in fibroblast composition, and the ratio of inflammatory cytokines that they secrete, may drive the variability between mice in reprogramming in vitro and influence wound healing rate in vivo. This variability may reflect distinct stochastic ageing trajectories between individuals, and could help in developing personalized strategies to improve iPS cell generation and wound healing in elderly individuals.
View details for DOI 10.1038/s41586-019-1658-5
View details for PubMedID 31645721
-
Multiomics modeling of the immunome, transcriptome, microbiome, proteome and metabolome adaptations during human pregnancy.
Bioinformatics (Oxford, England)
2019; 35 (1): 95–103
Abstract
Motivation: Multiple biological clocks govern a healthy pregnancy. These biological mechanisms produce immunologic, metabolomic, proteomic, genomic and microbiomic adaptations during the course of pregnancy. Modeling the chronology of these adaptations during full-term pregnancy provides the frameworks for future studies examining deviations implicated in pregnancy-related pathologies including preterm birth and preeclampsia.Results: We performed a multiomics analysis of 51 samples from 17 pregnant women, delivering at term. The datasets included measurements from the immunome, transcriptome, microbiome, proteome and metabolome of samples obtained simultaneously from the same patients. Multivariate predictive modeling using the Elastic Net (EN) algorithm was used to measure the ability of each dataset to predict gestational age. Using stacked generalization, these datasets were combined into a single model. This model not only significantly increased predictive power by combining all datasets, but also revealed novel interactions between different biological modalities. Future work includes expansion of the cohort to preterm-enriched populations and in vivo analysis of immune-modulating interventions based on the mechanisms identified.Availability and implementation: Datasets and scripts for reproduction of results are available through: https://nalab.stanford.edu/multiomics-pregnancy/.Supplementary information: Supplementary data are available at Bioinformatics online.
View details for PubMedID 30561547
-
A machine-compiled database of genome-wide association studies.
Nature communications
2019; 10 (1): 3341
Abstract
Tens of thousands of genotype-phenotype associations have been discovered to date, yet not all of them are easily accessible to scientists. Here, we describe GWASkb, a machine-compiled knowledge base of genetic associations collected from the scientific literature using automated information extraction algorithms. Our information extraction system helps curators by automatically collecting over 6,000 associations from open-access publications with an estimated recall of 60-80% and with an estimated precision of 78-94% (measured relative to existing manually curated knowledge bases). This system represents a fully automated GWAS curation effort and is made possible by a paradigm for constructing machine learning systems called data programming. Our work represents a step towards making the curation of scientific literature more efficient using automated systems.
View details for DOI 10.1038/s41467-019-11026-x
View details for PubMedID 31350405
-
Multiomics modeling of the immunome, transcriptome, microbiome, proteome and metabolome adaptations during human pregnancy
BIOINFORMATICS
2019; 35 (1): 95–103
View details for DOI 10.1093/bioinformatics/bty537
View details for Web of Science ID 000459313900012
-
Mitigation of off-target toxicity in CRISPR-Cas9 screens for essential non-coding elements.
Nature communications
2019; 10 (1): 4063
Abstract
Pooled CRISPR-Cas9 screens are a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we investigate Cas9, dCas9, and CRISPRi/a off-target activity in screens for essential regulatory elements. The sgRNAs with the largest effects in genome-scale screens for essential CTCF loop anchors in K562 cells were not single guide RNAs (sgRNAs) that disrupted gene expression near the on-target CTCF anchor. Rather, these sgRNAs had high off-target activity that, while only weakly correlated with absolute off-target site number, could be predicted by the recently developed GuideScan specificity score. Screens conducted in parallel with CRISPRi/a, which do not induce double-stranded DNA breaks, revealed that a distinct set of off-targets also cause strong confounding fitness effects with these epigenome-editing tools. Promisingly, filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and enabled identification of essential regulatory elements.
View details for DOI 10.1038/s41467-019-11955-7
View details for PubMedID 31492858
-
Understanding health disparities.
Journal of perinatology : official journal of the California Perinatal Association
2018
Abstract
Based upon our recent insights into the determinants of preterm birth, which is the leading cause of death in children under five years of age worldwide, we describe potential analytic frameworks that provides both a common understanding and, ultimately the basis for effective, ameliorative action. Our research on preterm birth serves as an example that the framing of any human health condition is a result of complex interactions between the genome and the exposome. New discoveries of the basic biology of pregnancy, such as the complex immunological and signaling processes that dictate the health and length of gestation, have revealed a complexity in the interactions (current and ancestral) between genetic and environmental forces. Understanding of these relationships may help reduce disparities in preterm birth and guide productive research endeavors and ultimately, effective clinical and public health interventions.
View details for PubMedID 30560947
-
Cross-Platform Comparison of Untargeted and Targeted Lipidomics Approaches on Aging Mouse Plasma.
Scientific reports
2018; 8 (1): 17747
Abstract
Lipidomics - the global assessment of lipids - can be performed using a variety of mass spectrometry (MS)-based approaches. However, choosing the optimal approach in terms of lipid coverage, robustness and throughput can be a challenging task. Here, we compare a novel targeted quantitative lipidomics platform known as the Lipidyzer to a conventional untargeted liquid chromatography (LC)-MS approach. We find that both platforms are efficient in profiling more than 300 lipids across 11 lipid classes in mouse plasma with precision and accuracy below 20% for most lipids. While the untargeted and targeted platforms detect similar numbers of lipids, the former identifies a broader range of lipid classes and can unambiguously identify all three fatty acids in triacylglycerols (TAG). Quantitative measurements from both approaches exhibit a median correlation coefficient (r) of 0.99 using a dilution series of deuterated internal standards and 0.71 using endogenous plasma lipids in the context of aging. Application of both platforms to plasma from aging mouse reveals similar changes in total lipid levels across all major lipid classes and in specific lipid species. Interestingly, TAG is the lipid class that exhibits the most changes with age, suggesting that TAG metabolism is particularly sensitive to the aging process in mice. Collectively, our data show that the Lipidyzer platform provides comprehensive profiling of the most prevalent lipids in plasma in a simple and automated manner.
View details for PubMedID 30532037
-
Progress on Identifying and Characterizing the Human Proteome: 2018 Metrics from the HUPO Human Proteome Project
JOURNAL OF PROTEOME RESEARCH
2018; 17 (12): 4031–41
Abstract
The Human Proteome Project (HPP) annually reports on progress throughout the field in credibly identifying and characterizing the human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2018-01-17, the baseline for this sixth annual HPP special issue of the Journal of Proteome Research, contains 17 470 PE1 proteins, 89% of all neXtProt predicted PE1-4 proteins, up from 17 008 in release 2017-01-23 and 13 975 in release 2012-02-24. Conversely, the number of neXtProt PE2,3,4 missing proteins has been reduced from 2949 to 2579 to 2186 over the past two years. Of the PE1 proteins, 16 092 are based on mass spectrometry results, and 1378 on other kinds of protein studies, notably protein-protein interaction findings. PeptideAtlas has 15 798 canonical proteins, up 625 over the past year, including 269 from SUMOylation studies. The largest reason for missing proteins is low abundance. Meanwhile, the Human Protein Atlas has released its Cell Atlas, Pathology Atlas, and updated Tissue Atlas, and is applying recommendations from the International Working Group on Antibody Validation. Finally, there is progress using the quantitative multiplex organ-specific popular proteins targeted proteomics approach in various disease categories.
View details for PubMedID 30099871
-
Transcriptomic study of Herpes simplex virus type-1 using full-length sequencing techniques.
Scientific data
2018; 5: 180266
Abstract
Herpes simplex virus type-1 (HSV-1) is a human pathogenic member of the Alphaherpesvirinae subfamily of herpesviruses. The HSV-1 genome is a large double-stranded DNA specifying about 85 protein coding genes. The latest surveys have demonstrated that the HSV-1 transcriptome is much more complex than it had been thought before. Here, we provide a long-read sequencing dataset, which was generated by using the RSII and Sequel systems from Pacific Biosciences (PacBio), as well as MinION sequencing system from Oxford Nanopore Technologies (ONT). This dataset contains 39,096 reads of inserts (ROIs) mapped to the HSV-1 genome (X14112) in RSII sequencing, while Sequel sequencing yielded 77,851 ROIs. The MinION cDNA sequencing altogether resulted in 158,653 reads, while the direct RNA-seq produced 16,516 reads. This dataset can be utilized for the identification of novel HSV RNAs and transcripts isoforms, as well as for the comparison of the quality and length of the sequencing reads derived from the currently available long-read sequencing platforms. The various library preparation approaches can also be compared with each other.
View details for PubMedID 30480662
-
Transcriptomic study of Herpes simplex virus type-1 using full-length sequencing techniques
SCIENTIFIC DATA
2018; 5
View details for DOI 10.1038/sdata.2018.266
View details for Web of Science ID 000451565600001
-
Macrophage de novo NAD+ synthesis specifies immune function in aging and inflammation.
Nature immunology
2018
Abstract
Recent advances highlight a pivotal role for cellular metabolism in programming immune responses. Here, we demonstrate that cell-autonomous generation of nicotinamide adenine dinucleotide (NAD+) via the kynurenine pathway (KP) regulates macrophage immune function in aging and inflammation. Isotope tracer studies revealed that macrophage NAD+ derives substantially from KP metabolism of tryptophan. Genetic or pharmacological blockade of de novo NAD+ synthesis depleted NAD+, suppressed mitochondrial NAD+-dependent signaling and respiration, and impaired phagocytosis and resolution of inflammation. Innate immune challenge triggered upstream KP activation but paradoxically suppressed cell-autonomous NAD+ synthesis by limiting the conversion of downstream quinolinate to NAD+, a profile recapitulated in aging macrophages. Increasing de novo NAD+ generation in immune-challenged or aged macrophages restored oxidative phosphorylation and homeostatic immune responses. Thus, KP-derived NAD+ operates as a metabolic switch to specify macrophage effector responses. Breakdown of de novo NAD+ synthesis may underlie declining NAD+ levels and rising innate immune dysfunction in aging and age-associated diseases.
View details for PubMedID 30478397
-
Dynamic Transcriptome Profiling Dataset of Vaccinia Virus Obtained from Long-read Sequencing Techniques.
GigaScience
2018
Abstract
Background: Poxviruses are large DNA viruses infecting humans and animals. Vaccinia virus (VACV) has been applied as a live vaccine for immunization against smallpox, which was eradicated by 1980 as a result of worldwide vaccination. VACV is the prototype of poxviruses in the investigation of the molecular pathogenesis of the virus. Short-read sequencing methods have revolutionized transcriptomics; but, they are not efficient in distinguishing between the RNA isoforms and transcript overlaps. Long-read sequencing (LRS) is much better suited to solve these problems, and also allow direct RNA sequencing. Despite the scientific relevance of VACV, no LRS data have been generated for the viral transcriptome so far.Findings: For the deep characterization of the VACV RNA profile, various LRS platforms and library preparation approaches were applied. The raw reads were mapped to the VACV reference genome and also to the host (Chlorocebus sabaeus) genome. In this study, we applied the Pacific Biosciences RSII and Sequel platforms, which altogether resulted in 937,531 mapped reads of inserts (1.42 Gb), while we obtained 2,160,348 aligned reads (1.75 Gb) from the different library preparation methods, using the MinION device from Oxford Nanopore Technologies.Conclusions: By applying cutting-edge technologies, we were able to generate a large dataset that can serve as a valuable resource for the investigation of the dynamic VACV transcriptome, the virus-host interactions and RNA base modifications. These data can provide useful information for novel gene annotations in the VACV genome. Our dataset can also be applied for analyzing the currently available LRS platforms, library preparation methods and bioinformatics pipelines.
View details for PubMedID 30476066
-
Smooth Muscle Contact Drives Endothelial Regeneration by BMPR2-Notch1 Mediated Metabolic and Epigenetic Changes.
Circulation research
2018
Abstract
RATIONALE: Maintaining endothelial cells (EC) as a monolayer in the vessel wall depends on their metabolic state and gene expression profile, features influenced by contact with neighboring cells such as pericytes and smooth muscle cells (SMC). Failure to regenerate a normal EC monolayer in response to injury can result in occlusive neointima formation in diseases such as atherosclerosis and pulmonary arterial hypertension.OBJECTIVE: We investigated the nature and functional importance of contact-dependent communication between SMC and EC to maintain EC integrity.METHODS AND RESULTS: We found that in SMC and EC contact co-cultures, bone morphogenetic protein receptor 2 (BMPR2) is required by both cell types to produce collagen IV to activate integrin-linked kinase. This enzyme directs phospho c-Jun N-terminal kinase (p-JNK) to the EC membrane, where it stabilizes presenilin1 and releases Notch1 intracellular domain (N1ICD) to promote EC proliferation. This response is necessary for EC regeneration following carotid artery injury. It is deficient in EC-SMC Bmpr2 double heterozygous mice in association with reduced collagen IV production, decreased N1ICD and attenuated EC proliferation, but can be rescued by targeting N1ICD to EC. Deletion of EC- Notch1 in transgenic mice worsens hypoxia-induced pulmonary hypertension, in association with impaired EC regenerative function associated with loss of pre-capillary arteries. We further determined that N1ICD maintains EC proliferative capacity by increasing mitochondrial mass and by inducing the phosphofructokinase PFKFB3. ChIP-seq analyses showed that PFKFB3 is required for citrate-dependent histone acetylation (H3K27) at enhancer sites of genes regulated by the acetyl transferase p300, and by N1ICD or the N1ICD target MYC and necessary for EC proliferation and homeostasis.CONCLUSIONS: Thus, SMC-EC contact is required for activation of Notch1 by BMPR2, to coordinate metabolism with chromatin remodeling of genes that enable EC regeneration, to maintain monolayer integrity and vascular homeostasis in response to injury.
View details for PubMedID 30582451
-
Identification of phagocytosis regulators using magnetic genome-wide CRISPR screens.
Nature genetics
2018
Abstract
Phagocytosis is required for a broad range of physiological functions, from pathogen defense to tissue homeostasis, but the mechanisms required for phagocytosis of diverse substrates remain incompletely understood. Here, we developed a rapid magnet-based phenotypic screening strategy, and performed eight genome-wide CRISPR screens in human cells to identify genes regulating phagocytosis of distinct substrates. After validating select hits in focused miniscreens, orthogonal assays and primary human macrophages, we show that (1) the previously uncharacterized gene NHLRC2 is a central player in phagocytosis, regulating RhoA-Rac1 signaling cascades that control actin polymerization and filopodia formation, (2) very-long-chain fatty acids are essential for efficient phagocytosis of certain substrates and (3) the previously uncharacterized Alzheimer's disease-associated gene TM2D3 can preferentially influence uptake of amyloid-beta aggregates. These findings illuminate new regulators and core principles of phagocytosis, and more generally establish an efficient method for unbiased identification of cellular uptake mechanisms across diverse physiological and pathological contexts.
View details for PubMedID 30397336
-
Systematic Screening For Environmental And Behavioral Determinants Identifies Factors Detrimental to Skeletal Health
WILEY. 2018: 279
View details for Web of Science ID 000450475401405
-
Evaluation of whole exome sequencing as an alternative to BeadChip and whole genome sequencing in human population genetic analysis.
BMC genomics
2018; 19 (1): 778
Abstract
BACKGROUND: Understanding the underlying genetic structure of human populations is of fundamental interest to both biological and social sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation. The most widely used methods for collecting variant information at the DNA-level include whole genome sequencing, which remains costly, and the more economical solution of array-based techniques, as these are capable of simultaneously genotyping a pre-selected set of variable DNA sites in the human genome. The largest publicly accessible set of human genomic sequence data available today originates from exome sequencing that comprises around 1.2% of the whole genome (approximately 30 million base pairs).RESULTS: To unbiasedly compare the effect of SNP selection strategies in population genetic analysis we subsampled the variants of the same highly curated 1K Genome dataset to mimic genome, exome sequencing and array data in order to eliminate the effect of different chemistry and error profiles of these different approaches. Next we compared the application of the exome dataset to the array-based dataset and to the gold standard whole genome dataset using the same population genetic analysis methods.CONCLUSIONS: Our results draw attention to some of the inherent problems that arise from using pre-selected SNP sets for population genetic analysis. Additionally, we demonstrate that exome sequencing provides a better alternative to the array-based methods for population genetic analysis. In this study, we propose a strategy for unbiased variant collection from exome data and offer a bioinformatics protocol for proper data processing.
View details for PubMedID 30373510
-
Precision Medicine: Role of Proteomics in Changing Clinical Management and Care.
Journal of proteome research
2018
Abstract
It is now possible to collect large sums of health-related data which has the potential to transform healthcare. Proteomics, with its central position as downstream of genetics and epigenetic inputs and upstream of biochemical outputs and integrators of environmental signals, is well-positioned to contribute to health discoveries and management. We present our perspective on the role of proteomics and other Omics in precision health and medicine.
View details for PubMedID 30296097
-
Wearables and the medical revolution.
Personalized medicine
2018
Abstract
Wearable sensors are already impacting healthcare and medicine by enabling health monitoring outside of the clinic and prediction of health events. This paper reviews current and prospective wearable technologies and their progress toward clinical application. We describe technologies underlying common, commercially available wearable sensors and early-stage devices and outline research, when available, to support the use of these devices in healthcare. We cover applications in the following health areas: metabolic, cardiovascular and gastrointestinal monitoring; sleep, neurology, movement disorders and mental health; maternal, pre- and neo-natal care; and pulmonary health and environmental exposures. Finally, we discuss challenges associated with the adoption of wearable sensors in the current healthcare ecosystem and discuss areas for future research and development.
View details for PubMedID 30259801
-
Dual Platform Long-Read RNA-Sequencing Dataset of the Human Cytomegalovirus Lytic Transcriptome
FRONTIERS IN GENETICS
2018; 9
View details for DOI 10.3389/fgene.2018.00432
View details for Web of Science ID 000445797500001
-
Dual Platform Long-Read RNA-Sequencing Dataset of the Human Cytomegalovirus Lytic Transcriptome.
Frontiers in genetics
2018; 9: 432
View details for DOI 10.3389/fgene.2018.00432
View details for PubMedID 30319694
View details for PubMedCentralID PMC6170618
-
Disruption of mesoderm formation during cardiac differentiation due to developmental exposure to 13-cis-retinoic acid.
Scientific reports
2018; 8 (1): 12960
Abstract
13-cis-retinoic acid (isotretinoin, INN) is an oral pharmaceutical drug used for the treatment of skin acne, and is also a known teratogen. In this study, the molecular mechanisms underlying INN-induced developmental toxicity during early cardiac differentiation were investigated using both human induced pluripotent stem cells (hiPSCs) and human embryonic stem cells (hESCs). Pre-exposure of hiPSCs and hESCs to a sublethal concentration of INN did not influence cell proliferation and pluripotency. However, mesodermal differentiation was disrupted when INN was included in the medium during differentiation. Transcriptomic profiling by RNA-seq revealed that INN exposure leads to aberrant expression of genes involved in several signaling pathways that control early mesoderm differentiation, such as TGF-beta signaling. In addition, genome-wide chromatin accessibility profiling by ATAC-seq suggested that INN-exposure leads to enhanced DNA-binding of specific transcription factors (TFs), including HNF1B, SOX10 and NFIC, often in close spatial proximity to genes that are dysregulated in response to INN treatment. Altogether, these results identify potential molecular mechanisms underlying INN-induced perturbation during mesodermal differentiation in the context of cardiac development. This study further highlights the utility of human stem cells as an alternative system for investigating congenital diseases of newborns that arise as a result of maternal drug exposure during pregnancy.
View details for PubMedID 30154523
-
A Cloud-Based Metabolite and Chemical Prioritization System for the Biology/Disease-Driven Human Proteome Project.
Journal of proteome research
2018
Abstract
Targeted metabolomics and biochemical studies complement the ongoing investigations led by the Human Proteome Organization (HUPO) Biology/Disease-Driven Human Proteome Project (B/D-HPP). However, it is challenging to identify and prioritize metabolite and chemical targets. Literature-mining-based approaches have been proposed for target proteomics studies, but text mining methods for metabolite and chemical prioritization are hindered by a large number of synonyms and nonstandardized names of each entity. In this study, we developed a cloud-based literature mining and summarization platform that maps metabolites and chemicals in the literature to unique identifiers and summarizes the copublication trends of metabolites/chemicals and B/D-HPP topics using Protein Universal Reference Publication-Originated Search Engine (PURPOSE) scores. We successfully prioritized metabolites and chemicals associated with the B/D-HPP targeted fields and validated the results by checking against expert-curated associations and enrichment analyses. Compared with existing algorithms, our system achieved better precision and recall in retrieving chemicals related to B/D-HPP focused areas. Our cloud-based platform enables queries on all biological terms in multiple species, which will contribute to B/D-HPP and targeted metabolomics/chemical studies.
View details for PubMedID 30094994
-
Long-Read Sequencing Revealed an Extensive Transcript Complexity in Herpesviruses
FRONTIERS IN GENETICS
2018; 9
View details for DOI 10.3389/fgene.2018.00259
View details for Web of Science ID 000438979700003
-
Long-Read Sequencing Revealed an Extensive Transcript Complexity in Herpesviruses.
Frontiers in genetics
2018; 9: 259
Abstract
Long-read sequencing (LRS) techniques are very recent advancements, but they have already been used for transcriptome research in all of the three subfamilies of herpesviruses. These techniques have multiplied the number of known transcripts in each of the examined viruses. Meanwhile, they have revealed a so far hidden complexity of the herpesvirus transcriptome with the discovery of a large number of novel RNA molecules, including coding and non-coding RNAs, as well as transcript isoforms, and polycistronic RNAs. Additionally, LRS techniques have uncovered an intricate meshwork of transcriptional overlaps between adjacent and distally located genes. Here, we review the contribution of LRS to herpesvirus transcriptomics and present the complexity revealed by this technology, while also discussing the functional significance of this phenomenon.
View details for DOI 10.3389/fgene.2018.00259
View details for PubMedID 30065753
View details for PubMedCentralID PMC6056645
-
An integrated global regulatory network of hematopoietic precursor cell self-renewal and differentiation
INTEGRATIVE BIOLOGY
2018; 10 (7): 390–405
Abstract
Systematic study of the regulatory mechanisms of Hematopoietic Stem Cell and Progenitor Cell (HSPC) self-renewal is fundamentally important for understanding hematopoiesis and for manipulating HSPCs for therapeutic purposes. Previously, we have characterized gene expression and identified important transcription factors (TFs) regulating the switch between self-renewal and differentiation in a multipotent Hematopoietic Progenitor Cell (HPC) line, EML (Erythroid, Myeloid, and Lymphoid) cells. Herein, we report binding maps for additional TFs (SOX4 and STAT3) by using chromatin immunoprecipitation (ChIP)-Sequencing, to address the underlying mechanisms regulating self-renewal properties of lineage-CD34+ subpopulation (Lin-CD34+ EML cells). Furthermore, we applied the Assay for Transposase Accessible Chromatin (ATAC)-Sequencing to globally identify the open chromatin regions associated with TF binding in the self-renewing Lin-CD34+ EML cells. Mass spectrometry (MS) was also used to quantify protein relative expression levels. Finally, by integrating the protein-protein interaction database, we built an expanded transcriptional regulatory and interaction network. We found that MAPK (Mitogen-activated protein kinase) pathway and TGF-β/SMAD signaling pathway components were highly enriched among the binding targets of these TFs in Lin-CD34+ EML cells. The present study integrates regulatory information at multiple levels to paint a more comprehensive picture of the HSPC self-renewal mechanisms.
View details for PubMedID 29892750
-
High Throughput Sequencing and Assessing Disease Risk.
Cold Spring Harbor perspectives in medicine
2018
Abstract
High-throughput sequencing has dramatically improved our ability to determine and diagnose the underlying causes of human disease. The use of whole-genome and whole-exome sequencing has facilitated faster and more cost-effective identification of new genes implicated in Mendelian disease. It has also improved our ability to identify disease-causing mutations for Mendelian diseases whose associated genes are already known. These benefits apply not only in cases in which the objective is to assess genetic disease risk in adults and children, but also for prenatal genetic testing and embryonic testing. High-throughput sequencing has also impacted our ability to assess risk for complex diseases and will likely continue to influence this area of disease research as more and more individuals undergo sequencing and we better understand the significance of variation, both rare and common, across the genome. Through these activities, high-throughput sequencing has the potential to revolutionize medicine.
View details for PubMedID 29959131
-
Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms
SCIENTIFIC DATA
2018; 5: 180119
Abstract
Pseudorabies virus (PRV) is an alphaherpesvirus of swine. PRV has a large double-stranded DNA genome and, as the latest investigations have revealed, a very complex transcriptome. Here, we present a large RNA-Seq dataset, derived from both short- and long-read sequencing. The dataset contains 1.3 million 100 bp paired-end reads that were obtained from the Illumina random-primed libraries, as well as 10 million 50 bp single-end reads generated by the Illumina polyA-seq. The Pacific Biosciences RSII non-amplified method yielded 57,021 reads of inserts (ROIs) aligned to the viral genome, the amplified method resulted in 158,396 PRV-specific ROIs, while we obtained 12,555 ROIs using the Sequel platform. The Oxford Nanopore's MinION device generated 44,006 reads using their regular cDNA-sequencing method, whereas 29,832 and 120,394 reads were produced by using the direct RNA-sequencing and the Cap-selection protocols, respectively. The raw reads were aligned to the PRV reference genome (KJ717942.1). Our provided dataset can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.
View details for PubMedID 29917014
-
Integrative omics for health and disease
NATURE REVIEWS GENETICS
2018; 19 (5): 299–310
Abstract
Advances in omics technologies - such as genomics, transcriptomics, proteomics and metabolomics - have begun to enable personalized medicine at an extraordinarily detailed molecular level. Individually, these technologies have contributed medical advances that have begun to enter clinical practice. However, each technology individually cannot capture the entire biological complexity of most human diseases. Integration of multiple technologies has emerged as an approach to provide a more comprehensive view of biology and disease. In this Review, we discuss the potential for combining diverse types of data and the utility of this approach in human health and disease. We provide examples of data integration to understand, diagnose and inform treatment of diseases, including rare and common diseases as well as cancer and transplant biology. Finally, we discuss technical and other challenges to clinical implementation of integrative omics.
View details for PubMedID 29479082
View details for PubMedCentralID PMC5990367
-
Personal Omics for Precision Health
CIRCULATION RESEARCH
2018; 122 (9): 1169–71
View details for PubMedID 29700064
-
Fast Metagenomic Binning via Hashing and Bayesian Clustering
JOURNAL OF COMPUTATIONAL BIOLOGY
2018
Abstract
We introduce GATTACA, a framework for fast unsupervised binning of metagenomic contigs. Similar to recent approaches, GATTACA clusters contigs based on their coverage profiles across a large cohort of metagenomic samples; however, unlike previous methods that rely on read mapping, GATTACA quickly estimates these profiles from kmer counts stored in a compact index. This approach can result in over an order of magnitude speedup, while matching the accuracy of earlier methods on synthetic and real data benchmarks. It also provides a way to index metagenomic samples (e.g., from public repositories such as the Human Microbiome Project) offline once and reuse them across experiments; furthermore, the small size of the sample indices allows them to be easily transferred and stored. Leveraging the MinHash technique, GATTACA also provides an efficient way to identify publicly available metagenomic data that can be incorporated into the set of reference metagenomes to further improve binning accuracy. Thus, enabling easy indexing and reuse of publicly available metagenomic data sets, GATTACA makes accurate metagenomic analyses accessible to a much wider range of researchers.
View details for PubMedID 29658784
-
Distinct transcriptomic and exomic abnormalities within myelodysplastic syndrome marrow cells.
Leukemia & lymphoma
2018: 1-11
Abstract
To provide biologic insights into mechanisms underlying myelodysplastic syndromes (MDS) we evaluated the CD34+ marrow cells transcriptome using high-throughput RNA sequencing (RNA-Seq). We demonstrated significant differential gene expression profiles (GEPs) between MDS and normal and identified 41 disease classifier genes. Additionally, two main clusters of GEPs distinguished patients based on their major clinical features, particularly between those whose disease remained stable versus patients who transformed into acute myeloid leukemia within 12 months. The genes whose expression was associated with disease outcome were involved in functional pathways and biologic processes highly relevant for MDS. Combined with exomic analysis we identified differential isoform usage of genes in MDS mutational subgroups, with consequent dysregulation of distinct biologic functions. This combination of clinical, transcriptomic and exomic findings provides valuable understanding of mechanisms underlying MDS and its progression to a more aggressive stage and also facilitates prognostic characterization of MDS patients.
View details for DOI 10.1080/10428194.2018.1452210
View details for PubMedID 29616851
-
A global transcriptional network connecting noncoding mutations to changes in tumor gene expression
NATURE GENETICS
2018; 50 (4): 613-+
Abstract
Although cancer genomes are replete with noncoding mutations, the effects of these mutations remain poorly characterized. Here we perform an integrative analysis of 930 tumor whole genomes and matched transcriptomes, identifying a network of 193 noncoding loci in which mutations disrupt target gene expression. These 'somatic eQTLs' (expression quantitative trait loci) are frequently mutated in specific cancer tissues, and the majority can be validated in an independent cohort of 3,382 tumors. Among these, we find that the effects of noncoding mutations on DAAM1, MTG2 and HYI transcription are recapitulated in multiple cancer cell lines and that increasing DAAM1 expression leads to invasive cell migration. Collectively, the noncoding loci converge on a set of core pathways, permitting a classification of tumors into pathway-based subtypes. The somatic eQTL network is disrupted in 88% of tumors, suggesting widespread impact of noncoding mutations in cancer.
View details for PubMedID 29610481
View details for PubMedCentralID PMC5893414
-
NF90/ILF3 is a transcription factor that promotes proliferation over differentiation by hierarchical regulation in K562 erythroleukemia cells
PLOS ONE
2018; 13 (3): e0193126
Abstract
NF90 and splice variant NF110 are DNA- and RNA-binding proteins encoded by the Interleukin enhancer-binding factor 3 (ILF3) gene that have been established to regulate RNA splicing, stabilization and export. The roles of NF90 and NF110 in regulating transcription as chromatin-interacting proteins have not been comprehensively characterized. Here, chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) identified 9,081 genomic sites specifically occupied by NF90/NF110 in K562 cells. One third of NF90/NF110 peaks occurred at promoters of annotated genes. NF90/NF110 occupancy colocalized with chromatin marks associated with active promoters and strong enhancers. Comparison with 150 ENCODE ChIP-seq experiments revealed that NF90/NF110 clustered with transcription factors exhibiting preference for promoters over enhancers (POLR2A, MYC, YY1). Differential gene expression analysis following shRNA knockdown of NF90/NF110 in K562 cells revealed that NF90/NF110 activates transcription factors that drive growth and proliferation (EGR1, MYC), while attenuating differentiation along the erythroid lineage (KLF1). NF90/NF110 associates with chromatin to hierarchically regulate transcription factors that promote proliferation and suppress differentiation.
View details for PubMedID 29590119
-
Circular DNA elements of chromosomal origin are common in healthy human somatic tissue
NATURE COMMUNICATIONS
2018; 9: 1069
Abstract
The human genome is generally organized into stable chromosomes, and only tumor cells are known to accumulate kilobase (kb)-sized extrachromosomal circular DNA elements (eccDNAs). However, it must be expected that kb eccDNAs exist in normal cells as a result of mutations. Here, we purify and sequence eccDNAs from muscle and blood samples from 16 healthy men, detecting ~100,000 unique eccDNA types from 16 million nuclei. Half of these structures carry genes or gene fragments and the majority are smaller than 25 kb. Transcription from eccDNAs suggests that eccDNAs reside in nuclei and recurrence of certain eccDNAs in several individuals implies DNA circularization hotspots. Gene-rich chromosomes contribute to more eccDNAs per megabase and the most transcribed protein-coding gene in muscle, TTN (titin), provides the most eccDNAs per gene. Thus, somatic genomes are rich in chromosome-derived eccDNAs that may influence phenotypes through altered gene copy numbers and transcription of full-length or truncated genes.
View details for PubMedID 29540679
-
An Integrated Understanding of the Rapid Metabolic Benefits of a Carbohydrate-Restricted Diet on Hepatic Steatosis in Humans
CELL METABOLISM
2018; 27 (3): 559-+
Abstract
A carbohydrate-restricted diet is a widely recommended intervention for non-alcoholic fatty liver disease (NAFLD), but a systematic perspective on the multiple benefits of this diet is lacking. Here, we performed a short-term intervention with an isocaloric low-carbohydrate diet with increased protein content in obese subjects with NAFLD and characterized the resulting alterations in metabolism and the gut microbiota using a multi-omics approach. We observed rapid and dramatic reductions of liver fat and other cardiometabolic risk factors paralleled by (1) marked decreases in hepatic de novo lipogenesis; (2) large increases in serum β-hydroxybutyrate concentrations, reflecting increased mitochondrial β-oxidation; and (3) rapid increases in folate-producing Streptococcus and serum folate concentrations. Liver transcriptomic analysis on biopsy samples from a second cohort revealed downregulation of the fatty acid synthesis pathway and upregulation of folate-mediated one-carbon metabolism and fatty acid oxidation pathways. Our results highlight the potential of exploring diet-microbiota interactions for treating NAFLD.
View details for PubMedID 29456073
-
Biallelic Mutations in ATP5F1D, which Encodes a Subunit of ATP Synthase, Cause a Metabolic Disorder
AMERICAN JOURNAL OF HUMAN GENETICS
2018; 102 (3): 494–504
Abstract
ATP synthase, H+ transporting, mitochondrial F1 complex, δ subunit (ATP5F1D; formerly ATP5D) is a subunit of mitochondrial ATP synthase and plays an important role in coupling proton translocation and ATP production. Here, we describe two individuals, each with homozygous missense variants in ATP5F1D, who presented with episodic lethargy, metabolic acidosis, 3-methylglutaconic aciduria, and hyperammonemia. Subject 1, homozygous for c.245C>T (p.Pro82Leu), presented with recurrent metabolic decompensation starting in the neonatal period, and subject 2, homozygous for c.317T>G (p.Val106Gly), presented with acute encephalopathy in childhood. Cultured skin fibroblasts from these individuals exhibited impaired assembly of F1FO ATP synthase and subsequent reduced complex V activity. Cells from subject 1 also exhibited a significant decrease in mitochondrial cristae. Knockdown of Drosophila ATPsynδ, the ATP5F1D homolog, in developing eyes and brains caused a near complete loss of the fly head, a phenotype that was fully rescued by wild-type human ATP5F1D. In contrast, expression of the ATP5F1D c.245C>T and c.317T>G variants rescued the head-size phenotype but recapitulated the eye and antennae defects seen in other genetic models of mitochondrial oxidative phosphorylation deficiency. Our data establish c.245C>T (p.Pro82Leu) and c.317T>G (p.Val106Gly) in ATP5F1D as pathogenic variants leading to a Mendelian mitochondrial disease featuring episodic metabolic decompensation.
View details for PubMedID 29478781
-
Full Genome Sequence of the Western Reserve Strain of Vaccinia Virus Determined by Third-Generation Sequencing
MICROBIOLOGY RESOURCE ANNOUNCEMENTS
2018; 6 (11)
Abstract
The vaccinia virus is a large, complex virus belonging to the Poxviridae family. Here, we report the complete, annotated genome sequence of the neurovirulent Western Reserve laboratory strain of this virus, which was sequenced on the Pacific Biosciences RS II and Oxford Nanopore MinION platforms.
View details for PubMedID 29545308
-
Applying genomics in heart transplantation
TRANSPLANT INTERNATIONAL
2018; 31 (3): 278–90
Abstract
While advances in patient care and immunosuppressive pharmacotherapies have increased the lifespan of heart allograft recipients, there are still significant comorbidities post-transplantation and 5-year survival rates are still significant, at approximately 70%. The last decade has seen massive strides in genomics and other omics fields, including transcriptomics, with many of these advances now starting to impact heart transplant clinical care. This review summarizes a number of the key advances in genomics which are relevant for heart transplant outcomes, and we highlight the translational potential that such knowledge may bring to patient care within the next decade.
View details for PubMedID 29363220
View details for PubMedCentralID PMC5990370
-
Multiplatform next-generation sequencing identifies novel RNA molecules and transcript isoforms of the endogenous retrovirus isolated from cultured cells
FEMS MICROBIOLOGY LETTERS
2018; 365 (5)
Abstract
In this study, we applied short- and long-read RNA sequencing techniques, as well as PCR analysis to investigate the transcriptome of the porcine endogenous retrovirus (PERV) expressed from cultured porcine kidney cell line PK-15. This analysis has revealed six novel transcripts and eight transcript isoforms, including five length and three splice variants. We were able to establish whether a deletion in a transcript is the result of the splicing of mRNAs or of genomic deletion in one of the PERV clones. Additionally, we re-annotated the formerly identified RNA molecules. Our analysis revealed a higher complexity of PERV transcriptome than it was earlier believed.
View details for PubMedID 29361122
-
Multi-Platform Sequencing Approach Reveals a Novel Transcriptome Profile in Pseudorabies Virus
FRONTIERS IN MICROBIOLOGY
2018; 8
View details for DOI 10.3389/fmicb.2017.02708
View details for Web of Science ID 000422957600001
-
Omics AnalySIs System for PRecision Oncology (OASISPRO): a web-based omics analysis tool for clinical phenotype prediction
BIOINFORMATICS
2018; 34 (2): 319–20
Abstract
Precision oncology is an approach that accounts for individual differences to guide cancer management. Omics signatures have been shown to predict clinical traits for cancer patients. However, the vast amount of omics information poses an informatics challenge in systematically identifying patterns associated with health outcomes, and no general-purpose data-mining tool exists for physicians, medical researchers, and citizen scientists without significant training in programming and bioinformatics. To bridge this gap, we built the Omics AnalySIs System for PRecision Oncology (OASISPRO), a web-based system to mine the quantitative omics information from The Cancer Genome Atlas (TCGA). This system effectively visualizes patients' clinical profiles, executes machine-learning algorithms of choice on the omics data, and evaluates the prediction performance using held-out test sets. With this tool, we successfully identified genes strongly associated with tumor stage, and accurately predicted patients' survival outcomes in many cancer types, including mesothelioma and adrenocortical carcinoma. By identifying the links between omics and clinical phenotypes, this system will facilitate omics studies on precision cancer medicine and contribute to establishing personalized cancer treatment plans.This web-based tool is available at http://tinyurl.com/oasispro ;source codes are available at http://tinyurl.com/oasisproSourceCode .
View details for PubMedID 28968749
View details for PubMedCentralID PMC5860203
-
Value of Circulating Cytokine Profiling During Submaximal Exercise Testing in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome.
Scientific reports
2018; 8 (1): 2779
Abstract
Myalgic Encephalomyelitis or Chronic Fatigue Syndrome (ME/CFS) is a heterogeneous syndrome in which patients often experience severe fatigue and malaise following exertion. Immune and cardiovascular dysfunction have been postulated to play a role in the pathophysiology. We therefore, examined whether cytokine profiling or cardiovascular testing following exercise would differentiate patients with ME/CFS. Twenty-four ME/CFS patients were matched to 24 sedentary controls and underwent cardiovascular and circulating immune profiling. Cardiovascular analysis included echocardiography, cardiopulmonary exercise and endothelial function testing. Cytokine and growth factor profiles were analyzed using a 51-plex Luminex bead kit at baseline and 18 hours following exercise. Cardiac structure and exercise capacity were similar between groups. Sparse partial least square discriminant analyses of cytokine profiles 18 hours post exercise offered the most reliable discrimination between ME/CFS and controls (κ = 0.62(0.34,0.84)). The most discriminatory cytokines post exercise were CD40L, platelet activator inhibitor, interleukin 1-β, interferon-α and CXCL1. In conclusion, cytokine profiling following exercise may help differentiate patients with ME/CFS from sedentary controls.
View details for PubMedID 29426834
-
Functional regulatory mechanism of smooth muscle cell-restricted LMOD1 coronary artery disease locus.
PLoS genetics
2018; 14 (11): e1007755
Abstract
Recent genome-wide association studies (GWAS) have identified multiple new loci which appear to alter coronary artery disease (CAD) risk via arterial wall-specific mechanisms. One of the annotated genes encodes LMOD1 (Leiomodin 1), a member of the actin filament nucleator family that is highly enriched in smooth muscle-containing tissues such as the artery wall. However, it is still unknown whether LMOD1 is the causal gene at this locus and also how the associated variants alter LMOD1 expression/function and CAD risk. Using epigenomic profiling we recently identified a non-coding regulatory variant, rs34091558, which is in tight linkage disequilibrium (LD) with the lead CAD GWAS variant, rs2820315. Herein we demonstrate through expression quantitative trait loci (eQTL) and statistical fine-mapping in GTEx, STARNET, and human coronary artery smooth muscle cell (HCASMC) datasets, rs34091558 is the top regulatory variant for LMOD1 in vascular tissues. Position weight matrix (PWM) analyses identify the protective allele rs34091558-TA to form a conserved Forkhead box O3 (FOXO3) binding motif, which is disrupted by the risk allele rs34091558-A. FOXO3 chromatin immunoprecipitation and reporter assays show reduced FOXO3 binding and LMOD1 transcriptional activity by the risk allele, consistent with effects of FOXO3 downregulation on LMOD1. LMOD1 knockdown results in increased proliferation and migration and decreased cell contraction in HCASMC, and immunostaining in atherosclerotic lesions in the SMC lineage tracing reporter mouse support a key role for LMOD1 in maintaining the differentiated SMC phenotype. These results provide compelling functional evidence that genetic variation is associated with dysregulated LMOD1 expression/function in SMCs, together contributing to the heritable risk for CAD.
View details for PubMedID 30444878
-
SETD7 Drives Cardiac Lineage Commitment through Stage-Specific Transcriptional Activation.
Cell stem cell
2018; 22 (3): 428–44.e5
Abstract
Cardiac development requires coordinated and large-scale rearrangements of the epigenome. The roles and precise mechanisms through which specific epigenetic modifying enzymes control cardiac lineage specification, however, remain unclear. Here we show that the H3K4 methyltransferase SETD7 controls cardiac differentiation by reading H3K36 marks independently of its enzymatic activity. Through chromatin immunoprecipitation sequencing (ChIP-seq), we found that SETD7 targets distinct sets of genes to drive their stage-specific expression during cardiomyocyte differentiation. SETD7 associates with different co-factors at these stages, including SWI/SNF chromatin-remodeling factors during mesodermal formation and the transcription factor NKX2.5 in cardiac progenitors to drive their differentiation. Further analyses revealed that SETD7 binds methylated H3K36 in the bodies of its target genes to facilitate RNA polymerase II (Pol II)-dependent transcription. Moreover, abnormal SETD7 expression impairs functional attributes of terminally differentiated cardiomyocytes. Together, these results reveal how SETD7 acts at sequential steps in cardiac lineage commitment, and they provide insights into crosstalk between dynamic epigenetic marks and chromatin-modifying enzymes.
View details for PubMedID 29499155
-
How many human proteoforms are there?
Nature chemical biology
2018; 14 (3): 206–14
Abstract
Despite decades of accumulated knowledge about proteins and their post-translational modifications (PTMs), numerous questions remain regarding their molecular composition and biological function. One of the most fundamental queries is the extent to which the combinations of DNA-, RNA- and PTM-level variations explode the complexity of the human proteome. Here, we outline what we know from current databases and measurement strategies including mass spectrometry-based proteomics. In doing so, we examine prevailing notions about the number of modifications displayed on human proteins and how they combine to generate the protein diversity underlying health and disease. We frame central issues regarding determination of protein-level variation and PTMs, including some paradoxes present in the field today. We use this framework to assess existing data and to ask the question, "How many distinct primary structures of proteins (proteoforms) are created from the 20,300 human genes?" We also explore prospects for improving measurements to better regularize protein-level biology and efficiently associate PTMs to function and phenotype.
View details for PubMedID 29443976
-
Distinct Transcriptomic and Exomic Abnormalities Within Myelodysplastic Syndrome Marrow Cells
Leukemia & Lymphoma
2018: 1-11
View details for DOI 10.1080/10428194.2018.1452210
-
Long-read sequencing of the human cytomegalovirus transcriptome with the Pacific Biosciences RSII platform
SCIENTIFIC DATA
2017; 4: 170194
Abstract
Long-read RNA sequencing allows for the precise characterization of full-length transcripts, which makes it an indispensable tool in transcriptomics. The human cytomegalovirus (HCMV) genome has been first sequenced in 1989 and although short-read sequencing studies have uncovered much of the complexity of its transcriptome, only few of its transcripts have been fully annotated. We hereby present a long-read RNA sequencing dataset of HCMV infected human lung fibroblast cells sequenced by the Pacific Biosciences RSII platform. Seven SMRT cells were sequenced using oligo(dT) primers to reverse transcribe poly(A)-selected RNA molecules and one library was prepared using random primers for the reverse transcription of the rRNA-depleted sample. Our dataset contains 122,636 human and 33,086 viral (HMCV strain Towne) reads. The described data include raw and processed sequencing files, and combined with other datasets, they can be used to validate transcriptome analysis tools, to compare library preparation methods, to test base calling algorithms or to identify genetic variants.
View details for PubMedID 29257134
-
Challenges and recommendations for epigenomics in precision health
NATURE BIOTECHNOLOGY
2017; 35 (12): 1128–32
View details for PubMedID 29220033
-
Cloud-based interactive analytics for terabytes of genomic variants data.
Bioinformatics (Oxford, England)
2017; 33 (23): 3709-3715
Abstract
Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired.We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information.Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs.cuiping@stanford.edu or ptsao@stanford.edu.Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btx468
View details for PubMedID 28961771
View details for PubMedCentralID PMC5860318
-
Long-Read Sequencing of Human Cytomegalovirus Transcriptome Reveals RNA Isoforms Carrying Distinct Coding Potentials
SCIENTIFIC REPORTS
2017; 7: 15989
Abstract
The human cytomegalovirus (HCMV) is a ubiquitous, human pathogenic herpesvirus. The complete viral genome is transcriptionally active during infection; however, a large part of its transcriptome has yet to be annotated. In this work, we applied the amplified isoform sequencing technique from Pacific Biosciences to characterize the lytic transcriptome of HCMV strain Towne varS. We developed a pipeline for transcript annotation using long-read sequencing data. We identified 248 transcriptional start sites, 116 transcriptional termination sites and 80 splicing events. Using this information, we have annotated 291 previously undescribed or only partially annotated transcript isoforms, including eight novel antisense transcripts and their isoforms, as well as a novel transcript (RS2) in the short repeat region, partially antisense to RS1. Similarly to other organisms, we discovered a high transcriptional diversity in HCMV, with many transcripts only slightly differing from one another. Comparing our transcriptome profiling results to an earlier ribosome footprint analysis, we have concluded that the majority of the transcripts contain multiple translationally active ORFs, and also that most isoforms contain unique combinations of ORFs. Based on these results, we propose that one important function of this transcriptional diversity may be to provide a regulatory mechanism at the level of translation.
View details for PubMedID 29167532
-
Transcriptomic and epigenomic differences in human induced pluripotent stem cells generated from six reprogramming methods
NATURE BIOMEDICAL ENGINEERING
2017; 1 (10): 826–37
View details for DOI 10.1038/s41551-017-0141-6
View details for Web of Science ID 000418860600003
-
Transcriptomic and epigenomic differences in human induced pluripotent stem cells generated from six reprogramming methods.
Nature biomedical engineering
2017; 1 (10): 826-837
Abstract
Many reprogramming methods can generate human induced pluripotent stem cells (hiPSCs) that closely resemble human embryonic stem cells (hESCs). This has led to assessments of how similar hiPSCs are to hESCs, by evaluating differences in gene expression, epigenetic marks and differentiation potential. However, all previous studies were performed using hiPSCs acquired from different laboratories, passage numbers, culturing conditions, genetic backgrounds and reprogramming methods, all of which may contribute to the reported differences. Here, by using high-throughput sequencing under standardized cell culturing conditions and passage number, we compare the epigenetic signatures (H3K4me3, H3K27me3 and HDAC2 ChIP-seq profiles) and transcriptome differences (by RNA-seq) of hiPSCs generated from the same primary fibroblast population by using six different reprogramming methods. We found that the reprogramming method impacts the resulting transcriptome and that all hiPSC lines could terminally differentiate, regardless of the reprogramming method. Moreover, by comparing the differences between the hiPSC and hESC lines, we observed a significant proportion of differentially expressed genes that could be attributed to polycomb repressive complex targets.
View details for DOI 10.1038/s41551-017-0141-6
View details for PubMedID 30263871
View details for PubMedCentralID PMC6155993
-
Long-Read Sequencing Reveals a GC Pressure during the Evolution of Porcine Endogenous Retrovirus
MICROBIOLOGY RESOURCE ANNOUNCEMENTS
2017; 5 (40)
Abstract
Here, we present the complete genome sequence of a porcine endogenous retrovirus determined by Pacific Biosciences sequencing. A comparison of the genome of this isolate with those of other strains revealed the operation of a mechanism resulting in the selective accumulation of G and C bases in the viral DNA.
View details for PubMedID 28982996
-
Novel nonsense gain-of-function NFKB2 mutations associated with a combined immunodeficiency phenotype
BLOOD
2017; 130 (13): 1553–64
Abstract
NF-κB signaling through its NFKB1-dependent canonical and NFKB2-dependent noncanonical pathways plays distinctive roles in a diverse range of immune processes. Recently, mutations in these 2 genes have been associated with common variable immunodeficiency (CVID). While studying patients with genetically uncharacterized primary immunodeficiencies, we detected 2 novel nonsense gain-of-function (GOF) NFKB2 mutations (E418X and R635X) in 3 patients from 2 families, and a novel missense change (S866R) in another patient. Their immunophenotype was assessed by flow cytometry and protein expression; activation of canonical and noncanonical pathways was examined in peripheral blood mononuclear cells and transfected HEK293T cells through immunoblotting, immunohistochemistry, luciferase activity, real-time polymerase chain reaction, and multiplex assays. The S866R change disrupted a C-terminal NF-κΒ2 critical site affecting protein phosphorylation and nuclear translocation, resulting in CVID with adrenocorticotropic hormone deficiency, growth hormone deficiency, and mild ectodermal dysplasia as previously described. In contrast, the nonsense mutations E418X and R635X observed in 3 patients led to constitutive nuclear localization and activation of both canonical and noncanonical NF-κΒ pathways, resulting in a combined immunodeficiency (CID) without endocrine or ectodermal manifestations. These changes were also found in 2 asymptomatic relatives. Thus, these novel NFKB2 GOF mutations produce a nonfully penetrant CID phenotype through a different pathophysiologic mechanism than previously described for mutations in NFKB2.
View details for PubMedID 28778864
View details for PubMedCentralID PMC5620416
-
Evaluation of the impact of ul54 gene-deletion on the global transcription and DNA replication of pseudorabies virus
ARCHIVES OF VIROLOGY
2017; 162 (9): 2679–94
Abstract
Pseudorabies virus (PRV) is an animal alphaherpesvirus with a wide host range. PRV has 67 protein-coding genes and several non-coding RNA molecules, which can be classified into three temporal groups, immediate early, early and late classes. The ul54 gene of PRV and its homolog icp27 of herpes simplex virus have a multitude of functions, including the regulation of viral DNA synthesis and the control of the gene expression. Therefore, abrogation of PRV ul54 function was expected to exert a significant effect on the global transcriptome and on DNA replication. Real-time PCR and real-time RT-PCR platforms were used to investigate these presumed effects. Our analyses revealed a drastic impact of the ul54 mutation on the genome-wide expression of PRV genes, especially on the transcription of the true late genes. A more than two hour delay was observed in the onset of DNA replication, and the amount of synthesized DNA molecules was significantly decreased in comparison to the wild-type virus. Furthermore, in this work, we were able to successfully demonstrate the utility of long-read SMRT sequencing for genotyping of mutant viruses.
View details for PubMedID 28577213
View details for PubMedCentralID PMC5927779
-
High-Coverage Whole-Exome Sequencing Identifies Candidate Genes for Suicide in Victims with Major Depressive Disorder
SCIENTIFIC REPORTS
2017; 7: 7106
Abstract
We carried out whole-exome ultra-high throughput sequencing in brain samples of suicide victims who had suffered from major depressive disorder and control subjects who had died from other causes. This study aimed to reveal the selective accumulation of rare variants in the coding and the UTR sequences within the genes of suicide victims. We also analysed the potential effect of STR and CNV variations, as well as the infection of the brain with neurovirulent viruses in this behavioural disorder. As a result, we have identified several candidate genes, among others three calcium channel genes that may potentially contribute to completed suicide. We also explored the potential implication of the TGF-β signalling pathway in the pathogenesis of suicidal behaviour. To our best knowledge, this is the first study that uses whole-exome sequencing for the investigation of suicide.
View details for PubMedID 28769055
-
Network analyses identify liver-specific targets for treating liver diseases
MOLECULAR SYSTEMS BIOLOGY
2017; 13 (8): 938
Abstract
We performed integrative network analyses to identify targets that can be used for effectively treating liver diseases with minimal side effects. We first generated co-expression networks (CNs) for 46 human tissues and liver cancer to explore the functional relationships between genes and examined the overlap between functional and physical interactions. Since increased de novo lipogenesis is a characteristic of nonalcoholic fatty liver disease (NAFLD) and hepatocellular carcinoma (HCC), we investigated the liver-specific genes co-expressed with fatty acid synthase (FASN). CN analyses predicted that inhibition of these liver-specific genes decreases FASN expression. Experiments in human cancer cell lines, mouse liver samples, and primary human hepatocytes validated our predictions by demonstrating functional relationships between these liver genes, and showing that their inhibition decreases cell growth and liver fat content. In conclusion, we identified liver-specific genes linked to NAFLD pathogenesis, such as pyruvate kinase liver and red blood cell (PKLR), or to HCC pathogenesis, such as PKLR, patatin-like phospholipase domain containing 3 (PNPLA3), and proprotein convertase subtilisin/kexin type 9 (PCSK9), all of which are potential targets for drug development.
View details for PubMedID 28827398
-
A Droplet Microfluidics Based Platform for Mining Metagenomic Libraries for Natural Compounds
MICROMACHINES
2017; 8 (8)
Abstract
Historically, microbes from the environment have been a reliable source for novel bio-active compounds. Cloning and expression of metagenomic DNA in heterologous strains of bacteria has broadened the range of potential compounds accessible. However, such metagenomic libraries have been under-exploited for applications in mammalian cells because of a lack of integrated methods. We present an innovative platform to systematically mine natural resources for pro-apoptotic compounds that relies on the combination of bacterial delivery and droplet microfluidics. Using the violacein operon from C. violaceum as a model, we demonstrate that E. coli modified to be invasive can serve as an efficient delivery vehicle of natural compounds. This approach permits the seamless screening of metagenomic libraries with mammalian cell assays and alleviates the need for laborious extraction of natural compounds. In addition, we leverage the unique properties of droplet microfluidics to amplify bacterial clones and perform clonal screening at high-throughput in place of one-compound-per-well assays in multi-well format. We also use droplet microfluidics to establish a cell aggregate strategy that overcomes the issue of background apoptosis. Altogether, this work forms the foundation of a versatile platform to efficiently mine the metagenome for compounds with therapeutic potential.
View details for PubMedID 30400422
-
Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and Promoter Motif Analysis
SCIENTIFIC REPORTS
2017; 7: 5557
Abstract
Deciphering gene regulatory networks requires identification of gene expression modules. We describe a novel bottom-up approach to identify gene modules regulated by cis-regulatory motifs from a human gene co-expression network. Target genes of a cis-regulatory motif were identified from the network via the motif's enrichment or biased distribution towards transcription start sites in the promoters of co-expressed genes. A gene sub-network containing the target genes was extracted and used to derive gene modules. The analysis revealed known and novel gene modules regulated by the NF-Y motif. The binding of NF-Y proteins to these modules' gene promoters were verified using ENCODE ChIP-Seq data. The analyses also identified 8,048 Sp1 motif target genes, interestingly many of which were not detected by ENCODE ChIP-Seq. These target genes assemble into house-keeping, tissues-specific developmental, and immune response modules. Integration of Sp1 modules with genomic and epigenomic data indicates epigenetic control of Sp1 targets' expression in a cell/tissue specific manner. Finally, known and novel target genes and modules regulated by the YY1, RFX1, IRF1, and 34 other motifs were also identified. The study described here provides a valuable resource to understand transcriptional regulation of various human developmental, disease, or immunity pathways.
View details for PubMedID 28717181
-
Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis
NATURE COMMUNICATIONS
2017; 8: 59
Abstract
RNA-sequencing (RNA-seq) is an essential technique for transcriptome studies, hundreds of analysis tools have been developed since it was debuted. Although recent efforts have attempted to assess the latest available tools, they have not evaluated the analysis workflows comprehensively to unleash the power within RNA-seq. Here we conduct an extensive study analysing a broad spectrum of RNA-seq workflows. Surpassing the expression analysis scope, our work also includes assessment of RNA variant-calling, RNA editing and RNA fusion detection techniques. Specifically, we examine both short- and long-read RNA-seq technologies, 39 analysis tools resulting in ~120 combinations, and ~490 analyses involving 15 samples with a variety of germline, cancer and stem cell data sets. We report the performance and propose a comprehensive RNA-seq analysis protocol, named RNACocktail, along with a computational pipeline achieving high accuracy. Validation on different samples reveals that our proposed protocol could help researchers extract more biologically relevant predictions by broad analysis of the transcriptome.RNA-seq is widely used for transcriptome analysis. Here, the authors analyse a wide spectrum of RNA-seq workflows and present a comprehensive analysis protocol named RNACocktail as well as a computational pipeline leveraging the widely used tools for accurate RNA-seq analysis.
View details for PubMedID 28680106
-
Long-Read Isoform Sequencing Reveals a Hidden Complexity of the Transcriptional Landscape of Herpes Simplex Virus Type 1
FRONTIERS IN MICROBIOLOGY
2017; 8: 1079
Abstract
In this study, we used the amplified isoform sequencing technique from Pacific Biosciences to characterize the poly(A)+ fraction of the lytic transcriptome of the herpes simplex virus type 1 (HSV-1). Our analysis detected 34 formerly unidentified protein-coding genes, 10 non-coding RNAs, as well as 17 polycistronic and complex transcripts. This work also led us to identify many transcript isoforms, including 13 splice and 68 transcript end variants, as well as several transcript overlaps. Additionally, we determined previously unascertained transcriptional start and polyadenylation sites. We analyzed the transcriptional activity from the complementary DNA strand in five convergent HSV gene pairs with quantitative RT-PCR and detected antisense RNAs in each gene. This part of the study revealed an inverse correlation between the expressions of convergent partners. Our work adds new insights for understanding the complexity of the pervasive transcriptional overlaps by suggesting that there is a crosstalk between adjacent and distal genes through interaction between their transcription apparatuses. We also identified transcripts overlapping the HSV replication origins, which may indicate an interplay between the transcription and replication machineries. The relative abundance of HSV-1 transcripts has also been established by using a novel method based on the calculation of sequencing reads for the analysis.
View details for DOI 10.3389/fmicb.2017.01079
View details for Web of Science ID 000403758800001
View details for PubMedID 28676792
View details for PubMedCentralID PMC5476775
-
Bisulfite-independent analysis of CpG island methylation enables genome-scale stratification of single cells
NUCLEIC ACIDS RESEARCH
2017; 45 (10): e77
Abstract
Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population.
View details for PubMedID 28126923
View details for PubMedCentralID PMC5605247
-
Isolated Congenital Anosmia and CNGA2 Mutation.
Scientific reports
2017; 7 (1): 2667-?
Abstract
Isolated congenital anosmia (ICA) is a rare condition that is associated with life-long inability to smell. Here we report a genetic characterization of a large Iranian family segregating ICA. Whole exome sequencing in five affected family members and five healthy members revealed a stop gain mutation in CNGA2 (OMIM 300338) (chrX:150,911,102; CNGA2. c.577C > T; p.Arg193*). The mutation segregates in an X-linked pattern, as all the affected family members are hemizygotes, whereas healthy family members are either heterozygote or homozygote for the reference allele. cnga2 knockout mice are congenitally anosmic and have abnormal olfactory system physiology, additionally Karstensen et al. recently reported two anosmic brothers sharing a CNGA2 truncating variant. Our study in concert with these findings provides strong support for role of CNGA2 gene with pathogenicity of ICA in humans. Together, these results indicate that mutations in key olfactory signaling pathway genes are responsible for human disease.
View details for DOI 10.1038/s41598-017-02947-y
View details for PubMedID 28572688
-
Succinate and its G-protein-coupled receptor stimulates osteoclastogenesis.
Nature communications
2017; 8: 15621-?
Abstract
The mechanism underlying bone impairment in patients with diabetes mellitus, a metabolic disorder characterized by chronic hyperglycaemia and dysregulation in metabolism, is unclear. Here we show the difference in the metabolomics of bone marrow stromal cells (BMSCs) derived from hyperglycaemic (type 2 diabetes mellitus, T2D) and normoglycaemic mice. One hundred and forty-two metabolites are substantially regulated in BMSCs from T2D mice, with the tricarboxylic acid (TCA) cycle being one of the primary metabolic pathways impaired by hyperglycaemia. Importantly, succinate, an intermediate metabolite in the TCA cycle, is increased by 24-fold in BMSCs from T2D mice. Succinate functions as an extracellular ligand through binding to its specific receptor on osteoclastic lineage cells and stimulates osteoclastogenesis in vitro and in vivo. Strategies targeting the receptor activation inhibit osteoclastogenesis. This study reveals a metabolite-mediated mechanism of osteoclastogenesis modulation that contributes to bone dysregulation in metabolic disorders.
View details for DOI 10.1038/ncomms15621
View details for PubMedID 28561074
-
Multi-platform analysis reveals a complex transcriptome architecture of a circovirus.
Virus research
2017; 237: 37-46
Abstract
In this study, we used Pacific Biosciences RS II long-read and Illumina HiScanSQ short-read sequencing technologies for the characterization of porcine circovirus type 1 (PCV-1) transcripts. Our aim was to identify novel RNA molecules and transcript isoforms, as well as to determine the exact 5'- and 3'-end sequences of previously described transcripts with single base-pair accuracy. We discovered a novel 3'-UTR length isoform of the Cap transcript, and a non-spliced Cap transcript variant. Additionally, our analysis has revealed a 3'-UTR isoform of Rep and two 5'-UTR isoforms of Rep' transcripts, and a novel splice variant of the longer Rep' transcript. We also explored two novel long transcripts, one with a previously identified splice site, and a formerly undetected mRNA of ORF3. Altogether, our methods have identified nine novel RNA molecules, doubling the size of PCV-1 transcriptome that had been known before. Additionally, our investigations revealed an intricate pattern of transcript overlapping, which might produce transcriptional interference between the transcriptional machineries of adjacent genes, and thereby may potentially play a role in the regulation of gene expression in circoviruses.
View details for DOI 10.1016/j.virusres.2017.05.010
View details for PubMedID 28549855
-
Non-equivalence of Wnt and R-spondin ligands during Lgr5(+) intestinal stem-cell self-renewal
NATURE
2017; 545 (7653): 238-?
Abstract
The canonical Wnt/β-catenin signalling pathway governs diverse developmental, homeostatic and pathological processes. Palmitoylated Wnt ligands engage cell-surface frizzled (FZD) receptors and LRP5 and LRP6 co-receptors, enabling β-catenin nuclear translocation and TCF/LEF-dependent gene transactivation. Mutations in Wnt downstream signalling components have revealed diverse functions thought to be carried out by Wnt ligands themselves. However, redundancy between the 19 mammalian Wnt proteins and 10 FZD receptors and Wnt hydrophobicity have made it difficult to attribute these functions directly to Wnt ligands. For example, individual mutations in Wnt ligands have not revealed homeostatic phenotypes in the intestinal epithelium-an archetypal canonical, Wnt pathway-dependent, rapidly self-renewing tissue, the regeneration of which is fueled by proliferative crypt Lgr5(+) intestinal stem cells (ISCs). R-spondin ligands (RSPO1-RSPO4) engage distinct LGR4-LGR6, RNF43 and ZNRF3 receptor classes, markedly potentiate canonical Wnt/β-catenin signalling, and induce intestinal organoid growth in vitro and Lgr5(+) ISCs in vivo. However, the interchangeability, functional cooperation and relative contributions of Wnt versus RSPO ligands to in vivo canonical Wnt signalling and ISC biology remain unknown. Here we identify the functional roles of Wnt and RSPO ligands in the intestinal crypt stem-cell niche. We show that the default fate of Lgr5(+) ISCs is to differentiate, unless both RSPO and Wnt ligands are present. However, gain-of-function studies using RSPO ligands and a new non-lipidated Wnt analogue reveal that these ligands have qualitatively distinct, non-interchangeable roles in ISCs. Wnt proteins are unable to induce Lgr5(+) ISC self-renewal, but instead confer a basal competency by maintaining RSPO receptor expression that enables RSPO ligands to actively drive and specify the extent of stem-cell expansion. This functionally non-equivalent yet cooperative interaction between Wnt and RSPO ligands establishes a molecular precedent for regulation of mammalian stem cells by distinct priming and self-renewal factors, with broad implications for precise control of tissue regeneration.
View details for DOI 10.1038/nature22313
View details for Web of Science ID 000400963800037
-
intestinal stem-cell self-renewal.
Nature
2017; 545 (7653): 238-242
Abstract
The canonical Wnt/β-catenin signalling pathway governs diverse developmental, homeostatic and pathological processes. Palmitoylated Wnt ligands engage cell-surface frizzled (FZD) receptors and LRP5 and LRP6 co-receptors, enabling β-catenin nuclear translocation and TCF/LEF-dependent gene transactivation. Mutations in Wnt downstream signalling components have revealed diverse functions thought to be carried out by Wnt ligands themselves. However, redundancy between the 19 mammalian Wnt proteins and 10 FZD receptors and Wnt hydrophobicity have made it difficult to attribute these functions directly to Wnt ligands. For example, individual mutations in Wnt ligands have not revealed homeostatic phenotypes in the intestinal epithelium-an archetypal canonical, Wnt pathway-dependent, rapidly self-renewing tissue, the regeneration of which is fueled by proliferative crypt Lgr5(+) intestinal stem cells (ISCs). R-spondin ligands (RSPO1-RSPO4) engage distinct LGR4-LGR6, RNF43 and ZNRF3 receptor classes, markedly potentiate canonical Wnt/β-catenin signalling, and induce intestinal organoid growth in vitro and Lgr5(+) ISCs in vivo. However, the interchangeability, functional cooperation and relative contributions of Wnt versus RSPO ligands to in vivo canonical Wnt signalling and ISC biology remain unknown. Here we identify the functional roles of Wnt and RSPO ligands in the intestinal crypt stem-cell niche. We show that the default fate of Lgr5(+) ISCs is to differentiate, unless both RSPO and Wnt ligands are present. However, gain-of-function studies using RSPO ligands and a new non-lipidated Wnt analogue reveal that these ligands have qualitatively distinct, non-interchangeable roles in ISCs. Wnt proteins are unable to induce Lgr5(+) ISC self-renewal, but instead confer a basal competency by maintaining RSPO receptor expression that enables RSPO ligands to actively drive and specify the extent of stem-cell expansion. This functionally non-equivalent yet cooperative interaction between Wnt and RSPO ligands establishes a molecular precedent for regulation of mammalian stem cells by distinct priming and self-renewal factors, with broad implications for precise control of tissue regeneration.
View details for DOI 10.1038/nature22313
View details for PubMedID 28467820
-
Histone variant H2A.J accumulates in senescent cells and promotes inflammatory gene expression
NATURE COMMUNICATIONS
2017; 8
Abstract
The senescence of mammalian cells is characterized by a proliferative arrest in response to stress and the expression of an inflammatory phenotype. Here we show that histone H2A.J, a poorly studied H2A variant found only in mammals, accumulates in human fibroblasts in senescence with persistent DNA damage. H2A.J also accumulates in mice with aging in a tissue-specific manner and in human skin. Knock-down of H2A.J inhibits the expression of inflammatory genes that contribute to the senescent-associated secretory phenotype (SASP), and over expression of H2A.J increases the expression of some of these genes in proliferating cells. H2A.J accumulation may thus promote the signalling of senescent cells to the immune system, and it may contribute to chronic inflammation and the development of aging-associated diseases.
View details for DOI 10.1038/ncomms14995
View details for Web of Science ID 000400886800001
View details for PubMedID 28489069
-
Genome-scale measurement of off-target activity using Cas9 toxicity in high-throughput screens
NATURE COMMUNICATIONS
2017; 8
Abstract
CRISPR-Cas9 screens are powerful tools for high-throughput interrogation of genome function, but can be confounded by nuclease-induced toxicity at both on- and off-target sites, likely due to DNA damage. Here, to test potential solutions to this issue, we design and analyse a CRISPR-Cas9 library with 10 variable-length guides per gene and thousands of negative controls targeting non-functional, non-genic regions (termed safe-targeting guides), in addition to non-targeting controls. We find this library has excellent performance in identifying genes affecting growth and sensitivity to the ricin toxin. The safe-targeting guides allow for proper control of toxicity from on-target DNA damage. Using this toxicity as a proxy to measure off-target cutting, we demonstrate with tens of thousands of guides both the nucleotide position-dependent sensitivity to single mismatches and the reduction of off-target cutting using truncated guides. Our results demonstrate a simple strategy for high-throughput evaluation of target specificity and nuclease toxicity in Cas9 screens.
View details for DOI 10.1038/ncomms15178
View details for PubMedID 28474669
-
A Case Report of Hypoglycemia and Hypogammaglobulinemia: DAVID syndrome in a patient with a novel NFKB2 mutation.
journal of clinical endocrinology and metabolism
2017
Abstract
DAVID syndrome (Deficient Anterior pituitary with Variable Immune Deficiency) is a rare disorder in which children present with symptomatic ACTH deficiency preceded by hypogammaglobulinemia from B-cell dysfunction with recurrent infections, termed common variable immunodeficiency (CVID). Subsequent whole exome sequencing studies have revealed germline heterozygous C-terminal mutations of NFKB2 as either a cause of DAVID syndrome or of CVID without clinical hypopituitarism. However, to the best of our knowledge there have been no cases in which the endocrinopathy has presented in the absence of a prior clinical history of CVID.A previously healthy 7 year-old boy with no history of clinical immunodeficiency, presented with profound hypoglycemia and seizures. He was found to have secondary adrenal insufficiency and was started on glucocorticoid replacement. An evaluation for autoimmune disease, including for anti-pituitary antibodies, was negative. Evaluation unexpectedly revealed hypogammaglobulinemia (decreased IgG, IgM, and IgA). He had moderately reduced serotype-specific IgG responses following pneumococcal polysaccharide vaccine. Subsequently, he was found to have growth hormone (GH) deficiency. Six years after initial presentation, whole exome sequencing revealed a novel de novo heterozygous NFKB2 missense mutation c.2596A>C (p.Ser866Arg) in the C-terminal region predicted to abrogate the processing of the p100 NFKB2 protein to its active p52 form.Isolated early-onset ACTH deficiency is rare and C-terminal region NFKB2 mutations should be considered as an etiology even in the absence of a clinical history of CVID. Early immunologic evaluation is indicated in the diagnosis and management of isolated ACTH deficiency.
View details for DOI 10.1210/jc.2017-00341
View details for PubMedID 28472507
-
Patient-Specific iPSC-Derived Endothelial Cells Uncover Pathways that Protect against Pulmonary Hypertension in BMPR2 Mutation Carriers
CELL STEM CELL
2017; 20 (4): 490-?
View details for DOI 10.1016/j.stem.2016.08.019
View details for Web of Science ID 000398350800013
-
Gpr124 is essential for blood-brain barrier integrity in central nervous system disease
NATURE MEDICINE
2017; 23 (4): 450-?
Abstract
Although blood-brain barrier (BBB) compromise is central to the etiology of diverse central nervous system (CNS) disorders, endothelial receptor proteins that control BBB function are poorly defined. The endothelial G-protein-coupled receptor (GPCR) Gpr124 has been reported to be required for normal forebrain angiogenesis and BBB function in mouse embryos, but the role of this receptor in adult animals is unknown. Here Gpr124 conditional knockout (CKO) in the endothelia of adult mice did not affect homeostatic BBB integrity, but resulted in BBB disruption and microvascular hemorrhage in mouse models of both ischemic stroke and glioblastoma, accompanied by reduced cerebrovascular canonical Wnt-β-catenin signaling. Constitutive activation of Wnt-β-catenin signaling fully corrected the BBB disruption and hemorrhage defects of Gpr124-CKO mice, with rescue of the endothelial gene tight junction, pericyte coverage and extracellular-matrix deficits. We thus identify Gpr124 as an endothelial GPCR specifically required for endothelial Wnt signaling and BBB integrity under pathological conditions in adult mice. This finding implicates Gpr124 as a potential therapeutic target for human CNS disorders characterized by BBB disruption.
View details for DOI 10.1038/nm.4309
View details for PubMedID 28288111
-
Induced Pluripotent Stem Cell Model of Pulmonary Arterial Hypertension Reveals Novel Gene Expression and Patient Specificity
AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE
2017; 195 (7): 930-941
View details for DOI 10.1164/rccm.201606-1200OC
View details for Web of Science ID 000398017200016
-
Characterization of the Dynamic Transcriptome of a Herpesvirus with Long-read Single Molecule Real-Time Sequencing.
Scientific reports
2017; 7: 43751-?
Abstract
Herpesvirus gene expression is co-ordinately regulated and sequentially ordered during productive infection. The viral genes can be classified into three distinct kinetic groups: immediate-early, early, and late classes. In this study, a massively parallel sequencing technique that is based on PacBio Single Molecule Real-time sequencing platform, was used for quantifying the poly(A) fraction of the lytic transcriptome of pseudorabies virus (PRV) throughout a 12-hour interval of productive infection on PK-15 cells. Other approaches, including microarray, real-time RT-PCR and Illumina sequencing are capable of detecting only the aggregate transcriptional activity of particular genomic regions, but not individual herpesvirus transcripts. However, SMRT sequencing allows for a distinction between transcript isoforms, including length- and splice variants, as well as between overlapping polycistronic RNA molecules. The non-amplified Isoform Sequencing (Iso-Seq) method was used to analyse the kinetic properties of the lytic PRV transcripts and to then classify them accordingly. Additionally, the present study demonstrates the general utility of long-read sequencing for the time-course analysis of global gene expression in practically any organism.
View details for DOI 10.1038/srep43751
View details for PubMedID 28256586
View details for PubMedCentralID PMC5335617
-
Association of AHSG with alopecia and mental retardation (APMR) syndrome.
Human genetics
2017; 136 (3): 287-296
Abstract
Alopecia with mental retardation syndrome (APMR) is a very rare autosomal recessive condition that is associated with total or partial absence of hair from the scalp and other parts of the body as well as variable intellectual disability. Here we present whole-exome sequencing results of a large consanguineous family segregating APMR syndrome with seven affected family members. Our study revealed a novel predicted pathogenic, homozygous missense mutation in the AHSG (OMIM 138680) gene (AHSG: NM_001622:exon7:c.950G>A:p.Arg317His). The variant is predicted to affect a region of the protein required for protein processing and disrupts a phosphorylation motif. In addition, the altered protein migrates with an aberrant size relative to healthy individuals. Consistent with the phenotype, AHSG maps within APMR linkage region 1 (APMR 1) as reported before, and falls within runs of homozygosity (ROH). Previous families with APMR syndrome have been studied through linkage analyses and the linkage resolution did not allow pointing out to a single gene candidate. Our study is the first report to identify a homozygous missense mutation for APMR syndrome through whole-exome sequencing.
View details for DOI 10.1007/s00439-016-1756-5
View details for PubMedID 28054173
-
A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N-1-methyladenosine modification
RNA
2017; 23 (3): 270-283
Abstract
Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5' proximal-intron-minus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N(1)-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N(1)-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC.
View details for DOI 10.1261/rna.059105.116.
View details for Web of Science ID 000394467500002
View details for PubMedCentralID PMC5311483
-
-methyladenosine modification.
RNA (New York, N.Y.)
2017; 23 (3): 270-283
Abstract
Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5' proximal-intron-minus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N(1)-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N(1)-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC.
View details for DOI 10.1261/rna.059105.116
View details for PubMedID 27994090
View details for PubMedCentralID PMC5311483
-
Single cell transcriptomics reveals unanticipated features of early hematopoietic precursors.
Nucleic acids research
2017; 45 (3): 1281-1296
Abstract
Molecular changes underlying stem cell differentiation are of fundamental interest. scRNA-seq on murine hematopoietic stem cells (HSC) and their progeny MPP1 separated the cells into 3 main clusters with distinct features: active, quiescent, and an un-characterized cluster. Induction of anemia resulted in mobilization of the quiescent to the active cluster and of the early to later stage of cell cycle, with marked increase in expression of certain transcription factors (TFs) while maintaining expression of interferon response genes. Cells with surface markers of long term HSC increased the expression of a group of TFs expressed highly in normal cycling MPP1 cells. However, at least Id1 and Hes1 were significantly activated in both HSC and MPP1 cells in anemic mice. Lineage-specific genes were differently expressed between cells, and correlated with the cell cycle stages with a specific augmentation of erythroid related genes in the G2/M phase. Most lineage specific TFs were stochastically expressed in the early precursor cells, but a few, such as Klf1, were detected only at very low levels in few precursor cells. The activation of these factors may correlate with stages of differentiation. This study reveals effects of cell cycle progression on the expression of lineage specific genes in precursor cells, and suggests that hematopoietic stress changes the balance of renewal and differentiation in these homeostatic cells.
View details for DOI 10.1093/nar/gkw1214
View details for PubMedID 28003475
View details for PubMedCentralID PMC5388401
-
Single cell transcriptomics reveals unanticipated features of early hematopoietic precursors
NUCLEIC ACIDS RESEARCH
2017; 45 (3): 1281-1296
Abstract
Molecular changes underlying stem cell differentiation are of fundamental interest. scRNA-seq on murine hematopoietic stem cells (HSC) and their progeny MPP1 separated the cells into 3 main clusters with distinct features: active, quiescent, and an un-characterized cluster. Induction of anemia resulted in mobilization of the quiescent to the active cluster and of the early to later stage of cell cycle, with marked increase in expression of certain transcription factors (TFs) while maintaining expression of interferon response genes. Cells with surface markers of long term HSC increased the expression of a group of TFs expressed highly in normal cycling MPP1 cells. However, at least Id1 and Hes1 were significantly activated in both HSC and MPP1 cells in anemic mice. Lineage-specific genes were differently expressed between cells, and correlated with the cell cycle stages with a specific augmentation of erythroid related genes in the G2/M phase. Most lineage specific TFs were stochastically expressed in the early precursor cells, but a few, such as Klf1, were detected only at very low levels in few precursor cells. The activation of these factors may correlate with stages of differentiation. This study reveals effects of cell cycle progression on the expression of lineage specific genes in precursor cells, and suggests that hematopoietic stress changes the balance of renewal and differentiation in these homeostatic cells.
View details for DOI 10.1093/nar/gkw1214
View details for Web of Science ID 000397008000025
View details for PubMedCentralID PMC5388401
-
Genetic Adaptation of Porcine Circovirus Type 1 to Cultured Porcine Kidney Cells Revealed by Single-Molecule Long-Read Sequencing Technology
MICROBIOLOGY RESOURCE ANNOUNCEMENTS
2017; 5 (5)
Abstract
Porcine circovirus type 1 (PCV1) is a nonpathogenic circovirus, and a contaminant of the porcine kidney (PK-15) cell line. We present the complete and annotated genome sequence of strain Szeged of PCV1, determined by Pacific Biosciences RSII long-read sequencing platform.
View details for PubMedID 28153895
-
Multi-Platform Sequencing Approach Reveals a Novel Transcriptome Profile in Pseudorabies Virus.
Frontiers in microbiology
2017; 8: 2708
Abstract
Third-generation sequencing is an emerging technology that is capable of solving several problems that earlier approaches were not able to, including the identification of transcripts isoforms and overlapping transcripts. In this study, we used long-read sequencing for the analysis of pseudorabies virus (PRV) transcriptome, including Oxford Nanopore Technologies MinION, PacBio RS-II, and Illumina HiScanSQ platforms. We also used data from our previous short-read and long-read sequencing studies for the comparison of the results and in order to confirm the obtained data. Our investigations identified 19 formerly unknown putative protein-coding