Systematically assessing microbiome-disease associations identifies drivers of inconsistency in metagenomic research
2022; 20 (3): e3001556
Evaluating the relationship between the human gut microbiome and disease requires computing reliable statistical associations. Here, using millions of different association modeling strategies, we evaluated the consistency-or robustness-of microbiome-based disease indicators for 6 prevalent and well-studied phenotypes (across 15 public cohorts and 2,343 individuals). We were able to discriminate between analytically robust versus nonrobust results. In many cases, different models yielded contradictory associations for the same taxon-disease pairing, some showing positive correlations and others negative. When querying a subset of 581 microbe-disease associations that have been previously reported in the literature, 1 out of 3 taxa demonstrated substantial inconsistency in association sign. Notably, >90% of published findings for type 1 diabetes (T1D) and type 2 diabetes (T2D) were particularly nonrobust in this regard. We additionally quantified how potential confounders-sequencing depth, glucose levels, cholesterol, and body mass index, for example-influenced associations, analyzing how these variables affect the ostensible correlation between Faecalibacterium prausnitzii abundance and a healthy gut. Overall, we propose our approach as a method to maximize confidence when prioritizing findings that emerge from microbiome association studies.
View details for DOI 10.1371/journal.pbio.3001556
View details for Web of Science ID 000774196600002
View details for PubMedID 35235560
View details for PubMedCentralID PMC8890741
Leveraging vibration of effects analysis for robust discovery in observational biomedical data science.
2021; 19 (9): e3001398
Hypothesis generation in observational, biomedical data science often starts with computing an association or identifying the statistical relationship between a dependent and an independent variable. However, the outcome of this process depends fundamentally on modeling strategy, with differing strategies generating what can be called "vibration of effects" (VoE). VoE is defined by variation in associations that often lead to contradictory results. Here, we present a computational tool capable of modeling VoE in biomedical data by fitting millions of different models and comparing their output. We execute a VoE analysis on a series of widely reported associations (e.g., carrot intake and eyesight) with an extended additional focus on lifestyle exposures (e.g., physical activity) and components of the Framingham Risk Score for cardiovascular health (e.g., blood pressure). We leveraged our tool for potential confounder identification, investigating what adjusting variables are responsible for conflicting models. We propose modeling VoE as a critical step in navigating discovery in observational data, discerning robust associations, and cataloging adjusting variables that impact model output.
View details for DOI 10.1371/journal.pbio.3001398
View details for PubMedID 34555021
Gene-level metagenomic architectures across diseases yield high-resolution microbiome diagnostic indicators
2021; 12 (1): 2907
We propose microbiome disease "architectures": linking >1 million microbial features (species, pathways, and genes) to 7 host phenotypes from 13 cohorts using a pipeline designed to identify associations that are robust to analytical model choice. Here, we quantify conservation and heterogeneity in microbiome-disease associations, using gene-level analysis to identify strain-specific, cross-disease, positive and negative associations. We find coronary artery disease, inflammatory bowel diseases, and liver cirrhosis to share gene-level signatures ascribed to the Streptococcus genus. Type 2 diabetes, by comparison, has a distinct metagenomic signature not linked to any one specific species or genus. We additionally find that at the species-level, the prior-reported connection between Solobacterium moorei and colorectal cancer is not consistently identified across models-however, our gene-level analysis unveils a group of robust, strain-specific gene associations. Finally, we validate our findings regarding colorectal cancer and inflammatory bowel diseases in independent cohorts and identify that features inversely associated with disease tend to be less reproducible than features enriched in disease. Overall, our work is not only a step towards gene-based, cross-disease microbiome diagnostic indicators, but it also illuminates the nuances of the genetic architecture of the human microbiome, including tension between gene- and species-level associations.
View details for DOI 10.1038/s41467-021-23029-8
View details for Web of Science ID 000655481800009
View details for PubMedID 34006865
View details for PubMedCentralID PMC8131609