Honors & Awards
Stanford RISE seed grant ($50,000), Stanford ChEM-H (2020)
Best Talk award at ISMB'20 (TransMed track), International Society of Computational Biology (ISCB) (2020)
Geographic finalist for best paper award, IEEE EMBC, IEEE EMBS (2019)
2nd place grand prize award at Stanford Health++ Hackathon, Stanford (2018)
American Thoracic Society Scholarship (Pediatric Assembly), ATS (2018)
Impact Scholar, UIC (2018)
Predoctoral Education for Clinical and Translational Scientists Fellowship (UL1TR002003) ($63,000), NIH (2017)
2nd place poster winner, GLBIO'17, ISCB (2017)
Scientific Excellence Award, Department of Medicine, UIC (2017)
Chancellor's Graduate Research Award ($8,000), UIC (2015)
1st place award at Research Forum among participants from Engineering, CS, Math, and Statistics, UIC (2016)
Travel awards, NSF, ACM, IEEE, ISCB, UIC, UIUC, ICTP, LinkSCEEM (2013-2018)
Distinctive honor and ranked 1st among undergraduate class, Cairo University (2010)
Helmholtz-Zentrum Geesthacht Fellowship, DAAD, Germany (2009)
Boards, Advisory Committees, Professional Organizations
Board member and Global Student Representative, IEEE Engineering in Medicine and Biology Society (EMBS) (2017 - 2019)
Member, IEEE Engineering in Medicine and Biology Society (EMBS) (2009 - Present)
Member, American Thoracic Society (ATS) (2017 - Present)
Member, Association for Computing Machinery (ACM) (2015 - Present)
Member, International Society for Computational Biology (ISCB) (2013 - Present)
Ph.D. in Bioinformatics, University of Illinois at Chicago (2018)
M.Sc. in Computer Science, University of Illinois at Chicago (2018)
M.Sc. in Biomedical Engineering, Cairo University (2014)
B.Sc. in Biomedical Engineering, Cairo University (2010)
Michael Snyder, Postdoctoral Faculty Sponsor
Pediatric lung transplantation: Dynamics of the microbiome and bronchiolitis obliterans in cystic fibrosis.
The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation
BACKGROUND: Compositional changes in the microbiome are associated with the development of bronchiolitis obliterans (BO) after lung transplantation (LTx) in adults with cystic fibrosis (CF). The association between the lower airway bacterial community and BO after LTx in children with CF remains largely unexplored and is possibly influenced by frequent antibiotic therapy. The objectives of this study were to examine the relationship between bacterial community dynamics and the development of BO and analyze antibiotic resistance trends in children after LTx for CF.METHODS: For 3 years from the time of transplant, 12 LTx recipients were followed longitudinally, with 5 subjects developing BO during the study period. A total of 82 longitudinal bronchoalveolar lavage samples were collected during standard of care bronchoscopies. Metagenomic shotgun sequencing was performed on the extracted microbial DNA from bronchoalveolar lavage specimens. Taxonomic profiling was constructed using WEVOTE pipeline. The longitudinal association between development of BO and temporal changes in bacterial diversity and abundance were evaluated with MetaLonDA. The analysis of antibiotic resistance genes was performed with the ARGs-OAP v2.0 pipeline.RESULTS: All recipients demonstrated a Proteobacteria-predominant lower airways community. Temporal reduction in bacterial diversity was significantly associated with the development of BO and associated with neutrophilia and antibiotic therapy. Conversely, an increasing abundance of the phylum Actinobacteria and the orders Neisseriales and Pseudonocardiales in the lower airways was significantly associated with resilience to BO. A more diverse bacterial community was related to a higher expression of multidrug resistance genes and increased proteobacterial abundance.CONCLUSIONS: Decreased diversity within bacterial communities may suggest a contribution to pediatric lung allograft rejection in CF.
View details for DOI 10.1016/j.healun.2020.04.016
View details for PubMedID 32580896
Molecular Choreography of Acute Exercise.
2020; 181 (5): 1112–30.e16
Acute physical activity leads to several changes in metabolic, cardiovascular, and immune pathways. Although studies have examined selected changes in these pathways, the system-wide molecular response to an acute bout of exercise has not been fully characterized. We performed longitudinal multi-omic profiling of plasma and peripheral blood mononuclear cells including metabolome, lipidome, immunome, proteome, and transcriptome from 36 well-characterized volunteers, before and after a controlled bout of symptom-limited exercise. Time-series analysis revealed thousands of molecular changes and an orchestrated choreography of biological processes involving energy metabolism, oxidative stress, inflammation, tissue repair, and growth factor response, as well as regulatory pathways. Most of these processes were dampened and some were reversed in insulin-resistant participants. Finally, we discovered biological pathways involved in cardiopulmonary exercise response and developed prediction models revealing potential resting blood-based biomarkers of peak oxygen consumption.
View details for DOI 10.1016/j.cell.2020.04.043
View details for PubMedID 32470399
Longitudinal multi-omics of host-microbe dynamics in prediabetes.
2019; 569 (7758): 663–71
Type 2 diabetes mellitus (T2D) is a growing health problem, but little is known about its early disease stages, its effects on biological processes or the transition to clinical T2D. To understand the earliest stages of T2Dbetter, we obtained samples from 106 healthy individuals and individuals with prediabetes over approximately four years and performed deep profiling of transcriptomes, metabolomes, cytokines, and proteomes, as well as changes in the microbiome. This rich longitudinal data set revealed many insights: first, healthy profiles are distinct among individuals while displaying diverse patterns of intra- and/or inter-personal variability. Second, extensive host and microbial changes occur during respiratory viral infections and immunization, and immunization triggers potentially protective responses that are distinct from responses to respiratory viral infections. Moreover, during respiratory viral infections, insulin-resistant participants respond differently than insulin-sensitive participants. Third, global co-association analyses among the thousands of profiled molecules reveal specific host-microbe interactions that differ between insulin-resistant and insulin-sensitive individuals. Last, we identified early personal molecular signatures in one individual that preceded the onset of T2D, including the inflammation markers interleukin-1 receptor agonist (IL-1RA) and high-sensitivity C-reactive protein (CRP) paired with xenobiotic-induced immune signalling. Our study reveals insights into pathways and responses that differ between glucose-dysregulated and healthy individuals during health and disease and provides an open-access data resource to enable further research into healthy, prediabetic and T2D states.
View details for DOI 10.1038/s41586-019-1236-x
View details for PubMedID 31142858
Utilizing longitudinal microbiome taxonomic profiles to predict food allergy via Long Short-Term Memory networks
PLOS COMPUTATIONAL BIOLOGY
2019; 15 (2): e1006693
Food allergy is usually difficult to diagnose in early life, and the inability to diagnose patients with atopic diseases at an early age may lead to severe complications. Numerous studies have suggested an association between the infant gut microbiome and development of allergy. In this work, we investigated the capacity of Long Short-Term Memory (LSTM) networks to predict food allergies in early life (0-3 years) from subjects' longitudinal gut microbiome profiles. Using the DIABIMMUNE dataset, we show an increase in predictive power using our model compared to Hidden Markov Model, Multi-Layer Perceptron Neural Network, Support Vector Machine, Random Forest, and LASSO regression. We further evaluated whether the training of LSTM networks benefits from reduced representations of microbial features. We considered sparse autoencoder for extraction of potential latent representations in addition to standard feature selection procedures based on Minimum Redundancy Maximum Relevance (mRMR) and variance prior to the training of LSTM networks. The comprehensive evaluation reveals that LSTM networks with the mRMR selected features achieve significantly better performance compared to the other tested machine learning models.
View details for DOI 10.1371/journal.pcbi.1006693
View details for Web of Science ID 000460276500013
View details for PubMedID 30716085
View details for PubMedCentralID PMC6361419
Identifying Appropriate Probabilistic Models for Sparse Discrete Omics Data
View details for Web of Science ID 000508002200123
Protein Subcellular Localization Prediction Based on Internal Micro-similarities of Markov Chains
IEEE. 2019: 1355–58
Elucidating protein subcellular localization is an essential topic in proteomics research due to its importance in the process of drug discovery. Unfortunately, experimentally uncovering protein subcellular targets is an arduous process that may not result in a successful localization. In contrast, computational methods can rapidly predict protein subcellular targets and are an efficient alternative to experimental methods for unannotated proteins. In this work, we introduce a new method to predict protein subcellular localization which increases the predictive power of generative probabilistic models while preserving their explanatory benefit. Our method exploits Markov models to produce a feature vector that records micro-similarities between the underlying probability distributions of a given sequence and their counterparts in reference models. Compared to ordinary Markov chain inference, we show that our method improves overall accuracy by 10% under 10-fold cross-validation on a dataset consisting of 10 subcellular locations. The source code is publicly available on https://github.com/aametwally/MC MicroSimilarities.
View details for Web of Science ID 000557295301182
View details for PubMedID 31946144
Taxonomic Classification at the Strain Level using a Species-of-Interest k-mer Database
View details for Web of Science ID 000508002200084
MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
2018; 6: 32
Microbial longitudinal studies are powerful experimental designs utilized to classify diseases, determine prognosis, and analyze microbial systems dynamics. In longitudinal studies, only identifying differential features between two phenotypes does not provide sufficient information to determine whether a change in the relative abundance is short-term or continuous. Furthermore, sample collection in longitudinal studies suffers from all forms of variability such as a different number of subjects per phenotypic group, a different number of samples per subject, and samples not collected at consistent time points. These inconsistencies are common in studies that collect samples from human subjects.We present MetaLonDA, an R package that is capable of identifying significant time intervals of differentially abundant microbial features. MetaLonDA is flexible such that it can perform differential abundance tests despite inconsistencies associated with sample collection. Extensive experiments on simulated datasets quantitatively demonstrate the effectiveness of MetaLonDA with significant improvement over alternative methods. We applied MetaLonDA to the DIABIMMUNE cohort ( https://pubs.broadinstitute.org/diabimmune ) substantiating significant early lifetime intervals of exposure to Bacteroides and Bifidobacterium in Finnish and Russian infants. Additionally, we established significant time intervals during which novel differentially relative abundant microbial genera may contribute to aberrant immunogenicity and development of autoimmune disease.MetaLonDA is computationally efficient and can be run on desktop machines. The identified differentially abundant features and their time intervals have the potential to distinguish microbial biomarkers that may be used for microbial reconstitution through bacteriotherapy, probiotics, or antibiotics. Moreover, MetaLonDA can be applied to any longitudinal count data such as metagenomic sequencing, 16S rRNA gene sequencing, or RNAseq. MetaLonDA is publicly available on CRAN ( https://CRAN.R-project.org/package=MetaLonDA ).
View details for DOI 10.1186/s40168-018-0402-y
View details for Web of Science ID 000425442200001
View details for PubMedID 29439731
View details for PubMedCentralID PMC5812052
A Circulating MicroRNA Signature Serves as a Diagnostic and Prognostic Indicator in Sarcoidosis
AMERICAN JOURNAL OF RESPIRATORY CELL AND MOLECULAR BIOLOGY
2018; 58 (1): 40–54
Micro-RNAs (miRNAs) act as post-transcriptional regulators of gene expression. In sarcoidosis, aberrant miRNA expression may enhance immune responses mounted against an unknown antigenic agent. We tested whether a distinct miRNA signature functions as a diagnostic biomarker and explored its role as an immune modulator in sarcoidosis. Expression of miRNAs from peripheral blood mononuclear cells from subjects that met clinical and histopathologic criteria for sarcoidosis were compared to those from matched controls in the ACCESS study. Signature miRNAs were determined by miRNA microarray analysis and validated by quantitative reverse transcription PCR (RT-qPCR). Microarray analysis identified 54 differentially expressed feature mature human miRNAs between groups. Significant feature miRNAs that distinguish sarcoidosis from controls were selected by use of probabilistic models adjusted for clinical variables. Eight signature miRNAs were chosen to verify the diagnosis of sarcoidosis in a validation cohort and distinguished sarcoidosis from controls with a positive predictive value of 88%. We identified both novel and previously described genes and molecular pathways associated with sarcoidosis as targets of these signature miRNAs. Additionally, we demonstrate that signature miRNAs (hsa-miR-150-3p and hsa-miR-342-5p) are significantly associated with reduced lymphocytes and airflow limitations, known markers of poor prognosis. Together, these findings suggest that a circulating miRNA signature serves as a non-invasive biomarker that supports the diagnosis of sarcoidosis. Future studies will test the miRNA signature as a prognostication tool associated with poor clinical outcomes in sarcoidosis.
View details for DOI 10.1165/rcmb.2017-0207OC
View details for Web of Science ID 000419123200009
View details for PubMedID 28812922
View details for PubMedCentralID PMC5941311
- A review on probabilistic models used in microbiome studies COMMUNICATIONS IN INFORMATION AND SYSTEMS 2018; 18 (3): 173–91
Bronchiolitis obliterans syndrome susceptibility and the pulmonary microbiome.
The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation
Lung transplantation outcomes remain complicated by bronchiolitis obliterans syndrome (BOS), a major cause of mortality and retransplantation for patients. A variety of factors linking inflammation and BOS have emerged, meriting further exploration of the microbiome as a source of inflammation. In this analysis, we determined features of the pulmonary microbiome associated with BOS susceptibility.Bronchoalveolar lavage (BAL) samples were collected from 25 patients during standard of care bronchoscopies before BOS onset. Microbial DNA was isolated from BAL fluid and prepared for metagenomics shotgun sequencing. Patient microbiomes were phenotyped using k-means clustering and compared to determine effects on BOS-free survival.Clustering identified 3 microbiome phenotypes: Actinobacteria dominant (AD), mixed, and Proteobacteria dominant. AD microbiomes, distinguished by enrichment with Gram-positive organisms, conferred reduced odds and risks for patients to develop acute rejection and BOS compared with non-AD microbiomes. These findings were independent of treatment models. Microbiome findings were correlated with BAL cell counts and polymorphonuclear cell percentages.In some populations, features of the microbiome may be used to assess BOS susceptibility. Namely, a Gram-positive enriched pulmonary microbiome may predict resilience to BOS.
View details for DOI 10.1016/j.healun.2018.04.007
View details for PubMedID 29929823
- Detection of Differential Abundance Intervals in Longitudinal Metagenomic Data Using Negative Binomial Smoothing Spline ANOVA ASSOC COMPUTING MACHINERY. 2017: 295–304
Using Convolutional Neural Networks to Explore the Microbiome
IEEE. 2017: 4269–72
The microbiome has been shown to have an impact on the development of various diseases in the host. Being able to make an accurate prediction of the phenotype of a genomic sample based on its microbial taxonomic abundance profile is an important problem for personalized medicine. In this paper, we examine the potential of using a deep learning framework, a convolutional neural network (CNN), for such a prediction. To facilitate the CNN learning, we explore the structure of abundance profiles by creating the phylogenetic tree and by designing a scheme to embed the tree to a matrix that retains the spatial relationship of nodes in the tree and their quantitative characteristics. The proposed CNN framework is highly accurate, achieving a 99.47% of accuracy based on the evaluation on a dataset 1967 samples of three phenotypes. Our result demonstrated the feasibility and promising aspect of CNN in the classification of sample phenotype.
View details for Web of Science ID 000427085304175
View details for PubMedID 29060840
WEVOTE: Weighted Voting Taxonomic Identification Method of Microbial Sequences
2016; 11 (9): e0163527
Metagenome shotgun sequencing presents opportunities to identify organisms that may prevent or promote disease. The analysis of sample diversity is achieved by taxonomic identification of metagenomic reads followed by generating an abundance profile. Numerous tools have been developed based on different design principles. Tools achieving high precision can lack sensitivity in some applications. Conversely, tools with high sensitivity can suffer from low precision and require long computation time.In this paper, we present WEVOTE (WEighted VOting Taxonomic idEntification), a method that classifies metagenome shotgun sequencing DNA reads based on an ensemble of existing methods using k-mer-based, marker-based, and naive-similarity based approaches. Our evaluation on fourteen benchmarking datasets shows that WEVOTE improves the classification precision by reducing false positive annotations while preserving a high level of sensitivity.WEVOTE is an efficient and automated tool that combines multiple individual taxonomic identification methods to produce more precise and sensitive microbial profiles. WEVOTE is developed primarily to identify reads generated by MetaGenome Shotgun sequencing. It is expandable and has the potential to incorporate additional tools to produce a more accurate taxonomic profile. WEVOTE was implemented using C++ and shell scripting and is available at www.github.com/aametwally/WEVOTE.
View details for DOI 10.1371/journal.pone.0163527
View details for Web of Science ID 000384171400044
View details for PubMedID 27683082
View details for PubMedCentralID PMC5040256
Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
2016; 469 (4): 967–77
The human microbiome has emerged as a major player in regulating human health and disease. Translational studies of the microbiome have the potential to indicate clinical applications such as fecal transplants and probiotics. However, one major issue is accurate identification of microbes constituting the microbiota. Studies of the microbiome have frequently utilized sequencing of the conserved 16S ribosomal RNA (rRNA) gene. We present a comparative study of an alternative approach using whole genome shotgun sequencing (WGS). In the present study, we analyzed the human fecal microbiome compiling a total of 194.1 × 10(6) reads from a single sample using multiple sequencing methods and platforms. Specifically, after establishing the reproducibility of our methods with extensive multiplexing, we compared: 1) The 16S rRNA amplicon versus the WGS method, 2) the Illumina HiSeq versus MiSeq platforms, 3) the analysis of reads versus de novo assembled contigs, and 4) the effect of shorter versus longer reads. Our study demonstrates that whole genome shotgun sequencing has multiple advantages compared with the 16S amplicon method including enhanced detection of bacterial species, increased detection of diversity and increased prediction of genes. In addition, increased length, either due to longer reads or the assembly of contigs, improved the accuracy of species detection.
View details for DOI 10.1016/j.bbrc.2015.12.083
View details for Web of Science ID 000369353000029
View details for PubMedID 26718401
View details for PubMedCentralID PMC4830092
Distributed Suffix Array Construction Algorithms: Comparison of Two Algorithms
IEEE. 2016: 27–30
View details for Web of Science ID 000400192500007
Cloud-based Parallel Suffix Array Construction based on MPI
IEEE. 2014: 334–37
View details for Web of Science ID 000353351700079