Honors & Awards
Travel Award, Journal Metabolites (2018)
Student Travel Award, The International Metabolomics Society (2018)
Doctor of Philosophy, Chinese Academy Of Sciences (2019)
BS, Inner Mongolia University (2013)
Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics
2019; 10: 1516
Large-scale metabolite annotation is a challenge in liquid chromatogram-mass spectrometry (LC-MS)-based untargeted metabolomics. Here, we develop a metabolic reaction network (MRN)-based recursive algorithm (MetDNA) that expands metabolite annotations without the need for a comprehensive standard spectral library. MetDNA is based on the rationale that seed metabolites and their reaction-paired neighbors tend to share structural similarities resulting in similar MS2 spectra. MetDNA characterizes initial seed metabolites using a small library of MS2 spectra, and utilizes their experimental MS2 spectra as surrogate spectra to annotate their reaction-paired neighbor metabolites, which subsequently serve as the basis for recursive analysis. Using different LC-MS platforms, data acquisition methods, and biological samples, we showcase the utility and versatility of MetDNA and demonstrate that about 2000 metabolites can cumulatively be annotated from one experiment. Our results demonstrate that MetDNA substantially expands metabolite annotation, enabling quantitative assessment of metabolic pathways and facilitating integrative multi-omics analysis.
View details for DOI 10.1038/s41467-019-09550-x
View details for Web of Science ID 000463171300005
View details for PubMedID 30944337
View details for PubMedCentralID PMC6447530
LipidIMMS Analyzer: integrating multi-dimensional information to support lipid identification in ion mobility-mass spectrometry based lipidomics
2019; 35 (4): 698–700
Ion mobility-mass spectrometry (IM-MS) has showed great application potential for lipidomics. However, IM-MS based lipidomics is significantly restricted by the available software for lipid structural identification. Here, we developed a software tool, namely, LipidIMMS Analyzer, to support the accurate identification of lipids in IM-MS. For the first time, the software incorporates a large-scale database covering over 260 000 lipids and four-dimensional structural information for each lipid [i.e. m/z, retention time (RT), collision cross-section (CCS) and MS/MS spectra]. Therefore, multi-dimensional information can be readily integrated to support lipid identifications, and significantly improve the coverage and confidence of identification. Currently, the software supports different IM-MS instruments and data acquisition approaches.The software is freely available at: http://imms.zhulab.cn/LipidIMMS/.Supplementary data are available at Bioinformatics online.
View details for DOI 10.1093/bioinformatics/bty661
View details for Web of Science ID 000459316300028
View details for PubMedID 30052780
Development of a Correlative Strategy To Discover Colorectal Tumor Tissue Derived Metabolite Biomarkers in Plasma Using Untargeted Metabolomics
2019; 91 (3): 2401–8
The metabolic profiling of biofluids using untargeted metabolomics provides a promising choice to discover metabolite biomarkers for clinical cancer diagnosis. However, metabolite biomarkers discovered in biofluids may not necessarily reflect the pathological status of tumor tissue, which makes these biomarkers difficult to reproduce. In this study, we developed a new analysis strategy by integrating the univariate and multivariate correlation analysis approach to discover tumor tissue derived (TTD) metabolites in plasma samples. Specifically, untargeted metabolomics was first used to profile a set of paired tissue and plasma samples from 34 colorectal cancer (CRC) patients. Next, univariate correlation analysis was used to select correlative metabolite pairs between tissue and plasma, and a random forest regression model was utilized to define 243 TTD metabolites in plasma samples. The TTD metabolites in CRC plasma were demonstrated to accurately reflect the pathological status of tumor tissue and have great potential for metabolite biomarker discovery. Accordingly, we conducted a clinical study using a set of 146 plasma samples from CRC patients and gender-matched polyp controls to discover metabolite biomarkers from TTD metabolites. As a result, eight metabolites were selected as potential biomarkers for CRC diagnosis with high sensitivity and specificity. For CRC patients after surgery, the survival risk score defined by metabolite biomarkers also performed well in predicting overall survival time ( p = 0.022) and progression-free survival time ( p = 0.002). In conclusion, we developed a new analysis strategy which effectively discovers tumor tissue related metabolite biomarkers in plasma for cancer diagnosis and prognosis.
View details for DOI 10.1021/acs.analchem.8b05177
View details for Web of Science ID 000458220300103
View details for PubMedID 30580524
MetFlow: An Interactive and Integrated Workflow for Metabolomics Data Cleaning and Differential Metabolite Discovery
View details for DOI 10.1093/bioinformatics/bty1066
LipidCCS: Prediction of Collision Cross-Section Values for Lipids with High Precision To Support Ion Mobility-Mass Spectrometry-Based Lipidomics
2017; 89 (17): 9559–66
The use of collision cross-section (CCS) values derived from ion mobility-mass spectrometry (IM-MS) has been proven to facilitate lipid identifications. Its utility is restricted by the limited availability of CCS values. Recently, the machine-learning algorithm-based prediction (e.g., MetCCS) is reported to generate CCS values in a large-scale. However, the prediction precision is not sufficient to differentiate lipids due to their high structural similarities and subtle differences on CCS values. To address this challenge, we developed a new approach, namely, LipidCCS, to precisely predict lipid CCS values. In LipidCCS, a set of molecular descriptors were optimized using bioinformatic approaches to comprehensively describe the subtle structure differences for lipids. The use of optimized molecular descriptors together with a large set of standard CCS values for lipids (458 in total) to build the prediction model significantly improved the precision. The prediction precision of LipidCCS was externally validated with median relative errors (MRE) of ∼1% using independent data sets across different instruments (Agilent DTIM-MS and Waters TWIM-MS) and laboratories. We also demonstrated that the improved precision in the predicted LipidCCS database (15 646 lipids and 63 434 CCS values in total) could effectively reduce false-positive identifications of lipids. Common users can freely access our LipidCCS web server for the following: (1) the prediction of lipid CCS values directly from SMILES structure; (2) database search; and (3) lipid match and identification. We believe LipidCCS will be a valuable tool to support IM-MS-based lipidomics. The web server is freely available on the Internet ( http://www.metabolomics-shanghai.org/LipidCCS/ ).
View details for DOI 10.1021/acs.analchem.7b02625
View details for Web of Science ID 000410014900133
View details for PubMedID 28764323
Large-Scale Prediction of Collision Cross-Section Values for Metabolites in Ion Mobility-Mass Spectrometry
2016; 88 (22): 11084–91
The rapid development of metabolomics has significantly advanced health and disease related research. However, metabolite identification remains a major analytical challenge for untargeted metabolomics. While the use of collision cross-section (CCS) values obtained in ion mobility-mass spectrometry (IM-MS) effectively increases identification confidence of metabolites, it is restricted by the limited number of available CCS values for metabolites. Here, we demonstrated the use of a machine-learning algorithm called support vector regression (SVR) to develop a prediction method that utilized 14 common molecular descriptors to predict CCS values for metabolites. In this work, we first experimentally measured CCS values (ΩN2) of ∼400 metabolites in nitrogen buffer gas and used these values as training data to optimize the prediction method. The high prediction precision of this method was externally validated using an independent set of metabolites with a median relative error (MRE) of ∼3%, better than conventional theoretical calculation. Using the SVR based prediction method, a large-scale predicted CCS database was generated for 35 203 metabolites in the Human Metabolome Database (HMDB). For each metabolite, five different ion adducts in positive and negative modes were predicted, accounting for 176 015 CCS values in total. Finally, improved metabolite identification accuracy was demonstrated using real biological samples. Conclusively, our results proved that the SVR based prediction method can accurately predict nitrogen CCS values (ΩN2) of metabolites from molecular descriptors and effectively improve identification accuracy and efficiency in untargeted metabolomics. The predicted CCS database, namely, MetCCS, is freely available on the Internet.
View details for DOI 10.1021/acs.analchem.6b03091
View details for Web of Science ID 000388154700045
View details for PubMedID 27768289
- Serum metabolomics for early diagnosis of esophageal squamous cell carcinoma by UHPLC-QTOF/MS METABOLOMICS 2016; 12 (7)
- Normalization and integration of large-scale metabolomics data using support vector regression METABOLOMICS 2016; 12 (5)