Bio


Clinical Focus
• Big data analytics for quality improvement and clinical effectiveness
• Disease biomarker discovery through multi-omics based analyses

Academic Appointments
• Assistant Professor Surgery

Professional Education
• B.S., Biochemistry, Fudan University, China (1990)
• M.A., Molecular and Developmental Biology, UCLA, US (1994)
• Ph.D., Biological Chemistry, UCLA, US (1996)
• Postdoctoral training, medicine/oncology/Computer science, Stanford University, US (1996-1998)
• Business administration, Leavey School of Business, Santa Clara University, US (2000-2001)

Current Research and Scholarly Interests


A significant focus of my career is the use of AI to decode real-world datasets of electronic health records, high-resolution LCMS-based liquid/tissue biopsy proteomics/metabolomics, and multiple modality medical imaging.

For population health management, we use tens of millions of real-world state-wide EMRs to develop risk surveillance systems that forecast aspects like disease progression, resource utilization, and mortality across a diverse patient demographic. This prompts timely clinical actions by simplifying intervention orders and crafting care strategies tailored to address modifiable patient risk components.

For first-in-class molecular diagnostics, we have developed unique LCMS based multi-omic approaches that allow the simultaneous absolute quantification of thousands of metabolites and proteins in blood and FFPE pathological slides to predict clinical outcomes. Our collaborations with key opinion leaders in pregnancy disorder and pediatric diseases, such as Kawasaki disease, have been productive and have helped to fill critical unmet medical needs.

For computer-aided pathology (CAP) and computer-aided medical imaging analytics (CAMIA), we have developed deep learning-based computational solutions to decode clinical outcome-correlating signals in pathological whole slide images and echocardiograms. Our multi-modality and multi-omics approaches synergize to promise the next generation of disease diagnostics and risk stratification solutions.

All Publications


  • Correction: Exploring the feasibility of using long-term stored newborn dried blood spots to identify metabolic features for congenital heart disease screening. Biomarker research Ceresnak, S. R., Zhang, Y., Ling, X. B., Su, K. J., Tang, Q., Jin, B., Schilling, J., Chou, C. J., Han, Z., Floyd, B. J., Whitin, J. C., Hwa, K. Y., Sylvester, K. G., Chubb, H., Luo, R. Y., Tian, L., Cohen, H. J., McElhinney, D. B. 2023; 11 (1): 101

    View details for DOI 10.1186/s40364-023-00546-w

    View details for PubMedID 37993911

  • Exploring the feasibility of using long-term stored newborn dried blood spots to identify metabolic features for congenital heart disease screening. Biomarker research Ceresnak, S. R., Zhang, Y., Ling, X. B., Su, K. J., Tang, Q., Jin, B., Schilling, J., Chou, C. J., Han, Z., Floyd, B. J., Whitin, J. C., Hwa, K. Y., Sylvester, K. G., Chubb, H., Luo, R. Y., Tian, L., Cohen, H. J., McElhinney, D. B. 2023; 11 (1): 97

    Abstract

    Congenital heart disease (CHD) represents a significant contributor to both morbidity and mortality in neonates and children. There's currently no analogous dried blood spot (DBS) screening for CHD immediately after birth. This study was set to assess the feasibility of using DBS to identify reliable metabolite biomarkers with clinical relevance, with the aim to screen and classify CHD utilizing the DBS. We assembled a cohort of DBS datasets from the California Department of Public Health (CDPH) Biobank, encompassing both normal controls and three pre-defined CHD categories. A DBS-based quantitative metabolomics method was developed using liquid chromatography with tandem mass spectrometry (LC-MS/MS). We conducted a correlation analysis comparing the absolute quantitated metabolite concentration in DBS against the CDPH NBS records to verify the reliability of metabolic profiling. For hydrophilic and hydrophobic metabolites, we executed significant pathway and metabolite analyses respectively. Logistic and LightGBM models were established to aid in CHD discrimination and classification. Consistent and reliable quantification of metabolites were demonstrated in DBS samples stored for up to 15 years. We discerned dysregulated metabolic pathways in CHD patients, including deviations in lipid and energy metabolism, as well as oxidative stress pathways. Furthermore, we identified three metabolites and twelve metabolites as potential biomarkers for CHD assessment and subtypes classifying. This study is the first to confirm the feasibility of validating metabolite profiling results using long-term stored DBS samples. Our findings highlight the potential clinical applications of our DBS-based methods for CHD screening and subtype classification.

    View details for DOI 10.1186/s40364-023-00536-y

    View details for PubMedID 37957758

    View details for PubMedCentralID PMC10644604

  • Altered expression of the L-arginine/nitric oxide pathway in ovarian cancer: metabolic biomarkers and biological implications. BMC cancer Chen, L., Tang, Q., Zhang, K., Huang, Q., Ding, Y., Jin, B., Liu, S., Hwa, K., Chou, C. J., Zhang, Y., Thyparambil, S., Liao, W., Han, Z., Mortensen, R., Schilling, J., Li, Z., Heaton, R., Tian, L., Cohen, H. J., Sylvester, K. G., Arent, R. C., Zhao, X., McElhinney, D. B., Wu, Y., Bai, W., Ling, X. B. 2023; 23 (1): 844

    Abstract

    Ovarian cancer (OC) is a highly lethal gynecological malignancy. Extensive research has shown that OC cells undergo significant metabolic alterations during tumorigenesis. In this study, we aim to leverage these metabolic changes as potential biomarkers for assessing ovarian cancer.A functional module-based approach was utilized to identify key gene expression pathways that distinguish different stages of ovarian cancer (OC) within a tissue biopsy cohort. This cohort consisted of control samples (n = 79), stage I/II samples (n = 280), and stage III/IV samples (n = 1016). To further explore these altered molecular pathways, minimal spanning tree (MST) analysis was applied, leading to the formulation of metabolic biomarker hypotheses for OC liquid biopsy. To validate, a multiple reaction monitoring (MRM) based quantitative LCMS/MS method was developed. This method allowed for the precise quantification of targeted metabolite biomarkers using an OC blood cohort comprising control samples (n = 464), benign samples (n = 3), and OC samples (n = 13).Eleven functional modules were identified as significant differentiators (false discovery rate, FDR < 0.05) between normal and early-stage, or early-stage and late-stage ovarian cancer (OC) tumor tissues. MST analysis revealed that the metabolic L-arginine/nitric oxide (L-ARG/NO) pathway was reprogrammed, and the modules related to "DNA replication" and "DNA repair and recombination" served as anchor modules connecting the other nine modules. Based on this analysis, symmetric dimethylarginine (SDMA) and arginine were proposed as potential liquid biopsy biomarkers for OC assessment. Our quantitative LCMS/MS analysis on our OC blood cohort provided direct evidence supporting the use of the SDMA-to-arginine ratio as a liquid biopsy panel to distinguish between normal and OC samples, with an area under the ROC curve (AUC) of 98.3%.Our comprehensive analysis of tissue genomics and blood quantitative LC/MSMS metabolic data shed light on the metabolic reprogramming underlying OC pathophysiology. These findings offer new insights into the potential diagnostic utility of the SDMA-to-arginine ratio for OC assessment. Further validation studies using adequately powered OC cohorts are warranted to fully establish the clinical effectiveness of this diagnostic test.

    View details for DOI 10.1186/s12885-023-11192-8

    View details for PubMedID 37684587

    View details for PubMedCentralID 8192829

  • Personalized therapy for head and neck squamous carcinoma (HNSCC) utilizing tissue proteomics profiling. Thyparambil, S. P., Liao, W., Strasbaugh, A., Melkie, M., Ghafourian, N., Heaton, R., Ling, X. LIPPINCOTT WILLIAMS & WILKINS. 2023
  • Development of a Urine Metabolomics Biomarker-Based Prediction Model for Preeclampsia during Early Pregnancy. Metabolites Zhang, Y., Sylvester, K. G., Jin, B., Wong, R. J., Schilling, J., Chou, C. J., Han, Z., Luo, R. Y., Tian, L., Ladella, S., Mo, L., Maric, I., Blumenfeld, Y. J., Darmstadt, G. L., Shaw, G. M., Stevenson, D. K., Whitin, J. C., Cohen, H. J., McElhinney, D. B., Ling, X. B. 2023; 13 (6)

    Abstract

    Preeclampsia (PE) is a condition that poses a significant risk of maternal mortality and multiple organ failure during pregnancy. Early prediction of PE can enable timely surveillance and interventions, such as low-dose aspirin administration. In this study, conducted at Stanford Health Care, we examined a cohort of 60 pregnant women and collected 478 urine samples between gestational weeks 8 and 20 for comprehensive metabolomic profiling. By employing liquid chromatography mass spectrometry (LCMS/MS), we identified the structures of seven out of 26 metabolomics biomarkers detected. Utilizing the XGBoost algorithm, we developed a predictive model based on these seven metabolomics biomarkers to identify individuals at risk of developing PE. The performance of the model was evaluated using 10-fold cross-validation, yielding an area under the receiver operating characteristic curve of 0.856. Our findings suggest that measuring urinary metabolomics biomarkers offers a noninvasive approach to assess the risk of PE prior to its onset.

    View details for DOI 10.3390/metabo13060715

    View details for PubMedID 37367874

  • Engineered Living Intestinal Muscle Patch Produces Macroscopic Contractions that can Mix and Break down Artificial Intestinal Contents. Advanced materials (Deerfield Beach, Fla.) Wang, Q., Wang, J., Tokhtaeva, E., Li, Z., Martín, M. G., Ling, X. B., Dunn, J. C. 2023: e2207255

    Abstract

    The intestinal muscle layers execute various gut wall movements to achieve controlled propulsion and mixing of intestinal content. Engineering intestinal muscle layers with complex contractile function is critical for developing bioartificial intestinal tissue to treat patients with short bowel syndrome. Here we report the first demonstration of a living intestinal muscle patch capable of generating three distinct motility patterns and displaying multiple digesta manipulations. Assessment of cell contractility, cellular morphology, and transcriptome profile reveals that successful generation of the contracting intestinal muscle patch relies on both biological factors in a serum-free medium and environmental cues from an elastic electrospun gelatin scaffold. By comparing gene-expression patterns among samples, we show that biological factors from the medium strongly affect ion transport activities, while the scaffold unexpectedly regulates cell-cell communication. Analysis of the ligand-receptor interactome identifies the scaffold-driven changes in intercellular communication, and 78% of the upregulated ligand-receptor interactions are involved in the development and function of enteric neurons. Our discoveries highlight the importance of combining biomolecular and biomaterial approaches for tissue engineering. The living intestinal muscle patch represents a pivotal advancement for building functional replacement intestinal tissue. It offers a more physiological model for studying GI motility and for preclinical drug discovery. This article is protected by copyright. All rights reserved.

    View details for DOI 10.1002/adma.202207255

    View details for PubMedID 36779454

  • Early prediction and longitudinal modeling of preeclampsia from multiomics. Patterns (New York, N.Y.) Maric, I., Contrepois, K., Moufarrej, M. N., Stelzer, I. A., Feyaerts, D., Han, X., Tang, A., Stanley, N., Wong, R. J., Traber, G. M., Ellenberger, M., Chang, A. L., Fallahzadeh, R., Nassar, H., Becker, M., Xenochristou, M., Espinosa, C., De Francesco, D., Ghaemi, M. S., Costello, E. K., Culos, A., Ling, X. B., Sylvester, K. G., Darmstadt, G. L., Winn, V. D., Shaw, G. M., Relman, D. A., Quake, S. R., Angst, M. S., Snyder, M. P., Stevenson, D. K., Gaudilliere, B., Aghaeepour, N. 2022; 3 (12): 100655

    Abstract

    Preeclampsia is a complex disease of pregnancy whose physiopathology remains unclear. We developed machine-learning models for early prediction of preeclampsia (first 16weeks of pregnancy) and over gestation by analyzing six omics datasets from a longitudinal cohort of pregnant women. For early pregnancy, a prediction model using nine urine metabolites had the highest accuracy and was validated on an independent cohort (area under the receiver-operating characteristic curve [AUC]= 0.88, 95% confidence interval [CI] [0.76, 0.99] cross-validated; AUC= 0.83, 95% CI [0.62,1] validated). Univariate analysis demonstrated statistical significance of identified metabolites. An integrated multiomics model further improved accuracy (AUC= 0.94). Several biological pathways were identified including tryptophan, caffeine, and arachidonic acid metabolisms. Integration with immune cytometry data suggested novel associations between immune and proteomic dynamics. While further validation in a larger population is necessary, these encouraging results can serve as a basis for a simple, early diagnostic test for preeclampsia.

    View details for DOI 10.1016/j.patter.2022.100655

    View details for PubMedID 36569558

  • Single center blind testing of a US multi-center validated diagnostic algorithm for Kawasaki disease in Taiwan. Frontiers in immunology Kuo, H. C., Hao, S., Jin, B., Chou, C. J., Han, Z., Chang, L. S., Huang, Y. H., Hwa, K., Whitin, J. C., Sylvester, K. G., Reddy, C. D., Chubb, H., Ceresnak, S. R., Kanegaye, J. T., Tremoulet, A. H., Burns, J. C., McElhinney, D., Cohen, H. J., Ling, X. B. 2022; 13: 1031387

    Abstract

    Kawasaki disease (KD) is the leading cause of acquired heart disease in children. The major challenge in KD diagnosis is that it shares clinical signs with other childhood febrile control (FC) subjects. We sought to determine if our algorithmic approach applied to a Taiwan cohort.A single center (Chang Gung Memorial Hospital in Taiwan) cohort of patients suspected with acute KD were prospectively enrolled by local KD specialists for KD analysis. Our previously single-center developed computer-based two-step algorithm was further tested by a five-center validation in US. This first blinded multi-center trial validated our approach, with sufficient sensitivity and positive predictive value, to identify most patients with KD diagnosed at centers across the US. This study involved 418 KDs and 259 FCs from the Chang Gung Memorial Hospital in Taiwan.Our diagnostic algorithm retained sensitivity (379 of 418; 90.7%), specificity (223 of 259; 86.1%), PPV (379 of 409; 92.7%), and NPV (223 of 247; 90.3%) comparable to previous US 2016 single center and US 2020 fiver center results. Only 4.7% (15 of 418) of KD and 2.3% (6 of 259) of FC patients were identified as indeterminate. The algorithm identified 18 of 50 (36%) KD patients who presented 2 or 3 principal criteria. Of 418 KD patients, 157 were infants younger than one year and 89.2% (140 of 157) were classified correctly. Of the 44 patients with KD who had coronary artery abnormalities, our diagnostic algorithm correctly identified 43 (97.7%) including all patients with dilated coronary artery but one who found to resolve in 8 weeks.This work demonstrates the applicability of our algorithmic approach and diagnostic portability in Taiwan.

    View details for DOI 10.3389/fimmu.2022.1031387

    View details for PubMedID 36263040

    View details for PubMedCentralID PMC9575935

  • A machine-learning algorithm for diagnosis of multisystem inflammatory syndrome in children and Kawasaki disease in the USA: a retrospective model development and validation study LANCET DIGITAL HEALTH Lam, J. Y., Shimizu, C., Tremoulet, A. H., Bainto, E., Roberts, S. C., Sivilay, N., Gardiner, M. A., Kanegaye, J. T., Hogan, A. H., Salazar, J. C., Mohandas, S., Szmuszkovicz, J. R., Mahanta, S., Dionne, A., Newburger, J. W., Ansusinha, E., DeBiasi, R. L., Hao, S., Ling, X. B., Cohen, H. J., Nemati, S., Burns, J. C., Pediat Emergency Med Kawasaki Dis, CHARMS Study Grp 2022; 4 (10): E717-E726
  • Machine Learning for Pediatric Echocardiographic Mitral Regurgitation Detection. Journal of the American Society of Echocardiography : official publication of the American Society of Echocardiography Edwards, L. A., Feng, F., Iqbal, M., Fu, Y., Sanyahumbi, A., Hao, S., McElhinnney, D. B., Ling, X. B., Sable, C., Luo, J. 2022

    Abstract

    Echocardiography-based screening for valvular disease in at-risk asymptomatic children can result in early diagnosis. These screening programs, however, are resource intensive, and may not be feasible in many resource-limited settings. Automated echocardiographic diagnosis may enable more widespread echocardiographic screening, early diagnosis, and improved outcomes. In this feasibility study, we sought to build a machine learning model capable of identifying mitral regurgitation (MR) on echocardiogram.Echocardiograms were labeled by clip for view and by frame for the presence of MR. The labeled data were used to build two convolutional neural networks (CNNs) to perform the stepwise tasks of classifying the clips 1) by view and 2) by the presence of any MR, including physiologic, in parasternal long axis color Doppler views (PLAX-C). We developed the view classification model using 66,330 frames and evaluated model performance using a hold-out testing dataset with 45 echocardiograms (11,730 frames). We developed the MR detection model using 938 frames and evaluated model performance using a hold-out testing dataset with 42 echocardiograms (182 frames). Metrics to evaluate model performance included accuracy, precision, recall, F1 score (average of precision and recall, 0 to 1 with 1 suggesting perfect precision and recall), and receiver-operating characteristic analysis.For the PLAX-C view, the view classification CNN achieved an F1 score of 0.97. The MR detection CNN achieved a testing accuracy of 0.86 and an area under the receiver operating characteristic curve of 0.91.A machine learning model is capable of discerning MR on transthoracic echocardiography. This is an encouraging step toward machine learning-based diagnosis of valvular heart disease on pediatric echocardiograms.

    View details for DOI 10.1016/j.echo.2022.09.017

    View details for PubMedID 36191670

  • Progressive Metabolic Abnormalities Associated with the Development of Neonatal Bronchopulmonary Dysplasia. Nutrients Ye, C., Wu, J., Reiss, J. D., Sinclair, T. J., Stevenson, D. K., Shaw, G. M., Chace, D. H., Clark, R. H., Prince, L. S., Ling, X. B., Sylvester, K. G. 2022; 14 (17)

    Abstract

    Objective: To assess the longitudinal metabolic patterns during the evolution of bronchopulmonary dysplasia (BPD) development. Methods: A case-control dataset of preterm infants (<32-week gestation) was obtained from a multicenter database, including 355 BPD cases and 395 controls. A total of 72 amino acid (AA) and acylcarnitine (AC) variables, along with infants' calorie intake and growth outcomes, were measured on day of life 1, 7, 28, and 42. Logistic regression, clustering methods, and random forest statistical modeling were utilized to identify metabolic variables significantly associated with BPD development and to investigate their longitudinal patterns that are associated with BPD development. Results: A panel of 27 metabolic variables were observed to be longitudinally associated with BPD development. The involved metabolites increased from 1 predominant different AC by day 7 to 19 associated AA and AC compounds by day 28 and 16 metabolic features by day 42. Citrulline, alanine, glutamate, tyrosine, propionylcarnitine, free carnitine, acetylcarnitine, hydroxybutyrylcarnitine, and most median-chain ACs (C5:C10) were the most associated metabolites down-regulated in BPD babies over the early days of life, whereas phenylalanine, methionine, and hydroxypalmitoylcarnitine were observed to be up-regulated in BPD babies. Most calorie intake and growth outcomes revealed similar longitudinal patterns between BPD cases and controls over the first 6 weeks of life, after gestational adjustment. When combining with birth weight, the derived metabolic-based discriminative model observed some differences between those with and without BPD development, with c-statistics of 0.869 and 0.841 at day 7 and 28 of life on the test data. Conclusions: The metabolic panel we describe identified some metabolic differences in the blood associated with BPD pathogenesis. Further work is needed to determine whether these compounds could facilitate the monitoring and/or investigation of early-life metabolic status in the lung and other tissues for the prevention and management of BPD.

    View details for DOI 10.3390/nu14173547

    View details for PubMedID 36079804

  • Clinical survey of Trop2 antibody drug conjugate target and payload biomarkers in multiple cancer indications using multiplex mass spectrometry Thyparambil, S. P., Liao, W., Heaton, R., Zhang, G., Strasbaugh, A., Melkie, M., Ling, X. B. AMER ASSOC CANCER RESEARCH. 2022
  • Multi-Omics Signatures Link to Ticagrelor Effects on Vascular Function in Patients With Acute Coronary Syndrome. Arteriosclerosis, thrombosis, and vascular biology Tam, C. F., Chan, Y., Wong, Y., Li, Z., Zhu, X., Su, K., Ganguly, A., Hwa, K., Ling, X. B., Tse, H. 2022: 101161ATVBAHA121317513

    Abstract

    BACKGROUND: Long-term antiplatelet agents including the potent P2Y12 antagonist ticagrelor are indicated in patients with a previous history of acute coronary syndrome. We sought to compare the effect of ticagrelor with that of aspirin monotherapy on vascular endothelial function in patients with prior acute coronary syndrome.METHODS: This was a prospective, single center, parallel group, investigator-blinded randomized controlled trial. We randomized 200 patients on long-term aspirin monotherapy with prior acute coronary syndrome in a 1:1 fashion to receive ticagrelor 60 mg BD (n=100) or aspirin 100 mg OD (n=100). The primary end point was change from baseline in brachial artery flow-mediated dilation at 12 weeks. Secondary end points were changes to platelet activation marker (CD41_62p) and endothelial progenitor cell (CD34/133) count measured by flow cytometry, plasma level of adenosine, IL-6 (interleukin-6) and EGF (epidermal growth factor), and multi-omics profiling at 12 weeks.RESULTS: After 12 weeks, brachial flow-mediated dilation was significantly increased in the ticagrelor group compared with the aspirin group (ticagrelor: 3.48±3.48% versus aspirin: -1.26±2.85%, treatment effect 4.73 [95% CI, 3.85-5.62], P<0.001). Nevertheless ticagrelor treatment for 12 weeks had no significant effect on platelet activation markers, circulating endothelial progenitor cell count or plasma level of adenosine, IL-6, and EGF (all P>0.05). Multi-omics pathway assessment revealed that changes in the metabolism and biosynthesis of amino acids (cysteine and methionine metabolism; phenylalanine, tyrosine, and tryptophan biosynthesis) and phospholipids (glycerophosphoethanolamines and glycerophosphoserines) were associated with improved brachial artery flow-mediated dilation in the ticagrelor group.CONCLUSIONS: In patients with prior acute coronary syndrome, ticagrelor 60 mg BD monotherapy significantly improved brachial flow-mediated dilation compared with aspirin monotherapy and was associated with significant changes in metabolomic and lipidomic signatures.REGISTRATION: URL: https://www.CLINICALTRIALS: gov; Unique identifier: NCT03881943.

    View details for DOI 10.1161/ATVBAHA.121.317513

    View details for PubMedID 35387483

  • Serological Phenotyping Analysis Uncovers a Unique Metabolomic Pattern Associated With Early Onset of Type 2 Diabetes Mellitus. Frontiers in molecular biosciences Zhu, L., Huang, Q., Li, X., Jin, B., Ding, Y., Chou, C. J., Su, K., Zhang, Y., Chen, X., Hwa, K. Y., Thyparambil, S., Liao, W., Han, Z., Mortensen, R., Jin, Y., Li, Z., Schilling, J., Li, Z., Sylvester, K. G., Sun, X., Ling, X. B. 2022; 9: 841209

    Abstract

    Background: Type 2 diabetes mellitus (T2DM) is a multifaceted disorder affecting epidemic proportion at global scope. Defective insulin secretion by pancreatic beta-cells and the inability of insulin-sensitive tissues to respond effectively to insulin are the underlying biology of T2DM. However, circulating biomarkers indicative of early diabetic onset at the asymptomatic stage have not been well described. We hypothesized that global and targeted mass spectrometry (MS) based metabolomic discovery can identify novel serological metabolic biomarkers specifically associated with T2DM. We further hypothesized that these markers can have a unique pattern associated with latent or early asymptomatic stage, promising an effective liquid biopsy approach for population T2DM risk stratification and screening. Methods: Four independent cohorts were assembled for the study. The T2DM cohort included sera from 25 patients with T2DM and 25 healthy individuals for the biomarker discovery and sera from 15 patients with T2DM and 15 healthy controls for the testing. The Pre-T2DM cohort included sera from 76 with prediabetes and 62 healthy controls for the model training and sera from 35 patients with prediabetes and 27 healthy controls for the model testing. Both global and targeted (amino acid, acylcarnitine, and fatty acid) approaches were used to deep phenotype the serological metabolome by high performance liquid chromatography-high resolution mass spectrometry. Different machine learning approaches (Random Forest, XGBoost, and ElasticNet) were applied to model the unique T2DM/Pre-T2DM metabolic patterns and contrasted with their effectiness to differentiate T2DM/Pre-T2DM from controls. Results: The univariate analysis identified unique panel of metabolites (n = 22) significantly associated with T2DM. Global metabolomics and subsequent structure determination led to the identification of 8 T2DM biomarkers while targeted LCMS profiling discovered 14 T2DM biomarkers. Our panel can effectively differentiate T2DM (ROC AUC = 1.00) or Pre-T2DM (ROC AUC = 0.84) from the controls in the respective testing cohort. Conclusion: Our serological metabolite panel can be utilized to identifiy asymptomatic population at risk of T2DM, which may provide utility in identifying population at risk at an early stage of diabetic development to allow for clinical intervention. This early detection would guide ehanced levels of care and accelerate development of clinical strategies to prevent T2DM.

    View details for DOI 10.3389/fmolb.2022.841209

    View details for PubMedID 35463946

  • Early-pregnancy prediction of risk for pre-eclampsia using maternal blood leptin/ceramide ratio: discovery and confirmation. BMJ open Huang, Q., Hao, S., You, J., Yao, X., Li, Z., Schilling, J., Thyparambil, S., Liao, W., Zhou, X., Mo, L., Ladella, S., Davies-Balch, S. R., Zhao, H., Fan, D., Whitin, J. C., Cohen, H. J., McElhinney, D. B., Wong, R. J., Shaw, G. M., Stevenson, D. K., Sylvester, K. G., Ling, X. B. 2021; 11 (11): e050963

    Abstract

    OBJECTIVE: This study aimed to develop a blood test for the prediction of pre-eclampsia (PE) early in gestation. We hypothesised that the longitudinal measurements of circulating adipokines and sphingolipids in maternal serum over the course of pregnancy could identify novel prognostic biomarkers that are predictive of impending event of PE early in gestation.STUDY DESIGN: Retrospective discovery and longitudinal confirmation.SETTING: Maternity units from two US hospitals.PARTICIPANTS: Six previously published studies of placental tissue (78 PE and 95 non-PE) were compiled for genomic discovery, maternal sera from 15 women (7 non-PE and 8 PE) enrolled at ProMedDx were used for sphingolipidomic discovery, and maternal sera from 40 women (20 non-PE and 20 PE) enrolled at Stanford University were used for longitudinal observation.OUTCOME MEASURES: Biomarker candidates from discovery were longitudinally confirmed and compared in parallel to the ratio of placental growth factor (PlGF) and soluble fms-like tyrosine kinase (sFlt-1) using the same cohort. The datasets were generated by enzyme-linked immunosorbent and liquid chromatography-tandem mass spectrometric assays.RESULTS: Our discovery integrating genomic and sphingolipidomic analysis identified leptin (Lep) and ceramide (Cer) (d18:1/25:0) as novel biomarkers for early gestational assessment of PE. Our longitudinal observation revealed a marked elevation of Lep/Cer (d18:1/25:0) ratio in maternal serum at a median of 23 weeks' gestation among women with impending PE as compared with women with uncomplicated pregnancy. The Lep/Cer (d18:1/25:0) ratio significantly outperformed the established sFlt-1/PlGF ratio in predicting impending event of PE with superior sensitivity (85% vs 20%) and area under curve (0.92 vs 0.52) from 5 to 25 weeks of gestation.CONCLUSIONS: Our study demonstrated the longitudinal measurement of maternal Lep/Cer (d18:1/25:0) ratio allows the non-invasive assessment of PE to identify pregnancy at high risk in early gestation, outperforming the established sFlt-1/PlGF ratio test.

    View details for DOI 10.1136/bmjopen-2021-050963

    View details for PubMedID 34824115

  • Multi-omics longitudinal analyses in stages I to III CRC patients: Surveillance liquid biopsy test to predict early recurrence and enable risk-stratified postoperative CRC management. Liu, X., Zhang, Y., Zhu, X., Thyparambil, S. P., Liao, W., Zheng, X., You, J., Masood, A., Li, Z., Yang, G., Yao, X., Hao, S., Heaton, R., Schilling, J., Sylvester, K. G., Liao, J., Gao, F., Lan, P., Ling, X., Wu, X. LIPPINCOTT WILLIAMS & WILKINS. 2021
  • Integrated trajectories of the maternal metabolome, proteome, and immunome predict labor onset. Science translational medicine Stelzer, I. A., Ghaemi, M. S., Han, X., Ando, K., Hedou, J. J., Feyaerts, D., Peterson, L. S., Rumer, K. K., Tsai, E. S., Ganio, E. A., Gaudilliere, D. K., Tsai, A. S., Choisy, B., Gaigne, L. P., Verdonk, F., Jacobsen, D., Gavasso, S., Traber, G. M., Ellenberger, M., Stanley, N., Becker, M., Culos, A., Fallahzadeh, R., Wong, R. J., Darmstadt, G. L., Druzin, M. L., Winn, V. D., Gibbs, R. S., Ling, X. B., Sylvester, K., Carvalho, B., Snyder, M. P., Shaw, G. M., Stevenson, D. K., Contrepois, K., Angst, M. S., Aghaeepour, N., Gaudilliere, B. 2021; 13 (592)

    Abstract

    Estimating the time of delivery is of high clinical importance because pre- and postterm deviations are associated with complications for the mother and her offspring. However, current estimations are inaccurate. As pregnancy progresses toward labor, major transitions occur in fetomaternal immune, metabolic, and endocrine systems that culminate in birth. The comprehensive characterization of maternal biology that precedes labor is key to understanding these physiological transitions and identifying predictive biomarkers of delivery. Here, a longitudinal study was conducted in 63 women who went into labor spontaneously. More than 7000 plasma analytes and peripheral immune cell responses were analyzed using untargeted mass spectrometry, aptamer-based proteomic technology, and single-cell mass cytometry in serial blood samples collected during the last 100 days of pregnancy. The high-dimensional dataset was integrated into a multiomic model that predicted the time to spontaneous labor [R = 0.85, 95% confidence interval (CI) [0.79 to 0.89], P = 1.2 * 10-40, N = 53, training set; R = 0.81, 95% CI [0.61 to 0.91], P = 3.9 * 10-7, N = 10, independent test set]. Coordinated alterations in maternal metabolome, proteome, and immunome marked a molecular shift from pregnancy maintenance to prelabor biology 2 to 4 weeks before delivery. A surge in steroid hormone metabolites and interleukin-1 receptor type 4 that preceded labor coincided with a switch from immune activation to regulation of inflammatory responses. Our study lays the groundwork for developing blood-based methods for predicting the day of labor, anchored in mechanisms shared in preterm and term pregnancies.

    View details for DOI 10.1126/scitranslmed.abd9898

    View details for PubMedID 33952678

  • Understanding how biologic and social determinants affect disparities in preterm birth and outcomes of preterm infants in the NICU. Seminars in perinatology Stevenson, D. K., Aghaeepour, N., Maric, I., Angst, M. S., Darmstadt, G. L., Druzin, M. L., Gaudilliere, B., Ling, X. B., Moufarrej, M. N., Peterson, L. S., Quake, S. R., Relman, D. A., Snyder, M. P., Sylvester, K. G., Shaw, G. M., Wong, R. J. 2021: 151408

    Abstract

    To understand the disparities in spontaneous preterm birth (sPTB) and/or its outcomes, biologic and social determinants as well as healthcare practice (such as those in neonatal intensive care units) should be considered. They have been largely intractable and remain obscure in most cases, despite a myriad of identified risk factors for and causes of sPTB. We still do not know how they might actually affect and lead to the different outcomes at different gestational ages and if they are independent of NICU practices. Here we describe an integrated approach to study the interplay between the genome and exposome, which may drive biochemistry and physiology, with health disparities.

    View details for DOI 10.1016/j.semperi.2021.151408

    View details for PubMedID 33875265

  • Electronic Health Record-Based Prediction of 1-Year Risk of Incident Cardiac Dysrhythmia: Prospective Case-Finding Algorithm Development and Validation Study. JMIR medical informatics Zhang, Y., Han, Y., Gao, P., Mo, Y., Hao, S., Huang, J., Ye, F., Li, Z., Zheng, L., Yao, X., Li, Z., Li, X., Wang, X., Huang, C., Jin, B., Zhang, Y., Yang, G., Alfreds, S. T., Kanov, L., Sylvester, K. G., Widen, E., Li, L., Ling, X. 2021; 9 (2): e23606

    Abstract

    BACKGROUND: Cardiac dysrhythmia is currently an extremely common disease. Severe arrhythmias often cause a series of complications, including congestive heart failure, fainting or syncope, stroke, and sudden death.OBJECTIVE: The aim of this study was to predict incident arrhythmia prospectively within a 1-year period to provide early warning of impending arrhythmia.METHODS: Retrospective (1,033,856 individuals enrolled between October 1, 2016, and October 1, 2017) and prospective (1,040,767 individuals enrolled between October 1, 2017, and October 1, 2018) cohorts were constructed from integrated electronic health records in Maine, United States. An ensemble learning workflow was built through multiple machine learning algorithms. Differentiating features, including acute and chronic diseases, procedures, health status, laboratory tests, prescriptions, clinical utilization indicators, and socioeconomic determinants, were compiled for incident arrhythmia assessment. The predictive model was retrospectively trained and calibrated using an isotonic regression method and was prospectively validated. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC).RESULTS: The cardiac dysrhythmia case-finding algorithm (retrospective: AUROC 0.854; prospective: AUROC 0.827) stratified the population into 5 risk groups: 53.35% (555,233/1,040,767), 44.83% (466,594/1,040,767), 1.76% (18,290/1,040,767), 0.06% (623/1,040,767), and 0.003% (27/1,040,767) were in the very low-risk, low-risk, medium-risk, high-risk, and very high-risk groups, respectively; 51.85% (14/27) patients in the very high-risk subgroup were confirmed to have incident cardiac dysrhythmia within the subsequent 1 year.CONCLUSIONS: Our case-finding algorithm is promising for prospectively predicting 1-year incident cardiac dysrhythmias in a general population, and we believe that our case-finding algorithm can serve as an early warning system to allow statewide population-level screening and surveillance to improve cardiac dysrhythmia care.

    View details for DOI 10.2196/23606

    View details for PubMedID 33595452

  • Targeted multiplex proteomics (TMP) and genomics of early-onset colorectal cancer (EO-CRC) Kam, A. E., Khaliq, A. M., Alam, N., Hayden, D., Bhama, A. R., Govekar, H., Pappas, S., Ritz, E. M., Singh, A., Thyparambil, S. P., Liao, W., Bhalkikar, A., Ling, X. B., Levy, M., Kuzel, T., Masood, A. LIPPINCOTT WILLIAMS & WILKINS. 2021
  • Proteomic signatures predict preeclampsia in individual cohorts but not across cohorts - implications for clinical biomarker studies. The journal of maternal-fetal & neonatal medicine : the official journal of the European Association of Perinatal Medicine, the Federation of Asia and Oceania Perinatal Societies, the International Society of Perinatal Obstetricians Ghaemi, M. S., Tarca, A. L., Romero, R. n., Stanley, N. n., Fallahzadeh, R. n., Tanada, A. n., Culos, A. n., Ando, K. n., Han, X. n., Blumenfeld, Y. J., Druzin, M. L., El-Sayed, Y. Y., Gibbs, R. S., Winn, V. D., Contrepois, K. n., Ling, X. B., Wong, R. J., Shaw, G. M., Stevenson, D. K., Gaudilliere, B. n., Aghaeepour, N. n., Angst, M. S. 2021: 1–8

    Abstract

    Early identification of pregnant women at risk for preeclampsia (PE) is important, as it will enable targeted interventions ahead of clinical manifestations. The quantitative analyses of plasma proteins feature prominently among molecular approaches used for risk prediction. However, derivation of protein signatures of sufficient predictive power has been challenging. The recent availability of platforms simultaneously assessing over 1000 plasma proteins offers broad examinations of the plasma proteome, which may enable the extraction of proteomic signatures with improved prognostic performance in prenatal care.The primary aim of this study was to examine the generalizability of proteomic signatures predictive of PE in two cohorts of pregnant women whose plasma proteome was interrogated with the same highly multiplexed platform. Establishing generalizability, or lack thereof, is critical to devise strategies facilitating the development of clinically useful predictive tests. A second aim was to examine the generalizability of protein signatures predictive of gestational age (GA) in uncomplicated pregnancies in the same cohorts to contrast physiological and pathological pregnancy outcomes.Serial blood samples were collected during the first, second, and third trimesters in 18 women who developed PE and 18 women with uncomplicated pregnancies (Stanford cohort). The second cohort (Detroit), used for comparative analysis, consisted of 76 women with PE and 90 women with uncomplicated pregnancies. Multivariate analyses were applied to infer predictive and cohort-specific proteomic models, which were then tested in the alternate cohort. Gene ontology (GO) analysis was performed to identify biological processes that were over-represented among top-ranked proteins associated with PE.The model derived in the Stanford cohort was highly significant (p = 3.9E-15) and predictive (AUC = 0.96), but failed validation in the Detroit cohort (p = 9.7E-01, AUC = 0.50). Similarly, the model derived in the Detroit cohort was highly significant (p = 1.0E-21, AUC = 0.73), but failed validation in the Stanford cohort (p = 7.3E-02, AUC = 0.60). By contrast, proteomic models predicting GA were readily validated across the Stanford (p = 1.1E-454, R = 0.92) and Detroit cohorts (p = 1.1.E-92, R = 0.92) indicating that the proteomic assay performed well enough to infer a generalizable model across studied cohorts, which makes it less likely that technical aspects of the assay, including batch effects, accounted for observed differences.Results point to a broader issue relevant for proteomic and other omic discovery studies in patient cohorts suffering from a clinical syndrome, such as PE, driven by heterogeneous pathophysiologies. While novel technologies including highly multiplex proteomic arrays and adapted computational algorithms allow for novel discoveries for a particular study cohort, they may not readily generalize across cohorts. A likely reason is that the prevalence of pathophysiologic processes leading up to the "same" clinical syndrome can be distributed differently in different and smaller-sized cohorts. Signatures derived in individual cohorts may simply capture different facets of the spectrum of pathophysiologic processes driving a syndrome. Our findings have important implications for the design of omic studies of a syndrome like PE. They highlight the need for performing such studies in diverse and well-phenotyped patient populations that are large enough to characterize subsets of patients with shared pathophysiologies to then derive subset-specific signatures of sufficient predictive power.

    View details for DOI 10.1080/14767058.2021.1888915

    View details for PubMedID 33653202

  • Identification of patients at risk of new onset heart failure: Utilizing a large statewide health information exchange to train and validate a risk prediction model. PloS one Duong, S. Q., Zheng, L., Xia, M., Jin, B., Liu, M., Li, Z., Hao, S., Alfreds, S. T., Sylvester, K. G., Widen, E., Teuteberg, J. J., McElhinney, D. B., Ling, X. B. 2021; 16 (12): e0260885

    Abstract

    BACKGROUND: New-onset heart failure (HF) is associated with poor prognosis and high healthcare utilization. Early identification of patients at increased risk incident-HF may allow for focused allocation of preventative care resources. Health information exchange (HIE) data span the entire spectrum of clinical care, but there are no HIE-based clinical decision support tools for diagnosis of incident-HF. We applied machine-learning methods to model the one-year risk of incident-HF from the Maine statewide-HIE.METHODS AND RESULTS: We included subjects aged ≥ 40 years without prior HF ICD9/10 codes during a three-year period from 2015 to 2018, and incident-HF defined as assignment of two outpatient or one inpatient code in a year. A tree-boosting algorithm was used to model the probability of incident-HF in year two from data collected in year one, and then validated in year three. 5,668 of 521,347 patients (1.09%) developed incident-HF in the validation cohort. In the validation cohort, the model c-statistic was 0.824 and at a clinically predetermined risk threshold, 10% of patients identified by the model developed incident-HF and 29% of all incident-HF cases in the state of Maine were identified.CONCLUSIONS: Utilizing machine learning modeling techniques on passively collected clinical HIE data, we developed and validated an incident-HF prediction tool that performs on par with other models that require proactively collected clinical data. Our algorithm could be integrated into other HIEs to leverage the EMR resources to provide individuals, systems, and payors with a risk stratification tool to allow for targeted resource allocation to reduce incident-HF disease burden on individuals and health care systems.

    View details for DOI 10.1371/journal.pone.0260885

    View details for PubMedID 34890438

  • Maternal metabolic profiling to assess fetal gestational age and predict preterm delivery: a two-centre retrospective cohort study in the US. BMJ open Sylvester, K. G., Hao, S., You, J., Zheng, L., Tian, L., Yao, X., Mo, L., Ladella, S., Wong, R. J., Shaw, G. M., Stevenson, D. K., Cohen, H. J., Whitin, J. C., McElhinney, D. B., Ling, X. B. 2020; 10 (12): e040647

    Abstract

    OBJECTIVES: The aim of this study was to develop a single blood test that could determine gestational age and estimate the risk of preterm birth by measuring serum metabolites. We hypothesised that serial metabolic modelling of serum analytes throughout pregnancy could be used to describe fetal gestational age and project preterm birth with a high degree of precision.STUDY DESIGN: A retrospective cohort study.SETTING: Two medical centres from the USA.PARTICIPANTS: Thirty-six patients (20 full-term, 16 preterm) enrolled at Stanford University were used to develop gestational age and preterm birth risk algorithms, 22 patients (9 full-term, 13 preterm) enrolled at the University of Alabama were used to validate the algorithms.OUTCOME MEASURES: Maternal blood was collected serially throughout pregnancy. Metabolic datasets were generated using mass spectrometry.RESULTS: A model to determine gestational age was developed (R2=0.98) and validated (R2=0.81). 66.7% of the estimates fell within ±1week of ultrasound results during model validation. Significant disruptions from full-term pregnancy metabolic patterns were observed in preterm pregnancies (R2=-0.68). A separate algorithm to predict preterm birth was developed using a set of 10 metabolic pathways that resulted in an area under the curve of 0.96 and 0.92, a sensitivity of 0.88 and 0.86, and a specificity of 0.96 and 0.92 during development and validation testing, respectively.CONCLUSIONS: In this study, metabolic profiling was used to develop and test a model for determining gestational age during full-term pregnancy progression, and to determine risk of preterm birth. With additional patient validation studies, these algorithms may be used to identify at-risk pregnancies prompting alterations in clinical care, and to gain biological insights into the pathophysiology of preterm birth. Metabolic pathway-based pregnancy modelling is a novel modality for investigation and clinical application development.

    View details for DOI 10.1136/bmjopen-2020-040647

    View details for PubMedID 33268420

  • Towards personalized medicine in maternal and child health: integrating biologic and social determinants. Pediatric research Stevenson, D. K., Wong, R. J., Aghaeepour, N., Maric, I., Angst, M. S., Contrepois, K., Darmstadt, G. L., Druzin, M. L., Eisenberg, M. L., Gaudilliere, B., Gibbs, R. S., Gotlib, I. H., Gould, J. B., Lee, H. C., Ling, X. B., Mayo, J. A., Moufarrej, M. N., Quaintance, C. C., Quake, S. R., Relman, D. A., Sirota, M., Snyder, M. P., Sylvester, K. G., Hao, S., Wise, P. H., Shaw, G. M., Katz, M. 2020

    View details for DOI 10.1038/s41390-020-0981-8

    View details for PubMedID 32454518

  • Deviation from the precisely timed phenomic ageotypes can assist in early CRC screening and reveal underlying pathophysiology. Thyparambil, S. P., You, J., Liu, K., Sun, H., Peng, J., Cai, S., Li, Y., Fu, C., Bao, P., Li, Q., Hao, S., Zhang, Y., Li, Z., Yang, J., Yin, Z., Yao, X., Zhu, X., Schilling, J., Sylvester, K. G., Ling, X. B. LIPPINCOTT WILLIAMS & WILKINS. 2020
  • Progressive Metabolic Dysfunction and Nutritional Variability Precedes Necrotizing Enterocolitis. Nutrients Sinclair, T. J., Ye, C., Chen, Y., Zhang, D., Li, T., Ling, X. B., Cohen, H. J., Shaw, G. M., Stevenson, D. K., Chace, D., Clark, R. H., Sylvester, K. G. 2020; 12 (5)

    Abstract

    Necrotizing Enterocolitis (NEC) is associated with prematurity, enteral feedings, and enteral dysbiosis. Accordingly, we hypothesized that along with nutritional variability, metabolic dysfunction would be associated with NEC onset. Methods: We queried a multicenter longitudinal database that included 995 preterm infants (<32 weeks gestation) and included 73 cases of NEC. Dried blood spot samples were obtained on day of life 1, 7, 28, and 42. Metabolite data from each time point included 72 amino acid (AA) and acylcarnitine (AC) measures. Nutrition data were averaged at each of the same time points. Odds ratios and 95% confidence intervals were calculated using samples obtained prior to NEC diagnosis and adjusted for potential confounding variables. Nutritional and metabolic data were plotted longitudinally to determine relationship to NEC onset. Results: Day 1 analyte levels of alanine, phenylalanine, free carnitine, C16, arginine, C14:1/C16, and citrulline/phenylalanine were associated with the subsequent development of NEC. Over time, differences in individual analyte levels associated with NEC onset shifted from predominantly AAs at birth to predominantly ACs by day 42. Subjects who developed NEC received significantly lower weight-adjusted total calories (p < 0.001) overall, a trend that emerged by day of life 7 (p = 0.020), and persisted until day of life 28 (p < 0.001) and 42 (p < 0.001). Conclusion: Premature infants demonstrate metabolic differences at birth. Metabolite abnormalities progress in parallel to significant differences in nutritional delivery signifying metabolic dysfunction in premature newborns prior to NEC onset. These observations provide new insights to potential contributing pathophysiology of NEC and opportunity for clinical care-based prevention.

    View details for DOI 10.3390/nu12051275

    View details for PubMedID 32365850

  • High-throughput quantitation of serological ceramides/dihydroceramides by LC/MS/MS: Pregnancy baseline biomarkers and potential metabolic messengers. Journal of pharmaceutical and biomedical analysis Huang, Q. n., Hao, S. n., Yao, X. n., You, J. n., Li, X. n., Lai, D. n., Han, C. n., Schilling, J. n., Hwa, K. Y., Thyparambil, S. n., Whitin, J. n., Cohen, H. J., Chubb, H. n., Ceresnak, S. R., McElhinney, D. B., Wong, R. J., Shaw, G. M., Stevenson, D. K., Sylvester, K. G., Ling, X. B. 2020; 192: 113639

    Abstract

    Ceramides and dihydroceramides are sphingolipids that present in abundance at the cellular membrane of eukaryotes. Although their metabolic dysregulation has been implicated in many diseases, our knowledge about circulating ceramide changes during the pregnancy remains limited. In this study, we present the development and validation of a high-throughput liquid chromatography-tandem mass spectrometric method for simultaneous quantification of 16 ceramides and 10 dihydroceramides in human serum within 5 min. by using stable isotope-labeled ceramides as internal standards. This method employs a protein precipitation method for high throughput sample preparation, reverse phase isocratic elusion for chromatographic separation, and Multiple Reaction Monitoring for mass spectrometric detection. To qualify for clinical applications, our assay has been validated against the FDA guidelines for Lower Limit of Quantitation (1 nM), linearity (R2>0.99), precision (imprecision<15 %), accuracy (inaccuracy<15 %), extraction recovery (>90 %), stability (>85 %), and carryover (<0.01 %). With enhanced sensitivity and specificity from this method, we have, for the first time, determined the serological levels of ceramides and dihydroceramides to reveal unique temporal gestational patterns. Our approach could have value in providing insights into disorders of pregnancy.

    View details for DOI 10.1016/j.jpba.2020.113639

    View details for PubMedID 33017796

  • Multicentre validation of a computer-based tool for differentiation of acute Kawasaki disease from clinically similar febrile illnesses. Archives of disease in childhood Hao, S. n., Ling, X. B., Kanegaye, J. T., Bainto, E. n., Dominguez, S. R., Heizer, H. n., Jone, P. N., Anderson, M. S., Jaggi, P. n., Baker, A. n., Son, M. B., Newberger, J. W., Ashouri, N. n., McElhinney, D. B., Burns, J. C., Whitin, J. C., Cohen, H. J., Tremoulet, A. H. 2020

    Abstract

    The clinical features of Kawasaki disease (KD) overlap with those of other paediatric febrile illnesses. A missed or delayed diagnosis increases the risk of coronary artery damage. Our computer algorithm for KD and febrile illness differentiation had a sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of 94.8%, 70.8%, 93.7% and 98.3%, respectively, in a single-centre validation study. We sought to determine the performance of this algorithm with febrile children from multiple institutions across the USA.We used our previously published 18-variable panel that includes illness day, the five KD clinical criteria and readily available laboratory values. We applied this two-step algorithm using a linear discriminant analysis-based clinical model followed by a random forest-based algorithm to a cohort of 1059 acute KD and 282 febrile control patients from five children's hospitals across the USA.The algorithm correctly classified 970 of 1059 patients with KD and 163 of 282 febrile controls resulting in a sensitivity of 91.6%, specificity of 57.8% and PPV and NPV of 95.4% and 93.1%, respectively. The algorithm also correctly identified 218 of the 232 KD patients (94.0%) with abnormal echocardiograms.The expectation is that the predictive accuracy of the algorithm will be reduced in a real-world setting in which patients with KD are rare and febrile controls are common. However, the results of the current analysis suggest that this algorithm warrants a prospective, multicentre study to evaluate its potential utility as a physician support tool.

    View details for DOI 10.1136/archdischild-2019-317980

    View details for PubMedID 32139365

  • Kinetics of SARS-CoV-2 positivity of infected and recovered patients from a single center. Scientific reports Huang, J. n., Zheng, L. n., Li, Z. n., Hao, S. n., Ye, F. n., Chen, J. n., Gans, H. A., Yao, X. n., Liao, J. n., Wang, S. n., Zeng, M. n., Qiu, L. n., Li, C. n., Whitin, J. C., Tian, L. n., Chubb, H. n., Hwa, K. Y., Ceresnak, S. R., Zhang, W. n., Lu, Y. n., Maldonado, Y. A., McElhinney, D. B., Sylvester, K. G., Cohen, H. J., Liu, L. n., Ling, X. B. 2020; 10 (1): 18629

    Abstract

    Recurrence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) positive detection in infected but recovered individuals has been reported. Patients who have recovered from coronavirus disease 2019 (COVID-19) could profoundly impact the health care system. We sought to define the kinetics and relevance of PCR-positive recurrence during recovery from acute COVID-19 to better understand risks for prolonged infectivity and reinfection. A series of 414 patients with confirmed SARS-Cov-2 infection, at The Second Affiliated Hospital of Southern University of Science and Technology in Shenzhen, China from January 11 to April 23, 2020. Statistical analyses were performed of the clinical, laboratory, radiologic image, medical treatment, and clinical course of admission/quarantine/readmission data, and a recurrence predictive algorithm was developed. 16.7% recovered patients with PCR positive recurring one to three times, despite being in strict quarantine. Younger patients with mild pulmonary respiratory syndrome had higher risk of PCR positivity recurrence. The recurrence prediction model had an area under the ROC curve of 0.786. This case series provides characteristics of patients with recurrent SARS-CoV-2 positivity. Use of a prediction algorithm may identify patients at high risk of recurrent SARS-CoV-2 positivity and help to establish protocols for health policy.

    View details for DOI 10.1038/s41598-020-75629-x

    View details for PubMedID 33122706

  • Changes in pregnancy-related serum biomarkers early in gestation are associated with later development of preeclampsia. PloS one Hao, S. n., You, J. n., Chen, L. n., Zhao, H. n., Huang, Y. n., Zheng, L. n., Tian, L. n., Maric, I. n., Liu, X. n., Li, T. n., Bianco, Y. K., Winn, V. D., Aghaeepour, N. n., Gaudilliere, B. n., Angst, M. S., Zhou, X. n., Li, Y. M., Mo, L. n., Wong, R. J., Shaw, G. M., Stevenson, D. K., Cohen, H. J., Mcelhinney, D. B., Sylvester, K. G., Ling, X. B. 2020; 15 (3): e0230000

    Abstract

    Placental protein expression plays a crucial role during pregnancy. We hypothesized that: (1) circulating levels of pregnancy-associated, placenta-related proteins throughout gestation reflect the temporal progression of the uncomplicated, full-term pregnancy, and can effectively estimate gestational ages (GAs); and (2) preeclampsia (PE) is associated with disruptions in these protein levels early in gestation; and can identify impending PE. We also compared gestational profiles of proteins in the human and mouse, using pregnant heme oxygenase-1 (HO-1) heterozygote (Het) mice, a mouse model reflecting PE-like symptoms.Serum levels of placenta-related proteins-leptin (LEP), chorionic somatomammotropin hormone like 1 (CSHL1), elabela (ELA), activin A, soluble fms-like tyrosine kinase 1 (sFlt-1), and placental growth factor (PlGF)-were quantified by ELISA in blood serially collected throughout human pregnancies (20 normal subjects with 66 samples, and 20 subjects who developed PE with 61 samples). Multivariate analysis was performed to estimate the GA in normal pregnancy. Mean-squared errors of GA estimations were used to identify impending PE. The human protein profiles were then compared with those in the pregnant HO-1 Het mice.An elastic net-based gestational dating model was developed (R2 = 0.76) and validated (R2 = 0.61) using serum levels of the 6 proteins measured at various GAs from women with normal uncomplicated pregnancies. In women who developed PE, the model was not (R2 = -0.17) associated with GA. Deviations from the model estimations were observed in women who developed PE (P = 0.01). The model developed with 5 proteins (ELA excluded) performed similarly from sera from normal human (R2 = 0.68) and WT mouse (R2 = 0.85) pregnancies. Disruptions of this model were observed in both human PE-associated (R2 = 0.27) and mouse HO-1 Het (R2 = 0.30) pregnancies. LEP outperformed sFlt-1 and PlGF in differentiating impending PE at early human and late mouse GAs.Serum placenta-related protein profiles are temporally regulated throughout normal pregnancies and significantly disrupted in women who develop PE. LEP changes earlier than the well-established biomarkers (sFlt-1 and PlGF). There may be evidence of a causative action of HO-1 deficiency in LEP upregulation in a PE-like murine model.

    View details for DOI 10.1371/journal.pone.0230000

    View details for PubMedID 32126118

  • Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm. International journal of medical informatics Ye, C. n., Li, J. n., Hao, S. n., Liu, M. n., Jin, H. n., Zheng, L. n., Xia, M. n., Jin, B. n., Zhu, C. n., Alfreds, S. T., Stearns, F. n., Kanov, L. n., Sylvester, K. G., Widen, E. n., McElhinney, D. n., Ling, X. B. 2020; 137: 104105

    Abstract

    Predicting the risk of falls in advance can benefit the quality of care and potentially reduce mortality and morbidity in the older population. The aim of this study was to construct and validate an electronic health record-based fall risk predictive tool to identify elders at a higher risk of falls.The one-year fall prediction model was developed using the machine-learning-based algorithm, XGBoost, and tested on an independent validation cohort. The data were collected from electronic health records (EHR) of Maine from 2016 to 2018, comprising 265,225 older patients (≥65 years of age).This model attained a validated C-statistic of 0.807, where 50 % of the identified high-risk true positives were confirmed to fall during the first 94 days of next year. The model also captured in advance 58.01 % and 54.93 % of falls that happened within the first 30 and 30-60 days of next year. The identified high-risk patients of fall showed conditions of severe disease comorbidities, an enrichment of fall-increasing cardiovascular and mental medication prescriptions and increased historical clinical utilization, revealing the complexity of the underlying fall etiology. The XGBoost algorithm captured 157 impactful predictors into the final predictive model, where cognitive disorders, abnormalities of gait and balance, Parkinson's disease, fall history and osteoporosis were identified as the top-5 strongest predictors of the future fall event.By using the EHR data, this risk assessment tool attained an improved discriminative ability and can be immediately deployed in the health system to provide automatic early warnings to older adults with increased fall risk and identify their personalized risk factors to facilitate customized fall interventions.

    View details for DOI 10.1016/j.ijmedinf.2020.104105

    View details for PubMedID 32193089

  • Development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records. Translational psychiatry Zheng, L. n., Wang, O. n., Hao, S. n., Ye, C. n., Liu, M. n., Xia, M. n., Sabo, A. N., Markovic, L. n., Stearns, F. n., Kanov, L. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Zhang, W. n., Liao, J. n., Ling, X. B. 2020; 10 (1): 72

    Abstract

    Suicide is the tenth leading cause of death in the United States (US). An early-warning system (EWS) for suicide attempt could prove valuable for identifying those at risk of suicide attempts, and analyzing the contribution of repeated attempts to the risk of eventual death by suicide. In this study we sought to develop an EWS for high-risk suicide attempt patients through the development of a population-based risk stratification surveillance system. Advanced machine-learning algorithms and deep neural networks were utilized to build models with the data from electronic health records (EHRs). A final risk score was calculated for each individual and calibrated to indicate the probability of a suicide attempt in the following 1-year time period. Risk scores were subjected to individual-level analysis in order to aid in the interpretation of the results for health-care providers managing the at-risk cohorts. The 1-year suicide attempt risk model attained an area under the curve (AUC ROC) of 0.792 and 0.769 in the retrospective and prospective cohorts, respectively. The suicide attempt rate in the "very high risk" category was 60 times greater than the population baseline when tested in the prospective cohorts. Mental health disorders including depression, bipolar disorders and anxiety, along with substance abuse, impulse control disorders, clinical utilization indicators, and socioeconomic determinants were recognized as significant features associated with incident suicide attempt.

    View details for DOI 10.1038/s41398-020-0684-2

    View details for PubMedID 32080165

  • A Real-Time Early Warning System for Monitoring Inpatient Mortality Risk: Prospective Study Using Electronic Medical Record Data. Journal of medical Internet research Ye, C., Wang, O., Liu, M., Zheng, L., Xia, M., Hao, S., Jin, B., Jin, H., Zhu, C., Huang, C. J., Gao, P., Ellrodt, G., Brennan, D., Stearns, F., Sylvester, K. G., Widen, E., McElhinney, D. B., Ling, X. 2019; 21 (7): e13719

    Abstract

    BACKGROUND: The rapid deterioration observed in the condition of some hospitalized patients can be attributed to either disease progression or imperfect triage and level of care assignment after their admission. An early warning system (EWS) to identify patients at high risk of subsequent intrahospital death can be an effective tool for ensuring patient safety and quality of care and reducing avoidable harm and costs.OBJECTIVE: The aim of this study was to prospectively validate a real-time EWS designed to predict patients at high risk of inpatient mortality during their hospital episodes.METHODS: Data were collected from the system-wide electronic medical record (EMR) of two acute Berkshire Health System hospitals, comprising 54,246 inpatient admissions from January 1, 2015, to September 30, 2017, of which 2.30% (1248/54,246) resulted in intrahospital deaths. Multiple machine learning methods (linear and nonlinear) were explored and compared. The tree-based random forest method was selected to develop the predictive application for the intrahospital mortality assessment. After constructing the model, we prospectively validated the algorithms as a real-time inpatient EWS for mortality.RESULTS: The EWS algorithm scored patients' daily and long-term risk of inpatient mortality probability after admission and stratified them into distinct risk groups. In the prospective validation, the EWS prospectively attained a c-statistic of 0.884, where 99 encounters were captured in the highest risk group, 69% (68/99) of whom died during the episodes. It accurately predicted the possibility of death for the top 13.3% (34/255) of the patients at least 40.8 hours before death. Important clinical utilization features, together with coded diagnoses, vital signs, and laboratory test results were recognized as impactful predictors in the final EWS.CONCLUSIONS: In this study, we prospectively demonstrated the capability of the newly-designed EWS to monitor and alert clinicians about patients at high risk of in-hospital death in real time, thereby providing opportunities for timely interventions. This real-time EWS is able to assist clinical decision making and enable more actionable and effective individualized care for patients' better health outcomes in target medical facilities.

    View details for DOI 10.2196/13719

    View details for PubMedID 31278734

  • Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine JOURNAL OF MEDICAL INTERNET RESEARCH Wang, X., Zhang, Y., Hao, S., Zheng, L., Liao, J., Ye, C., Xia, M., Wang, O., Liu, M., Weng, C., Duong, S. Q., Jin, B., Alfreds, S. T., Stearns, F., Kanov, L., Sylvester, K. G., Widen, E., McElhinney, D. B., Ling, X. B. 2019; 21 (5)

    View details for DOI 10.2196/13260

    View details for Web of Science ID 000468102900001

  • A proteomic clock for malignant gliomas: The role of the environment in tumorigenesis at the presymptomatic stage. PloS one Zheng, L. n., Zhang, Y. n., Hao, S. n., Chen, L. n., Sun, Z. n., Yan, C. n., Whitin, J. C., Jang, T. n., Merchant, M. n., McElhinney, D. B., Sylvester, K. G., Cohen, H. J., Recht, L. n., Yao, X. n., Ling, X. B. 2019; 14 (10): e0223558

    Abstract

    Malignant gliomas remain incurable with a poor prognosis despite of aggressive treatment. We have been studying the development of brain tumors in a glioma rat model, where rats develop brain tumors after prenatal exposure to ethylnitrosourea (ENU), and there is a sizable interval between when the first pathological changes are noted and tumors become detectable with MRI. Our aim to define a molecular timeline through proteomic profiling of the cerebrospinal fluid (CSF) such that brain tumor commitment can be revealed earlier than at the presymptomatic stage. A comparative proteomic approach was applied to profile CSF collected serially either before, at and after the time MRI becomes positive. Elastic net (EN) based models were developed to infer the timeline of normal or tumor development respectively, mirroring a chronology of precisely timed, "clocked", adaptations. These CSF changes were later quantified by longitudinal entropy analyses of the EN predictive metric. False discovery rates (FDR) were computed to control the expected proportion of the EN models that are due to multiple hypothesis testing. Our ENU rat brain tumor dating EN model indicated that protein content in CSF is programmed even before tumor MRI detection. The findings of the precisely timed CSF tumor microenvironment changes at presymptomatic stages, deviation from the normal development timeline, may provide the groundwork for the understanding of adaptation of the brain environment in tumorigenesis to devise effective brain tumor management strategies.

    View details for DOI 10.1371/journal.pone.0223558

    View details for PubMedID 31600288

  • Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine. Journal of medical Internet research Wang, X. n., Zhang, Y. n., Hao, S. n., Zheng, L. n., Liao, J. n., Ye, C. n., Xia, M. n., Wang, O. n., Liu, M. n., Weng, C. H., Duong, S. Q., Jin, B. n., Alfreds, S. T., Stearns, F. n., Kanov, L. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Ling, X. B. 2019; 21 (5): e13260

    Abstract

    Lung cancer is the leading cause of cancer death worldwide. Early detection of individuals at risk of lung cancer is critical to reduce the mortality rate.The aim of this study was to develop and validate a prospective risk prediction model to identify patients at risk of new incident lung cancer within the next 1 year in the general population.Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. The study population consisted of patients with at least one EHR between April 1, 2016, and March 31, 2018, who had no history of lung cancer. A retrospective cohort (N=873,598) and a prospective cohort (N=836,659) were formed for model construction and validation. An Extreme Gradient Boosting (XGBoost) algorithm was adopted to build the model. It assigned a score to each individual to quantify the probability of a new incident lung cancer diagnosis from October 1, 2016, to September 31, 2017. The model was trained with the clinical profile in the retrospective cohort from the preceding 6 months and validated with the prospective cohort to predict the risk of incident lung cancer from April 1, 2017, to March 31, 2018.The model had an area under the curve (AUC) of 0.881 (95% CI 0.873-0.889) in the prospective cohort. Two thresholds of 0.0045 and 0.01 were applied to the predictive scores to stratify the population into low-, medium-, and high-risk categories. The incidence of lung cancer in the high-risk category (579/53,922, 1.07%) was 7.7 times higher than that in the overall cohort (1167/836,659, 0.14%). Age, a history of pulmonary diseases and other chronic diseases, medications for mental disorders, and social disparities were found to be associated with new incident lung cancer.We retrospectively developed and prospectively validated an accurate risk prediction model of new incident lung cancer occurring in the next 1 year. Through statistical learning from the statewide EHR data in the preceding 6 months, our model was able to identify statewide high-risk patients, which will benefit the population health through establishment of preventive interventions or more intensive surveillance.

    View details for PubMedID 31099339

  • Validation of a novel automated signal analysis tool for ablation of Wolff-Parkinson-White Syndrome. PloS one Ceresnak, S. R., Pass, R. H., Dubin, A. M., Yang, L., Motonaga, K. S., Hedlin, H., Avasarala, K., Trela, A., McElhinney, D. B., Janson, C., Nappo, L., Ling, X. B., Gates, G. J. 2019; 14 (6): e0217282

    Abstract

    BACKGROUND: In previous pilot work we demonstrated that a novel automated signal analysis tool could accurately identify successful ablation sites during Wolff-Parkinson-White (WPW) ablation at a single center.OBJECTIVE: We sought to validate and refine this signal analysis tool in a larger multi-center cohort of children with WPW.METHODS: A retrospective review was performed of signal data from children with WPW who underwent ablation at two pediatric arrhythmia centers from 2008-2015. All patients with WPW ≤ 21 years who underwent invasive electrophysiology study and ablation with ablation signals available for review were included. Signals were excluded if temperature or power delivery was inadequate or lesion time was < 5 seconds. Ablation lesions were reviewed for each patient. Signals were classified as successful if there was loss of antegrade and retrograde accessory pathway (AP) conduction or unsuccessful if ablation did not eliminate AP conduction. Custom signal analysis software analyzed intracardiac electrograms for amplitudes, high and low frequency components, integrated area, and signal timing components to create a signal score. We validated the previously published signal score threshold 3.1 in this larger, more diverse cohort and explored additional scoring options. Logistic regression with lasso regularization using Youden's index criterion and a cost-benefit criterion to identify thresholds was considered as a refinement to this score.RESULTS: 347 signals (141 successful, 206 unsuccessful) in 144 pts were analyzed [mean age 13.2 ± 3.9 years, 96 (67%) male, 66 (45%) left sided APs]. The software correctly identified the signals as successful or unsuccessful in 276/347 (80%) at a threshold of 3.1. The performance of other thresholds did not significantly improve the predictive ability. A signal score threshold of 3.1 provided the following diagnostic accuracy for distinguishing a successful from unsuccessful signal: sensitivity 83%, specificity 77%, PPV 71%, NPV 87%.CONCLUSIONS: An automated signal analysis software tool reliably distinguished successful versus unsuccessful ablation electrograms in children with WPW when validated in a large, diverse cohort. Refining the tools using an alternative threshold and statistical method did not improve the original signal score at a threshold of 3.1. This software was effective across two centers and multiple operators and may be an effective tool for ablation of WPW.

    View details for DOI 10.1371/journal.pone.0217282

    View details for PubMedID 31242221

  • Improved detection of prostate cancer using a magneto-nanosensor assay for serum circulating autoantibodies. PloS one Xu, L., Lee, J., Hao, S., Ling, X. B., Brooks, J. D., Wang, S. X., Gambhir, S. S. 2019; 14 (8): e0221051

    Abstract

    PURPOSE: To develop a magneto-nanosensor (MNS) based multiplex assay to measure protein and autoantibody biomarkers from human serum for prostate cancer (CaP) diagnosis.MATERIALS AND METHODS: A 4-panel MNS autoantibody assay and a MNS protein assay were developed and optimized in our labs. Using these assays, serum concentration of six biomarkers including prostate-specific antigen (PSA) protein, free/total PSA ratio, as well as four autoantibodies against Parkinson disease 7 (PARK7), TAR DNA-binding protein 43 (TARDBP), Talin 1 (TLN1), and Caldesmon 1 (CALD1) and were analyzed. Human serum samples from 99 patients (50 with non-cancer and 49 with clinically localized CaP) were evaluated.RESULTS: The MNS assay showed excellent performance characteristics and no cross-reactivity. All autoantibody assays showed a statistically significant difference between CaP and non-cancer samples except for PARK7. The most significant difference was the combination of the four autoantibodies as a panel in addition to the free/total PSA ratio. This combination had the highest area under the curve (AUC)- 0.916 in ROC analysis.CONCLUSIONS: Our results suggest that this autoantibody panel along with PSA and free PSA have potential to segregate patients without cancer from those with prostate cancer with higher sensitivity and specificity than PSA alone.

    View details for DOI 10.1371/journal.pone.0221051

    View details for PubMedID 31404106

  • Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning. Journal of medical Internet research Ye, C. n., Fu, T. n., Hao, S. n., Zhang, Y. n., Wang, O. n., Jin, B. n., Xia, M. n., Liu, M. n., Zhou, X. n., Wu, Q. n., Guo, Y. n., Zhu, C. n., Li, Y. M., Culver, D. S., Alfreds, S. T., Stearns, F. n., Sylvester, K. G., Widen, E. n., McElhinney, D. n., Ling, X. n. 2018; 20 (1): e22

    Abstract

    As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke.The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year.Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual.The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension.With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.

    View details for PubMedID 29382633

  • Shorten Bipolarity Checklist for the Differentiation of Subtypes of Bipolar Disorder Using Machine Learning Feng, C., Gao, H., Ling, X. B., Ji, J., Ma, Y., ACM ASSOC COMPUTING MACHINERY. 2018: 162–66
  • Gene expression network analysis in aneuploid human trophoblast progenitor cells (TBPC) reveals modular structures Leon-Martinez, D., Ling, X., Hao, S., Sylvester, K., Bianco, K. MOSBY-ELSEVIER. 2018: S163–S164
  • Assessing Statewide All-Cause Future One-Year Mortality: Prospective Study With Implications for Quality of Life, Resource Utilization, and Medical Futility. Journal of medical Internet research Guo, Y. n., Zheng, G. n., Fu, T. n., Hao, S. n., Ye, C. n., Zheng, L. n., Liu, M. n., Xia, M. n., Jin, B. n., Zhu, C. n., Wang, O. n., Wu, Q. n., Culver, D. S., Alfreds, S. T., Stearns, F. n., Kanov, L. n., Bhatia, A. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Ling, X. B. 2018; 20 (6): e10311

    Abstract

    For many elderly patients, a disproportionate amount of health care resources and expenditures is spent during the last year of life, despite the discomfort and reduced quality of life associated with many aggressive medical approaches. However, few prognostic tools have focused on predicting all-cause 1-year mortality among elderly patients at a statewide level, an issue that has implications for improving quality of life while distributing scarce resources fairly.Using data from a statewide elderly population (aged ≥65 years), we sought to prospectively validate an algorithm to identify patients at risk for dying in the next year for the purpose of minimizing decision uncertainty, improving quality of life, and reducing futile treatment.Analysis was performed using electronic medical records from the Health Information Exchange in the state of Maine, which covered records of nearly 95% of the statewide population. The model was developed from 125,896 patients aged at least 65 years who were discharged from any care facility in the Health Information Exchange network from September 5, 2013, to September 4, 2015. Validation was conducted using 153,199 patients with same inclusion and exclusion criteria from September 5, 2014, to September 4, 2016. Patients were stratified into risk groups. The association between all-cause 1-year mortality and risk factors was screened by chi-squared test and manually reviewed by 2 clinicians. We calculated risk scores for individual patients using a gradient tree-based boost algorithm, which measured the probability of mortality within the next year based on the preceding 1-year clinical profile.The development sample included 125,896 patients (72,572 women, 57.64%; mean 74.2 [SD 7.7] years). The final validation cohort included 153,199 patients (88,177 women, 57.56%; mean 74.3 [SD 7.8] years). The c-statistic for discrimination was 0.96 (95% CI 0.93-0.98) in the development group and 0.91 (95% CI 0.90-0.94) in the validation cohort. The mortality was 0.99% in the low-risk group, 16.75% in the intermediate-risk group, and 72.12% in the high-risk group. A total of 99 independent risk factors (n=99) for mortality were identified (reported as odds ratios; 95% CI). Age was on the top of list (1.41; 1.06-1.48); congestive heart failure (20.90; 15.41-28.08) and different tumor sites were also recognized as driving risk factors, such as cancer of the ovaries (14.42; 2.24-53.04), colon (14.07; 10.08-19.08), and stomach (13.64; 3.26-86.57). Disparities were also found in patients' social determinants like respiratory hazard index (1.24; 0.92-1.40) and unemployment rate (1.18; 0.98-1.24). Among high-risk patients who expired in our dataset, cerebrovascular accident, amputation, and type 1 diabetes were the top 3 diseases in terms of average cost in the last year of life.Our study prospectively validated an accurate 1-year risk prediction model and stratification for the elderly population (≥65 years) at risk of mortality with statewide electronic medical record datasets. It should be a valuable adjunct for helping patients to make better quality-of-life choices and alerting care givers to target high-risk elderly for appropriate care and discussions, thus cutting back on futile treatment.

    View details for PubMedID 29866643

  • Estimating One-Year Risk of Incident Chronic Kidney Disease: Retrospective Development and Validation Study Using Electronic Medical Record Data From the State of Maine. JMIR medical informatics Hao, S. n., Fu, T. n., Wu, Q. n., Jin, B. n., Zhu, C. n., Hu, Z. n., Guo, Y. n., Zhang, Y. n., Yu, Y. n., Fouts, T. n., Ng, P. n., Culver, D. S., Alfreds, S. T., Stearns, F. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Ling, X. B. 2017; 5 (3): e21

    Abstract

    Chronic kidney disease (CKD) is a major public health concern in the United States with high prevalence, growing incidence, and serious adverse outcomes.We aimed to develop and validate a model to identify patients at risk of receiving a new diagnosis of CKD (incident CKD) during the next 1 year in a general population.The study population consisted of patients who had visited any care facility in the Maine Health Information Exchange network any time between January 1, 2013, and December 31, 2015, and had no history of CKD diagnosis. Two retrospective cohorts of electronic medical records (EMRs) were constructed for model derivation (N=1,310,363) and validation (N=1,430,772). The model was derived using a gradient tree-based boost algorithm to assign a score to each individual that measured the probability of receiving a new diagnosis of CKD from January 1, 2014, to December 31, 2014, based on the preceding 1-year clinical profile. A feature selection process was conducted to reduce the dimension of the data from 14,680 EMR features to 146 as predictors in the final model. Relative risk was calculated by the model to gauge the risk ratio of the individual to population mean of receiving a CKD diagnosis in next 1 year. The model was tested on the validation cohort to predict risk of CKD diagnosis in the period from January 1, 2015, to December 31, 2015, using the preceding 1-year clinical profile.The final model had a c-statistic of 0.871 in the validation cohort. It stratified patients into low-risk (score 0-0.005), intermediate-risk (score 0.005-0.05), and high-risk (score ≥ 0.05) levels. The incidence of CKD in the high-risk patient group was 7.94%, 13.7 times higher than the incidence in the overall cohort (0.58%). Survival analysis showed that patients in the 3 risk categories had significantly different CKD outcomes as a function of time (P<.001), indicating an effective classification of patients by the model.We developed and validated a model that is able to identify patients at high risk of having CKD in the next 1 year by statistically learning from the EMR-based clinical history in the preceding 1 year. Identification of these patients indicates care opportunities such as monitoring and adopting intervention plans that may benefit the quality of care and outcomes in the long term.

    View details for PubMedID 28747298

  • Disturbance Propagation in Power System Based on an Epidemic Model Wu, Q., Zhang, D., Liu, D., Liu, F., Ling, X., Li, Z., IEEE IEEE. 2017
  • Defining and characterizing the critical transition state prior to the type 2 diabetes disease. PloS one Jin, B. n., Liu, R. n., Hao, S. n., Li, Z. n., Zhu, C. n., Zhou, X. n., Chen, P. n., Fu, T. n., Hu, Z. n., Wu, Q. n., Liu, W. n., Liu, D. n., Yu, Y. n., Zhang, Y. n., McElhinney, D. B., Li, Y. M., Culver, D. S., Alfreds, S. T., Stearns, F. n., Sylvester, K. G., Widen, E. n., Ling, X. B. 2017; 12 (7): e0180937

    Abstract

    Type 2 diabetes mellitus (T2DM), with increased risk of serious long-term complications, currently represents 8.3% of the adult population. We hypothesized that a critical transition state prior to the new onset T2DM can be revealed through the longitudinal electronic medical record (EMR) analysis.We applied the transition-based network entropy methodology which previously identified a dynamic driver network (DDN) underlying the critical T2DM transition at the tissue molecular biological level. To profile pre-disease phenotypical changes that indicated a critical transition state, a cohort of 7,334 patients was assembled from the Maine State Health Information Exchange (HIE). These patients all had their first confirmative diagnosis of T2DM between January 1, 2013 and June 30, 2013. The cohort's EMRs from the 24 months preceding their date of first T2DM diagnosis were extracted.Analysis of these patients' pre-disease clinical history identified a dynamic driver network (DDN) and an associated critical transition state six months prior to their first confirmative T2DM state.This 6-month window before the disease state provides an early warning of the impending T2DM, warranting an opportunity to apply proactive interventions to prevent or delay the new onset of T2DM.

    View details for PubMedID 28686739

  • Unique Molecular Patterns Uncovered in Kawasaki Disease Patients with Elevated Serum Gamma Glutamyl Transferase Levels: Implications for Intravenous Immunoglobulin Responsiveness PLOS ONE Wang, Y., Li, Z., Hu, G., Hao, S., Deng, X., Huang, M., Ren, M., Jiang, X., Kanegaye, J. T., Ha, K., Lee, J., Li, X., Jiang, X., Yu, Y., Tremoulet, A. H., Burns, J. C., Whitin, J. C., Shin, A. Y., Sylvester, K. G., McElhinney, D. B., Cohen, H. J., Ling, X. B. 2016; 11 (12)

    Abstract

    Resistance to intravenous immunoglobulin (IVIG) occurs in 10-20% of patients with Kawasaki disease (KD). The risk of resistance is about two-fold higher in patients with elevated gamma glutamyl transferase (GGT) levels. We sought to understand the biological mechanisms underlying IVIG resistance in patients with elevated GGT levels.We explored the association between elevated GGT levels and IVIG-resistance with a cohort of 686 KD patients (Cohort I). Gene expression data from 130 children with acute KD (Cohort II) were analyzed using the R square statistic and false discovery analysis to identify genes that were differentially represented in patients with elevated GGT levels with regard to IVIG responsiveness. Two additional KD cohorts (Cohort III and IV) were used to test the hypothesis that sialylation and GGT may be involved in IVIG resistance through neutrophil apoptosis.Thirty-six genes were identified that significantly explained the variations of both GGT levels and IVIG responsiveness in KD patients. After Bonferroni correction, significant associations with IVIG resistance persisted for 12 out of 36 genes among patients with elevated GGT levels and none among patients with normal GGT levels. With the discovery of ST6GALNAC3, a sialyltransferase, as the most differentially expressed gene, we hypothesized that sialylation and GGT are involved in IVIG resistance through neutrophil apoptosis. We then confirmed that in Cohort III and IV there was significantly less reduction in neutrophil count in IVIG non-responders.Gene expression analyses combining molecular and clinical datasets support the hypotheses that: (1) neutrophil apoptosis induced by IVIG may be a mechanism of action of IVIG in KD; (2) changes in sialylation and GGT level in KD patients may contribute synergistically to IVIG resistance through blocking IVIG-induced neutrophil apoptosis. These findings have implications for understanding the mechanism of action in IVIG resistance, and possibly for development of novel therapeutics.

    View details for DOI 10.1371/journal.pone.0167434

    View details for Web of Science ID 000392853100008

    View details for PubMedID 28002448

    View details for PubMedCentralID PMC5176264

  • Web-based Real-Time Case Finding for the Population Health Management of Patients With Diabetes Mellitus: A Prospective Validation of the Natural Language Processing-Based Algorithm With Statewide Electronic Medical Records. JMIR medical informatics Zheng, L., Wang, Y., Hao, S., Shin, A. Y., Jin, B., Ngo, A. D., Jackson-Browne, M. S., Feller, D. J., Fu, T., Zhang, K., Zhou, X., Zhu, C., Dai, D., Yu, Y., Zheng, G., Li, Y., McElhinney, D. B., Culver, D. S., Alfreds, S. T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2016; 4 (4)

    Abstract

    Diabetes case finding based on structured medical records does not fully identify diabetic patients whose medical histories related to diabetes are available in the form of free text. Manual chart reviews have been used but involve high labor costs and long latency.This study developed and tested a Web-based diabetes case finding algorithm using both structured and unstructured electronic medical records (EMRs).This study was based on the health information exchange (HIE) EMR database that covers almost all health facilities in the state of Maine, United States. Using narrative clinical notes, a Web-based natural language processing (NLP) case finding algorithm was retrospectively (July 1, 2012, to June 30, 2013) developed with a random subset of HIE-associated facilities, which was then blind tested with the remaining facilities. The NLP-based algorithm was subsequently integrated into the HIE database and validated prospectively (July 1, 2013, to June 30, 2014).Of the 935,891 patients in the prospective cohort, 64,168 diabetes cases were identified using diagnosis codes alone. Our NLP-based case finding algorithm prospectively found an additional 5756 uncodified cases (5756/64,168, 8.97% increase) with a positive predictive value of .90. Of the 21,720 diabetic patients identified by both methods, 6616 patients (6616/21,720, 30.46%) were identified by the NLP-based algorithm before a diabetes diagnosis was noted in the structured EMR (mean time difference = 48 days).The online NLP algorithm was effective in identifying uncodified diabetes cases in real time, leading to a significant improvement in diabetes case finding. The successful integration of the NLP-based case finding algorithm into the Maine HIE database indicates a strong potential for application of this novel method to achieve a more complete ascertainment of diagnoses of diabetes mellitus.

    View details for PubMedID 27836816

  • A Classification Tool for Differentiation of Kawasaki Disease from Other Febrile Illnesses. journal of pediatrics Hao, S., Jin, B., Tan, Z., Li, Z., Ji, J., Hu, G., Wang, Y., Deng, X., Kanegaye, J. T., Tremoulet, A. H., Burns, J. C., Cohen, H. J., Ling, X. B. 2016; 176: 114-120 e8

    Abstract

    To develop and validate a novel decision tree-based clinical algorithm to differentiate Kawasaki disease (KD) from other pediatric febrile illnesses that share common clinical characteristics.Using clinical and laboratory data from 801 subjects with acute KD (533 for development, and 268 for validation) and 479 febrile control subjects (318 for development, and 161 for validation), we developed a stepwise KD diagnostic algorithm combining our previously developed linear discriminant analysis (LDA)-based model with a newly developed tree-based algorithm.The primary model (LDA) stratified the 1280 subjects into febrile controls (n = 276), indeterminate (n = 247), and KD (n = 757) subgroups. The subsequent model (decision trees) further classified the indeterminate group into febrile controls (n = 103) and KD (n = 58) subgroups, leaving only 29 of 801 KD (3.6%) and 57 of 479 febrile control (11.9%) subjects indeterminate. The 2-step algorithm had a sensitivity of 96.0% and a specificity of 78.5%, and correctly classified all subjects with KD who later developed coronary artery aneurysms.The addition of a decision tree step increased sensitivity and specificity in the classification of subject with KD and febrile controls over our previously described LDA model. A multicenter trial is needed to prospectively determine its utility as a point of care diagnostic test for KD.

    View details for DOI 10.1016/j.jpeds.2016.05.060

    View details for PubMedID 27344221

    View details for PubMedCentralID PMC5003696

  • Prehypertension During Normotensive Pregnancy and Postpartum Clustering of Cardiometabolic Risk Factors: A Prospective Cohort Study. Hypertension Lei, Q., Zhou, X., Zhou, Y., Mai, C., Hou, M., Lv, L., Duan, D., Wen, J., Lin, X., Wang, P. P., Ling, X. B., Li, Y., Niu, J. 2016; 68 (2): 455-463

    Abstract

    The nonstratification of blood pressure (BP) levels may underestimate future cardiovascular risk in pregnant women who present with BP levels in the range of prehypertension (120-139/80-89 mm Hg). We prospectively evaluated the relationship between multiple antepartum BP measurements (from 11(+0) to 13(+6) weeks' gestation to term) and the occurrence of postpartum metabolic syndrome in 507 normotensive pregnant women after a live birth. By using latent class growth modeling, we identified the following 3 distinctive diastolic BP (DBP) trajectory groups: the low-J-shaped group (34.2%; DBP from 62.5±5.8 to 65.0±6.8 mm Hg), the moderate-U-shaped group (52.6%; DBP from 71.0±5.9 to 69.8±6.2 mm Hg), and the elevated-J-shaped group (13.2%; DBP from 76.2±6.7 to 81.8±4.8 mm Hg). Notably, the elevated-J-shaped trajectory group had mean DBP and systolic BP levels within the range of prehypertension from 37(+0) and 26(+0) weeks of pregnancy, respectively. Among the 309 women who completed the ≈1.6 years of postpartum follow-up, the women in the elevated-J-shaped group had greater odds of developing postpartum metabolic syndrome (adjusted odds ratio, 6.55; 95% confidence interval, 1.79-23.92; P=0.004) than the low-J-shaped group. Moreover, a parsimonious model incorporating DBP (membership in the elevated-J-shaped group but not in the DBP prehypertension group as identified by a single measurement) and elevated levels of fasting glucose (>4.99 mmol/L) and triglycerides (>3.14 mmol/L) at term was developed, with good discrimination and calibration for postpartum metabolic syndrome (c-statistic, 0.764; 95% confidence interval, 0.674-0.855; P<0.001). Therefore, prehypertension identified by DBP trajectories throughout pregnancy is an independent risk factor for predicting postpartum metabolic syndrome in normotensive pregnant women.

    View details for DOI 10.1161/HYPERTENSIONAHA.116.07261

    View details for PubMedID 27354425

  • A Novel Truncated Form of Serum Amyloid A in Kawasaki Disease PLOS ONE Whitin, J. C., Yu, T. T., Ling, X. B., Kanegaye, J. T., Burns, J. C., Cohen, H. J. 2016; 11 (6)

    Abstract

    Kawasaki disease (KD) is an acute vasculitis in children that can cause coronary artery abnormalities. Its diagnosis is challenging, and many cytokines, chemokines, acute phase reactants, and growth factors have failed evaluation as specific biomarkers to distinguish KD from other febrile illnesses. We performed protein profiling, comparing plasma from children with KD with febrile control (FC) subjects to determine if there were specific proteins or peptides that could distinguish the two clinical states.Plasma from three independent cohorts from the blood of 68 KD and 61 FC subjects was fractionated by anion exchange chromatography, followed by surface-enhanced laser desorption ionization (SELDI) mass spectrometry of the fractions. The mass spectra of KD and FC plasma samples were analyzed for peaks that were statistically significantly different.A mass spectrometry peak with a mass of 7,860 Da had high intensity in acute KD subjects compared to subacute KD (p = 0.0003) and FC (p = 7.9 x 10-10) subjects. We identified this peak as a novel truncated form of serum amyloid A with N-terminal at Lys-34 of the circulating form and validated its identity using a hybrid mass spectrum immunoassay technique. The truncated form of serum amyloid A was present in plasma of KD subjects when blood was collected in tubes containing protease inhibitors. This peak disappeared when the patients were examined after their symptoms resolved. Intensities of this peptide did not correlate with KD-associated laboratory values or with other mass spectrum peaks from the plasma of these KD subjects.Using SELDI mass spectrometry, we have discovered a novel truncated form of serum amyloid A that is elevated in the plasma of KD when compared with FC subjects. Future studies will evaluate its relevance as a diagnostic biomarker and its potential role in the pathophysiology of KD.

    View details for DOI 10.1371/journal.pone.0157024

    View details for Web of Science ID 000377560200029

    View details for PubMedID 27271757

    View details for PubMedCentralID PMC4894573

  • Exploring the Role of Polycythemia in Patients With Cyanosis After Palliative Congenital Heart Surgery. Pediatric critical care medicine Siehr, S. L., Shi, S., Hao, S., Hu, Z., Jin, B., Hanley, F., Reddy, V. M., McElhinney, D. B., Ling, X. B., Shin, A. Y. 2016; 17 (3): 216-222

    Abstract

    To understand the relationship between polycythemia and clinical outcome in patients with hypoplastic left heart syndrome following the Norwood operation.A retrospective, single-center cohort study.Pediatric cardiovascular ICU, university-affiliated children's hospital.Infants with hypoplastic left heart syndrome admitted to our medical center from September 2009 to December 2012 undergoing stage 1/Norwood operation.None.Baseline demographic and clinical information including first recorded postoperative hematocrit and subsequent mean, median, and nadir hematocrits during the first 72 hours postoperatively were recorded. The primary outcomes were in-hospital mortality and length of hospitalization. Thirty-two patients were included in the analysis. Patients did not differ by operative factors (cardiopulmonary bypass time and cross-clamp time) or traditional markers of severity of illness (vasoactive inotrope score, lactate, saturation, and PaO2/FIO2 ratio). Early polycythemia (hematocrit value > 49%) was associated with longer cardiovascular ICU stay (51.0 [± 38.6] vs 21.4 [± 16.2] d; p < 0.01) and total hospital length of stay (65.0 [± 46.5] vs 36.1 [± 20.0] d; p = 0.03). In a multivariable analysis, polycythemia remained independently associated with the length of hospitalization after controlling for the amount of RBC transfusion (weight, 4.36 [95% CI, 1.35-7.37]; p < 0.01). No difference in in-hospital mortality rates was detected between the two groups (17.6% vs 20%).Early polycythemia following the Norwood operation is associated with longer length of hospitalization even after controlling for blood cell transfusion practices. We hypothesize that polycythemia may be caused by hemoconcentration and used as an early marker of capillary leak syndrome.

    View details for DOI 10.1097/PCC.0000000000000654

    View details for PubMedID 26825044

  • A Multi-Omics Analysis of Human Nucleus-Coded Mitochondrial Genes with Mouse Extraembryonic Tissue/Placenta Phenotypes: Implications in Mitochondria-Mediated Maternal and Fetal Complications. Hu, G., Chen, R., Deng, X., Li, Z., Mo, L., Hao, S., Shaw, G. M., Stevenson, D. K., Cohen, H. J., Jiang, X., Sylvester, K. G., Ling, X. B. SAGE PUBLICATIONS INC. 2016: 320A
  • Urinary Colorimetric Sensor Array and Algorithm to Distinguish Kawasaki Disease from Other Febrile Illnesses PLOS ONE Li, Z., Tan, Z., Hao, S., Jin, B., Deng, X., Hu, G., Liu, X., Zhang, J., Jin, H., Huang, M., Kanegaye, J. T., Tremoulet, A. H., Burns, J. C., Wu, J., Cohen, H. J., Ling, X. B. 2016; 11 (2)

    Abstract

    Kawasaki disease (KD) is an acute pediatric vasculitis of infants and young children with unknown etiology and no specific laboratory-based test to identify. A specific molecular diagnostic test is urgently needed to support the clinical decision of proper medical intervention, preventing subsequent complications of coronary artery aneurysms. We used a simple and low-cost colorimetric sensor array to address the lack of a specific diagnostic test to differentiate KD from febrile control (FC) patients with similar rash/fever illnesses.Demographic and clinical data were prospectively collected for subjects with KD and FCs under standard protocol. After screening using a genetic algorithm, eleven compounds including metalloporphyrins, pH indicators, redox indicators and solvatochromic dye categories, were selected from our chromatic compound library (n = 190) to construct a colorimetric sensor array for diagnosing KD. Quantitative color difference analysis led to a decision-tree-based KD diagnostic algorithm.This KD sensing array allowed the identification of 94% of KD subjects (receiver operating characteristic [ROC] area under the curve [AUC] 0.981) in the training set (33 KD, 33 FC) and 94% of KD subjects (ROC AUC: 0.873) in the testing set (16 KD, 17 FC). Color difference maps reconstructed from the digital images of the sensing compounds demonstrated distinctive patterns differentiating KD from FC patients.The colorimetric sensor array, composed of common used chemical compounds, is an easily accessible, low-cost method to realize the discrimination of subjects with KD from other febrile illness.

    View details for DOI 10.1371/journal.pone.0146733

    View details for Web of Science ID 000370038400003

    View details for PubMedID 26859297

    View details for PubMedCentralID PMC4747548

  • Precision test for precision medicine: opportunities, challenges and perspectives regarding pre-eclampsia as an intervention window for future cardiovascular disease. American journal of translational research Zhou, X. n., Niu, J. M., Ji, W. J., Zhang, Z. n., Wang, P. P., Ling, X. B., Li, Y. M. 2016; 8 (5): 1920–34

    Abstract

    Hypertensive disorders of pregnancy (HDP) comprise a spectrum of syndromes that range in severity from gestational hypertension and pre-eclamplsia (PE) to eclampsia, as well as chronic hypertension and chronic hypertension with superimposed PE. HDP occur in 2% to 10% of pregnant women worldwide, and impose a substantial burden on maternal and fetal/infant health. Cardiovascular disease (CVD) is the leading cause of death in women. The high prevalence of non-obstructive coronary artery disease and the lack of an efficient diagnostic workup make the identification of CVD in women challenging. Accumulating evidence suggests that a previous history of PE is consistently associated with future CVD risk. Moreover, PE as a maladaptation to pregnancy-induced hemodynamic and metabolic stress may also be regarded as a "precision" testing result that predicts future cardiovascular risk. Therefore, the development of PE provides a tremendous, early opportunity that may lead to changes in maternal and infant future well-being. However, the underlying pathogenesis of PE is not precise, which warrants precision medicine-based approaches to establish a more precise definition and reclassification. In this review, we proposed a stage-specific, PE-targeted algorithm, which may provide novel hypotheses that bridge the gap between Big Data-generating approaches and clinical translational research in terms of PE prediction and prevention, clinical treatment, and long-term CVD management.

    View details for PubMedID 27347303

  • Prospective stratification of patients at risk for emergency department revisit: resource utilization and population management strategy implications. BMC emergency medicine Jin, B., Zhao, Y., Hao, S., Shin, A. Y., Wang, Y., Zhu, C., Hu, Z., Fu, C., Ji, J., Wang, Y., Zhao, Y., Jiang, Y., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2016; 16 (1): 10-?

    Abstract

    Estimating patient risk of future emergency department (ED) revisits can guide the allocation of resources, e.g. local primary care and/or specialty, to better manage ED high utilization patient populations and thereby improve patient life qualities.We set to develop and validate a method to estimate patient ED revisit risk in the subsequent 6 months from an ED discharge date. An ensemble decision-tree-based model with Electronic Medical Record (EMR) encounter data from HealthInfoNet (HIN), Maine's Health Information Exchange (HIE), was developed and validated, assessing patient risk for a subsequent 6 month return ED visit based on the ED encounter-associated demographic and EMR clinical history data. A retrospective cohort of 293,461 ED encounters that occurred between January 1, 2012 and December 31, 2012, was assembled with the associated patients' 1-year clinical histories before the ED discharge date, for model training and calibration purposes. To validate, a prospective cohort of 193,886 ED encounters that occurred between January 1, 2013 and June 30, 2013 was constructed.Statistical learning that was utilized to construct the prediction model identified 152 variables that included the following data domains: demographics groups (12), different encounter history (104), care facilities (12), primary and secondary diagnoses (10), primary and secondary procedures (2), chronic disease condition (1), laboratory test results (2), and outpatient prescription medications (9). The c-statistics for the retrospective and prospective cohorts were 0.742 and 0.730 respectively. Total medical expense and ED utilization by risk score 6 months after the discharge were analyzed. Cluster analysis identified discrete subpopulations of high-risk patients with distinctive resource utilization patterns, suggesting the need for diversified care management strategies.Integration of our method into the HIN secure statewide data system in real time prospectively validated its performance. It promises to provide increased opportunity for high ED utilization identification, and optimized resource and population management.

    View details for DOI 10.1186/s12873-016-0074-5

    View details for PubMedID 26842066

    View details for PubMedCentralID PMC4739399

  • NLP based congestive heart failure case finding: A prospective analysis on statewide electronic medical records. International journal of medical informatics Wang, Y., Luo, J., Hao, S., Xu, H., Shin, A. Y., Jin, B., Liu, R., Deng, X., Wang, L., Zheng, L., Zhao, Y., Zhu, C., Hu, Z., Fu, C., Hao, Y., Zhao, Y., Jiang, Y., Dai, D., Culver, D. S., Alfreds, S. T., Todd, R., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 84 (12): 1039-1047

    Abstract

    In order to proactively manage congestive heart failure (CHF) patients, an effective CHF case finding algorithm is required to process both structured and unstructured electronic medical records (EMR) to allow complementary and cost-efficient identification of CHF patients.We set to identify CHF cases from both EMR codified and natural language processing (NLP) found cases. Using narrative clinical notes from all Maine Health Information Exchange (HIE) patients, the NLP case finding algorithm was retrospectively (July 1, 2012-June 30, 2013) developed with a random subset of HIE associated facilities, and blind-tested with the remaining facilities. The NLP based method was integrated into a live HIE population exploration system and validated prospectively (July 1, 2013-June 30, 2014). Total of 18,295 codified CHF patients were included in Maine HIE. Among the 253,803 subjects without CHF codings, our case finding algorithm prospectively identified 2411 uncodified CHF cases. The positive predictive value (PPV) is 0.914, and 70.1% of these 2411 cases were found to be with CHF histories in the clinical notes.A CHF case finding algorithm was developed, tested and prospectively validated. The successful integration of the CHF case findings algorithm into the Maine HIE live system is expected to improve the Maine CHF care.

    View details for DOI 10.1016/j.ijmedinf.2015.06.007

    View details for PubMedID 26254876

  • Exploring Value in Congenital Heart Disease: An Evaluation of Inpatient Admissions. Congenital heart disease Shin, A. Y., Hu, Z., Jin, B., Lal, S., Rosenthal, D. N., Efron, B., Sharek, P. J., Sutherland, S. M., Cohen, H. J., McElhinney, D. B., Roth, S. J., Ling, X. B. 2015; 10 (6): E278-87

    Abstract

    Understanding value provides an important context for improvement. However, most health care models fail to measure value. Our objective was to categorize inpatient encounters within an academic congenital heart program based on clinical outcome and the cost to achieve the outcome (value). We aimed to describe clinical and nonclinical features associated with value.We defined hospital encounters based on outcome per resource utilized. We performed principal component and cluster analysis to classify encounters based on mortality, length of stay, hospital cost and revenue into six classes. We used nearest shrunken centroid to identify discriminant features associated with the cluster-derived classes. These features underwent hierarchical clustering and multivariate analysis to identify features associated with each class.We analyzed all patients admitted to an academic congenital heart program between September 1, 2009, and December 31, 2012.A total of 2658 encounters occurred during the study period. Six classes were categorized by value. Low-performing value classes were associated with greater institutional reward; however, encounters with higher-performing value were associated with a loss in profitability. Encounters that included insertion of a pediatric ventricular assist device (log OR 2.5 [95% CI, 1.78 to 3.43]) and acquisition of a hospital-acquired infection (log OR 1.42 [95% CI, 0.99 to 1.87]) were risk factors for inferior health care value.Among the patients in our study, institutional reward was not associated with value. We describe a framework to target quality improvement and resource management efforts that can benefit patients, institutions, and payers alike.

    View details for DOI 10.1111/chd.12290

    View details for PubMedID 26219731

  • Novel data-mining approach identifies biomarkers for diagnosis of Kawasaki disease PEDIATRIC RESEARCH Tremoulet, A. H., Dutkowski, J., Sato, Y., Kanegaye, J. T., Ling, X. B., Burns, J. C. 2015; 78 (5): 547-553

    Abstract

    As Kawasaki disease (KD) shares many clinical features with other more common febrile illnesses and misdiagnosis, leading to a delay in treatment, increases the risk of coronary artery damage, a diagnostic test for KD is urgently needed. We sought to develop a panel of biomarkers that could distinguish between acute KD patients and febrile controls (FC) with sufficient accuracy to be clinically useful.Plasma samples were collected from three independent cohorts of FC and acute KD patients who met the American Heart Association definition for KD and presented within the first 10 d of fever. The levels of 88 biomarkers associated with inflammation were assessed by Luminex bead technology. Unsupervised clustering followed by supervised clustering using a Random Forest model was used to find a panel of candidate biomarkers.A panel of biomarkers commonly available in the hospital laboratory (absolute neutrophil count, erythrocyte sedimentation rate, alanine aminotransferase, γ-glutamyl transferase, concentrations of α-1-antitrypsin, C-reactive protein, and fibrinogen, and platelet count) accurately diagnosed 81-96% of KD patients in a series of three independent cohorts.After prospective validation, this eight-biomarker panel may improve the recognition of KD.

    View details for DOI 10.1038/pr.2015.137

    View details for Web of Science ID 000363601700011

    View details for PubMedID 26237629

    View details for PubMedCentralID PMC4628575

  • Exploring Value in Congenital Heart Disease: An Evaluation of Inpatient Admissions CONGENITAL HEART DISEASE Shin, A. Y., Hu, Z., Jin, B., Lal, S., Rosenthal, D. N., Efron, B., Sharek, P. J., Sutherland, S. M., Cohen, H. J., McElhinney, D. B., Roth, S. J., Ling, X. B. 2015; 10 (6): E278-E287

    Abstract

    Understanding value provides an important context for improvement. However, most health care models fail to measure value. Our objective was to categorize inpatient encounters within an academic congenital heart program based on clinical outcome and the cost to achieve the outcome (value). We aimed to describe clinical and nonclinical features associated with value.We defined hospital encounters based on outcome per resource utilized. We performed principal component and cluster analysis to classify encounters based on mortality, length of stay, hospital cost and revenue into six classes. We used nearest shrunken centroid to identify discriminant features associated with the cluster-derived classes. These features underwent hierarchical clustering and multivariate analysis to identify features associated with each class.We analyzed all patients admitted to an academic congenital heart program between September 1, 2009, and December 31, 2012.A total of 2658 encounters occurred during the study period. Six classes were categorized by value. Low-performing value classes were associated with greater institutional reward; however, encounters with higher-performing value were associated with a loss in profitability. Encounters that included insertion of a pediatric ventricular assist device (log OR 2.5 [95% CI, 1.78 to 3.43]) and acquisition of a hospital-acquired infection (log OR 1.42 [95% CI, 0.99 to 1.87]) were risk factors for inferior health care value.Among the patients in our study, institutional reward was not associated with value. We describe a framework to target quality improvement and resource management efforts that can benefit patients, institutions, and payers alike.

    View details for DOI 10.1111/chd.12290

    View details for Web of Science ID 000367379300004

  • Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk Assessment Tool in the Maine Healthcare Information Exchange PLOS ONE Hao, S., Wang, Y., Jin, B., Shin, A. Y., Zhu, C., Huang, M., Zheng, L., Luo, J., Hu, Z., Fu, C., Dai, D., Wang, Y., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 10 (10)

    Abstract

    Identifying patients at risk of a 30-day readmission can help providers design interventions, and provide targeted care to improve clinical effectiveness. This study developed a risk model to predict a 30-day inpatient hospital readmission for patients in Maine, across all payers, all diseases and all demographic groups.Our objective was to develop a model to determine the risk for inpatient hospital readmission within 30 days post discharge. All patients within the Maine Health Information Exchange (HIE) system were included. The model was retrospectively developed on inpatient encounters between January 1, 2012 to December 31, 2012 from 24 randomly chosen hospitals, and then prospectively validated on inpatient encounters from January 1, 2013 to December 31, 2013 using all HIE patients.A risk assessment tool partitioned the entire HIE population into subgroups that corresponded to probability of hospital readmission as determined by a corresponding positive predictive value (PPV). An overall model c-statistic of 0.72 was achieved. The total 30-day readmission rates in low (score of 0-30), intermediate (score of 30-70) and high (score of 70-100) risk groupings were 8.67%, 24.10% and 74.10%, respectively. A time to event analysis revealed the higher risk groups readmitted to a hospital earlier than the lower risk groups. Six high-risk patient subgroup patterns were revealed through unsupervised clustering. Our model was successfully integrated into the statewide HIE to identify patient readmission risk upon admission and daily during hospitalization or for 30 days subsequently, providing daily risk score updates.The risk model was validated as an effective tool for predicting 30-day readmissions for patients across all payer, disease and demographic groups within the Maine HIE. Exposing the key clinical, demographic and utilization profiles driving each patient's risk of readmission score may be useful to providers in developing individualized post discharge care plans.

    View details for DOI 10.1371/journal.pone.0140271

    View details for Web of Science ID 000362511000113

    View details for PubMedID 26448562

  • Serological Targeted Analysis of an ITIH4 Peptide Isoform: A Preterm Birth Biomarker and Its Associated SNP Implications JOURNAL OF GENETICS AND GENOMICS Tan, Z., Hu, Z., Cai, E. Y., Alev, C., Yang, T., Li, Z., Sung, J., El-Sayed, Y. Y., Shaw, G. M., Stevenson, D. K., Butte, A. J., Sheng, G., Sylvester, K. G., Cohen, H. J., Ling, X. B. 2015; 42 (9): 507-510

    View details for DOI 10.1016/j.jgg.2015.06.001

    View details for PubMedID 26408095

  • Online Prediction of Health Care Utilization in the Next Six Months Based on Electronic Health Record Information: A Cohort and Validation Study JOURNAL OF MEDICAL INTERNET RESEARCH Hu, Z., Hao, S., Jin, B., Shin, A. Y., Zhu, C., Huang, M., Wang, Y., Zheng, L., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. 2015; 17 (9)

    Abstract

    The increasing rate of health care expenditures in the United States has placed a significant burden on the nation's economy. Predicting future health care utilization of patients can provide useful information to better understand and manage overall health care deliveries and clinical resource allocation.This study developed an electronic medical record (EMR)-based online risk model predictive of resource utilization for patients in Maine in the next 6 months across all payers, all diseases, and all demographic groups.In the HealthInfoNet, Maine's health information exchange (HIE), a retrospective cohort of 1,273,114 patients was constructed with the preceding 12-month EMR. Each patient's next 6-month (between January 1, 2013 and June 30, 2013) health care resource utilization was retrospectively scored ranging from 0 to 100 and a decision tree-based predictive model was developed. Our model was later integrated in the Maine HIE population exploration system to allow a prospective validation analysis of 1,358,153 patients by forecasting their next 6-month risk of resource utilization between July 1, 2013 and December 31, 2013.Prospectively predicted risks, on either an individual level or a population (per 1000 patients) level, were consistent with the next 6-month resource utilization distributions and the clinical patterns at the population level. Results demonstrated the strong correlation between its care resource utilization and our risk scores, supporting the effectiveness of our model. With the online population risk monitoring enterprise dashboards, the effectiveness of the predictive algorithm has been validated by clinicians and caregivers in the State of Maine.The model and associated online applications were designed for tracking the evolving nature of total population risk, in a longitudinal manner, for health care resource utilization. It will enable more effective care management strategies driving improved patient outcomes.

    View details for DOI 10.2196/jmir.4976

    View details for Web of Science ID 000361809800005

    View details for PubMedID 26395541

  • Cerebrospinal fluid protein dynamic driver network: At the crossroads of brain tumorigenesis METHODS Tan, Z., Liu, R., Zheng, L., Hao, S., Fu, C., Li, Z., Deng, X., Jang, T., Merchant, M., Whitin, J. C., Guo, M., Cohen, H. J., Recht, L., Ling, X. B. 2015; 83: 36-43

    Abstract

    To get a better understanding of the ongoing in situ environmental changes preceding the brain tumorigenesis, we assessed cerebrospinal fluid (CSF) proteome profile changes in a glioma rat model in which brain tumor invariably developed after a single in utero exposure to the neurocarcinogen ethylnitrosourea (ENU). Computationally, the CSF proteome profile dynamics during the tumorigenesis can be modeled as non-smooth or even abrupt state changes. Such brain tumor environment transition analysis, correlating the CSF composition changes with the development of early cellular hyperplasia, can reveal the pathogenesis process at network level during a time before the image detection of the tumors. In our controlled rat model study, matched ENU- and saline-exposed rats' CSF proteomics changes were quantified at approximately 30, 60, 90, 120, 150days of age (P30, P60, P90, P120, P150). We applied our transition-based network entropy (TNE) method to compute the CSF proteome changes in the ENU rat model and test the hypothesis of the critical transition state prior to impending hyperplasia. Our analysis identified a dynamic driver network (DDN) of CSF proteins related with the emerging tumorigenesis progressing from the non-hyperplasia state. The DDN associated leading network CSF proteins can allow the early detection of such dynamics before the catastrophic shift to the clear clinical landmarks in gliomas. Future characterization of the critical transition state (P60) during the brain tumor progression may reveal the underlying pathophysiology to device novel therapeutics preventing tumor formation. More detailed method and information are accessible through our website at http://translationalmedicine.stanford.edu.

    View details for DOI 10.1016/j.ymeth.2015.05.004

    View details for Web of Science ID 000358755100005

  • Utility of Clinical Biomarkers to Predict Central Line-associated Bloodstream Infections After Congenital Heart Surgery. Pediatric infectious disease journal Shin, A. Y., Jin, B., Hao, S., Hu, Z., Sutherland, S., McCammond, A., Axelrod, D., Sharek, P., Roth, S. J., Ling, X. B. 2015; 34 (3): 251-254

    Abstract

    Central line associated bloodstream infections is an important contributor of morbidity and mortality in children recovering from congenital heart surgery. The reliability of commonly used biomarkers to differentiate these patients have not been specifically studied.This was a retrospective cohort study in a university-affiliated children's hospital examining all patients with congenital or acquired heart disease admitted to the cardiovascular intensive care unit following cardiac surgery who underwent evaluation for a catheter-associated bloodstream infection.Among 1260 cardiac surgeries performed, 451 encounters underwent an infection evaluation post-operatively. Twenty-five instances of CLABSI and 227 instances of a negative infection evaluation were the subject of analysis. Patients with CLABSI tended to be younger (1.34 vs 4.56 years, p = 0.011) and underwent more complex surgery (RACHS-1 score 3.79 vs 3.04, p = 0.039). The two groups were indistinguishable in WBC, PMNs and band count at the time of their presentation. On multivariate analysis, CLABSI was associated with fever (adjusted OR 4.78; 95% CI, 1.6 to 5.8) and elevated CRP (adjusted OR 1.28; 95% CI, 1.09 to 1.68) after adjusting for differences between the two groups. Receiver operating characteristic analysis demonstrated the discriminatory power of both fever and CRP (area under curve 0.7247, 95% CI, 0.42 to 0.74 and 0.58, 95% CI 0.4208 to 0.7408). We calculated multilevel likelihood ratios for a spectrum of temperature and CRP values.We found commonly used serum biomarkers such as fever and CRP not to be helpful discriminators in patients following congenital heart surgery.

    View details for DOI 10.1097/INF.0000000000000553

    View details for PubMedID 25232780

  • Real-time web-based assessment of total population risk of future emergency department utilization: statewide prospective active case finding study. Interactive journal of medical research Hu, Z., Jin, B., Shin, A. Y., Zhu, C., Zhao, Y., Hao, S., Zheng, L., Fu, C., Wen, Q., Ji, J., Li, Z., Wang, Y., Zheng, X., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 4 (1)

    Abstract

    An easily accessible real-time Web-based utility to assess patient risks of future emergency department (ED) visits can help the health care provider guide the allocation of resources to better manage higher-risk patient populations and thereby reduce unnecessary use of EDs.Our main objective was to develop a Health Information Exchange-based, next 6-month ED risk surveillance system in the state of Maine.Data on electronic medical record (EMR) encounters integrated by HealthInfoNet (HIN), Maine's Health Information Exchange, were used to develop the Web-based surveillance system for a population ED future 6-month risk prediction. To model, a retrospective cohort of 829,641 patients with comprehensive clinical histories from January 1 to December 31, 2012 was used for training and then tested with a prospective cohort of 875,979 patients from July 1, 2012, to June 30, 2013.The multivariate statistical analysis identified 101 variables predictive of future defined 6-month risk of ED visit: 4 age groups, history of 8 different encounter types, history of 17 primary and 8 secondary diagnoses, 8 specific chronic diseases, 28 laboratory test results, history of 3 radiographic tests, and history of 25 outpatient prescription medications. The c-statistics for the retrospective and prospective cohorts were 0.739 and 0.732 respectively. Integration of our method into the HIN secure statewide data system in real time prospectively validated its performance. Cluster analysis in both the retrospective and prospective analyses revealed discrete subpopulations of high-risk patients, grouped around multiple "anchoring" demographics and chronic conditions. With the Web-based population risk-monitoring enterprise dashboards, the effectiveness of the active case finding algorithm has been validated by clinicians and caregivers in Maine.The active case finding model and associated real-time Web-based app were designed to track the evolving nature of total population risk, in a longitudinal manner, for ED visits across all payers, all diseases, and all age groups. Therefore, providers can implement targeted care management strategies to the patient subgroups with similar patterns of clinical histories, driving the delivery of more efficient and effective health care interventions. To the best of our knowledge, this prospectively validated EMR-based, Web-based tool is the first one to allow real-time total population risk assessment for statewide ED visits.

    View details for DOI 10.2196/ijmr.4022

    View details for PubMedID 25586600

    View details for PubMedCentralID PMC4319080

  • Virtual Pharmacist: A Platform for Pharmacogenomics. PloS one Cheng, R., Leung, R. K., Chen, Y., Pan, Y., Tong, Y., Li, Z., Ning, L., Ling, X. B., He, J. 2015; 10 (10): e0141105

    Abstract

    We present Virtual Pharmacist, a web-based platform that takes common types of high-throughput data, namely microarray SNP genotyping data, FASTQ and Variant Call Format (VCF) files as inputs, and reports potential drug responses in terms of efficacy, dosage and toxicity at one glance. Batch submission facilitates multivariate analysis or data mining of targeted groups. Individual analysis consists of a report that is readily comprehensible to patients and practioners who have basic knowledge in pharmacology, a table that summarizes variants and potential affected drug response according to the US Food and Drug Administration pharmacogenomic biomarker labeled drug list and PharmGKB, and visualization of a gene-drug-target network. Group analysis provides the distribution of the variants and potential affected drug response of a target group, a sample-gene variant count table, and a sample-drug count table. Our analysis of genomes from the 1000 Genome Project underlines the potentially differential drug responses among different human populations. Even within the same population, the findings from Watson's genome highlight the importance of personalized medicine. Virtual Pharmacist can be accessed freely at http://www.sustc-genome.org.cn/vp or installed as a local web server. The codes and documentation are available at the GitHub repository (https://github.com/VirtualPharmacist/vp). Administrators can download the source codes to customize access settings for further development.

    View details for DOI 10.1371/journal.pone.0141105

    View details for PubMedID 26496198

    View details for PubMedCentralID PMC4619711

  • Risk prediction for future 6-month healthcare resource utilization in Maine Hao, S., Sylvester, K. G., Ling, X. B., Ru, Z., Jin, B., Zhu, C., Dai, D., Stearns, F., Widen, E., Shin, A., Culver, D. S., Alfreds, S. T., Rogow, T., Huan, J., Miyano, S., Shehu, A., Hu, Ma, B., Rajasekaran, S., Gombar, V. K., Schapranow, I. M., Yoo, I. H., Zhou, J. Y., Chen, B., Pai, Pierce, B. IEEE. 2015: 863–66
  • Risk Prediction of Stroke: A Prospective Statewide Study on Patients in Maine Zheng, L., Wang, V., Hao, S., Sylvester, K. G., Ling, X. B., Jin, B., Zhu, C., Jin, H., Dai, D., Xu, H., Steams, F., Widen, E., Shin, A., Culver, D. S., Alfreds, S. T., Rogow, T., Huan, J., Miyano, S., Shehu, A., Hu, Ma, B., Rajasekaran, S., Gombar, V. K., Schapranow, I. M., Yoo, I. H., Zhou, J. Y., Chen, B., Pai, Pierce, B. IEEE. 2015: 853–55
  • Pilot Application of Magnetic Nanoparticle-Based Biosensor for Necrotizing Enterocolitis. Journal of proteomics & bioinformatics Kim, D., Fu, C., Ling, X. B., Hu, Z., Tao, G., Zhao, Y., Kastenberg, Z. J., Sylvester, K. G., Wang, S. X. 2015

    Abstract

    Necrotizing Enterocolitis (NEC) is a major source of neonatal morbidity and mortality. There is an ongoing need for a sensitive diagnostic instrument to discriminate NEC from neonatal sepsis. We hypothesized that magnetic nanopartile-based biosensor analysis of gut injury-associated biomarkers would provide such an instrument.We designed a magnetic multiplexed biosensor platform, allowing the parallel plasma analysis of C-reactive protein (CRP), matrix metalloproteinase-7 (MMp7), and epithelial cell adhesion molecule (EpCAM). Neonatal subjects with sepsis (n=5) or NEC (n=10) were compared to control (n=5) subjects to perform a proof of concept pilot study for the diagnosis of NEC using our ultra-sensitive biosensor platform.Our multiplexed NEC magnetic nanoparticle-based biosensor platform was robust, ultrasensitive (Limit of detection LOD: CRP 0.6 pg/ml; MMp7 20 pg/ml; and EpCAM 20 pg/ml), and displayed no cross-reactivity among analyte reporting regents. To gauge the diagnostic performance, bootstrapping procedure (500 runs) was applied: MMp7 and EpCAM collectively differentiated infants with NEC from control infants with ROC AUC of 0.96, and infants with NEC from those with sepsis with ROC AUC of 1.00. The 3-marker panel comprising of EpCAM, MMp7 and CRP had a corresponding ROC AUC of 0.956 and 0.975, respectively.The exploration of the multiplexed nano-biosensor platform shows promise to deliver an ultrasensitive instrument for the diagnosis of NEC in the clinical setting.

    View details for PubMedID 26798207

  • Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk Assessment Tool in the Maine Healthcare Information Exchange. PloS one Hao, S., Wang, Y., Jin, B., Shin, A. Y., Zhu, C., Huang, M., Zheng, L., Luo, J., Hu, Z., Fu, C., Dai, D., Wang, Y., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 10 (10)

    Abstract

    Identifying patients at risk of a 30-day readmission can help providers design interventions, and provide targeted care to improve clinical effectiveness. This study developed a risk model to predict a 30-day inpatient hospital readmission for patients in Maine, across all payers, all diseases and all demographic groups.Our objective was to develop a model to determine the risk for inpatient hospital readmission within 30 days post discharge. All patients within the Maine Health Information Exchange (HIE) system were included. The model was retrospectively developed on inpatient encounters between January 1, 2012 to December 31, 2012 from 24 randomly chosen hospitals, and then prospectively validated on inpatient encounters from January 1, 2013 to December 31, 2013 using all HIE patients.A risk assessment tool partitioned the entire HIE population into subgroups that corresponded to probability of hospital readmission as determined by a corresponding positive predictive value (PPV). An overall model c-statistic of 0.72 was achieved. The total 30-day readmission rates in low (score of 0-30), intermediate (score of 30-70) and high (score of 70-100) risk groupings were 8.67%, 24.10% and 74.10%, respectively. A time to event analysis revealed the higher risk groups readmitted to a hospital earlier than the lower risk groups. Six high-risk patient subgroup patterns were revealed through unsupervised clustering. Our model was successfully integrated into the statewide HIE to identify patient readmission risk upon admission and daily during hospitalization or for 30 days subsequently, providing daily risk score updates.The risk model was validated as an effective tool for predicting 30-day readmissions for patients across all payer, disease and demographic groups within the Maine HIE. Exposing the key clinical, demographic and utilization profiles driving each patient's risk of readmission score may be useful to providers in developing individualized post discharge care plans.

    View details for DOI 10.1371/journal.pone.0140271

    View details for PubMedID 26448562

  • A novel urine peptide biomarker-based algorithm for the prognosis of necrotising enterocolitis in human infants. Gut Sylvester, K. G., Ling, X. B., Liu, G. Y., Kastenberg, Z. J., Ji, J., Hu, Z., Peng, S., Lau, K., Abdullah, F., Brandt, M. L., Ehrenkranz, R. A., Harris, M. C., Lee, T. C., Simpson, J., Bowers, C., Moss, R. L. 2014; 63 (8): 1284-1292

    Abstract

    Necrotising enterocolitis (NEC) is a major source of neonatal morbidity and mortality. The management of infants with NEC is currently complicated by our inability to accurately identify those at risk for progression of disease prior to the development of irreversible intestinal necrosis. We hypothesised that integrated analysis of clinical parameters in combination with urine peptide biomarkers would lead to improved prognostic accuracy in the NEC population.Infants under suspicion of having NEC (n=550) were prospectively enrolled from a consortium consisting of eight university-based paediatric teaching hospitals. Twenty-seven clinical parameters were used to construct a multivariate predictor of NEC progression. Liquid chromatography/mass spectrometry was used to profile the urine peptidomes from a subset of this population (n=65) to discover novel biomarkers of NEC progression. An ensemble model for the prediction of disease progression was then created using clinical and biomarker data.The use of clinical parameters alone resulted in a receiver-operator characteristic curve with an area under the curve of 0.817 and left 40.1% of all patients in an 'indeterminate' risk group. Three validated urine peptide biomarkers (fibrinogen peptides: FGA1826, FGA1883 and FGA2659) produced a receiver-operator characteristic area under the curve of 0.856. The integration of clinical parameters with urine biomarkers in an ensemble model resulted in the correct prediction of NEC outcomes in all cases tested.Ensemble modelling combining clinical parameters with biomarker analysis dramatically improves our ability to identify the population at risk for developing progressive NEC.

    View details for DOI 10.1136/gutjnl-2013-305130

    View details for PubMedID 24048736

  • Investigation of maternal environmental exposures in association with self-reported preterm birth. Reproductive toxicology Patel, C. J., Yang, T., Hu, Z., Wen, Q., Sung, J., El-Sayed, Y. Y., Cohen, H., Gould, J., Stevenson, D. K., Shaw, G. M., Ling, X. B., Butte, A. J. 2014; 45: 1-7

    Abstract

    Identification of maternal environmental factors influencing preterm birth risks is important to understand the reasons for the increase in prematurity since 1990. Here, we utilized a health survey, the US National Health and Nutrition Examination Survey (NHANES) to search for personal environmental factors associated with preterm birth. 201 urine and blood markers of environmental factors, such as allergens, pollutants, and nutrients were assayed in mothers (range of N: 49-724) who answered questions about any children born preterm (delivery <37 weeks). We screened each of the 201 factors for association with any child born preterm adjusting by age, race/ethnicity, education, and household income. We attempted to verify the top finding, urinary bisphenol A, in an independent study of pregnant women attending Lucile Packard Children's Hospital. We conclude that the association between maternal urinary levels of bisphenol A and preterm birth should be evaluated in a larger epidemiological investigation.

    View details for DOI 10.1016/j.reprotox.2013.12.005

    View details for PubMedID 24373932

  • Urine protein biomarkers for the diagnosis and prognosis of necrotizing enterocolitis in infants. journal of pediatrics Sylvester, K. G., Ling, X. B., Liu, G. Y., Kastenberg, Z. J., Ji, J., Hu, Z., Wu, S., Peng, S., Abdullah, F., Brandt, M. L., Ehrenkranz, R. A., Harris, M. C., Lee, T. C., Simpson, B. J., Bowers, C., Moss, R. L. 2014; 164 (3): 607-12 e1 7

    Abstract

    To test the hypothesis that an exploratory proteomics analysis of urine proteins with subsequent development of validated urine biomarker panels would produce molecular classifiers for both the diagnosis and prognosis of infants with necrotizing enterocolitis (NEC).Urine samples were collected from 119 premature infants (85 NEC, 17 sepsis, 17 control) at the time of initial clinical concern for disease. The urine from 59 infants was used for candidate biomarker discovery by liquid chromatography/mass spectrometry. The remaining 60 samples were subject to enzyme-linked immunosorbent assay for quantitative biomarker validation.A panel of 7 biomarkers (alpha-2-macroglobulin-like protein 1, cluster of differentiation protein 14, cystatin 3, fibrinogen alpha chain, pigment epithelium-derived factor, retinol binding protein 4, and vasolin) was identified by liquid chromatography/mass spectrometry and subsequently validated by enzyme-linked immunosorbent assay. These proteins were consistently found to be either up- or down-regulated depending on the presence, absence, or severity of disease. Biomarker panel validation resulted in a receiver-operator characteristic area under the curve of 98.2% for NEC vs sepsis and an area under the curve of 98.4% for medical NEC vs surgical NEC.We identified 7 urine proteins capable of providing highly accurate diagnostic and prognostic information for infants with suspected NEC. This work represents a novel approach to improving the efficiency with which we diagnose early NEC and identify those at risk for developing severe, or surgical, disease.

    View details for DOI 10.1016/j.jpeds.2013.10.091

    View details for PubMedID 24433829

  • Urine protein biomarkers for the diagnosis and prognosis of necrotizing enterocolitis in infants. journal of pediatrics Sylvester, K. G., Ling, X. B., Liu, G. Y., Kastenberg, Z. J., Ji, J., Hu, Z., Wu, S., Peng, S., Abdullah, F., Brandt, M. L., Ehrenkranz, R. A., Harris, M. C., Lee, T. C., Simpson, B. J., Bowers, C., Moss, R. L. 2014; 164 (3): 607-612 e7

    Abstract

    To test the hypothesis that an exploratory proteomics analysis of urine proteins with subsequent development of validated urine biomarker panels would produce molecular classifiers for both the diagnosis and prognosis of infants with necrotizing enterocolitis (NEC).Urine samples were collected from 119 premature infants (85 NEC, 17 sepsis, 17 control) at the time of initial clinical concern for disease. The urine from 59 infants was used for candidate biomarker discovery by liquid chromatography/mass spectrometry. The remaining 60 samples were subject to enzyme-linked immunosorbent assay for quantitative biomarker validation.A panel of 7 biomarkers (alpha-2-macroglobulin-like protein 1, cluster of differentiation protein 14, cystatin 3, fibrinogen alpha chain, pigment epithelium-derived factor, retinol binding protein 4, and vasolin) was identified by liquid chromatography/mass spectrometry and subsequently validated by enzyme-linked immunosorbent assay. These proteins were consistently found to be either up- or down-regulated depending on the presence, absence, or severity of disease. Biomarker panel validation resulted in a receiver-operator characteristic area under the curve of 98.2% for NEC vs sepsis and an area under the curve of 98.4% for medical NEC vs surgical NEC.We identified 7 urine proteins capable of providing highly accurate diagnostic and prognostic information for infants with suspected NEC. This work represents a novel approach to improving the efficiency with which we diagnose early NEC and identify those at risk for developing severe, or surgical, disease.

    View details for DOI 10.1016/j.jpeds.2013.10.091

    View details for PubMedID 24433829

  • Risk prediction of emergency department revisit 30 days post discharge: a prospective study. PloS one Hao, S., Jin, B., Shin, A. Y., Zhao, Y., Zhu, C., Li, Z., Hu, Z., Fu, C., Ji, J., Wang, Y., Zhao, Y., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2014; 9 (11)

    Abstract

    Among patients who are discharged from the Emergency Department (ED), about 3% return within 30 days. Revisits can be related to the nature of the disease, medical errors, and/or inadequate diagnoses and treatment during their initial ED visit. Identification of high-risk patient population can help device new strategies for improved ED care with reduced ED utilization.A decision tree based model with discriminant Electronic Medical Record (EMR) features was developed and validated, estimating patient ED 30 day revisit risk. A retrospective cohort of 293,461 ED encounters from HealthInfoNet (HIN), Maine's Health Information Exchange (HIE), between January 1, 2012 and December 31, 2012, was assembled with the associated patients' demographic information and one-year clinical histories before the discharge date as the inputs. To validate, a prospective cohort of 193,886 encounters between January 1, 2013 and June 30, 2013 was constructed. The c-statistics for the retrospective and prospective predictions were 0.710 and 0.704 respectively. Clinical resource utilization, including ED use, was analyzed as a function of the ED risk score. Cluster analysis of high-risk patients identified discrete sub-populations with distinctive demographic, clinical and resource utilization patterns.Our ED 30-day revisit model was prospectively validated on the Maine State HIN secure statewide data system. Future integration of our ED predictive analytics into the ED care work flow may lead to increased opportunities for targeted care intervention to reduce ED resource burden and overall healthcare expense, and improve outcomes.

    View details for DOI 10.1371/journal.pone.0112944

    View details for PubMedID 25393305

    View details for PubMedCentralID PMC4231082

  • Pilot Application of Magnetic Nanoparticle-Based Biosensor for Necrotizing Enterocolitis Journal of Proteomics and Bioinformatics Kim, D., Fu, C., Ling, X. B., Hu, Z., Tao, G., Zhao, Y., Kastenberg, Z. J., Sylvester, K. G., Wang, S. X. 2014

    View details for DOI 10.4172/jpb.S5-002

  • Risk prediction of emergency department revisit 30 days post discharge: a prospective study. PloS one Hao, S., Jin, B., Shin, A. Y., Zhao, Y., Zhu, C., Li, Z., Hu, Z., Fu, C., Ji, J., Wang, Y., Zhao, Y., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2014; 9 (11): e112944

    Abstract

    Among patients who are discharged from the Emergency Department (ED), about 3% return within 30 days. Revisits can be related to the nature of the disease, medical errors, and/or inadequate diagnoses and treatment during their initial ED visit. Identification of high-risk patient population can help device new strategies for improved ED care with reduced ED utilization.A decision tree based model with discriminant Electronic Medical Record (EMR) features was developed and validated, estimating patient ED 30 day revisit risk. A retrospective cohort of 293,461 ED encounters from HealthInfoNet (HIN), Maine's Health Information Exchange (HIE), between January 1, 2012 and December 31, 2012, was assembled with the associated patients' demographic information and one-year clinical histories before the discharge date as the inputs. To validate, a prospective cohort of 193,886 encounters between January 1, 2013 and June 30, 2013 was constructed. The c-statistics for the retrospective and prospective predictions were 0.710 and 0.704 respectively. Clinical resource utilization, including ED use, was analyzed as a function of the ED risk score. Cluster analysis of high-risk patients identified discrete sub-populations with distinctive demographic, clinical and resource utilization patterns.Our ED 30-day revisit model was prospectively validated on the Maine State HIN secure statewide data system. Future integration of our ED predictive analytics into the ED care work flow may lead to increased opportunities for targeted care intervention to reduce ED resource burden and overall healthcare expense, and improve outcomes.

    View details for DOI 10.1371/journal.pone.0112944

    View details for PubMedID 25393305

    View details for PubMedCentralID PMC4231082

  • CSF protein dynamic driver network: at the crossroads of brain tumorigenesis Fu, C., Tan, Z., Liu, R., Hao, S., Li, Z., Chen, P., Jang, T., Merchant, M., Whitin, J. C., Wang, O., Guo, M., Cohen, H. J., Recht, L., Ling, X. B., Zheng, H., Hu, Berrar, D., Wang, Y., Dubitzky, W., Hao, J. K., Cho, K. H., Gilbert, D. IEEE. 2014
  • A data-driven algorithm integrating clinical and laboratory features for the diagnosis and prognosis of necrotizing enterocolitis. PloS one Ji, J., Ling, X. B., Zhao, Y., Hu, Z., Zheng, X., Xu, Z., Wen, Q., Kastenberg, Z. J., Li, P., Abdullah, F., Brandt, M. L., Ehrenkranz, R. A., Harris, M. C., Lee, T. C., Simpson, B. J., Bowers, C., Moss, R. L., Sylvester, K. G. 2014; 9 (2)

    Abstract

    Necrotizing enterocolitis (NEC) is a major source of neonatal morbidity and mortality. Since there is no specific diagnostic test or risk of progression model available for NEC, the diagnosis and outcome prediction of NEC is made on clinical grounds. The objective in this study was to develop and validate new NEC scoring systems for automated staging and prognostic forecasting.A six-center consortium of university based pediatric teaching hospitals prospectively collected data on infants under suspicion of having NEC over a 7-year period. A database comprised of 520 infants was utilized to develop the NEC diagnostic and prognostic models by dividing the entire dataset into training and testing cohorts of demographically matched subjects. Developed on the training cohort and validated on the blind testing cohort, our multivariate analyses led to NEC scoring metrics integrating clinical data.MACHINE LEARNING USING CLINICAL AND LABORATORY RESULTS AT THE TIME OF CLINICAL PRESENTATION LED TO TWO NEC MODELS: (1) an automated diagnostic classification scheme; (2) a dynamic prognostic method for risk-stratifying patients into low, intermediate and high NEC scores to determine the risk for disease progression. We submit that dynamic risk stratification of infants with NEC will assist clinicians in determining the need for additional diagnostic testing and guide potential therapies in a dynamic manner.http://translationalmedicine.stanford.edu/cgi-bin/NEC/index.pl and smartphone application upon request.

    View details for DOI 10.1371/journal.pone.0089860

    View details for PubMedID 24587080

  • Integrating multiple 'omics' analyses identifies serological protein biomarkers for preeclampsia BMC MEDICINE Liu, L. Y., Yang, T., Ji, J., Wen, Q., Morgan, A. A., Jin, B., Chen, G., Lyell, D. J., Stevenson, D. K., Ling, X. B., Butte, A. J. 2013; 11

    Abstract

    Preeclampsia (PE) is a pregnancy-related vascular disorder which is the leading cause of maternal morbidity and mortality. We sought to identify novel serological protein markers to diagnose PE with a multi-'omics' based discovery approach.Seven previous placental expression studies were combined for a multiplex analysis, and in parallel, two-dimensional gel electrophoresis was performed to compare serum proteomes in PE and control subjects. The combined biomarker candidates were validated with available ELISA assays using gestational age-matched PE (n=32) and control (n=32) samples. With the validated biomarkers, a genetic algorithm was then used to construct and optimize biomarker panels in PE assessment.In addition to the previously identified biomarkers, the angiogenic and antiangiogenic factors (soluble fms-like tyrosine kinase (sFlt-1) and placental growth factor (PIGF)), we found 3 up-regulated and 6 down-regulated biomakers in PE sera. Two optimal biomarker panels were developed for early and late onset PE assessment, respectively.Both early and late onset PE diagnostic panels, constructed with our PE biomarkers, were superior over sFlt-1/PIGF ratio in PE discrimination. The functional significance of these PE biomarkers and their associated pathways were analyzed which may provide new insights into the pathogenesis of PE.

    View details for DOI 10.1186/1741-7015-11-236

    View details for Web of Science ID 000329052900001

    View details for PubMedID 24195779

    View details for PubMedCentralID PMC4226208

  • AKI in Hospitalized Children: Epidemiology and Clinical Associations in a National Cohort. Clinical journal of the American Society of Nephrology Sutherland, S. M., Ji, J., Sheikhi, F. H., Widen, E., Tian, L., Alexander, S. R., Ling, X. B. 2013; 8 (10): 1661-1669

    Abstract

    Although AKI is common among hospitalized children, comprehensive epidemiologic data are lacking. This study characterizes pediatric AKI across the United States and identifies AKI risk factors using high-content/high-throughput analytic techniques.For the cross-sectional analysis of the 2009 Kids Inpatient Database, AKI events were identified using International Classification of Diseases, Ninth Revision, Clinical Modification codes. Demographics, incident rates, and outcome data were analyzed and reported for the entire AKI cohort as well as AKI subsets. Statistical learning methods were applied to the highly imbalanced dataset to derive AKI-related risk factors.Of 2,644,263 children, 10,322 children developed AKI (3.9/1000 admissions). Although 19% of the AKI cohort was ≤1 month old, the highest incidence was seen in children 15-18 years old (6.6/1000 admissions); 49% of the AKI cohort was white, but AKI incidence was higher among African Americans (4.5 versus 3.8/1000 admissions). In-hospital mortality among patients with AKI was 15.3% but higher among children ≤1 month old (31.3% versus 10.1%, P<0.001) and children requiring critical care (32.8% versus 9.4%, P<0.001) or dialysis (27.1% versus 14.2%, P<0.001). Shock (odds ratio, 2.15; 95% confidence interval, 1.95 to 2.36), septicemia (odds ratio, 1.37; 95% confidence interval, 1.32 to 1.43), intubation/mechanical ventilation (odds ratio, 1.2; 95% confidence interval, 1.16 to 1.25), circulatory disease (odds ratio, 1.47; 95% confidence interval, 1.32 to 1.65), cardiac congenital anomalies (odds ratio, 1.2; 95% confidence interval, 1.13 to 1.23), and extracorporeal support (odds ratio, 2.58; 95% confidence interval, 2.04 to 3.26) were associated with AKI.AKI occurs in 3.9/1000 at-risk US pediatric hospitalizations. Mortality is highest among neonates and children requiring critical care or dialysis. Identified risk factors suggest that AKI occurs in association with systemic/multiorgan disease more commonly than primary renal disease.

    View details for DOI 10.2215/CJN.00270113

    View details for PubMedID 23833312

  • Identification of novel biomarkers for early detection of ovarian cancer Szabo, L. A., Khatri, P., Liu, X., Hu, Z., Ling, B., Butte, A. J. AMER ASSOC CANCER RESEARCH. 2013
  • Peptidomic Identification of Serum Peptides Diagnosing Preeclampsia. PloS one Wen, Q., Liu, L. Y., Yang, T., Alev, C., Wu, S., Stevenson, D. K., Sheng, G., Butte, A. J., Ling, X. B. 2013; 8 (6): e65571

    Abstract

    We sought to identify serological markers capable of diagnosing preeclampsia (PE). We performed serum peptide analysis (liquid chromatography mass spectrometry) of 62 unique samples from 31 PE patients and 31 healthy pregnant controls, with two-thirds used as a training set and the other third as a testing set. Differential serum peptide profiling identified 52 significant serum peptides, and a 19-peptide panel collectively discriminating PE in training sets (n = 21 PE, n = 21 control; specificity = 85.7% and sensitivity = 100%) and testing sets (n = 10 PE, n = 10 control; specificity = 80% and sensitivity = 100%). The panel peptides were derived from 6 different protein precursors: 13 from fibrinogen alpha (FGA), 1 from alpha-1-antitrypsin (A1AT), 1 from apolipoprotein L1 (APO-L1), 1 from inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), 2 from kininogen-1 (KNG1), and 1 from thymosin beta-4 (TMSB4). We concluded that serum peptides can accurately discriminate active PE. Measurement of a 19-peptide panel could be performed quickly and in a quantitative mass spectrometric platform available in clinical laboratories. This serum peptide panel quantification could provide clinical utility in predicting PE or differential diagnosis of PE from confounding chronic hypertension.

    View details for DOI 10.1371/journal.pone.0065571

    View details for PubMedID 23840341

    View details for PubMedCentralID PMC3686758

  • Peptidomic Identification of Serum Peptides Diagnosing Preeclampsia PLOS ONE Wen, Q., Liu, L. Y., Yang, T., Alev, C., Wu, S., Stevenson, D. K., Sheng, G., Butte, A. J., Ling, X. B. 2013; 8 (6)

    Abstract

    We sought to identify serological markers capable of diagnosing preeclampsia (PE). We performed serum peptide analysis (liquid chromatography mass spectrometry) of 62 unique samples from 31 PE patients and 31 healthy pregnant controls, with two-thirds used as a training set and the other third as a testing set. Differential serum peptide profiling identified 52 significant serum peptides, and a 19-peptide panel collectively discriminating PE in training sets (n = 21 PE, n = 21 control; specificity = 85.7% and sensitivity = 100%) and testing sets (n = 10 PE, n = 10 control; specificity = 80% and sensitivity = 100%). The panel peptides were derived from 6 different protein precursors: 13 from fibrinogen alpha (FGA), 1 from alpha-1-antitrypsin (A1AT), 1 from apolipoprotein L1 (APO-L1), 1 from inter-alpha-trypsin inhibitor heavy chain H4 (ITIH4), 2 from kininogen-1 (KNG1), and 1 from thymosin beta-4 (TMSB4). We concluded that serum peptides can accurately discriminate active PE. Measurement of a 19-peptide panel could be performed quickly and in a quantitative mass spectrometric platform available in clinical laboratories. This serum peptide panel quantification could provide clinical utility in predicting PE or differential diagnosis of PE from confounding chronic hypertension.

    View details for DOI 10.1371/journal.pone.0065571

    View details for Web of Science ID 000322361200025

    View details for PubMedCentralID PMC3686758

  • Integrating multiple 'omics' analyses identifies serological protein biomarkers for preeclampsia. BMC medicine Liu, L. Y., Yang, T., Ji, J., Wen, Q., Morgan, A. A., Jin, B., Chen, G., Lyell, D. J., Stevenson, D. K., Ling, X. B., Butte, A. J. 2013; 11: 236-?

    Abstract

    Preeclampsia (PE) is a pregnancy-related vascular disorder which is the leading cause of maternal morbidity and mortality. We sought to identify novel serological protein markers to diagnose PE with a multi-'omics' based discovery approach.Seven previous placental expression studies were combined for a multiplex analysis, and in parallel, two-dimensional gel electrophoresis was performed to compare serum proteomes in PE and control subjects. The combined biomarker candidates were validated with available ELISA assays using gestational age-matched PE (n=32) and control (n=32) samples. With the validated biomarkers, a genetic algorithm was then used to construct and optimize biomarker panels in PE assessment.In addition to the previously identified biomarkers, the angiogenic and antiangiogenic factors (soluble fms-like tyrosine kinase (sFlt-1) and placental growth factor (PIGF)), we found 3 up-regulated and 6 down-regulated biomakers in PE sera. Two optimal biomarker panels were developed for early and late onset PE assessment, respectively.Both early and late onset PE diagnostic panels, constructed with our PE biomarkers, were superior over sFlt-1/PIGF ratio in PE discrimination. The functional significance of these PE biomarkers and their associated pathways were analyzed which may provide new insights into the pathogenesis of PE.

    View details for DOI 10.1186/1741-7015-11-236

    View details for PubMedID 24195779

  • Transcriptomics and proteomics ensemble analyses reveal serological protein panel for preeclampsia diagnosis 33rd Annual Pregnancy Meeting of the Society-for-Maternal-Fetal-Medicine (SMFM) Liu, L., Cooper, M., Yang, T., Ji, J., Wen, Q., Chen, G., Morgan, A., Stevenson, D., Ling, X., Butte, A. MOSBY-ELSEVIER. 2013: S272–S272
  • Cloud-based solution to identify statistically significant MS peaks differentiating sample categories. BMC research notes Ji, J., Ling, J., Jiang, H., Wen, Q., Whitin, J. C., Tian, L., Cohen, H. J., Ling, X. B. 2013; 6: 109-?

    Abstract

    Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation.Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/.Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.

    View details for DOI 10.1186/1756-0500-6-109

    View details for PubMedID 23522030

    View details for PubMedCentralID PMC3621609

  • Point-of-Care Differentiation of Kawasaki Disease from Other Febrile Illnesses JOURNAL OF PEDIATRICS Ling, X. B., Kanegaye, J. T., Ji, J., Peng, S., Sato, Y., Tremoulet, A., Burns, J. C., Cohen, H. J. 2013; 162 (1): 183-U219

    Abstract

    To test whether statistical learning on clinical and laboratory test patterns would lead to an algorithm for Kawasaki disease (KD) diagnosis that could aid clinicians.Demographic, clinical, and laboratory data were prospectively collected for subjects with KD and febrile controls (FCs) using a standardized data collection form.Our multivariate models were trained with a cohort of 276 patients with KD and 243 FCs (who shared some features of KD) and validated with a cohort of 136 patients with KD and 121 FCs using either clinical data, laboratory test results, or their combination. Our KD scoring method stratified the subjects into subgroups with low (FC diagnosis, negative predictive value >95%), intermediate, and high (KD diagnosis, positive predictive value >95%) scores. Combining both clinical and laboratory test results, the algorithm diagnosed 81.2% of all training and 74.3% of all testing of patients with KD in the high score group and 67.5% of all training and 62.8% of all testing FCs in the low score group.Our KD scoring metric and the associated data system with online (http://translationalmedicine.stanford.edu/cgi-bin/KD/kd.pl) and smartphone applications are easily accessible, inexpensive tools to improve the differentiation of most children with KD from FCs with other pediatric illnesses.

    View details for DOI 10.1016/j.jpeds.2012.06.012

    View details for Web of Science ID 000312915900040

    View details for PubMedID 22819274

  • Correlation analyses of clinical and molecular findings identify candidate biological pathways in systemic juvenile idiopathic arthritis BMC MEDICINE Ling, X. B., Macaubas, C., Alexander, H. C., Wen, Q., Chen, E., Peng, S., Sun, Y., Deshpande, C., Pan, K., Lin, R., Lih, C., Chang, S. P., Lee, T., Sandborg, C., Begovich, A. B., Cohen, S. N., Mellins, E. D. 2012; 10

    Abstract

    Clinicians have long appreciated the distinct phenotype of systemic juvenile idiopathic arthritis (SJIA) compared to polyarticular juvenile idiopathic arthritis (POLY). We hypothesized that gene expression profiles of peripheral blood mononuclear cells (PBMC) from children with each disease would reveal distinct biological pathways when analyzed for significant associations with elevations in two markers of JIA activity, erythrocyte sedimentation rate (ESR) and number of affected joints (joint count, JC).PBMC RNA from SJIA and POLY patients was profiled by kinetic PCR to analyze expression of 181 genes, selected for relevance to immune response pathways. Pearson correlation and Student's t-test analyses were performed to identify transcripts significantly associated with clinical parameters (ESR and JC) in SJIA or POLY samples. These transcripts were used to find related biological pathways.Combining Pearson and t-test analyses, we found 91 ESR-related and 92 JC-related genes in SJIA. For POLY, 20 ESR-related and 0 JC-related genes were found. Using Ingenuity Systems Pathways Analysis, we identified SJIA ESR-related and JC-related pathways. The two sets of pathways are strongly correlated. In contrast, there is a weaker correlation between SJIA and POLY ESR-related pathways. Notably, distinct biological processes were found to correlate with JC in samples from the earlier systemic plus arthritic phase (SAF) of SJIA compared to samples from the later arthritis-predominant phase (AF). Within the SJIA SAF group, IL-10 expression was related to JC, whereas lack of IL-4 appeared to characterize the chronic arthritis (AF) subgroup.The strong correlation between pathways implicated in elevations of both ESR and JC in SJIA argues that the systemic and arthritic components of the disease are related mechanistically. Inflammatory pathways in SJIA are distinct from those in POLY course JIA, consistent with differences in clinically appreciated target organs. The limited number of ESR-related SJIA genes that also are associated with elevations of ESR in POLY implies that the SJIA associations are specific for SJIA, at least to some degree. The distinct pathways associated with arthritis in early and late SJIA raise the possibility that different immunobiology underlies arthritis over the course of SJIA.

    View details for DOI 10.1186/1741-7015-10-125

    View details for PubMedID 23092393

  • Proteomic studies in breast cancer (Review). Oncology letters Qin, X. J., Ling, B. X. 2012; 3 (4): 735-743

    Abstract

    Breast cancer is one of the most common types of invasive cancer in females worldwide. Despite major advances in early cancer detection and emerging therapeutic strategies, further improvement has to be achieved for precise diagnosis to reduce the chance of metastasis and relapses. Recent proteomic technologies have offered a promising opportunity for the identification of new breast cancer biomarkers. Matrix-assisted laser desorption/ionization, time-of-flight mass spectrometry (MALDI-TOF MS) and the derived surface-enhanced laser desorption/ionization mass spectrometry (SELDI-TOF MS) enable the development of high-throughput proteome analysis based on comprehensive reliable biomarkers. In this review, we examined proteomic technologies and their applications, and provided focus on the proteomics-based profiling analyses of tumor tissues/cells in order to identify and confirm novel biomarkers of breast cancer.

    View details for DOI 10.3892/ol.2012.573

    View details for PubMedID 22740985

    View details for PubMedCentralID PMC3362396

  • Proteomic studies in breast cancer ONCOLOGY LETTERS Qin, X., Ling, B. X. 2012; 3 (4): 735-743
  • A diagnostic algorithm combining clinical and molecular data distinguishes Kawasaki disease from other febrile illnesses BMC MEDICINE Ling, X. B., Lau, K., Kanegaye, J. T., Pan, Z., Peng, S., Ji, J., Liu, G., Sato, Y., Yu, T. T., Whitin, J. C., Schilling, J., Burns, J. C., Cohen, H. J. 2011; 9

    Abstract

    Kawasaki disease is an acute vasculitis of infants and young children that is recognized through a constellation of clinical signs that can mimic other benign conditions of childhood. The etiology remains unknown and there is no specific laboratory-based test to identify patients with Kawasaki disease. Treatment to prevent the complication of coronary artery aneurysms is most effective if administered early in the course of the illness. We sought to develop a diagnostic algorithm to help clinicians distinguish Kawasaki disease patients from febrile controls to allow timely initiation of treatment.Urine peptidome profiling and whole blood cell type-specific gene expression analyses were integrated with clinical multivariate analysis to improve differentiation of Kawasaki disease subjects from febrile controls.Comparative analyses of multidimensional protein identification using 23 pooled Kawasaki disease and 23 pooled febrile control urine peptide samples revealed 139 candidate markers, of which 13 were confirmed (area under the receiver operating characteristic curve (ROC AUC 0.919)) in an independent cohort of 30 Kawasaki disease and 30 febrile control urine peptidomes. Cell type-specific analysis of microarrays (csSAM) on 26 Kawasaki disease and 13 febrile control whole blood samples revealed a 32-lymphocyte-specific-gene panel (ROC AUC 0.969). The integration of the urine/blood based biomarker panels and a multivariate analysis of 7 clinical parameters (ROC AUC 0.803) effectively stratified 441 Kawasaki disease and 342 febrile control subjects to diagnose Kawasaki disease.A hybrid approach using a multi-step diagnostic algorithm integrating both clinical and molecular findings was successful in differentiating children with acute Kawasaki disease from febrile controls.

    View details for DOI 10.1186/1741-7015-9-130

    View details for Web of Science ID 000298862200001

    View details for PubMedID 22145762

    View details for PubMedCentralID PMC3251532

  • URINARY PEPTIDOMICS IN KIDNEY TRANSPLANTATION IDENTIFIES NOVEL PEPTIDE BIOMARKERS FOR ACUTE REJECTION Sigdel, T., Ling, B., Lau, K., Ying, L., Lau, I., Schilling, J., Sarwal, M. WILEY-BLACKWELL. 2011: 130–130
  • URINARY PROTEOMIC ANALYSIS TO IDENTIFY HOST RESPONSE PROTEINS IN CATHETER-ASSOCIATED URINARY TRACT INFECTION Mach, K., Ling, X., Liao, J. ELSEVIER SCIENCE INC. 2011: E474
  • Proteomics and Biomarkers in Neonatology NeoReviews Ling, X. B., Sylvester, K. G. 2011; 12: 585-91
  • Plasma profiles in active systemic juvenile idiopathic arthritis: Biomarkers and biological implications PROTEOMICS Ling, X. B., Park, J. L., Carroll, T., Nguyen, K. D., Lau, K., Macaubas, C., Chen, E., Lee, T., Sandborg, C., Milojevic, D., Kanegaye, J. T., Gao, S., Burns, J., Schilling, J., Mellins, E. D. 2010; 10 (24): 4415-4430

    Abstract

    Systemic juvenile idiopathic arthritis (SJIA) is a chronic arthritis of children characterized by a combination of arthritis and systemic inflammation. There is usually non-specific laboratory evidence of inflammation at diagnosis but no diagnostic test. Normalized volumes from 89/889 2-D protein spots representing 26 proteins revealed a plasma pattern that distinguishes SJIA flare from quiescence. Highly discriminating spots derived from 15 proteins constitute a robust SJIA flare signature and show specificity for SJIA flare in comparison to active polyarticular juvenile idiopathic arthritis or acute febrile illness. We used 7 available ELISA assays, including one to the complex of S100A8/S100A9, to measure levels of 8 of the15 proteins. Validating our DIGE results, this ELISA panel correctly classified independent SJIA flare samples, and distinguished them from acute febrile illness. Notably, data using the panel suggest its ability to improve on erythrocyte sedimentation rate or C-reactive protein or S100A8/S100A9, either alone or in combination in SJIA F/Q discriminations. Our results also support the panel's potential clinical utility as a predictor of incipient flare (within 9 wk) in SJIA subjects with clinically inactive disease. Pathway analyses of the 15 proteins in the SJIA flare versus quiescence signature corroborate growing evidence for a key role for IL-1 at disease flare.

    View details for DOI 10.1002/pmic.201000298

    View details for PubMedID 21136595

  • Urine Peptidomic and Targeted Plasma Protein Analyses in the Diagnosis and Monitoring of Systemic Juvenile Idiopathic Arthritis. Clinical proteomics Ling, X. B., Lau, K., Deshpande, C., Park, J. L., Milojevic, D., Macaubas, C., Xiao, C., Lopez-Avila, V., Kanegaye, J., Burns, J. C., Cohen, H., Schilling, J., Mellins, E. D. 2010; 6 (4): 175-193

    Abstract

    PURPOSE: Systemic juvenile idiopathic arthritis is a chronic pediatric disease. The initial clinical presentation can mimic other pediatric inflammatory conditions, which often leads to significant delays in diagnosis and appropriate therapy. SJIA biomarker development is an unmet diagnostic/prognostic need to prevent disease complications. EXPERIMENTAL DESIGN: We profiled the urine peptidome to analyze a set of 102 urine samples, from patients with SJIA, Kawasaki disease (KD), febrile illnesses (FI), and healthy controls. A set of 91 plasma samples, from SJIA flare and quiescent patients, were profiled using a customized antibody array against 43 proteins known to be involved in inflammatory and protein catabolic processes. RESULTS: We identified a 17-urine-peptide biomarker panel that could effectively discriminate SJIA patients at active, quiescent, and remission disease states, and patients with active SJIA from confounding conditions including KD and FI. Targeted sequencing of these peptides revealed that they fall into several tight clusters from seven different proteins, suggesting disease-specific proteolytic activities. The antibody array plasma profiling identified an SJIA plasma flare signature consisting of tissue inhibitor of metalloproteinase-1 (TIMP1), interleukin (IL)-18, regulated upon activation, normal T cell expressed and secreted (RANTES), P-Selectin, MMP9, and L-Selectin. CONCLUSIONS AND CLINICAL RELEVANCE: The urine peptidomic and plasma protein analyses have the potential to improve SJIA care and suggest that SJIA urine peptide biomarkers may be an outcome of inflammation-driven effects on catabolic pathways operating at multiple sites. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s12014-010-9058-8) contains supplementary material, which is available to authorized users.

    View details for DOI 10.1007/s12014-010-9058-8

    View details for PubMedID 21124648

    View details for PubMedCentralID PMC2970804

  • Integrative Urinary Peptidomics in Renal Transplantation Identifies Biomarkers for Acute Rejection JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY Ling, X. B., Sigdel, T. K., Lau, K., Ying, L., Lau, I., Schilling, J., Sarwal, M. M. 2010; 21 (4): 646-653

    Abstract

    Noninvasive methods to diagnose rejection of renal allografts are unavailable. Mass spectrometry followed by multiple-reaction monitoring provides a unique approach to identify disease-specific urine peptide biomarkers. Here, we performed urine peptidomic analysis of 70 unique samples from 50 renal transplant patients and 20 controls (n = 20), identifying a specific panel of 40 peptides for acute rejection (AR). Peptide sequencing revealed suggestive mechanisms of graft injury with roles for proteolytic degradation of uromodulin (UMOD) and several collagens, including COL1A2 and COL3A1. The 40-peptide panel discriminated AR in training (n = 46) and test (n = 24) sets (area under ROC curve >0.96). Integrative analysis of transcriptional signals from paired renal transplant biopsies, matched with the urine samples, revealed coordinated transcriptional changes for the corresponding genes in addition to dysregulation of extracellular matrix proteins in AR (MMP-7, SERPING1, and TIMP1). Quantitative PCR on an independent set of 34 transplant biopsies with and without AR validated coordinated changes in expression for the corresponding genes in rejection tissue. A six-gene biomarker panel (COL1A2, COL3A1, UMOD, MMP-7, SERPING1, TIMP1) classified AR with high specificity and sensitivity (area under ROC curve = 0.98). These data suggest that changes in collagen remodeling characterize AR and that detection of the corresponding proteolytic degradation products in urine provides a noninvasive diagnostic approach.

    View details for DOI 10.1681/ASN.2009080876

    View details for Web of Science ID 000276784800017

    View details for PubMedID 20150539

    View details for PubMedCentralID PMC2844301

  • Integrative Urinary Peptidomics in Renal Transplantation Identifies Novel Biomarkers for Acute Rejection 10th American Transplant Congress Sigdel, T., Ling, B., Lau, K., Ying, L., Lau, I., Schilling, J., Sarwal, M. WILEY-BLACKWELL. 2010: 60–60
  • URINE PEPTIDOMICS FOR CLINICAL BIOMARKER DISCOVERY ADVANCES IN CLINICAL CHEMISTRY, VOL 51 Ling, X. B., Mellins, E. D., Sylvester, K. G., Cohen, H. J. 2010; 51: 181-213

    Abstract

    Urine-based proteomic profiling is a novel approach that may result in the discovery of noninvasive biomarkers for diagnosing patients with different diseases, with the aim to ultimately improve clinical outcomes. Given new and emerging analytical technologies and data mining algorithms, the urine peptidome has become a rich resource to uncover naturally occurring peptide biomarkers for both systemic and renal diseases. However, significant analytical hurdles remain in sample collection and storage, experimental design, data analysis, and statistical inference. This study summarizes, focusing on our experiences and perspectives, the progress in addressing these challenges to enable high-throughput urine peptidomics-based biomarker discovery.

    View details for DOI 10.1016/S0065-2423(10)51007-2

    View details for Web of Science ID 000281865700007

    View details for PubMedID 20857622

  • Urine peptidomics for clinical biomarker discovery. Advances in clinical chemistry. Ling, X. B., Mellins, E. D., Karl, K. G., Cohen, H. J. 2010: 51:181-213
  • Effects of moderate versus deep hypothermic circulatory arrest and selective cerebral perfusion on cerebrospinal fluid proteomic profiles in a piglet model of cardiopulmonary bypass JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY Allibhai, T., DiGeronimo, R., Whitin, J., Salazar, J., Yu, T. T., Ling, X. B., Cohen, H., Dixon, P., Madan, A. 2009; 138 (6): 1290-1296

    Abstract

    Our objective was to compare protein profiles of cerebrospinal fluid between control animals and those subjected to cardiopulmonary bypass after moderate versus deep hypothermic circulatory arrest with selective cerebral perfusion.Immature Yorkshire piglets were assigned to one of four study groups: (1) deep hypothermic circulatory arrest at 18 degrees C, (2) deep hypothermic circulatory arrest at 18 degrees C with selective cerebral perfusion, (3) moderate hypothermic circulatory arrest at 25 degrees C with selective cerebral perfusion, or (4) age-matched control animals without surgery. Animals undergoing cardiopulmonary bypass were cooled to their assigned group temperature and exposed to 1 hour of hypothermic circulatory arrest. After arrest, animals were rewarmed, weaned off bypass, and allowed to recover for 4 hours. Cerebrospinal fluid collected from surgical animals after the recovery period was compared with cerebrospinal fluid from controls by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry. Protein spectra were analyzed for differences between groups by Mann-Whitney U test and false discovery rate analysis.Baseline and postbypass physiologic parameters were similar in all surgical groups. A total of 194 protein peaks were detected. Compared with controls, groups 1, 2, and 3 had 64, 100, and 13 peaks that were significantly different, respectively (P < .05). Three of these peaks were present in all three groups. Cerebrospinal fluid protein profiles in animals undergoing cardiopulmonary bypass with moderate hypothermic circulatory arrest (group 3) were more similar to controls than either of the groups subjected to deep hypothermia.The mass spectra of cerebrospinal fluid proteins are altered in piglets exposed to cardiopulmonary bypass and hypothermic circulatory arrest. Moderate hypothermic circulatory arrest (25 degrees C) with selective cerebral perfusion compared with deep hypothermic circulatory arrest (18 degrees C) is associated with fewer changes in cerebrospinal fluid proteins, when compared with nonbypass controls.

    View details for DOI 10.1016/j.jtcvs.2009.06.001

    View details for Web of Science ID 000272029800004

    View details for PubMedID 19660276

  • Plasma Biomarkers in a Mouse Model of Preterm Labor PEDIATRIC RESEARCH Yang, Q., Whitin, J. C., Ling, X. B., Nayak, N. R., Cohen, H. J., Jin, J., Schilling, J., Yu, T. T., Madan, A. 2009; 66 (1): 11-16

    Abstract

    Preterm labor (PTL) is frequently associated with inflammation. We hypothesized that biomarkers during pregnancy can identify pregnancies most at risk for development of PTL. An inflammation-induced mouse model of PTL was used. Surface-enhanced laser desorption/ionization time-of-flight mass spectrometry was used to analyze and compare the plasma protein (PP) profile between CD-1 mice injected intrauterine with either lipopolysaccharide (LPS) or PBS on d 14.5 of gestation. The median differences of normalized PP peaks between the two groups were determined using the Mann-Whitney U test and the false discovery rate. In a second series of experiments, both groups of mice were injected with a lower dose of LPS. A total of 1665 peaks were detected. Thirty peaks were highly differentially expressed (p < 0.0001) between the groups. Two 11 kDa protein peaks were identified by MALDI-TOF/TOF-MS and confirmed to be mouse serum amyloid A (SAA) 1 and 2. Plasma SAA2 levels were increased in LPS-treated animals compared with controls and in LPS-treated animals that delivered preterm vs. those that delivered at term. SAA2 has the potential to be a plasma biomarker that can identify pregnancies at risk for development of PTL.

    View details for Web of Science ID 000267249300004

    View details for PubMedID 19287348

  • FDR made easy in differential feature discovery and correlation analyses BIOINFORMATICS Ling, X. B., Cohen, H., Jin, J., Lau, I., Schilling, J. 2009; 25 (11): 1461-1462

    Abstract

    Rapid progress in technology, particularly in high-throughput biology, allows the analysis of thousands of genes or proteins simultaneously, where the multiple comparison problems occurs. Global false discovery rate (gFDR) analysis statistically controls this error, computing the ratio of the number of false positives over the total number of rejections. Local FDR (lFDR) method can associate the corrected significance measure with each hypothesis testing for its feature-by-feature interpretation. Given the large feature number and sample size in any genomics or proteomics analysis, FDR computation, albeit critical, is both beyond the regular biologists' specialty and computationally expensive, easily exceeding the capacity of desktop computers. To overcome this digital divide, a web portal has been developed that provides bench-side biologists easy access to the server-side computing capabilities to analyze for FDR, differential expressed genes or proteins, and for the correlation between molecular data and clinical measurements.(http://translationalmedicine.stanford.edu/Mass-Conductor/FDR.html).

    View details for DOI 10.1093/bioinformatics/btp176

    View details for Web of Science ID 000266109500030

    View details for PubMedID 19376824

  • Cancer Biomarker Discovery via Targeted Profiling of Multiclass Tumor Tissue-Derived Proteomes Clinical Proteomics. Zhou, L., Cai, M., Ling, X. B., Wang, Q., Lau, K., Zhao, J., Schilling, J., Chen, L. 2009; 5 (3-4): 163-9.
  • Urinary peptidomic analysis identifies potential biomarkers for acute rejection of renal transplantation. Clinical Proteomics. Sigdel, T. K., Ling, X. B., Lau, K., Schilling, J., Sarwal, M. M. 2009; 5: 103-13.
  • Informed decision-making and colorectal cancer screening - Is it occurring in primary care? 30th Annual Meeting of the Society-of-General-Internal-Medicine Ling, B. S., Trauth, J. M., Fine, M. J., Mor, M. K., Resnick, A., Braddock, C. H., Bereknyei, S., Weissfeld, J. L., Schoen, R. E., Ricci, E. M., Whittle, J. LIPPINCOTT WILLIAMS & WILKINS. 2008: S23–S29

    Abstract

    Current recommendations advise patients to participate in the decision-making for selecting a colorectal cancer (CRC) screening option. The degree to which providers communicate the information necessary to prepare patients for participation in this process is not known.To assess the level of informed decision-making occurring during actual patient-provider communications on CRC screening and test for the association between informed decision-making and screening behavior.Observational study of audiotaped clinic visits between patients and their providers in the primary care clinic at a Veterans Administration Medical Center.Male patients, age 50-74 years, presenting to a primary care visit at the study site.The Informed Decision-Making (IDM) Model was used to code the audiotapes for 9 elements of communication that should occur to prepare patients for participation in decision-making. The primary outcome is completion of CRC screening during the study period.The analytic cohort consisted of 91 patients due for CRC screening who had a test ordered at the visit. Six of the 9 IDM elements occurred in < or =20% of the visits with none addressed in > or =50%. CRC screening occurred less frequently for those discussing "pros and cons" (12% vs. 46%, P = 0.01) and "patient preferences" (6% vs. 47%, P = 0.001) compared with those who did not.We found that a lack of informed decision-making occurred during CRC screening discussions and that particular elements of the process were negatively associated with screening. Further research is needed to better understand the effects of informed decision-making on screening behavior.

    View details for Web of Science ID 000258945500005

    View details for PubMedID 18725829

  • Optimizing protein recovery for urinary proteomics, a tool to monitor renal transplantation CLINICAL TRANSPLANTATION Sigdel, T. K., Lau, K., Schilling, J., Sarwal, M. 2008; 22 (5): 617-623

    Abstract

    Despite attractiveness of urine for biomarker discovery for systemic and renal diseases, the confounding effect of the high abundance plasma proteins in urine, and a lack of optimization of urine protein recovery methods are bottlenecks for urine proteomics. Three methods were performed and compared for percentage protein yield, yield consistency, ease and cost of analysis: (i) organic solvent precipitation, (ii) dialysis/lyophilization, and (iii) centrifugal filtration. Urine samples were subjected to an immunoaffinity column to deplete high abundance proteins. Difference gel electrophoresis was performed to assess use of depletion strategy for detection of low abundance proteins. Urine from healthy volunteers (n = 10) and kidney transplant recipients with proteinuria (n = 11) were used. Centrifugal filtration performed best for analysis ease and yield consistency. Highest percentage yield was obtained from dialysis/lyophilization but was laborious and residual salt interfered with subsequent gel electrophoresis. Organic solvent precipitation was inexpensive, but suffered from varying yield consistency. Increased spot intensity for some low abundance and previously undetected proteins were noted after depletion of high abundance proteins. In conclusion, we compare the pros and cons of different protein recovery methods and reveal an increase in the dynamic range of protein detection after depletional strategy that could be critical for biomarker discovery, particularly with reference to processing human study samples from clinical trials.

    View details for DOI 10.1111/j.1399-0012.2008.00833.x

    View details for Web of Science ID 000259341800014

    View details for PubMedID 18459997

  • Novel urinary peptidomic analysis for acute rejection monitoring 8th American Transplant Congress Sigdel, T., Ling, X., Lau, K., Schilling, J., Sarwal, M. WILEY-BLACKWELL. 2008: 636–637
  • High throughput screening informatics COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING Ling, X. B. 2008; 11 (3): 249-257

    Abstract

    High throughput screening (HTS), an industrial effort to leverage developments in the areas of modern robotics, data analysis and control software, liquid handling devices, and sensitive detectors, has played a pivotal role in the drug discovery process, allowing researchers to efficiently screen millions of compounds to identify tractable small molecule modulators of a given biological process or disease state and advance them into high quality leads. As HTS throughput has significantly increased the volume, complexity, and information content of datasets, lead discovery research demands a clear corporate strategy for scientific computing and subsequent establishment of robust enterprise-wide (usually global) informatics platforms, which enable complicated HTS work flows, facilitate HTS data mining, and drive effective decision-making. The purpose of this review is, from the data analysis and handling perspective, to examine key elements in HTS operations and some essential data-related activities supporting or interfacing the screening process, and outline properties that various enabling software should have. Additionally, some general advice for corporate managers with system procurement responsibilities is offered.

    View details for Web of Science ID 000254653500008

    View details for PubMedID 18336217

  • Significance analysis and multiple pharmacophore models for differentiating P-glycoprotein substrates JOURNAL OF CHEMICAL INFORMATION AND MODELING Li, W., Li, L., Eksterowicz, J., Ling, X. B., Cardozo, M. 2007; 47 (6): 2429-2438

    Abstract

    P-glycoprotein (Pgp) mediated drug efflux affects the absorption, distribution, and clearance of a broad structural variety of drugs. Early assessment of the potential of compounds to interact with Pgp can aid in the selection and optimization of drug candidates. To differentiate nonsubstrates from substrates of Pgp, a robust predictive pharmacophore model was targeted in a supervised analysis of three-dimensional (3D) pharmacophores from 163 published compounds. A comprehensive set of pharmacophores has been generated from conformers of whole molecules of both substrates and nonsubstrates of P-glycoprotein. Four-point 3D pharmacophores were employed to increase the amount of shape information and resolution, including the ability to distinguish chirality. A novel algorithm of the pharmacophore-specific t-statistic was applied to the actual structure-activity data and 400 sets of artificial data (sampled by decorrelating the structure and Pgp efflux activity). The optimal size of the significant pharmacophore set was determined through this analysis. A simple classification tree using nine distinct pharmacophores was constructed to distinguish nonsubstrates from substrates of Pgp. An overall accuracy of 87.7% was achieved for the training set and 87.6% for the external independent test set. Furthermore, each of nine pharmacophores can be independently utilized as an accurate marker for potential Pgp substrates.

    View details for DOI 10.1021/ci700284p

    View details for Web of Science ID 000251216500041

    View details for PubMedID 17956085

  • A novel proteomic approach for identification of biomarkers for diagnosis and monitoring of acute rejection. 7th American Transplant Congress Sigdel, T. K., Lau, K., Ling, X. B., Schilling, J., Sarwal, M. WILEY-BLACKWELL. 2007: 149–150
  • GO-Diff: Mining functional differentiation between EST-based transcriptomes BMC BIOINFORMATICS Chen, Z. Z., Wang, W. L., Ling, X. F., Liu, J. J., Chen, L. B. 2006; 7

    Abstract

    Large-scale sequencing efforts produced millions of Expressed Sequence Tags (ESTs) collectively representing differentiated biochemical and functional states. Analysis of these EST libraries reveals differential gene expressions, and therefore EST data sets constitute valuable resources for comparative transcriptomics. To translate differentially expressed genes into a better understanding of the underlying biological phenomena, existing microarray analysis approaches usually involve the integration of gene expression with Gene Ontology (GO) databases to derive comparable functional profiles. However, methods are not available yet to process EST-derived transcription maps to enable GO-based global functional profiling for comparative transcriptomics in a high throughput manner.Here we present GO-Diff, a GO-based functional profiling approach towards high throughput EST-based gene expression analysis and comparative transcriptomics. Utilizing holistic gene expression information, the software converts EST frequencies into EST Coverage Ratios of GO Terms. The ratios are then tested for statistical significances to uncover differentially represented GO terms between the compared transcriptomes, and functional differences are thus inferred. We demonstrated the validity and the utility of this software by identifying differentially represented GO terms in three application cases: intra-species comparison; meta-analysis to test a specific hypothesis; inter-species comparison. GO-Diff findings were consistent with previous knowledge and provided new clues for further discoveries. A comprehensive test on the GO-Diff results using series of comparisons between EST libraries of human and mouse tissues showed acceptable levels of consistency: 61% for human-human; 69% for mouse-mouse; 47% for human-mouse.GO-Diff is the first software integrating EST profiles with GO knowledge databases to mine functional differentiation between biological systems, e.g. tissues of the same species or the same tissue cross species. With rapid accumulation of EST resources in the public domain and expanding sequencing effort in individual laboratories, GO-Diff is useful as a screening tool before undertaking serious expression studies.

    View details for DOI 10.1186/1471-2105-7-72

    View details for Web of Science ID 000235825600001

    View details for PubMedID 16480524

  • Multiclass cancer classification and biomarker discovery using GA-based algorithms BIOINFORMATICS Liu, J. J., Cutler, G., Li, W. X., Pan, Z., Peng, S. H., Hoey, T., Chen, L. B., Ling, X. F. 2005; 21 (11): 2691-2697

    Abstract

    The development of microarray-based high-throughput gene profiling has led to the hope that this technology could provide an efficient and accurate means of diagnosing and classifying tumors, as well as predicting prognoses and effective treatments. However, the large amount of data generated by microarrays requires effective reduction of discriminant gene features into reliable sets of tumor biomarkers for such multiclass tumor discrimination. The availability of reliable sets of biomarkers, especially serum biomarkers, should have a major impact on our understanding and treatment of cancer.We have combined genetic algorithm (GA) and all paired (AP) support vector machine (SVM) methods for multiclass cancer categorization. Predictive features can be automatically determined through iterative GA/SVM, leading to very compact sets of non-redundant cancer-relevant genes with the best classification performance reported to date. Interestingly, these different classifier sets harbor only modest overlapping gene features but have similar levels of accuracy in leave-one-out cross-validations (LOOCV). Further characterization of these optimal tumor discriminant features, including the use of nearest shrunken centroids (NSC), analysis of annotations and literature text mining, reveals previously unappreciated tumor subclasses and a series of genes that could be used as cancer biomarkers. With this approach, we believe that microarray-based multiclass molecular analysis can be an effective tool for cancer biomarker discovery and subsequent molecular cancer diagnosis.

    View details for DOI 10.1093/bioinformatics/bti419

    View details for Web of Science ID 000229441500017

    View details for PubMedID 15814557

  • A machine to make a future - Biotech chronicles. J Clin Invest. Ling, X. B. 2005; 115: 2303-4.
  • Genomic resources for cancer biology researchers. Oncogenomics Handbook Ling, X. B., Cutler, G., Hoey, T. Humana Press. . 2004
  • A comparative analysis of HGSC and Celera human genome assemblies and gene sets BIOINFORMATICS Li, S. Y., Cutler, G., Liu, J. J., Hoey, T., Chen, L. B., Schultz, P. G., Liao, J. Y., Ling, X. F. 2003; 19 (13): 1597-1605

    Abstract

    Since the simultaneous publication of the human genome assembly by the International Human Genome Sequencing Consortium (HGSC) and Celera Genomics, several comparisons have been made of various aspects of these two assemblies. In this work, we set out to provide a more comprehensive comparative analysis of the two assemblies and their associated gene sets.The local sequence content for both draft genome assemblies has been similar since the early releases, however it took a year for the quality of the Celera assembly to approach that of HGSC, suggesting an advantage of HGSC's hierarchical shotgun (HS) sequencing strategy over Celera's whole genome shotgun (WGS) approach. While similar numbers of ab initio predicted genes can be derived from both assemblies, Celera's Otto approach consistently generated larger, more varied gene sets than the Ensembl gene build system. The presence of a non-overlapping gene set has persisted with successive data releases from both groups. Since most of the unique genes from either genome assembly could be mapped back to the other assembly, we conclude that the gene set discrepancies do not reflect differences in local sequence content but rather in the assemblies and especially the different gene-prediction methodologies.

    View details for DOI 10.1093/bioinformatics/btg219

    View details for Web of Science ID 000185310600001

    View details for PubMedID 12967954

  • PRC17, a novel oncogene encoding a Rab GTPase-activating protein, is amplified in prostate cancer CANCER RESEARCH Pei, L., PENG, Y., Yang, Y., Ling, X. F., van Eyndhoven, W. G., Nguyen, K. C., Rubin, M., Hoey, T., Powers, S., Li, J. 2002; 62 (19): 5420-5424

    Abstract

    We used cDNA-based genomic microarrays to examine DNA copy number changes in a panel of prostate tumors and found a previously undescribed amplicon on chromosome 17 containing a novel overexpressed gene that we termed prostate cancer gene 17 (PRC17). When overexpressed in 3T3 mouse fibroblast cells, PRC17 induced growth in low serum, loss of contact inhibition, and tumor formation in nude mice. The PRC17 gene product contains a GTPase-activating protein (GAP) catalytic core motif found in various Rab/Ypt GAPs, including RN-Tre. Similar to RN-Tre, we found that PRC17 protein interacts directly with Rab5 and stimulates its GTP hydrolysis. Point mutations that alter conserved amino acid residues within the PRC17 GAP domain abolished its transforming abilities, suggesting that GAP activity is essential for its oncogenic function. Whereas PRC17 is amplified in 15% of prostate cancers, it is highly overexpressed in approximately one-half of metastatic prostate tumors. The potent oncogenic activity of PRC17 is likely to influence the tumorigenic phenotype of these prostate cancers.

    View details for Web of Science ID 000178378200008

    View details for PubMedID 12359748

  • Comparative analysis of human genome assemblies reveals genome-level differences GENOMICS Li, S. Y., Liao, J. Y., Cutler, G., Hoey, T., Hogenesch, J. B., Cooke, M. P., Schultz, P. G., Ling, X. F. 2002; 80 (2): 138-139

    Abstract

    Previous comparative analysis has revealed a significant disparity between the predicted gene sets produced by the International Human Genome Sequencing Consortium (HGSC) and Celera Genomics. To determine whether the source of this discrepancy was due to underlying differences in the genomic sequences or different gene prediction methodologies, we analyzed both genome assemblies in parallel. Using the GENSCAN gene prediction algorithm, we generated predicted transcriptomes that could be directly compared. BLAST-based comparisons revealed a 20-30% difference between the transcriptomes. Further differences between the two genomes were revealed with protein domain PFAM analyses. These results suggest that fundamental differences between the two genome assemblies are likely responsible for a significant portion of the discrepancy between the transcript sets predicted by the two groups.

    View details for DOI 10.1006/geno.2002.6824

    View details for Web of Science ID 000177393500005

    View details for PubMedID 12160725

  • DQ 65-79, a peptide derived from HLA class II, induces I kappa B expression JOURNAL OF IMMUNOLOGY Jiang, Y., Chen, D., Lyu, S. C., Ling, X. F., KRENSKY, A. M., Clayberger, C. 2002; 168 (7): 3323-3328

    Abstract

    A synthetic peptide corresponding to residues 65-79 of the alpha helix of the alpha-chain of the class II HLA molecule DQA03011 (DQ 65-79) inhibits the proliferation of human T lymphocytes in an allele nonrestricted manner. By using microarray technology, we found that expression of 29 genes was increased or decreased in a human CTL cell line after treatment with DQ 65-79. This study focuses on one of these genes, IkappaB-alpha, whose expression is increased by DQ 65-79. IkappaB proteins, including IkappaB-alpha and IkappaB-beta, are increased in T cells treated with DQ 65-79. Nuclear translocation of the NF-kappaB subunits p65 and p50 is decreased in T cells after treatment with DQ 65-79, while elevated levels of p65 and p50 are present in cytosol. DQ 65-79 inhibits the degradation of IkappaB-alpha mRNA and inhibits the activity of IkappaB kinase. These findings indicate that the DQ 65-79 peptide increases the level of IkappaB proteins, thereby preventing nuclear translocation of the transcription factor, NF-kappaB, and inhibiting T cell proliferation.

    View details for Web of Science ID 000174566400029

    View details for PubMedID 11907089

  • DIAN: A novel algorithm for genome ontological classification GENOME RESEARCH Pouliot, Y., Gao, J., Su, Q. J., Liu, G. Z., Ling, X. F. 2001; 11 (10): 1766-1779

    Abstract

    Faced with the determination of many completely sequenced genomes, computational biology is now faced with the challenge of interpreting the significance of these data sets. A multiplicity of data-related problems impedes this goal: Biological annotations associated with raw data are often not normalized, and the data themselves are often poorly interrelated and their interpretation unclear. All of these problems make interpretation of genomic databases increasingly difficult. With the current explosion of sequences now available from the human genome as well as from model organisms, the importance of sorting this vast amount of conceptually unstructured source data into a limited universe of genes, proteins, functions, structures, and pathways has become a bottleneck for the field. To address this problem, we have developed a method of interrelating data sources by applying a novel method of associating biological objects to ontologies. We have developed an intelligent knowledge-based algorithm, to support biological knowledge mapping, and, in particular, to facilitate the interpretation of genomic data. In this respect, the method makes it possible to inventory genomes by collapsing multiple types of annotations and normalizing them to various ontologies. By relying on a conceptual view of the genome, researchers can now easily navigate the human genome in a biologically intuitive, scientifically accurate manner.

    View details for Web of Science ID 000171456000019

    View details for PubMedID 11591654

  • An immunosuppressive and anti-inflammatory HLA class I-derived peptide binds vascular cell adhesion molecule-1 TRANSPLANTATION Ling, X. F., Tamaki, T., Xiao, Y., Kamangar, S., CLAYBERGER, C., Lewis, D. B., KRENSKY, A. M. 2000; 70 (4): 662-667

    Abstract

    A synthetic peptide corresponding to residues 75-84 of HLA-B2702 modulates immune responses in rodents and humans both in vitro and in vivo.We used a yeast two-hybrid screening, an in vitro biochemical method, and an in vivo animal model.Two cellular receptors for this novel immunomodulatory peptide were identified using a yeast two-hybrid screen: immunoglobulin binding protein (BiP), a member of the heat shock protein 70 family, and vascular cell adhesion molecule (VCAM)-1. Identification of BiP as a ligand for this peptide confirms earlier biochemical findings, while the interaction with VCAM-1 suggests an alternative mechanism of action. Binding to the B2702 peptide but not to closely related variants was confirmed by ligand Western blot analysis and correlated with immunomodulatory activity of each peptide. In mice, an ovalbumin-induced allergic pulmonary response was blocked by in vivo administration of either the B2702 peptide or anti-VLA-4 antibody.We propose that the immunomodulatory effect of the B2702 peptide is caused, in part, by binding to VCAM-1, which then prevents the normal interaction of VCAM-1 with VLA-4.

    View details for Web of Science ID 000088986100021

    View details for PubMedID 10972226

  • Proliferating cell nuclear antigen as the cell cycle sensor for an HLA-derived peptide blocking T cell proliferation JOURNAL OF IMMUNOLOGY Ling, X. F., Kamangar, S., Boytim, M. L., Kelman, Z., Huie, P., Lyu, S. C., Sibley, R. K., Hurwitz, J., CLAYBERGER, C., KRENSKY, A. M. 2000; 164 (12): 6188-6192

    Abstract

    Synthetic peptides corresponding to structural regions of HLA molecules are novel immunosuppressive agents. A peptide corresponding to residues 65-79 of the alpha-chain of HLA-DQA03011 (DQ65-79) blocks cell cycle progression from early G1 to the G1 restriction point, which inhibits cyclin-dependent kinase-2 activity and phosphorylation of the retinoblastoma protein. A yeast two-hybrid screen identified proliferating cell nuclear Ag (PCNA) as a cellular ligand for this peptide, whose interaction with PCNA was further confirmed by in vitro biochemistry. Electron microscopy demonstrates that the DQ65-79 peptide enters the cell and colocalizes with PCNA in the T cell nucleus in vivo. Binding of the DQ65-79 peptide to PCNA did not block polymerase delta (pol delta)-dependent DNA replication in vitro. These findings support a key role for PCNA as a sensor of cell cycle progression and reveal an unanticipated function for conserved regions of HLA molecules.

    View details for Web of Science ID 000087508500014

    View details for PubMedID 10843669

  • All four core histone N-termini contain sequences required for the repression of basal transcription in yeast EMBO JOURNAL Lenfant, F., Mann, R. K., Thomsen, B., Ling, X. F., Grunstein, M. 1996; 15 (15): 3974-3985

    Abstract

    Nucleosomes prevent the recognition of TATA promoter elements by the basal transcriptional machinery in the absence of induction. However, while Saccharomyces cerevisiae histones H3 and H4 contain N-terminal regions involved in the activation and repression of GAL1 and in the expression of heterochromatin-like regions, the sequences involved in repressing basal transcription have not yet been identified. Here, we describe the mapping of new N-terminal domains, in all four core histones (H2A, H2B, H3 and H4), required for the repression of basal, uninduced transcription. Basal transcription was monitored by the use of a GAL1 promoter-URA3 reporter construct whose uninduced activity can be detected through cellular sensitivity to the drug, 5-fluoroorotic acid. We have found for each histone that the N-terminal sequences repressing basal activity are in a short region adjacent to the structured alpha-helical core. Analysis of minichromosome DNA topology demonstrates that the basal domains are required for the proper folding of DNA around the chromosomal particle. Deletion of the basal domain at each histone significantly decreases plasmid superhelical density, which probably reflects a release of DNA from the constraints of the nucleosome into the linker region. This provides a means by which basal factors may recognize otherwise repressed regulatory elements.

    View details for Web of Science ID A1996VC66700022

    View details for PubMedID 8670902

  • Yeast histone H3 and H4 amino termini are important for nucleosome assembly in vivo and in vitro: Redundant and position-independent functions in assembly but not in gene regulation GENES & DEVELOPMENT Ling, X. F., Harkness, T. A., Schultz, M. C., FISHERADAMS, G., Grunstein, M. 1996; 10 (6): 686-699

    Abstract

    The hydrophilic amino-terminal sequences of histones H3 and H4 extend from the highly structured nucleosome core. Here we examine the importance of the amino termini and their position in the nucleosome with regard to both nucleosome assembly and gene regulation. Despite previous conclusions based on nonphysiological nucleosome reconstitution experiments, we find that the histone amino termini are important for nucleosome assembly in vivo and in vitro. Deletion of both tails, a lethal event, alters micrococcal nuclease-generated nucleosomal ladders, plasmid superhelicity in whole cells, and nucleosome assembly in cell extracts. The H3 and H4 amino-terminal tails have redundant functions in this regard because the presence of either tail allows assembly and cellular viability. Moreover, the tails need not be attached to their native carboxy-terminal core. Their exchange re-establishes both cellular viability and nucleosome assembly. In contrast, the regulation of GAL1 and the silent mating loci by the H3 and H4 tails is highly disrupted by exchange of the histone amino termini.

    View details for Web of Science ID A1996UB72900004

    View details for PubMedID 8598296

  • HISTONE H3 AMINO-TERMINUS IS REQUIRED FOR TELOMERIC AND SILENT MATING LOCUS REPRESSION IN YEAST NATURE Thompson, J. S., Ling, X. F., Grunstein, M. 1994; 369 (6477): 245-247

    Abstract

    Heterochromatin is a cytologically visible form of condensed chromatin capable of repressing genes in eukaryotic cells. For the yeast Saccharomyces cerevisiae, despite the absence of observable heterochromatin, there is genetic and chromatin structure data which indicate that there are heterochromatin-like repressive structures. Genes experience position effects at the silent mating loci and the telomeres, resulting in a repressed state that is inherited in an epigenetic manner. The histone H4 amino terminus is required for repression at these loci. Additional studies have indicated that the histone H3 N terminus is not important for silent mating locus repression, but redundancy of repressive elements at the silent mating loci may be responsible for masking its role. Here we report that histone H3 is required for full repression at yeast telomeres and at partially disabled silent mating loci, and that the acetylatable lysine residues of H3 play an important role in silencing.

    View details for Web of Science ID A1994NM06700060

    View details for PubMedID 8183346