Institute Affiliations


All Publications


  • COVID-19 Vaccine Strategy Left Small Primary Care Practices On The Sidelines. Health affairs (Project Hope) Hao, S., Rehkopf, D. H., Velasquez, E., Vala, A., Bazemore, A. W., Phillips, R. L. 2023; 42 (8): 1147-1151

    Abstract

    We report on the experience of small primary care practices participating in a national clinical registry with COVID-19 vaccines and vaccination data. At the end of 2021, 11.2percent of these practices' 3.9million patients had records of COVID-19 vaccination; 43.1percent of clinics had no record of patients' COVID-19 vaccinations, but 93.4percent of clinics had provided or recorded other routine vaccinations.

    View details for DOI 10.1377/hlthaff.2023.00114

    View details for PubMedID 37549323

  • Single center blind testing of a US multi-center validated diagnostic algorithm for Kawasaki disease in Taiwan. Frontiers in immunology Kuo, H. C., Hao, S., Jin, B., Chou, C. J., Han, Z., Chang, L. S., Huang, Y. H., Hwa, K., Whitin, J. C., Sylvester, K. G., Reddy, C. D., Chubb, H., Ceresnak, S. R., Kanegaye, J. T., Tremoulet, A. H., Burns, J. C., McElhinney, D., Cohen, H. J., Ling, X. B. 2022; 13: 1031387

    Abstract

    Kawasaki disease (KD) is the leading cause of acquired heart disease in children. The major challenge in KD diagnosis is that it shares clinical signs with other childhood febrile control (FC) subjects. We sought to determine if our algorithmic approach applied to a Taiwan cohort.A single center (Chang Gung Memorial Hospital in Taiwan) cohort of patients suspected with acute KD were prospectively enrolled by local KD specialists for KD analysis. Our previously single-center developed computer-based two-step algorithm was further tested by a five-center validation in US. This first blinded multi-center trial validated our approach, with sufficient sensitivity and positive predictive value, to identify most patients with KD diagnosed at centers across the US. This study involved 418 KDs and 259 FCs from the Chang Gung Memorial Hospital in Taiwan.Our diagnostic algorithm retained sensitivity (379 of 418; 90.7%), specificity (223 of 259; 86.1%), PPV (379 of 409; 92.7%), and NPV (223 of 247; 90.3%) comparable to previous US 2016 single center and US 2020 fiver center results. Only 4.7% (15 of 418) of KD and 2.3% (6 of 259) of FC patients were identified as indeterminate. The algorithm identified 18 of 50 (36%) KD patients who presented 2 or 3 principal criteria. Of 418 KD patients, 157 were infants younger than one year and 89.2% (140 of 157) were classified correctly. Of the 44 patients with KD who had coronary artery abnormalities, our diagnostic algorithm correctly identified 43 (97.7%) including all patients with dilated coronary artery but one who found to resolve in 8 weeks.This work demonstrates the applicability of our algorithmic approach and diagnostic portability in Taiwan.

    View details for DOI 10.3389/fimmu.2022.1031387

    View details for PubMedID 36263040

    View details for PubMedCentralID PMC9575935

  • A machine-learning algorithm for diagnosis of multisystem inflammatory syndrome in children and Kawasaki disease in the USA: a retrospective model development and validation study. The Lancet. Digital health Lam, J. Y., Shimizu, C., Tremoulet, A. H., Bainto, E., Roberts, S. C., Sivilay, N., Gardiner, M. A., Kanegaye, J. T., Hogan, A. H., Salazar, J. C., Mohandas, S., Szmuszkovicz, J. R., Mahanta, S., Dionne, A., Newburger, J. W., Ansusinha, E., DeBiasi, R. L., Hao, S., Ling, X. B., Cohen, H. J., Nemati, S., Burns, J. C., Pediatric Emergency Medicine Kawasaki Disease Research Group, CHARMS Study Group, Abe, N., Austin-Page, L. R., Bryl, A. W., Donofrio-Odmann, J. J., Ekpenyong, A., Gutglass, D. J., Nguyen, M. B., Schwartz, K., Ulrich, S., Vayngortin, T., Zimmerman, E., Anderson, M., Ang, J. Y., Ashouri, N., Bocchini, J., D'Addese, L., Dominguez, S., Gutierrez, M. P., Harahsheh, A. S., Hite, M., Jone, P., Kumar, M., Manaloor, J. J., Melish, M., Morgan, L., Natale, J. E., Rometo, A., Rosenkranz, M., Rowley, A. H., Samuy, N., Scalici, P., Sykes, M. 2022; 4 (10): e717-e726

    Abstract

    BACKGROUND: Multisystem inflammatory syndrome in children (MIS-C) is a novel disease that was identified during the COVID-19 pandemic and is characterised by systemic inflammation following SARS-CoV-2 infection. Early detection of MIS-C is a challenge given its clinical similarities to Kawasaki disease and other acute febrile childhood illnesses. We aimed to develop and validate an artificial intelligence algorithm that can distinguish among MIS-C, Kawasaki disease, and other similar febrile illnesses and aid in the diagnosis of patients in the emergency department and acute care setting.METHODS: In this retrospective model development and validation study, we developed a deep-learning algorithm called KIDMATCH (Kawasaki Disease vs Multisystem Inflammatory Syndrome in Children) using patient age, the five classic clinical Kawasaki disease signs, and 17 laboratory measurements. All features were prospectively collected at the time of initial evaluation from patients diagnosed with Kawasaki disease or other febrile illness between Jan 1, 2009, and Dec 31, 2019, at Rady Children's Hospital in San Diego (CA, USA). For patients with MIS-C, the same data were collected from patients between May 7, 2020, and July 20, 2021, at Rady Children's Hospital, Connecticut Children's Medical Center in Hartford (CT, USA), and Children's Hospital Los Angeles (CA, USA). We trained a two-stage model consisting of feedforward neural networks to distinguish between patients with MIS-C and those without and then those with Kawasaki disease and other febrile illnesses. After internally validating the algorithm using stratified tenfold cross-validation, we incorporated a conformal prediction framework to tag patients with erroneous data or distribution shifts. We finally externally validated KIDMATCH on patients with MIS-C enrolled between April 22, 2020, and July 21, 2021, from Boston Children's Hospital (MA, USA), Children's National Hospital (Washington, DC, USA), and the CHARMS Study Group consortium of 14 US hospitals.FINDINGS: 1517 patients diagnosed at Rady Children's Hospital between Jan 1, 2009, and June 7, 2021, with MIS-C (n=69), Kawasaki disease (n=775), or other febrile illnesses (n=673) were identified for internal validation, with an additional 16 patients with MIS-C included from Connecticut Children's Medical Center and 50 from Children's Hospital Los Angeles between May 7, 2020, and July 20, 2021. KIDMATCH achieved a median area under the receiver operating characteristic curve during internal validation of 98·8% (IQR 98·0-99·3) in the first stage and 96·0% (95·6-97·2) in the second stage. We externally validated KIDMATCH on 175 patients with MIS-C from Boston Children's Hospital (n=50), Children's National Hospital (n=42), and the CHARMS Study Group consortium of 14 US hospitals (n=83). External validation of KIDMATCH on patients with MIS-C correctly classified 76 of 81 patients (94% accuracy, two rejected by conformal prediction) from 14 hospitals in the CHARMS Study Group consortium, 47 of 49 patients (96% accuracy, one rejected by conformal prediction) from Boston Children's Hospital, and 36 of 40 patients (90% accuracy, two rejected by conformal prediction) from Children's National Hospital.INTERPRETATION: KIDMATCH has the potential to aid front-line clinicians to distinguish between MIS-C, Kawasaki disease, and other similar febrile illnesses to allow prompt treatment and prevent severe complications.FUNDING: US Eunice Kennedy Shriver National Institute of Child Health and Human Development, US National Heart, Lung, and Blood Institute, US Patient-Centered Outcomes Research Institute, US National Library of Medicine, the McCance Foundation, and the Gordon and Marilyn Macklin Foundation.

    View details for DOI 10.1016/S2589-7500(22)00149-2

    View details for PubMedID 36150781

  • A machine-learning algorithm for diagnosis of multisystem inflammatory syndrome in children and Kawasaki disease in the USA: a retrospective model development and validation study LANCET DIGITAL HEALTH Lam, J. Y., Shimizu, C., Tremoulet, A. H., Bainto, E., Roberts, S. C., Sivilay, N., Gardiner, M. A., Kanegaye, J. T., Hogan, A. H., Salazar, J. C., Mohandas, S., Szmuszkovicz, J. R., Mahanta, S., Dionne, A., Newburger, J. W., Ansusinha, E., DeBiasi, R. L., Hao, S., Ling, X. B., Cohen, H. J., Nemati, S., Burns, J. C., Pediat Emergency Med Kawasaki Dis, CHARMS Study Grp 2022; 4 (10): E717-E726
  • Machine Learning for Pediatric Echocardiographic Mitral Regurgitation Detection. Journal of the American Society of Echocardiography : official publication of the American Society of Echocardiography Edwards, L. A., Feng, F., Iqbal, M., Fu, Y., Sanyahumbi, A., Hao, S., McElhinnney, D. B., Ling, X. B., Sable, C., Luo, J. 2022

    Abstract

    Echocardiography-based screening for valvular disease in at-risk asymptomatic children can result in early diagnosis. These screening programs, however, are resource intensive, and may not be feasible in many resource-limited settings. Automated echocardiographic diagnosis may enable more widespread echocardiographic screening, early diagnosis, and improved outcomes. In this feasibility study, we sought to build a machine learning model capable of identifying mitral regurgitation (MR) on echocardiogram.Echocardiograms were labeled by clip for view and by frame for the presence of MR. The labeled data were used to build two convolutional neural networks (CNNs) to perform the stepwise tasks of classifying the clips 1) by view and 2) by the presence of any MR, including physiologic, in parasternal long axis color Doppler views (PLAX-C). We developed the view classification model using 66,330 frames and evaluated model performance using a hold-out testing dataset with 45 echocardiograms (11,730 frames). We developed the MR detection model using 938 frames and evaluated model performance using a hold-out testing dataset with 42 echocardiograms (182 frames). Metrics to evaluate model performance included accuracy, precision, recall, F1 score (average of precision and recall, 0 to 1 with 1 suggesting perfect precision and recall), and receiver-operating characteristic analysis.For the PLAX-C view, the view classification CNN achieved an F1 score of 0.97. The MR detection CNN achieved a testing accuracy of 0.86 and an area under the receiver operating characteristic curve of 0.91.A machine learning model is capable of discerning MR on transthoracic echocardiography. This is an encouraging step toward machine learning-based diagnosis of valvular heart disease on pediatric echocardiograms.

    View details for DOI 10.1016/j.echo.2022.09.017

    View details for PubMedID 36191670

  • Deviation from the precisely timed age-associated patterns revealed by blood metabolomics to find CRC patients at risk of relapse at the CRC diagnosis Thyparambil, S. P., Zhu, X., Zhang, Y., Sun, H., Peng, J., Cai, S., Li, Y., Fu, C., Bao, P., Hao, S., Li, Z., Ding, Y., Yao, X., Liao, W., Heaton, R., Han, Z., Tian, L., Schilling, J., Sylvester, K. G., Ling, X. LIPPINCOTT WILLIAMS & WILKINS. 2022
  • Early-pregnancy prediction of risk for pre-eclampsia using maternal blood leptin/ceramide ratio: discovery and confirmation. BMJ open Huang, Q., Hao, S., You, J., Yao, X., Li, Z., Schilling, J., Thyparambil, S., Liao, W., Zhou, X., Mo, L., Ladella, S., Davies-Balch, S. R., Zhao, H., Fan, D., Whitin, J. C., Cohen, H. J., McElhinney, D. B., Wong, R. J., Shaw, G. M., Stevenson, D. K., Sylvester, K. G., Ling, X. B. 2021; 11 (11): e050963

    Abstract

    OBJECTIVE: This study aimed to develop a blood test for the prediction of pre-eclampsia (PE) early in gestation. We hypothesised that the longitudinal measurements of circulating adipokines and sphingolipids in maternal serum over the course of pregnancy could identify novel prognostic biomarkers that are predictive of impending event of PE early in gestation.STUDY DESIGN: Retrospective discovery and longitudinal confirmation.SETTING: Maternity units from two US hospitals.PARTICIPANTS: Six previously published studies of placental tissue (78 PE and 95 non-PE) were compiled for genomic discovery, maternal sera from 15 women (7 non-PE and 8 PE) enrolled at ProMedDx were used for sphingolipidomic discovery, and maternal sera from 40 women (20 non-PE and 20 PE) enrolled at Stanford University were used for longitudinal observation.OUTCOME MEASURES: Biomarker candidates from discovery were longitudinally confirmed and compared in parallel to the ratio of placental growth factor (PlGF) and soluble fms-like tyrosine kinase (sFlt-1) using the same cohort. The datasets were generated by enzyme-linked immunosorbent and liquid chromatography-tandem mass spectrometric assays.RESULTS: Our discovery integrating genomic and sphingolipidomic analysis identified leptin (Lep) and ceramide (Cer) (d18:1/25:0) as novel biomarkers for early gestational assessment of PE. Our longitudinal observation revealed a marked elevation of Lep/Cer (d18:1/25:0) ratio in maternal serum at a median of 23 weeks' gestation among women with impending PE as compared with women with uncomplicated pregnancy. The Lep/Cer (d18:1/25:0) ratio significantly outperformed the established sFlt-1/PlGF ratio in predicting impending event of PE with superior sensitivity (85% vs 20%) and area under curve (0.92 vs 0.52) from 5 to 25 weeks of gestation.CONCLUSIONS: Our study demonstrated the longitudinal measurement of maternal Lep/Cer (d18:1/25:0) ratio allows the non-invasive assessment of PE to identify pregnancy at high risk in early gestation, outperforming the established sFlt-1/PlGF ratio test.

    View details for DOI 10.1136/bmjopen-2021-050963

    View details for PubMedID 34824115

  • Multi-omics longitudinal analyses in stages I to III CRC patients: Surveillance liquid biopsy test to predict early recurrence and enable risk-stratified postoperative CRC management. Liu, X., Zhang, Y., Zhu, X., Thyparambil, S. P., Liao, W., Zheng, X., You, J., Masood, A., Li, Z., Yang, G., Yao, X., Hao, S., Heaton, R., Schilling, J., Sylvester, K. G., Liao, J., Gao, F., Lan, P., Ling, X., Wu, X. LIPPINCOTT WILLIAMS & WILKINS. 2021
  • Electronic Health Record-Based Prediction of 1-Year Risk of Incident Cardiac Dysrhythmia: Prospective Case-Finding Algorithm Development and Validation Study. JMIR medical informatics Zhang, Y., Han, Y., Gao, P., Mo, Y., Hao, S., Huang, J., Ye, F., Li, Z., Zheng, L., Yao, X., Li, Z., Li, X., Wang, X., Huang, C., Jin, B., Zhang, Y., Yang, G., Alfreds, S. T., Kanov, L., Sylvester, K. G., Widen, E., Li, L., Ling, X. 2021; 9 (2): e23606

    Abstract

    BACKGROUND: Cardiac dysrhythmia is currently an extremely common disease. Severe arrhythmias often cause a series of complications, including congestive heart failure, fainting or syncope, stroke, and sudden death.OBJECTIVE: The aim of this study was to predict incident arrhythmia prospectively within a 1-year period to provide early warning of impending arrhythmia.METHODS: Retrospective (1,033,856 individuals enrolled between October 1, 2016, and October 1, 2017) and prospective (1,040,767 individuals enrolled between October 1, 2017, and October 1, 2018) cohorts were constructed from integrated electronic health records in Maine, United States. An ensemble learning workflow was built through multiple machine learning algorithms. Differentiating features, including acute and chronic diseases, procedures, health status, laboratory tests, prescriptions, clinical utilization indicators, and socioeconomic determinants, were compiled for incident arrhythmia assessment. The predictive model was retrospectively trained and calibrated using an isotonic regression method and was prospectively validated. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC).RESULTS: The cardiac dysrhythmia case-finding algorithm (retrospective: AUROC 0.854; prospective: AUROC 0.827) stratified the population into 5 risk groups: 53.35% (555,233/1,040,767), 44.83% (466,594/1,040,767), 1.76% (18,290/1,040,767), 0.06% (623/1,040,767), and 0.003% (27/1,040,767) were in the very low-risk, low-risk, medium-risk, high-risk, and very high-risk groups, respectively; 51.85% (14/27) patients in the very high-risk subgroup were confirmed to have incident cardiac dysrhythmia within the subsequent 1 year.CONCLUSIONS: Our case-finding algorithm is promising for prospectively predicting 1-year incident cardiac dysrhythmias in a general population, and we believe that our case-finding algorithm can serve as an early warning system to allow statewide population-level screening and surveillance to improve cardiac dysrhythmia care.

    View details for DOI 10.2196/23606

    View details for PubMedID 33595452

  • The Correlation Between SPP1 and Immune Escape of EGFR Mutant Lung Adenocarcinoma Was Explored by Bioinformatics Analysis. Frontiers in oncology Zheng, Y., Hao, S., Xiang, C., Han, Y., Shang, Y., Zhen, Q., Zhao, Y., Zhang, M., Zhang, Y. 2021; 11: 592854

    Abstract

    Background: Immune checkpoint inhibitors have achieved breakthrough efficacy in treating lung adenocarcinoma (LUAD) with wild-type epidermal growth factor receptor (EGFR), leading to the revision of the treatment guidelines. However, most patients with EGFR mutation are resistant to immunotherapy. It is particularly important to study the differences in tumor microenvironment (TME) between patients with and without EGFR mutation. However, relevant research has not been reported. Our previous study showed that secreted phosphoprotein 1 (SPP1) promotes macrophage M2 polarization and PD-L1 expression in LUAD, which may influence response to immunotherapy. Here, we assessed the role of SPP1 in different populations and its effects on the TME.Methods: We compared the expression of SPP1 in LUAD tumor and normal tissues, and in samples with wild-type and mutant EGFR. We also evaluated the influence of SPP1 on survival. The LUAD data sets were downloaded from TCGA and CPTAC databases. Clinicopathologic characteristics associated with overall survival in TCGA were assessed using Cox regression analysis. GSEA revealed that several fundamental signaling pathways were enriched in the high SPP1 expression group. We applied CIBERSORT and xCell to calculate the proportion and abundance of tumor-infiltrating immune cells (TICs) in LUAD, and compared the differences in patients with high or low SPP1 expression and wild-type or mutant EGFR. In addition, we explored the correlation between SPP1 and CD276 for different groups.Results: SPP1 expression was higher in LUAD tumor tissues and in people with EGFR mutation. High SPP1 expression was associated with poor prognosis. Univariate and multivariate cox analysis revealed that up-regulated SPP1 expression was independent indicatorofpoor prognosis. GSEA showed that the SPP1 high expression group was mainly enriched in immunosuppressed pathways. In the SPP1 high expression group, the infiltration of CD8+ T cells was lower and M2-type macrophages was higher. These results were also observed in patients with EGFR mutation. Furthermore, we found that the SPP1 expression was positively correlated with CD276, especially in patients with EGFR mutation.Conclusion: SPP1 levels might be a useful marker of immunosuppression in patients with EGFR mutation, and could offer insight for therapeutics.

    View details for DOI 10.3389/fonc.2021.592854

    View details for PubMedID 34178613

  • Identification of patients at risk of new onset heart failure: Utilizing a large statewide health information exchange to train and validate a risk prediction model. PloS one Duong, S. Q., Zheng, L., Xia, M., Jin, B., Liu, M., Li, Z., Hao, S., Alfreds, S. T., Sylvester, K. G., Widen, E., Teuteberg, J. J., McElhinney, D. B., Ling, X. B. 2021; 16 (12): e0260885

    Abstract

    BACKGROUND: New-onset heart failure (HF) is associated with poor prognosis and high healthcare utilization. Early identification of patients at increased risk incident-HF may allow for focused allocation of preventative care resources. Health information exchange (HIE) data span the entire spectrum of clinical care, but there are no HIE-based clinical decision support tools for diagnosis of incident-HF. We applied machine-learning methods to model the one-year risk of incident-HF from the Maine statewide-HIE.METHODS AND RESULTS: We included subjects aged ≥ 40 years without prior HF ICD9/10 codes during a three-year period from 2015 to 2018, and incident-HF defined as assignment of two outpatient or one inpatient code in a year. A tree-boosting algorithm was used to model the probability of incident-HF in year two from data collected in year one, and then validated in year three. 5,668 of 521,347 patients (1.09%) developed incident-HF in the validation cohort. In the validation cohort, the model c-statistic was 0.824 and at a clinically predetermined risk threshold, 10% of patients identified by the model developed incident-HF and 29% of all incident-HF cases in the state of Maine were identified.CONCLUSIONS: Utilizing machine learning modeling techniques on passively collected clinical HIE data, we developed and validated an incident-HF prediction tool that performs on par with other models that require proactively collected clinical data. Our algorithm could be integrated into other HIEs to leverage the EMR resources to provide individuals, systems, and payors with a risk stratification tool to allow for targeted resource allocation to reduce incident-HF disease burden on individuals and health care systems.

    View details for DOI 10.1371/journal.pone.0260885

    View details for PubMedID 34890438

  • Maternal metabolic profiling to assess fetal gestational age and predict preterm delivery: a two-centre retrospective cohort study in the US. BMJ open Sylvester, K. G., Hao, S., You, J., Zheng, L., Tian, L., Yao, X., Mo, L., Ladella, S., Wong, R. J., Shaw, G. M., Stevenson, D. K., Cohen, H. J., Whitin, J. C., McElhinney, D. B., Ling, X. B. 2020; 10 (12): e040647

    Abstract

    OBJECTIVES: The aim of this study was to develop a single blood test that could determine gestational age and estimate the risk of preterm birth by measuring serum metabolites. We hypothesised that serial metabolic modelling of serum analytes throughout pregnancy could be used to describe fetal gestational age and project preterm birth with a high degree of precision.STUDY DESIGN: A retrospective cohort study.SETTING: Two medical centres from the USA.PARTICIPANTS: Thirty-six patients (20 full-term, 16 preterm) enrolled at Stanford University were used to develop gestational age and preterm birth risk algorithms, 22 patients (9 full-term, 13 preterm) enrolled at the University of Alabama were used to validate the algorithms.OUTCOME MEASURES: Maternal blood was collected serially throughout pregnancy. Metabolic datasets were generated using mass spectrometry.RESULTS: A model to determine gestational age was developed (R2=0.98) and validated (R2=0.81). 66.7% of the estimates fell within ±1week of ultrasound results during model validation. Significant disruptions from full-term pregnancy metabolic patterns were observed in preterm pregnancies (R2=-0.68). A separate algorithm to predict preterm birth was developed using a set of 10 metabolic pathways that resulted in an area under the curve of 0.96 and 0.92, a sensitivity of 0.88 and 0.86, and a specificity of 0.96 and 0.92 during development and validation testing, respectively.CONCLUSIONS: In this study, metabolic profiling was used to develop and test a model for determining gestational age during full-term pregnancy progression, and to determine risk of preterm birth. With additional patient validation studies, these algorithms may be used to identify at-risk pregnancies prompting alterations in clinical care, and to gain biological insights into the pathophysiology of preterm birth. Metabolic pathway-based pregnancy modelling is a novel modality for investigation and clinical application development.

    View details for DOI 10.1136/bmjopen-2020-040647

    View details for PubMedID 33268420

  • Pertussis-like syndrome often not associated with Bordetella pertussis: 5-year study in a large children's hospital. Infectious diseases (London, England) Xiong, Q., Hao, S., Shen, L., Liu, J., Chen, T., Zhang, G., Huang, Y. 2020: 1–7

    Abstract

    Background: Recently, a resurgence of pertussis has been observed worldwide despite broad vaccination coverage. The purpose of this study was to identify the clinical characteristics and the aetiological agent of pertussis-like syndrome (PLS) in Eastern China.Methods: 1168 patients who were diagnosed with a suspected Bordetella pertussis in Shanghai Children's Hospital from 2013 to 2017 were included in the study. Clinical features and aetiologies were analysed. Aetiological analyses in sub-cohorts of age, seasons and years were also investigated.Results: 96.0% (1121) of the patients were less than 12months old. 59.0% (689) of the patients were male. The Top 5 pathogens were respiratory syncytial virus (RSV; n=125; 10.7%), Streptococcus pneumonia (SP; n=109; 9.3%), Haemophilus influenzae type b (HIB; n=86; 7.4%), Bordetella pertussis (B. pertussis; n=84; 7.2%), and Mycoplasma pneumonia (MP; n=80; 6.9%), respectively. The percentage of SP in the age group of 0-3months was significantly lower than that in other age groups. The percentage of B. pertussis in the age group of 3-6months was significantly lower than that in the group of 6-12months. The percentage of MP in 0-3 months' group was significantly lower than that in >12months group. RSV peaked in winter (n=52), while HIB peaked in spring (n=38).Conclusion: PLS occurred most often in infants. RSV, SP, HIB, B. pertussis, and MP were the most prevalent pathogens. Since patients with B. pertussis and other pathogens have similar clinical manifestations, diagnosis of pertussis should be based on both clinical symptoms and laboratory confirmation.

    View details for DOI 10.1080/23744235.2020.1784995

    View details for PubMedID 32589094

  • Towards personalized medicine in maternal and child health: integrating biologic and social determinants. Pediatric research Stevenson, D. K., Wong, R. J., Aghaeepour, N., Maric, I., Angst, M. S., Contrepois, K., Darmstadt, G. L., Druzin, M. L., Eisenberg, M. L., Gaudilliere, B., Gibbs, R. S., Gotlib, I. H., Gould, J. B., Lee, H. C., Ling, X. B., Mayo, J. A., Moufarrej, M. N., Quaintance, C. C., Quake, S. R., Relman, D. A., Sirota, M., Snyder, M. P., Sylvester, K. G., Hao, S., Wise, P. H., Shaw, G. M., Katz, M. 2020

    View details for DOI 10.1038/s41390-020-0981-8

    View details for PubMedID 32454518

  • Deviation from the precisely timed phenomic ageotypes can assist in early CRC screening and reveal underlying pathophysiology. Thyparambil, S. P., You, J., Liu, K., Sun, H., Peng, J., Cai, S., Li, Y., Fu, C., Bao, P., Li, Q., Hao, S., Zhang, Y., Li, Z., Yang, J., Yin, Z., Yao, X., Zhu, X., Schilling, J., Sylvester, K. G., Ling, X. B. LIPPINCOTT WILLIAMS & WILKINS. 2020
  • High-throughput quantitation of serological ceramides/dihydroceramides by LC/MS/MS: Pregnancy baseline biomarkers and potential metabolic messengers. Journal of pharmaceutical and biomedical analysis Huang, Q. n., Hao, S. n., Yao, X. n., You, J. n., Li, X. n., Lai, D. n., Han, C. n., Schilling, J. n., Hwa, K. Y., Thyparambil, S. n., Whitin, J. n., Cohen, H. J., Chubb, H. n., Ceresnak, S. R., McElhinney, D. B., Wong, R. J., Shaw, G. M., Stevenson, D. K., Sylvester, K. G., Ling, X. B. 2020; 192: 113639

    Abstract

    Ceramides and dihydroceramides are sphingolipids that present in abundance at the cellular membrane of eukaryotes. Although their metabolic dysregulation has been implicated in many diseases, our knowledge about circulating ceramide changes during the pregnancy remains limited. In this study, we present the development and validation of a high-throughput liquid chromatography-tandem mass spectrometric method for simultaneous quantification of 16 ceramides and 10 dihydroceramides in human serum within 5 min. by using stable isotope-labeled ceramides as internal standards. This method employs a protein precipitation method for high throughput sample preparation, reverse phase isocratic elusion for chromatographic separation, and Multiple Reaction Monitoring for mass spectrometric detection. To qualify for clinical applications, our assay has been validated against the FDA guidelines for Lower Limit of Quantitation (1 nM), linearity (R2>0.99), precision (imprecision<15 %), accuracy (inaccuracy<15 %), extraction recovery (>90 %), stability (>85 %), and carryover (<0.01 %). With enhanced sensitivity and specificity from this method, we have, for the first time, determined the serological levels of ceramides and dihydroceramides to reveal unique temporal gestational patterns. Our approach could have value in providing insights into disorders of pregnancy.

    View details for DOI 10.1016/j.jpba.2020.113639

    View details for PubMedID 33017796

  • Multicentre validation of a computer-based tool for differentiation of acute Kawasaki disease from clinically similar febrile illnesses. Archives of disease in childhood Hao, S. n., Ling, X. B., Kanegaye, J. T., Bainto, E. n., Dominguez, S. R., Heizer, H. n., Jone, P. N., Anderson, M. S., Jaggi, P. n., Baker, A. n., Son, M. B., Newberger, J. W., Ashouri, N. n., McElhinney, D. B., Burns, J. C., Whitin, J. C., Cohen, H. J., Tremoulet, A. H. 2020

    Abstract

    The clinical features of Kawasaki disease (KD) overlap with those of other paediatric febrile illnesses. A missed or delayed diagnosis increases the risk of coronary artery damage. Our computer algorithm for KD and febrile illness differentiation had a sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of 94.8%, 70.8%, 93.7% and 98.3%, respectively, in a single-centre validation study. We sought to determine the performance of this algorithm with febrile children from multiple institutions across the USA.We used our previously published 18-variable panel that includes illness day, the five KD clinical criteria and readily available laboratory values. We applied this two-step algorithm using a linear discriminant analysis-based clinical model followed by a random forest-based algorithm to a cohort of 1059 acute KD and 282 febrile control patients from five children's hospitals across the USA.The algorithm correctly classified 970 of 1059 patients with KD and 163 of 282 febrile controls resulting in a sensitivity of 91.6%, specificity of 57.8% and PPV and NPV of 95.4% and 93.1%, respectively. The algorithm also correctly identified 218 of the 232 KD patients (94.0%) with abnormal echocardiograms.The expectation is that the predictive accuracy of the algorithm will be reduced in a real-world setting in which patients with KD are rare and febrile controls are common. However, the results of the current analysis suggest that this algorithm warrants a prospective, multicentre study to evaluate its potential utility as a physician support tool.

    View details for DOI 10.1136/archdischild-2019-317980

    View details for PubMedID 32139365

  • Prediction for Intravenous Immunoglobulin Resistance Combining Genetic Risk Loci Identified From Next Generation Sequencing and Laboratory Data in Kawasaki Disease. Frontiers in pediatrics Chen, L., Song, S., Ning, Q., Zhu, D., Jia, J., Zhang, H., Zhao, J., Hao, S., Liu, F., Chu, C., Huang, M., Chen, S., Xie, L., Xiao, T., Huang, M. 2020; 8: 462367

    Abstract

    Background: Kawasaki disease (KD) is the most common cause of acquired heart disease. A proportion of patients were resistant to intravenous immunoglobulin (IVIG), the primary treatment of KD, and the mechanism of IVIG resistance remains unclear. The accuracy of current models predictive of IVIG resistance is insufficient and doesn't meet the clinical expectations. Objectives: To develop a scoring model predicting IVIG resistance of patients with KD. Methods: We recruited 330 KD patients (50 IVIG non-responders, 280 IVIG responders) and 105 healthy children to explore the susceptibility loci of IVIG resistance in Kawasaki disease. A next generation sequencing technology that focused on 4 immune-related pathways and 472 single nucleotide polymorphisms (SNPs) was performed. An R package SNPassoc was used to identify the risk loci, and student's t-test was used to identify risk factors associated with IVIG resistance. A random forest-based scoring model of IVIG resistance was built based on the identified specific SNP loci with the laboratory data. Results: A total of 544 significant risk loci were found associated with IVIG resistance, including 27 previous published SNPs. Laboratory test variables, including erythrocyte sedimentation rate (ESR), platelet (PLT), and C reactive protein, were found significantly different between IVIG responders and non-responders. A scoring model was built using the top 9 SNPs and clinical features achieving an area under the ROC curve of 0.974. Conclusions: It is the first study that focused on immune system in KD using high-throughput sequencing technology. Our findings provided a prediction of the IVIG resistance by integrating the genotype and clinical variables. It also suggested a new perspective on the pathogenesis of IVIG resistance.

    View details for DOI 10.3389/fped.2020.462367

    View details for PubMedID 33344378

  • Kinetics of SARS-CoV-2 positivity of infected and recovered patients from a single center. Scientific reports Huang, J. n., Zheng, L. n., Li, Z. n., Hao, S. n., Ye, F. n., Chen, J. n., Gans, H. A., Yao, X. n., Liao, J. n., Wang, S. n., Zeng, M. n., Qiu, L. n., Li, C. n., Whitin, J. C., Tian, L. n., Chubb, H. n., Hwa, K. Y., Ceresnak, S. R., Zhang, W. n., Lu, Y. n., Maldonado, Y. A., McElhinney, D. B., Sylvester, K. G., Cohen, H. J., Liu, L. n., Ling, X. B. 2020; 10 (1): 18629

    Abstract

    Recurrence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) positive detection in infected but recovered individuals has been reported. Patients who have recovered from coronavirus disease 2019 (COVID-19) could profoundly impact the health care system. We sought to define the kinetics and relevance of PCR-positive recurrence during recovery from acute COVID-19 to better understand risks for prolonged infectivity and reinfection. A series of 414 patients with confirmed SARS-Cov-2 infection, at The Second Affiliated Hospital of Southern University of Science and Technology in Shenzhen, China from January 11 to April 23, 2020. Statistical analyses were performed of the clinical, laboratory, radiologic image, medical treatment, and clinical course of admission/quarantine/readmission data, and a recurrence predictive algorithm was developed. 16.7% recovered patients with PCR positive recurring one to three times, despite being in strict quarantine. Younger patients with mild pulmonary respiratory syndrome had higher risk of PCR positivity recurrence. The recurrence prediction model had an area under the ROC curve of 0.786. This case series provides characteristics of patients with recurrent SARS-CoV-2 positivity. Use of a prediction algorithm may identify patients at high risk of recurrent SARS-CoV-2 positivity and help to establish protocols for health policy.

    View details for DOI 10.1038/s41598-020-75629-x

    View details for PubMedID 33122706

  • Changes in pregnancy-related serum biomarkers early in gestation are associated with later development of preeclampsia. PloS one Hao, S. n., You, J. n., Chen, L. n., Zhao, H. n., Huang, Y. n., Zheng, L. n., Tian, L. n., Maric, I. n., Liu, X. n., Li, T. n., Bianco, Y. K., Winn, V. D., Aghaeepour, N. n., Gaudilliere, B. n., Angst, M. S., Zhou, X. n., Li, Y. M., Mo, L. n., Wong, R. J., Shaw, G. M., Stevenson, D. K., Cohen, H. J., Mcelhinney, D. B., Sylvester, K. G., Ling, X. B. 2020; 15 (3): e0230000

    Abstract

    Placental protein expression plays a crucial role during pregnancy. We hypothesized that: (1) circulating levels of pregnancy-associated, placenta-related proteins throughout gestation reflect the temporal progression of the uncomplicated, full-term pregnancy, and can effectively estimate gestational ages (GAs); and (2) preeclampsia (PE) is associated with disruptions in these protein levels early in gestation; and can identify impending PE. We also compared gestational profiles of proteins in the human and mouse, using pregnant heme oxygenase-1 (HO-1) heterozygote (Het) mice, a mouse model reflecting PE-like symptoms.Serum levels of placenta-related proteins-leptin (LEP), chorionic somatomammotropin hormone like 1 (CSHL1), elabela (ELA), activin A, soluble fms-like tyrosine kinase 1 (sFlt-1), and placental growth factor (PlGF)-were quantified by ELISA in blood serially collected throughout human pregnancies (20 normal subjects with 66 samples, and 20 subjects who developed PE with 61 samples). Multivariate analysis was performed to estimate the GA in normal pregnancy. Mean-squared errors of GA estimations were used to identify impending PE. The human protein profiles were then compared with those in the pregnant HO-1 Het mice.An elastic net-based gestational dating model was developed (R2 = 0.76) and validated (R2 = 0.61) using serum levels of the 6 proteins measured at various GAs from women with normal uncomplicated pregnancies. In women who developed PE, the model was not (R2 = -0.17) associated with GA. Deviations from the model estimations were observed in women who developed PE (P = 0.01). The model developed with 5 proteins (ELA excluded) performed similarly from sera from normal human (R2 = 0.68) and WT mouse (R2 = 0.85) pregnancies. Disruptions of this model were observed in both human PE-associated (R2 = 0.27) and mouse HO-1 Het (R2 = 0.30) pregnancies. LEP outperformed sFlt-1 and PlGF in differentiating impending PE at early human and late mouse GAs.Serum placenta-related protein profiles are temporally regulated throughout normal pregnancies and significantly disrupted in women who develop PE. LEP changes earlier than the well-established biomarkers (sFlt-1 and PlGF). There may be evidence of a causative action of HO-1 deficiency in LEP upregulation in a PE-like murine model.

    View details for DOI 10.1371/journal.pone.0230000

    View details for PubMedID 32126118

  • Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm. International journal of medical informatics Ye, C. n., Li, J. n., Hao, S. n., Liu, M. n., Jin, H. n., Zheng, L. n., Xia, M. n., Jin, B. n., Zhu, C. n., Alfreds, S. T., Stearns, F. n., Kanov, L. n., Sylvester, K. G., Widen, E. n., McElhinney, D. n., Ling, X. B. 2020; 137: 104105

    Abstract

    Predicting the risk of falls in advance can benefit the quality of care and potentially reduce mortality and morbidity in the older population. The aim of this study was to construct and validate an electronic health record-based fall risk predictive tool to identify elders at a higher risk of falls.The one-year fall prediction model was developed using the machine-learning-based algorithm, XGBoost, and tested on an independent validation cohort. The data were collected from electronic health records (EHR) of Maine from 2016 to 2018, comprising 265,225 older patients (≥65 years of age).This model attained a validated C-statistic of 0.807, where 50 % of the identified high-risk true positives were confirmed to fall during the first 94 days of next year. The model also captured in advance 58.01 % and 54.93 % of falls that happened within the first 30 and 30-60 days of next year. The identified high-risk patients of fall showed conditions of severe disease comorbidities, an enrichment of fall-increasing cardiovascular and mental medication prescriptions and increased historical clinical utilization, revealing the complexity of the underlying fall etiology. The XGBoost algorithm captured 157 impactful predictors into the final predictive model, where cognitive disorders, abnormalities of gait and balance, Parkinson's disease, fall history and osteoporosis were identified as the top-5 strongest predictors of the future fall event.By using the EHR data, this risk assessment tool attained an improved discriminative ability and can be immediately deployed in the health system to provide automatic early warnings to older adults with increased fall risk and identify their personalized risk factors to facilitate customized fall interventions.

    View details for DOI 10.1016/j.ijmedinf.2020.104105

    View details for PubMedID 32193089

  • Development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records. Translational psychiatry Zheng, L. n., Wang, O. n., Hao, S. n., Ye, C. n., Liu, M. n., Xia, M. n., Sabo, A. N., Markovic, L. n., Stearns, F. n., Kanov, L. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Zhang, W. n., Liao, J. n., Ling, X. B. 2020; 10 (1): 72

    Abstract

    Suicide is the tenth leading cause of death in the United States (US). An early-warning system (EWS) for suicide attempt could prove valuable for identifying those at risk of suicide attempts, and analyzing the contribution of repeated attempts to the risk of eventual death by suicide. In this study we sought to develop an EWS for high-risk suicide attempt patients through the development of a population-based risk stratification surveillance system. Advanced machine-learning algorithms and deep neural networks were utilized to build models with the data from electronic health records (EHRs). A final risk score was calculated for each individual and calibrated to indicate the probability of a suicide attempt in the following 1-year time period. Risk scores were subjected to individual-level analysis in order to aid in the interpretation of the results for health-care providers managing the at-risk cohorts. The 1-year suicide attempt risk model attained an area under the curve (AUC ROC) of 0.792 and 0.769 in the retrospective and prospective cohorts, respectively. The suicide attempt rate in the "very high risk" category was 60 times greater than the population baseline when tested in the prospective cohorts. Mental health disorders including depression, bipolar disorders and anxiety, along with substance abuse, impulse control disorders, clinical utilization indicators, and socioeconomic determinants were recognized as significant features associated with incident suicide attempt.

    View details for DOI 10.1038/s41398-020-0684-2

    View details for PubMedID 32080165

  • A Real-Time Early Warning System for Monitoring Inpatient Mortality Risk: Prospective Study Using Electronic Medical Record Data. Journal of medical Internet research Ye, C., Wang, O., Liu, M., Zheng, L., Xia, M., Hao, S., Jin, B., Jin, H., Zhu, C., Huang, C. J., Gao, P., Ellrodt, G., Brennan, D., Stearns, F., Sylvester, K. G., Widen, E., McElhinney, D. B., Ling, X. 2019; 21 (7): e13719

    Abstract

    BACKGROUND: The rapid deterioration observed in the condition of some hospitalized patients can be attributed to either disease progression or imperfect triage and level of care assignment after their admission. An early warning system (EWS) to identify patients at high risk of subsequent intrahospital death can be an effective tool for ensuring patient safety and quality of care and reducing avoidable harm and costs.OBJECTIVE: The aim of this study was to prospectively validate a real-time EWS designed to predict patients at high risk of inpatient mortality during their hospital episodes.METHODS: Data were collected from the system-wide electronic medical record (EMR) of two acute Berkshire Health System hospitals, comprising 54,246 inpatient admissions from January 1, 2015, to September 30, 2017, of which 2.30% (1248/54,246) resulted in intrahospital deaths. Multiple machine learning methods (linear and nonlinear) were explored and compared. The tree-based random forest method was selected to develop the predictive application for the intrahospital mortality assessment. After constructing the model, we prospectively validated the algorithms as a real-time inpatient EWS for mortality.RESULTS: The EWS algorithm scored patients' daily and long-term risk of inpatient mortality probability after admission and stratified them into distinct risk groups. In the prospective validation, the EWS prospectively attained a c-statistic of 0.884, where 99 encounters were captured in the highest risk group, 69% (68/99) of whom died during the episodes. It accurately predicted the possibility of death for the top 13.3% (34/255) of the patients at least 40.8 hours before death. Important clinical utilization features, together with coded diagnoses, vital signs, and laboratory test results were recognized as impactful predictors in the final EWS.CONCLUSIONS: In this study, we prospectively demonstrated the capability of the newly-designed EWS to monitor and alert clinicians about patients at high risk of in-hospital death in real time, thereby providing opportunities for timely interventions. This real-time EWS is able to assist clinical decision making and enable more actionable and effective individualized care for patients' better health outcomes in target medical facilities.

    View details for DOI 10.2196/13719

    View details for PubMedID 31278734

  • Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine JOURNAL OF MEDICAL INTERNET RESEARCH Wang, X., Zhang, Y., Hao, S., Zheng, L., Liao, J., Ye, C., Xia, M., Wang, O., Liu, M., Weng, C., Duong, S. Q., Jin, B., Alfreds, S. T., Stearns, F., Kanov, L., Sylvester, K. G., Widen, E., McElhinney, D. B., Ling, X. B. 2019; 21 (5)

    View details for DOI 10.2196/13260

    View details for Web of Science ID 000468102900001

  • A proteomic clock for malignant gliomas: The role of the environment in tumorigenesis at the presymptomatic stage. PloS one Zheng, L. n., Zhang, Y. n., Hao, S. n., Chen, L. n., Sun, Z. n., Yan, C. n., Whitin, J. C., Jang, T. n., Merchant, M. n., McElhinney, D. B., Sylvester, K. G., Cohen, H. J., Recht, L. n., Yao, X. n., Ling, X. B. 2019; 14 (10): e0223558

    Abstract

    Malignant gliomas remain incurable with a poor prognosis despite of aggressive treatment. We have been studying the development of brain tumors in a glioma rat model, where rats develop brain tumors after prenatal exposure to ethylnitrosourea (ENU), and there is a sizable interval between when the first pathological changes are noted and tumors become detectable with MRI. Our aim to define a molecular timeline through proteomic profiling of the cerebrospinal fluid (CSF) such that brain tumor commitment can be revealed earlier than at the presymptomatic stage. A comparative proteomic approach was applied to profile CSF collected serially either before, at and after the time MRI becomes positive. Elastic net (EN) based models were developed to infer the timeline of normal or tumor development respectively, mirroring a chronology of precisely timed, "clocked", adaptations. These CSF changes were later quantified by longitudinal entropy analyses of the EN predictive metric. False discovery rates (FDR) were computed to control the expected proportion of the EN models that are due to multiple hypothesis testing. Our ENU rat brain tumor dating EN model indicated that protein content in CSF is programmed even before tumor MRI detection. The findings of the precisely timed CSF tumor microenvironment changes at presymptomatic stages, deviation from the normal development timeline, may provide the groundwork for the understanding of adaptation of the brain environment in tumorigenesis to devise effective brain tumor management strategies.

    View details for DOI 10.1371/journal.pone.0223558

    View details for PubMedID 31600288

  • Improved detection of prostate cancer using a magneto-nanosensor assay for serum circulating autoantibodies. PloS one Xu, L., Lee, J., Hao, S., Ling, X. B., Brooks, J. D., Wang, S. X., Gambhir, S. S. 2019; 14 (8): e0221051

    Abstract

    PURPOSE: To develop a magneto-nanosensor (MNS) based multiplex assay to measure protein and autoantibody biomarkers from human serum for prostate cancer (CaP) diagnosis.MATERIALS AND METHODS: A 4-panel MNS autoantibody assay and a MNS protein assay were developed and optimized in our labs. Using these assays, serum concentration of six biomarkers including prostate-specific antigen (PSA) protein, free/total PSA ratio, as well as four autoantibodies against Parkinson disease 7 (PARK7), TAR DNA-binding protein 43 (TARDBP), Talin 1 (TLN1), and Caldesmon 1 (CALD1) and were analyzed. Human serum samples from 99 patients (50 with non-cancer and 49 with clinically localized CaP) were evaluated.RESULTS: The MNS assay showed excellent performance characteristics and no cross-reactivity. All autoantibody assays showed a statistically significant difference between CaP and non-cancer samples except for PARK7. The most significant difference was the combination of the four autoantibodies as a panel in addition to the free/total PSA ratio. This combination had the highest area under the curve (AUC)- 0.916 in ROC analysis.CONCLUSIONS: Our results suggest that this autoantibody panel along with PSA and free PSA have potential to segregate patients without cancer from those with prostate cancer with higher sensitivity and specificity than PSA alone.

    View details for DOI 10.1371/journal.pone.0221051

    View details for PubMedID 31404106

  • Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine. Journal of medical Internet research Wang, X. n., Zhang, Y. n., Hao, S. n., Zheng, L. n., Liao, J. n., Ye, C. n., Xia, M. n., Wang, O. n., Liu, M. n., Weng, C. H., Duong, S. Q., Jin, B. n., Alfreds, S. T., Stearns, F. n., Kanov, L. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Ling, X. B. 2019; 21 (5): e13260

    Abstract

    Lung cancer is the leading cause of cancer death worldwide. Early detection of individuals at risk of lung cancer is critical to reduce the mortality rate.The aim of this study was to develop and validate a prospective risk prediction model to identify patients at risk of new incident lung cancer within the next 1 year in the general population.Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. The study population consisted of patients with at least one EHR between April 1, 2016, and March 31, 2018, who had no history of lung cancer. A retrospective cohort (N=873,598) and a prospective cohort (N=836,659) were formed for model construction and validation. An Extreme Gradient Boosting (XGBoost) algorithm was adopted to build the model. It assigned a score to each individual to quantify the probability of a new incident lung cancer diagnosis from October 1, 2016, to September 31, 2017. The model was trained with the clinical profile in the retrospective cohort from the preceding 6 months and validated with the prospective cohort to predict the risk of incident lung cancer from April 1, 2017, to March 31, 2018.The model had an area under the curve (AUC) of 0.881 (95% CI 0.873-0.889) in the prospective cohort. Two thresholds of 0.0045 and 0.01 were applied to the predictive scores to stratify the population into low-, medium-, and high-risk categories. The incidence of lung cancer in the high-risk category (579/53,922, 1.07%) was 7.7 times higher than that in the overall cohort (1167/836,659, 0.14%). Age, a history of pulmonary diseases and other chronic diseases, medications for mental disorders, and social disparities were found to be associated with new incident lung cancer.We retrospectively developed and prospectively validated an accurate risk prediction model of new incident lung cancer occurring in the next 1 year. Through statistical learning from the statewide EHR data in the preceding 6 months, our model was able to identify statewide high-risk patients, which will benefit the population health through establishment of preventive interventions or more intensive surveillance.

    View details for PubMedID 31099339

  • Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning. Journal of medical Internet research Ye, C. n., Fu, T. n., Hao, S. n., Zhang, Y. n., Wang, O. n., Jin, B. n., Xia, M. n., Liu, M. n., Zhou, X. n., Wu, Q. n., Guo, Y. n., Zhu, C. n., Li, Y. M., Culver, D. S., Alfreds, S. T., Stearns, F. n., Sylvester, K. G., Widen, E. n., McElhinney, D. n., Ling, X. n. 2018; 20 (1): e22

    Abstract

    As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke.The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year.Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual.The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension.With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.

    View details for PubMedID 29382633

  • Gene expression network analysis in aneuploid human trophoblast progenitor cells (TBPC) reveals modular structures Leon-Martinez, D., Ling, X., Hao, S., Sylvester, K., Bianco, K. MOSBY-ELSEVIER. 2018: S163–S164
  • Assessing Statewide All-Cause Future One-Year Mortality: Prospective Study With Implications for Quality of Life, Resource Utilization, and Medical Futility. Journal of medical Internet research Guo, Y. n., Zheng, G. n., Fu, T. n., Hao, S. n., Ye, C. n., Zheng, L. n., Liu, M. n., Xia, M. n., Jin, B. n., Zhu, C. n., Wang, O. n., Wu, Q. n., Culver, D. S., Alfreds, S. T., Stearns, F. n., Kanov, L. n., Bhatia, A. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Ling, X. B. 2018; 20 (6): e10311

    Abstract

    For many elderly patients, a disproportionate amount of health care resources and expenditures is spent during the last year of life, despite the discomfort and reduced quality of life associated with many aggressive medical approaches. However, few prognostic tools have focused on predicting all-cause 1-year mortality among elderly patients at a statewide level, an issue that has implications for improving quality of life while distributing scarce resources fairly.Using data from a statewide elderly population (aged ≥65 years), we sought to prospectively validate an algorithm to identify patients at risk for dying in the next year for the purpose of minimizing decision uncertainty, improving quality of life, and reducing futile treatment.Analysis was performed using electronic medical records from the Health Information Exchange in the state of Maine, which covered records of nearly 95% of the statewide population. The model was developed from 125,896 patients aged at least 65 years who were discharged from any care facility in the Health Information Exchange network from September 5, 2013, to September 4, 2015. Validation was conducted using 153,199 patients with same inclusion and exclusion criteria from September 5, 2014, to September 4, 2016. Patients were stratified into risk groups. The association between all-cause 1-year mortality and risk factors was screened by chi-squared test and manually reviewed by 2 clinicians. We calculated risk scores for individual patients using a gradient tree-based boost algorithm, which measured the probability of mortality within the next year based on the preceding 1-year clinical profile.The development sample included 125,896 patients (72,572 women, 57.64%; mean 74.2 [SD 7.7] years). The final validation cohort included 153,199 patients (88,177 women, 57.56%; mean 74.3 [SD 7.8] years). The c-statistic for discrimination was 0.96 (95% CI 0.93-0.98) in the development group and 0.91 (95% CI 0.90-0.94) in the validation cohort. The mortality was 0.99% in the low-risk group, 16.75% in the intermediate-risk group, and 72.12% in the high-risk group. A total of 99 independent risk factors (n=99) for mortality were identified (reported as odds ratios; 95% CI). Age was on the top of list (1.41; 1.06-1.48); congestive heart failure (20.90; 15.41-28.08) and different tumor sites were also recognized as driving risk factors, such as cancer of the ovaries (14.42; 2.24-53.04), colon (14.07; 10.08-19.08), and stomach (13.64; 3.26-86.57). Disparities were also found in patients' social determinants like respiratory hazard index (1.24; 0.92-1.40) and unemployment rate (1.18; 0.98-1.24). Among high-risk patients who expired in our dataset, cerebrovascular accident, amputation, and type 1 diabetes were the top 3 diseases in terms of average cost in the last year of life.Our study prospectively validated an accurate 1-year risk prediction model and stratification for the elderly population (≥65 years) at risk of mortality with statewide electronic medical record datasets. It should be a valuable adjunct for helping patients to make better quality-of-life choices and alerting care givers to target high-risk elderly for appropriate care and discussions, thus cutting back on futile treatment.

    View details for PubMedID 29866643

  • Fluid overload independent of acute kidney injury predicts poor outcomes in neonates following congenital heart surgery. Pediatric nephrology (Berlin, Germany) Mah, K. E., Hao, S. n., Sutherland, S. M., Kwiatkowski, D. M., Axelrod, D. M., Almond, C. S., Krawczeski, C. D., Shin, A. Y. 2017

    Abstract

    Fluid overload (FO) is common after neonatal congenital heart surgery and may contribute to mortality and morbidity. It is unclear if the effects of FO are independent of acute kidney injury (AKI).This was a retrospective cohort study which examined neonates (age < 30 days) who underwent cardiopulmonary bypass in a university-affiliated children's hospital between 20 October 2010 and 31 December 2012. Demographic information, risk adjustment for congenital heart surgery score, surgery type, cardiopulmonary bypass time, cross-clamp time, and vasoactive inotrope score were recorded. FO [(fluid in-out)/pre-operative weight] and AKI defined by Kidney Disease Improving Global Outcomes serum creatinine criteria were calculated. Outcomes were all-cause, in-hospital mortality and median postoperative hospital and intensive care unit lengths of stay.Overall, 167 neonates underwent cardiac surgery using cardiopulmonary bypass in the study period, of whom 117 met the inclusion criteria. Of the 117 neonates included in the study, 76 (65%) patients developed significant FO (>10%), and 25 (21%) developed AKI ≥ Stage 2. When analyzed as FO cohorts (< 10%,10-20%, > 20% FO), patients with greater FO were more likely to have AKI (9.8 vs. 18.2 vs. 52.4%, respectively, with AKI ≥ stage 2; p = 0.013) and a higher vasoactive-inotrope score, and be premature. In the multivariable regression analyses of patients without AKI, FO was independently associated with hospital and intensive care unit lengths of stay [0.322 extra days (p = 0.029) and 0.468 extra days (p < 0.001), respectively, per 1% FO increase). In all patients, FO was also associated with mortality [odds ratio 1.058 (5.8% greater odds of mortality per 1% FO increase); 95% confidence interval 1.008,1.125;p = 0.032].Fluid overload is an important independent contributor to outcomes in neonates following congenital heart surgery. Careful fluid management after cardiac surgery in neonates with and without AKI is warranted.

    View details for PubMedID 29128923

  • Defining and characterizing the critical transition state prior to the type 2 diabetes disease. PloS one Jin, B. n., Liu, R. n., Hao, S. n., Li, Z. n., Zhu, C. n., Zhou, X. n., Chen, P. n., Fu, T. n., Hu, Z. n., Wu, Q. n., Liu, W. n., Liu, D. n., Yu, Y. n., Zhang, Y. n., McElhinney, D. B., Li, Y. M., Culver, D. S., Alfreds, S. T., Stearns, F. n., Sylvester, K. G., Widen, E. n., Ling, X. B. 2017; 12 (7): e0180937

    Abstract

    Type 2 diabetes mellitus (T2DM), with increased risk of serious long-term complications, currently represents 8.3% of the adult population. We hypothesized that a critical transition state prior to the new onset T2DM can be revealed through the longitudinal electronic medical record (EMR) analysis.We applied the transition-based network entropy methodology which previously identified a dynamic driver network (DDN) underlying the critical T2DM transition at the tissue molecular biological level. To profile pre-disease phenotypical changes that indicated a critical transition state, a cohort of 7,334 patients was assembled from the Maine State Health Information Exchange (HIE). These patients all had their first confirmative diagnosis of T2DM between January 1, 2013 and June 30, 2013. The cohort's EMRs from the 24 months preceding their date of first T2DM diagnosis were extracted.Analysis of these patients' pre-disease clinical history identified a dynamic driver network (DDN) and an associated critical transition state six months prior to their first confirmative T2DM state.This 6-month window before the disease state provides an early warning of the impending T2DM, warranting an opportunity to apply proactive interventions to prevent or delay the new onset of T2DM.

    View details for PubMedID 28686739

  • Estimating One-Year Risk of Incident Chronic Kidney Disease: Retrospective Development and Validation Study Using Electronic Medical Record Data From the State of Maine. JMIR medical informatics Hao, S. n., Fu, T. n., Wu, Q. n., Jin, B. n., Zhu, C. n., Hu, Z. n., Guo, Y. n., Zhang, Y. n., Yu, Y. n., Fouts, T. n., Ng, P. n., Culver, D. S., Alfreds, S. T., Stearns, F. n., Sylvester, K. G., Widen, E. n., McElhinney, D. B., Ling, X. B. 2017; 5 (3): e21

    Abstract

    Chronic kidney disease (CKD) is a major public health concern in the United States with high prevalence, growing incidence, and serious adverse outcomes.We aimed to develop and validate a model to identify patients at risk of receiving a new diagnosis of CKD (incident CKD) during the next 1 year in a general population.The study population consisted of patients who had visited any care facility in the Maine Health Information Exchange network any time between January 1, 2013, and December 31, 2015, and had no history of CKD diagnosis. Two retrospective cohorts of electronic medical records (EMRs) were constructed for model derivation (N=1,310,363) and validation (N=1,430,772). The model was derived using a gradient tree-based boost algorithm to assign a score to each individual that measured the probability of receiving a new diagnosis of CKD from January 1, 2014, to December 31, 2014, based on the preceding 1-year clinical profile. A feature selection process was conducted to reduce the dimension of the data from 14,680 EMR features to 146 as predictors in the final model. Relative risk was calculated by the model to gauge the risk ratio of the individual to population mean of receiving a CKD diagnosis in next 1 year. The model was tested on the validation cohort to predict risk of CKD diagnosis in the period from January 1, 2015, to December 31, 2015, using the preceding 1-year clinical profile.The final model had a c-statistic of 0.871 in the validation cohort. It stratified patients into low-risk (score 0-0.005), intermediate-risk (score 0.005-0.05), and high-risk (score ≥ 0.05) levels. The incidence of CKD in the high-risk patient group was 7.94%, 13.7 times higher than the incidence in the overall cohort (0.58%). Survival analysis showed that patients in the 3 risk categories had significantly different CKD outcomes as a function of time (P<.001), indicating an effective classification of patients by the model.We developed and validated a model that is able to identify patients at high risk of having CKD in the next 1 year by statistically learning from the EMR-based clinical history in the preceding 1 year. Identification of these patients indicates care opportunities such as monitoring and adopting intervention plans that may benefit the quality of care and outcomes in the long term.

    View details for PubMedID 28747298

  • Unique Molecular Patterns Uncovered in Kawasaki Disease Patients with Elevated Serum Gamma Glutamyl Transferase Levels: Implications for Intravenous Immunoglobulin Responsiveness PLOS ONE Wang, Y., Li, Z., Hu, G., Hao, S., Deng, X., Huang, M., Ren, M., Jiang, X., Kanegaye, J. T., Ha, K., Lee, J., Li, X., Jiang, X., Yu, Y., Tremoulet, A. H., Burns, J. C., Whitin, J. C., Shin, A. Y., Sylvester, K. G., McElhinney, D. B., Cohen, H. J., Ling, X. B. 2016; 11 (12)

    Abstract

    Resistance to intravenous immunoglobulin (IVIG) occurs in 10-20% of patients with Kawasaki disease (KD). The risk of resistance is about two-fold higher in patients with elevated gamma glutamyl transferase (GGT) levels. We sought to understand the biological mechanisms underlying IVIG resistance in patients with elevated GGT levels.We explored the association between elevated GGT levels and IVIG-resistance with a cohort of 686 KD patients (Cohort I). Gene expression data from 130 children with acute KD (Cohort II) were analyzed using the R square statistic and false discovery analysis to identify genes that were differentially represented in patients with elevated GGT levels with regard to IVIG responsiveness. Two additional KD cohorts (Cohort III and IV) were used to test the hypothesis that sialylation and GGT may be involved in IVIG resistance through neutrophil apoptosis.Thirty-six genes were identified that significantly explained the variations of both GGT levels and IVIG responsiveness in KD patients. After Bonferroni correction, significant associations with IVIG resistance persisted for 12 out of 36 genes among patients with elevated GGT levels and none among patients with normal GGT levels. With the discovery of ST6GALNAC3, a sialyltransferase, as the most differentially expressed gene, we hypothesized that sialylation and GGT are involved in IVIG resistance through neutrophil apoptosis. We then confirmed that in Cohort III and IV there was significantly less reduction in neutrophil count in IVIG non-responders.Gene expression analyses combining molecular and clinical datasets support the hypotheses that: (1) neutrophil apoptosis induced by IVIG may be a mechanism of action of IVIG in KD; (2) changes in sialylation and GGT level in KD patients may contribute synergistically to IVIG resistance through blocking IVIG-induced neutrophil apoptosis. These findings have implications for understanding the mechanism of action in IVIG resistance, and possibly for development of novel therapeutics.

    View details for DOI 10.1371/journal.pone.0167434

    View details for Web of Science ID 000392853100008

    View details for PubMedID 28002448

    View details for PubMedCentralID PMC5176264

  • Web-based Real-Time Case Finding for the Population Health Management of Patients With Diabetes Mellitus: A Prospective Validation of the Natural Language Processing-Based Algorithm With Statewide Electronic Medical Records. JMIR medical informatics Zheng, L., Wang, Y., Hao, S., Shin, A. Y., Jin, B., Ngo, A. D., Jackson-Browne, M. S., Feller, D. J., Fu, T., Zhang, K., Zhou, X., Zhu, C., Dai, D., Yu, Y., Zheng, G., Li, Y., McElhinney, D. B., Culver, D. S., Alfreds, S. T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2016; 4 (4)

    Abstract

    Diabetes case finding based on structured medical records does not fully identify diabetic patients whose medical histories related to diabetes are available in the form of free text. Manual chart reviews have been used but involve high labor costs and long latency.This study developed and tested a Web-based diabetes case finding algorithm using both structured and unstructured electronic medical records (EMRs).This study was based on the health information exchange (HIE) EMR database that covers almost all health facilities in the state of Maine, United States. Using narrative clinical notes, a Web-based natural language processing (NLP) case finding algorithm was retrospectively (July 1, 2012, to June 30, 2013) developed with a random subset of HIE-associated facilities, which was then blind tested with the remaining facilities. The NLP-based algorithm was subsequently integrated into the HIE database and validated prospectively (July 1, 2013, to June 30, 2014).Of the 935,891 patients in the prospective cohort, 64,168 diabetes cases were identified using diagnosis codes alone. Our NLP-based case finding algorithm prospectively found an additional 5756 uncodified cases (5756/64,168, 8.97% increase) with a positive predictive value of .90. Of the 21,720 diabetic patients identified by both methods, 6616 patients (6616/21,720, 30.46%) were identified by the NLP-based algorithm before a diabetes diagnosis was noted in the structured EMR (mean time difference = 48 days).The online NLP algorithm was effective in identifying uncodified diabetes cases in real time, leading to a significant improvement in diabetes case finding. The successful integration of the NLP-based case finding algorithm into the Maine HIE database indicates a strong potential for application of this novel method to achieve a more complete ascertainment of diagnoses of diabetes mellitus.

    View details for PubMedID 27836816

  • A Classification Tool for Differentiation of Kawasaki Disease from Other Febrile Illnesses. journal of pediatrics Hao, S., Jin, B., Tan, Z., Li, Z., Ji, J., Hu, G., Wang, Y., Deng, X., Kanegaye, J. T., Tremoulet, A. H., Burns, J. C., Cohen, H. J., Ling, X. B. 2016; 176: 114-120 e8

    Abstract

    To develop and validate a novel decision tree-based clinical algorithm to differentiate Kawasaki disease (KD) from other pediatric febrile illnesses that share common clinical characteristics.Using clinical and laboratory data from 801 subjects with acute KD (533 for development, and 268 for validation) and 479 febrile control subjects (318 for development, and 161 for validation), we developed a stepwise KD diagnostic algorithm combining our previously developed linear discriminant analysis (LDA)-based model with a newly developed tree-based algorithm.The primary model (LDA) stratified the 1280 subjects into febrile controls (n = 276), indeterminate (n = 247), and KD (n = 757) subgroups. The subsequent model (decision trees) further classified the indeterminate group into febrile controls (n = 103) and KD (n = 58) subgroups, leaving only 29 of 801 KD (3.6%) and 57 of 479 febrile control (11.9%) subjects indeterminate. The 2-step algorithm had a sensitivity of 96.0% and a specificity of 78.5%, and correctly classified all subjects with KD who later developed coronary artery aneurysms.The addition of a decision tree step increased sensitivity and specificity in the classification of subject with KD and febrile controls over our previously described LDA model. A multicenter trial is needed to prospectively determine its utility as a point of care diagnostic test for KD.

    View details for DOI 10.1016/j.jpeds.2016.05.060

    View details for PubMedID 27344221

    View details for PubMedCentralID PMC5003696

  • Exploring the Role of Polycythemia in Patients With Cyanosis After Palliative Congenital Heart Surgery. Pediatric critical care medicine Siehr, S. L., Shi, S., Hao, S., Hu, Z., Jin, B., Hanley, F., Reddy, V. M., McElhinney, D. B., Ling, X. B., Shin, A. Y. 2016; 17 (3): 216-222

    Abstract

    To understand the relationship between polycythemia and clinical outcome in patients with hypoplastic left heart syndrome following the Norwood operation.A retrospective, single-center cohort study.Pediatric cardiovascular ICU, university-affiliated children's hospital.Infants with hypoplastic left heart syndrome admitted to our medical center from September 2009 to December 2012 undergoing stage 1/Norwood operation.None.Baseline demographic and clinical information including first recorded postoperative hematocrit and subsequent mean, median, and nadir hematocrits during the first 72 hours postoperatively were recorded. The primary outcomes were in-hospital mortality and length of hospitalization. Thirty-two patients were included in the analysis. Patients did not differ by operative factors (cardiopulmonary bypass time and cross-clamp time) or traditional markers of severity of illness (vasoactive inotrope score, lactate, saturation, and PaO2/FIO2 ratio). Early polycythemia (hematocrit value > 49%) was associated with longer cardiovascular ICU stay (51.0 [± 38.6] vs 21.4 [± 16.2] d; p < 0.01) and total hospital length of stay (65.0 [± 46.5] vs 36.1 [± 20.0] d; p = 0.03). In a multivariable analysis, polycythemia remained independently associated with the length of hospitalization after controlling for the amount of RBC transfusion (weight, 4.36 [95% CI, 1.35-7.37]; p < 0.01). No difference in in-hospital mortality rates was detected between the two groups (17.6% vs 20%).Early polycythemia following the Norwood operation is associated with longer length of hospitalization even after controlling for blood cell transfusion practices. We hypothesize that polycythemia may be caused by hemoconcentration and used as an early marker of capillary leak syndrome.

    View details for DOI 10.1097/PCC.0000000000000654

    View details for PubMedID 26825044

  • Urinary Colorimetric Sensor Array and Algorithm to Distinguish Kawasaki Disease from Other Febrile Illnesses PLOS ONE Li, Z., Tan, Z., Hao, S., Jin, B., Deng, X., Hu, G., Liu, X., Zhang, J., Jin, H., Huang, M., Kanegaye, J. T., Tremoulet, A. H., Burns, J. C., Wu, J., Cohen, H. J., Ling, X. B. 2016; 11 (2)

    Abstract

    Kawasaki disease (KD) is an acute pediatric vasculitis of infants and young children with unknown etiology and no specific laboratory-based test to identify. A specific molecular diagnostic test is urgently needed to support the clinical decision of proper medical intervention, preventing subsequent complications of coronary artery aneurysms. We used a simple and low-cost colorimetric sensor array to address the lack of a specific diagnostic test to differentiate KD from febrile control (FC) patients with similar rash/fever illnesses.Demographic and clinical data were prospectively collected for subjects with KD and FCs under standard protocol. After screening using a genetic algorithm, eleven compounds including metalloporphyrins, pH indicators, redox indicators and solvatochromic dye categories, were selected from our chromatic compound library (n = 190) to construct a colorimetric sensor array for diagnosing KD. Quantitative color difference analysis led to a decision-tree-based KD diagnostic algorithm.This KD sensing array allowed the identification of 94% of KD subjects (receiver operating characteristic [ROC] area under the curve [AUC] 0.981) in the training set (33 KD, 33 FC) and 94% of KD subjects (ROC AUC: 0.873) in the testing set (16 KD, 17 FC). Color difference maps reconstructed from the digital images of the sensing compounds demonstrated distinctive patterns differentiating KD from FC patients.The colorimetric sensor array, composed of common used chemical compounds, is an easily accessible, low-cost method to realize the discrimination of subjects with KD from other febrile illness.

    View details for DOI 10.1371/journal.pone.0146733

    View details for Web of Science ID 000370038400003

    View details for PubMedID 26859297

    View details for PubMedCentralID PMC4747548

  • Prospective stratification of patients at risk for emergency department revisit: resource utilization and population management strategy implications. BMC emergency medicine Jin, B., Zhao, Y., Hao, S., Shin, A. Y., Wang, Y., Zhu, C., Hu, Z., Fu, C., Ji, J., Wang, Y., Zhao, Y., Jiang, Y., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2016; 16 (1): 10-?

    Abstract

    Estimating patient risk of future emergency department (ED) revisits can guide the allocation of resources, e.g. local primary care and/or specialty, to better manage ED high utilization patient populations and thereby improve patient life qualities.We set to develop and validate a method to estimate patient ED revisit risk in the subsequent 6 months from an ED discharge date. An ensemble decision-tree-based model with Electronic Medical Record (EMR) encounter data from HealthInfoNet (HIN), Maine's Health Information Exchange (HIE), was developed and validated, assessing patient risk for a subsequent 6 month return ED visit based on the ED encounter-associated demographic and EMR clinical history data. A retrospective cohort of 293,461 ED encounters that occurred between January 1, 2012 and December 31, 2012, was assembled with the associated patients' 1-year clinical histories before the ED discharge date, for model training and calibration purposes. To validate, a prospective cohort of 193,886 ED encounters that occurred between January 1, 2013 and June 30, 2013 was constructed.Statistical learning that was utilized to construct the prediction model identified 152 variables that included the following data domains: demographics groups (12), different encounter history (104), care facilities (12), primary and secondary diagnoses (10), primary and secondary procedures (2), chronic disease condition (1), laboratory test results (2), and outpatient prescription medications (9). The c-statistics for the retrospective and prospective cohorts were 0.742 and 0.730 respectively. Total medical expense and ED utilization by risk score 6 months after the discharge were analyzed. Cluster analysis identified discrete subpopulations of high-risk patients with distinctive resource utilization patterns, suggesting the need for diversified care management strategies.Integration of our method into the HIN secure statewide data system in real time prospectively validated its performance. It promises to provide increased opportunity for high ED utilization identification, and optimized resource and population management.

    View details for DOI 10.1186/s12873-016-0074-5

    View details for PubMedID 26842066

    View details for PubMedCentralID PMC4739399

  • NLP based congestive heart failure case finding: A prospective analysis on statewide electronic medical records. International journal of medical informatics Wang, Y., Luo, J., Hao, S., Xu, H., Shin, A. Y., Jin, B., Liu, R., Deng, X., Wang, L., Zheng, L., Zhao, Y., Zhu, C., Hu, Z., Fu, C., Hao, Y., Zhao, Y., Jiang, Y., Dai, D., Culver, D. S., Alfreds, S. T., Todd, R., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 84 (12): 1039-1047

    Abstract

    In order to proactively manage congestive heart failure (CHF) patients, an effective CHF case finding algorithm is required to process both structured and unstructured electronic medical records (EMR) to allow complementary and cost-efficient identification of CHF patients.We set to identify CHF cases from both EMR codified and natural language processing (NLP) found cases. Using narrative clinical notes from all Maine Health Information Exchange (HIE) patients, the NLP case finding algorithm was retrospectively (July 1, 2012-June 30, 2013) developed with a random subset of HIE associated facilities, and blind-tested with the remaining facilities. The NLP based method was integrated into a live HIE population exploration system and validated prospectively (July 1, 2013-June 30, 2014). Total of 18,295 codified CHF patients were included in Maine HIE. Among the 253,803 subjects without CHF codings, our case finding algorithm prospectively identified 2411 uncodified CHF cases. The positive predictive value (PPV) is 0.914, and 70.1% of these 2411 cases were found to be with CHF histories in the clinical notes.A CHF case finding algorithm was developed, tested and prospectively validated. The successful integration of the CHF case findings algorithm into the Maine HIE live system is expected to improve the Maine CHF care.

    View details for DOI 10.1016/j.ijmedinf.2015.06.007

    View details for PubMedID 26254876

  • Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk Assessment Tool in the Maine Healthcare Information Exchange PLOS ONE Hao, S., Wang, Y., Jin, B., Shin, A. Y., Zhu, C., Huang, M., Zheng, L., Luo, J., Hu, Z., Fu, C., Dai, D., Wang, Y., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 10 (10)

    Abstract

    Identifying patients at risk of a 30-day readmission can help providers design interventions, and provide targeted care to improve clinical effectiveness. This study developed a risk model to predict a 30-day inpatient hospital readmission for patients in Maine, across all payers, all diseases and all demographic groups.Our objective was to develop a model to determine the risk for inpatient hospital readmission within 30 days post discharge. All patients within the Maine Health Information Exchange (HIE) system were included. The model was retrospectively developed on inpatient encounters between January 1, 2012 to December 31, 2012 from 24 randomly chosen hospitals, and then prospectively validated on inpatient encounters from January 1, 2013 to December 31, 2013 using all HIE patients.A risk assessment tool partitioned the entire HIE population into subgroups that corresponded to probability of hospital readmission as determined by a corresponding positive predictive value (PPV). An overall model c-statistic of 0.72 was achieved. The total 30-day readmission rates in low (score of 0-30), intermediate (score of 30-70) and high (score of 70-100) risk groupings were 8.67%, 24.10% and 74.10%, respectively. A time to event analysis revealed the higher risk groups readmitted to a hospital earlier than the lower risk groups. Six high-risk patient subgroup patterns were revealed through unsupervised clustering. Our model was successfully integrated into the statewide HIE to identify patient readmission risk upon admission and daily during hospitalization or for 30 days subsequently, providing daily risk score updates.The risk model was validated as an effective tool for predicting 30-day readmissions for patients across all payer, disease and demographic groups within the Maine HIE. Exposing the key clinical, demographic and utilization profiles driving each patient's risk of readmission score may be useful to providers in developing individualized post discharge care plans.

    View details for DOI 10.1371/journal.pone.0140271

    View details for Web of Science ID 000362511000113

    View details for PubMedID 26448562

  • Online Prediction of Health Care Utilization in the Next Six Months Based on Electronic Health Record Information: A Cohort and Validation Study JOURNAL OF MEDICAL INTERNET RESEARCH Hu, Z., Hao, S., Jin, B., Shin, A. Y., Zhu, C., Huang, M., Wang, Y., Zheng, L., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. 2015; 17 (9)

    Abstract

    The increasing rate of health care expenditures in the United States has placed a significant burden on the nation's economy. Predicting future health care utilization of patients can provide useful information to better understand and manage overall health care deliveries and clinical resource allocation.This study developed an electronic medical record (EMR)-based online risk model predictive of resource utilization for patients in Maine in the next 6 months across all payers, all diseases, and all demographic groups.In the HealthInfoNet, Maine's health information exchange (HIE), a retrospective cohort of 1,273,114 patients was constructed with the preceding 12-month EMR. Each patient's next 6-month (between January 1, 2013 and June 30, 2013) health care resource utilization was retrospectively scored ranging from 0 to 100 and a decision tree-based predictive model was developed. Our model was later integrated in the Maine HIE population exploration system to allow a prospective validation analysis of 1,358,153 patients by forecasting their next 6-month risk of resource utilization between July 1, 2013 and December 31, 2013.Prospectively predicted risks, on either an individual level or a population (per 1000 patients) level, were consistent with the next 6-month resource utilization distributions and the clinical patterns at the population level. Results demonstrated the strong correlation between its care resource utilization and our risk scores, supporting the effectiveness of our model. With the online population risk monitoring enterprise dashboards, the effectiveness of the predictive algorithm has been validated by clinicians and caregivers in the State of Maine.The model and associated online applications were designed for tracking the evolving nature of total population risk, in a longitudinal manner, for health care resource utilization. It will enable more effective care management strategies driving improved patient outcomes.

    View details for DOI 10.2196/jmir.4976

    View details for Web of Science ID 000361809800005

    View details for PubMedID 26395541

  • Cerebrospinal fluid protein dynamic driver network: At the crossroads of brain tumorigenesis METHODS Tan, Z., Liu, R., Zheng, L., Hao, S., Fu, C., Li, Z., Deng, X., Jang, T., Merchant, M., Whitin, J. C., Guo, M., Cohen, H. J., Recht, L., Ling, X. B. 2015; 83: 36-43

    Abstract

    To get a better understanding of the ongoing in situ environmental changes preceding the brain tumorigenesis, we assessed cerebrospinal fluid (CSF) proteome profile changes in a glioma rat model in which brain tumor invariably developed after a single in utero exposure to the neurocarcinogen ethylnitrosourea (ENU). Computationally, the CSF proteome profile dynamics during the tumorigenesis can be modeled as non-smooth or even abrupt state changes. Such brain tumor environment transition analysis, correlating the CSF composition changes with the development of early cellular hyperplasia, can reveal the pathogenesis process at network level during a time before the image detection of the tumors. In our controlled rat model study, matched ENU- and saline-exposed rats' CSF proteomics changes were quantified at approximately 30, 60, 90, 120, 150days of age (P30, P60, P90, P120, P150). We applied our transition-based network entropy (TNE) method to compute the CSF proteome changes in the ENU rat model and test the hypothesis of the critical transition state prior to impending hyperplasia. Our analysis identified a dynamic driver network (DDN) of CSF proteins related with the emerging tumorigenesis progressing from the non-hyperplasia state. The DDN associated leading network CSF proteins can allow the early detection of such dynamics before the catastrophic shift to the clear clinical landmarks in gliomas. Future characterization of the critical transition state (P60) during the brain tumor progression may reveal the underlying pathophysiology to device novel therapeutics preventing tumor formation. More detailed method and information are accessible through our website at http://translationalmedicine.stanford.edu.

    View details for DOI 10.1016/j.ymeth.2015.05.004

    View details for Web of Science ID 000358755100005

  • Cerebrospinal fluid protein dynamic driver network: At the crossroads of brain tumorigenesis. Methods (San Diego, Calif.) Tan, Z., Liu, R., Zheng, L., Hao, S., Fu, C., Li, Z., Deng, X., Jang, T., Merchant, M., Whitin, J. C., Guo, M., Cohen, H. J., Recht, L., Ling, X. B. 2015; 83: 36-43

    Abstract

    To get a better understanding of the ongoing in situ environmental changes preceding the brain tumorigenesis, we assessed cerebrospinal fluid (CSF) proteome profile changes in a glioma rat model in which brain tumor invariably developed after a single in utero exposure to the neurocarcinogen ethylnitrosourea (ENU). Computationally, the CSF proteome profile dynamics during the tumorigenesis can be modeled as non-smooth or even abrupt state changes. Such brain tumor environment transition analysis, correlating the CSF composition changes with the development of early cellular hyperplasia, can reveal the pathogenesis process at network level during a time before the image detection of the tumors. In our controlled rat model study, matched ENU- and saline-exposed rats' CSF proteomics changes were quantified at approximately 30, 60, 90, 120, 150days of age (P30, P60, P90, P120, P150). We applied our transition-based network entropy (TNE) method to compute the CSF proteome changes in the ENU rat model and test the hypothesis of the critical transition state prior to impending hyperplasia. Our analysis identified a dynamic driver network (DDN) of CSF proteins related with the emerging tumorigenesis progressing from the non-hyperplasia state. The DDN associated leading network CSF proteins can allow the early detection of such dynamics before the catastrophic shift to the clear clinical landmarks in gliomas. Future characterization of the critical transition state (P60) during the brain tumor progression may reveal the underlying pathophysiology to device novel therapeutics preventing tumor formation. More detailed method and information are accessible through our website at http://translationalmedicine.stanford.edu.

    View details for DOI 10.1016/j.ymeth.2015.05.004

    View details for PubMedID 25982164

  • Utility of Clinical Biomarkers to Predict Central Line-associated Bloodstream Infections After Congenital Heart Surgery. Pediatric infectious disease journal Shin, A. Y., Jin, B., Hao, S., Hu, Z., Sutherland, S., McCammond, A., Axelrod, D., Sharek, P., Roth, S. J., Ling, X. B. 2015; 34 (3): 251-254

    Abstract

    Central line associated bloodstream infections is an important contributor of morbidity and mortality in children recovering from congenital heart surgery. The reliability of commonly used biomarkers to differentiate these patients have not been specifically studied.This was a retrospective cohort study in a university-affiliated children's hospital examining all patients with congenital or acquired heart disease admitted to the cardiovascular intensive care unit following cardiac surgery who underwent evaluation for a catheter-associated bloodstream infection.Among 1260 cardiac surgeries performed, 451 encounters underwent an infection evaluation post-operatively. Twenty-five instances of CLABSI and 227 instances of a negative infection evaluation were the subject of analysis. Patients with CLABSI tended to be younger (1.34 vs 4.56 years, p = 0.011) and underwent more complex surgery (RACHS-1 score 3.79 vs 3.04, p = 0.039). The two groups were indistinguishable in WBC, PMNs and band count at the time of their presentation. On multivariate analysis, CLABSI was associated with fever (adjusted OR 4.78; 95% CI, 1.6 to 5.8) and elevated CRP (adjusted OR 1.28; 95% CI, 1.09 to 1.68) after adjusting for differences between the two groups. Receiver operating characteristic analysis demonstrated the discriminatory power of both fever and CRP (area under curve 0.7247, 95% CI, 0.42 to 0.74 and 0.58, 95% CI 0.4208 to 0.7408). We calculated multilevel likelihood ratios for a spectrum of temperature and CRP values.We found commonly used serum biomarkers such as fever and CRP not to be helpful discriminators in patients following congenital heart surgery.

    View details for DOI 10.1097/INF.0000000000000553

    View details for PubMedID 25232780

  • Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk Assessment Tool in the Maine Healthcare Information Exchange. PloS one Hao, S., Wang, Y., Jin, B., Shin, A. Y., Zhu, C., Huang, M., Zheng, L., Luo, J., Hu, Z., Fu, C., Dai, D., Wang, Y., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2015; 10 (10)

    Abstract

    Identifying patients at risk of a 30-day readmission can help providers design interventions, and provide targeted care to improve clinical effectiveness. This study developed a risk model to predict a 30-day inpatient hospital readmission for patients in Maine, across all payers, all diseases and all demographic groups.Our objective was to develop a model to determine the risk for inpatient hospital readmission within 30 days post discharge. All patients within the Maine Health Information Exchange (HIE) system were included. The model was retrospectively developed on inpatient encounters between January 1, 2012 to December 31, 2012 from 24 randomly chosen hospitals, and then prospectively validated on inpatient encounters from January 1, 2013 to December 31, 2013 using all HIE patients.A risk assessment tool partitioned the entire HIE population into subgroups that corresponded to probability of hospital readmission as determined by a corresponding positive predictive value (PPV). An overall model c-statistic of 0.72 was achieved. The total 30-day readmission rates in low (score of 0-30), intermediate (score of 30-70) and high (score of 70-100) risk groupings were 8.67%, 24.10% and 74.10%, respectively. A time to event analysis revealed the higher risk groups readmitted to a hospital earlier than the lower risk groups. Six high-risk patient subgroup patterns were revealed through unsupervised clustering. Our model was successfully integrated into the statewide HIE to identify patient readmission risk upon admission and daily during hospitalization or for 30 days subsequently, providing daily risk score updates.The risk model was validated as an effective tool for predicting 30-day readmissions for patients across all payer, disease and demographic groups within the Maine HIE. Exposing the key clinical, demographic and utilization profiles driving each patient's risk of readmission score may be useful to providers in developing individualized post discharge care plans.

    View details for DOI 10.1371/journal.pone.0140271

    View details for PubMedID 26448562

  • Risk prediction of emergency department revisit 30 days post discharge: a prospective study. PloS one Hao, S., Jin, B., Shin, A. Y., Zhao, Y., Zhu, C., Li, Z., Hu, Z., Fu, C., Ji, J., Wang, Y., Zhao, Y., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2014; 9 (11)

    Abstract

    Among patients who are discharged from the Emergency Department (ED), about 3% return within 30 days. Revisits can be related to the nature of the disease, medical errors, and/or inadequate diagnoses and treatment during their initial ED visit. Identification of high-risk patient population can help device new strategies for improved ED care with reduced ED utilization.A decision tree based model with discriminant Electronic Medical Record (EMR) features was developed and validated, estimating patient ED 30 day revisit risk. A retrospective cohort of 293,461 ED encounters from HealthInfoNet (HIN), Maine's Health Information Exchange (HIE), between January 1, 2012 and December 31, 2012, was assembled with the associated patients' demographic information and one-year clinical histories before the discharge date as the inputs. To validate, a prospective cohort of 193,886 encounters between January 1, 2013 and June 30, 2013 was constructed. The c-statistics for the retrospective and prospective predictions were 0.710 and 0.704 respectively. Clinical resource utilization, including ED use, was analyzed as a function of the ED risk score. Cluster analysis of high-risk patients identified discrete sub-populations with distinctive demographic, clinical and resource utilization patterns.Our ED 30-day revisit model was prospectively validated on the Maine State HIN secure statewide data system. Future integration of our ED predictive analytics into the ED care work flow may lead to increased opportunities for targeted care intervention to reduce ED resource burden and overall healthcare expense, and improve outcomes.

    View details for DOI 10.1371/journal.pone.0112944

    View details for PubMedID 25393305

    View details for PubMedCentralID PMC4231082

  • Risk prediction of emergency department revisit 30 days post discharge: a prospective study. PloS one Hao, S., Jin, B., Shin, A. Y., Zhao, Y., Zhu, C., Li, Z., Hu, Z., Fu, C., Ji, J., Wang, Y., Zhao, Y., Dai, D., Culver, D. S., Alfreds, S. T., Rogow, T., Stearns, F., Sylvester, K. G., Widen, E., Ling, X. B. 2014; 9 (11): e112944

    Abstract

    Among patients who are discharged from the Emergency Department (ED), about 3% return within 30 days. Revisits can be related to the nature of the disease, medical errors, and/or inadequate diagnoses and treatment during their initial ED visit. Identification of high-risk patient population can help device new strategies for improved ED care with reduced ED utilization.A decision tree based model with discriminant Electronic Medical Record (EMR) features was developed and validated, estimating patient ED 30 day revisit risk. A retrospective cohort of 293,461 ED encounters from HealthInfoNet (HIN), Maine's Health Information Exchange (HIE), between January 1, 2012 and December 31, 2012, was assembled with the associated patients' demographic information and one-year clinical histories before the discharge date as the inputs. To validate, a prospective cohort of 193,886 encounters between January 1, 2013 and June 30, 2013 was constructed. The c-statistics for the retrospective and prospective predictions were 0.710 and 0.704 respectively. Clinical resource utilization, including ED use, was analyzed as a function of the ED risk score. Cluster analysis of high-risk patients identified discrete sub-populations with distinctive demographic, clinical and resource utilization patterns.Our ED 30-day revisit model was prospectively validated on the Maine State HIN secure statewide data system. Future integration of our ED predictive analytics into the ED care work flow may lead to increased opportunities for targeted care intervention to reduce ED resource burden and overall healthcare expense, and improve outcomes.

    View details for DOI 10.1371/journal.pone.0112944

    View details for PubMedID 25393305

    View details for PubMedCentralID PMC4231082