Chenxi Sun's Profile | Stanford Profiles

Contact

Academic
cxsun@stanford.edu

University - Scholar Department: Adult Neurology Position: Postdoctoral Scholar

Additional Info

Mail Code: 5235
ORCID:
https://orcid.org/0000-0002-1762-0877

Current Research and Scholarly Interests

Artificial intelligence for time-series data; clinical EEG foundation models; machine learning for neurophysiology

All Publications

A Review of Deep Learning Methods for Irregularly Sampled Medical Time Series Data Health Data Science Sun, C., et al 2026

View details for DOI 10.34133/hds.0456
An Electrocardiogram Foundation Model Built on over 10 Million Recordings. NEJM AI Li, J., Aguirre, A. D., Moura, V., Jin, J., Liu, C., Zhong, L., Sun, C., Clifford, G., Westover, M. B., Hong, S. 2025; 2 (7)

Abstract

Artificial intelligence (AI) has demonstrated significant potential in electrocardiogram (ECG) analysis and cardiovascular disease assessment. Recently, foundation models have played a remarkable role in advancing medical AI, bringing benefits such as efficient disease diagnosis and cross-domain knowledge transfer. The development of an ECG foundation model holds the promise of elevating AI-ECG research to new heights. However, building such a model poses several challenges, including insufficient database sample sizes and inadequate generalization across multiple domains. In addition, there is a notable performance gap between single-lead and multilead ECG analysis.We propose a general-purpose ECG foundation model (ECGFounder), which leverages real-world ECG annotations from cardiologists to broaden the diagnostic capabilities of ECG analysis. ECGFounder was built on 10,771,552 ECGs from 1,818,247 unique subjects with 150 label categories from the Harvard-Emory ECG Database, enabling comprehensive cardiovascular disease diagnosis. The model is designed to be both an effective out-of-the-box solution and easily fine-tunable for downstream tasks, maximizing usability. Importantly, we extended its application to reduced-lead ECGs, particularly single-lead ECGs. ECGFounder is therefore applicable to various downstream tasks in mobile and remote monitoring scenarios.Experimental results demonstrate that ECGFounder achieves expert-level performance on internal validation sets, with area under the receiver operating characteristic curve (AUROC) exceeding 0.95 for 80 diagnoses. It also shows strong classification performance and generalization across various diagnoses on external validation sets. When fine-tuned, ECGFounder outperforms baseline models in demographic analysis, clinical event detection, and cross-modality cardiac rhythm diagnosis, surpassing baseline methods by 3 to 5 points in the AUROC.The ECG foundation model offers an effective solution, allowing it to generalize across a wide range of tasks. By enhancing existing cardiovascular diagnostics and facilitating integration with cloud-based systems, which analyze ECG data uploaded from wearable devices, it significantly contributes to the advancement of the cardiovascular AI community and enables management of cardiac conditions. (Funded by the National Science Foundation and others.).

View details for DOI 10.1056/aioa2401033

View details for PubMedID 40771651

View details for PubMedCentralID PMC12327759
Harvard Electroencephalography Database: A comprehensive clinical electroencephalographic resource from four Boston hospitals. Epilepsia Sun, C., Jing, J., Turley, N., Alcott, C., Kang, W. Y., Cole, A. J., Goldenholz, D. M., Lam, A., Amorim, E., Chu, C., Cash, S., Junior, V. M., Gupta, A., Ghanta, M., Nearing, B., Nascimento, F. A., Struck, A., Kim, J., Sartipi, S., Tauton, A. M., Fernandes, M., Sun, H., Bayas, G., Gallagher, K., Wagenaar, J. B., Sinha, N., Lee-Messer, C., Silvers, C. T., Gunapati, B., Rosand, J., Peters, J., Loddenkemper, T., Lee, J. W., Zafar, S., Westover, M. B. 2025

Abstract

This article presents the Harvard Electroencephalography Database (HEEDB), a large-scale, deidentified, and standardized electroencephalographic (EEG) resource supporting artificial intelligence-driven and reproducible research in epilepsy and broader clinical neuroscience.HEEDB aggregates more than 280 000 EEG recordings from more than 108 000 patients across four Harvard-affiliated hospitals. Data are harmonized using the Brain Imaging Data Structure and hosted on the Brain Data Science Platform. EEG data are linked with clinical notes, International Classification of Diseases, 10th Revision codes, medications, and EEG reports. Deidentification follows Health Insurance Portability and Accountability Act Safe Harbor standards.The database includes routine, epilepsy monitoring unit, and intensive care unit EEGs across all age groups, with 73% linked to deidentified clinical reports and 96% of those matched to recordings. Findings are extracted using expert curation, regular expressions, and medical natural language processing models. Auxiliary data include diagnoses, medications, and hospital course, supporting multimodal analysis.HEEDB fills a critical gap in EEG data availability for epilepsy research. By enabling large-scale, privacy-compliant, and clinically relevant analysis, it accelerates the development of diagnostic tools, improves training datasets for machine learning, and promotes data-sharing in alignment with FAIR (Findable, Accessible, Interoperable, Reusable) and National Institutes of Health data policies.

View details for DOI 10.1111/epi.18487

View details for PubMedID 40464151
Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales The New England Journal of Medicine AI Li, J., Goldenholz, D. M., Alkofer, M., Sun, C., et al 2025

View details for DOI 10.1056/AIoa2401221
A Ranking-Based Cross-Entropy Loss for Early Classification of Time Series. IEEE transactions on neural networks and learning systems Sun, C., Li, H., Song, M., Hong, S. 2024; 35 (8): 11194-11203

Abstract

Early classification tasks aim to classify time series before observing full data. It is critical in time-sensitive applications such as early sepsis diagnosis in the intensive care unit (ICU). Early diagnosis can provide more opportunities for doctors to rescue lives. However, there are two conflicting goals in the early classification task-accuracy and earliness. Most existing methods try to find a balance between them by weighing one goal against the other. But we argue that a powerful early classifier should always make highly accurate predictions at any moment. The main obstacle is that the key features suitable for classification are not obvious in the early stage, resulting in the excessive overlap of time series distributions in different time stages. The indistinguishable distributions make it difficult for classifiers to recognize. To solve this problem, this article proposes a novel ranking-based cross-entropy (RCE) loss to jointly learn the feature of classes and the order of earliness from time series data. In this way, RCE can help classifier to generate probability distributions of time series in different stages with more distinguishable boundary. Thus, the classification accuracy at each time step is finally improved. Besides, for the applicability of the method, we also accelerate the training process by focusing the learning process on high-ranking samples. Experiments on three real-world datasets show that our method can perform classification more accurately than all baselines at all moments.

View details for DOI 10.1109/TNNLS.2023.3250203

View details for PubMedID 37028352
Curriculum Design Helps Spiking Neural Networks to Classify Time Series arXiv Sun, C., et al 2024
Review of Data-centric Time Series Analysis from Sample, Feature, and Period arXiv Sun, C., et al 2024
TEST: Text prototype aligned embedding to activate LLM's ability for time series The Twelfth International Conference on Learning Representations (ICLR 2024) Sun, C., et al 2024: 28
Time pattern reconstruction for classification of irregularly sampled time series PATTERN RECOGNITION Sun, C., Li, H., Song, M., Cai, D., Zhang, B., Hong, S. 2024; 147

View details for DOI 10.1016/j.patcog.2023.110075

View details for Web of Science ID 001105411400001
A multi-model architecture based on deep learning for aircraft load prediction COMMUNICATIONS ENGINEERING Sun, C., Li, H., Dui, H., Hong, S., Sun, Y., Song, M., Cai, D., Zhang, B., Wang, Q., Wang, Y., Liu, B. 2023; 2 (1)

View details for DOI 10.1038/s44172-023-00100-4

View details for Web of Science ID 001478269300001
Estimating causal effects of physical disability and number of comorbid chronic diseases on risk of depressive symptoms in an elderly Chinese population: a machine learning analysis of cross-sectional baseline data from the China longitudinal ageing social survey. BMJ open Wang, Z., Yang, H., Sun, C., Hong, S. 2023; 13 (7): e069298

Abstract

This study aimed to explore the causal effects of physical disability and number of comorbid chronic diseases on depressive symptoms in an elderly Chinese population.Cross-sectional, baseline data were obtained from the China Longitudinal Ageing Social Survey, a stratified, multistage, probabilistic sampling survey conducted in 2014 that covers 28 of 31 provincial areas in China. The causal effects of physical disability and number of comorbid chronic diseases on depressive symptoms were analysed using the conditional average treatment effect method of machine learning. The causal effects model's adjustment was made for age, gender, residence, marital status, educational level, ethnicity, wealth quantile and other factors.Assessment of the causal effects of physical disability and number of comorbid chronic diseases on depressive symptoms.7496 subjects who were 60 years of age or older and who answered the questions on depressive symptoms and other independent variables of interest in a survey conducted in 2014 were included in this study.Physical disability and number of comorbid chronic diseases had causal effects on depressive symptoms. Among the subjects who had one or more functional limitations, the probability of depressive symptoms increased by 22% (95% CI 19% to 24%). For the subjects who had one chronic disease and those who had two or more chronic diseases, the possibility of depressive symptoms increased by 13% (95% CI 10% to 15%) and 20% (95% CI 18% to 22%), respectively.This study provides evidence that the presence of one or more functional limitations affects the occurrence of depressive symptoms among elderly people. The findings of our study are of value in developing programmes that are designed to identify elderly individuals who have physical disabilities or comorbid chronic diseases to provide early intervention.

View details for DOI 10.1136/bmjopen-2022-069298

View details for PubMedID 37407052

View details for PubMedCentralID PMC10335586
SPL-LDP: a label distribution propagation method for semi-supervised partial label learning APPLIED INTELLIGENCE Song, M., Sun, C., Cai, D., Hong, S., Li, H. 2023; 53 (18): 20785-20796

View details for DOI 10.1007/s10489-023-04548-x

View details for Web of Science ID 000971606000001
Adaptive model training strategy for continuous classification of time series. Applied intelligence (Dordrecht, Netherlands) Sun, C., Li, H., Song, M., Cai, D., Zhang, B., Hong, S. 2023: 1-19

Abstract

The classification of time series is essential in many real-world applications like healthcare. The class of a time series is usually labeled at the final time, but more and more time-sensitive applications require classifying time series continuously. For example, the outcome of a critical patient is only determined at the end, but he should be diagnosed at all times for timely treatment. For this demand, we propose a new concept, Continuous Classification of Time Series (CCTS). Different from the existing single-shot classification, the key of CCTS is to model multiple distributions simultaneously due to the dynamic evolution of time series. But the deep learning model will encounter intertwined problems of catastrophic forgetting and over-fitting when learning multi-distribution. In this work, we found that the well-designed distribution division and replay strategies in the model training process can help to solve the problems. We propose a novel Adaptive model training strategy for CCTS (ACCTS). Its adaptability represents two aspects: (1) Adaptive multi-distribution extraction policy. Instead of the fixed rules and the prior knowledge, ACCTS extracts data distributions adaptive to the time series evolution and the model change; (2) Adaptive importance-based replay policy. Instead of reviewing all old distributions, ACCTS only replays important samples adaptive to their contribution to the model. Experiments on four real-world datasets show that our method outperforms all baselines.

View details for DOI 10.1007/s10489-022-04433-z

View details for PubMedID 36819946

View details for PubMedCentralID PMC9922045
Continuous diagnosis and prognosis by controlling the update process of deep neural networks. Patterns (New York, N.Y.) Sun, C., Li, H., Song, M., Cai, D., Zhang, B., Hong, S. 2023; 4 (2): 100687

Abstract

Continuous diagnosis and prognosis are essential for critical patients. They can provide more opportunities for timely treatment and rational allocation. Although deep-learning techniques have demonstrated superiority in many medical tasks, they frequently forget, overfit, and produce results too late when performing continuous diagnosis and prognosis. In this work, we summarize the four requirements; propose a concept, continuous classification of time series (CCTS); and design a training method for deep learning, restricted update strategy (RU). The RU outperforms all baselines and achieves average accuracies of 90%, 97%, and 85% on continuous sepsis prognosis, COVID-19 mortality prediction, and eight disease classifications, respectively. The RU can also endow deep learning with interpretability, exploring disease mechanisms through staging and biomarker discovery. We find four sepsis stages, three COVID-19 stages, and their respective biomarkers. Further, our approach is data and model agnostic. It can be applied to other diseases and even in other fields.

View details for DOI 10.1016/j.patter.2023.100687

View details for PubMedID 36873902

View details for PubMedCentralID PMC9982300
Curricular and Cyclical Loss for Time Series Learning Strategy arXiv Sun, C., et al 2023
A systematic review of deep learning methods for modeling electrocardiograms during sleep. Physiological measurement Sun, C., Hong, S., Wang, J., Dong, X., Han, F., Li, H. 2022; 43 (8)

Abstract

Sleep is one of the most important human physiological activities, and plays an essential role in human health. Polysomnography (PSG) is the gold standard for measuring sleep quality and disorders, but it is time-consuming, labor-intensive, and prone to errors. Current research has confirmed the correlations between sleep and the respiratory/circulatory system. Electrocardiography (ECG) is convenient to perform, and ECG data are rich in breathing information. Therefore, sleep research based on ECG data has become popular. Currently, deep learning (DL) methods have achieved promising results on predictive health care tasks using ECG signals. Therefore, in this review, we systematically identify recent research studies and analyze them from the perspectives of data, model, and task. We discuss the shortcomings, summarize the findings, and highlight the potential opportunities. For sleep-related tasks, many ECG-based DL methods produce more accurate results than traditional approaches by combining multiple signal features and model structures. Methods that are more interpretable, scalable, and transferable will become ubiquitous in the daily practice of medicine and ambient-assisted-living applications. This paper is the first systematic review of ECG-based DL methods for sleep tasks.

View details for DOI 10.1088/1361-6579/ac826e

View details for PubMedID 35853448
DLSA: Semi-supervised partial label learning via dependence-maximized label set assignment INFORMATION SCIENCES Song, M., Li, H., Sun, C., Cai, D., Hong, S. 2022; 609: 1169-1180

View details for DOI 10.1016/j.ins.2022.07.114

View details for Web of Science ID 000848146300013
Hypergraph Contrastive Learning for Electronic Health Records Cai, D., Sun, C., Song, M., Zhang, B., Hong, S., Li, H. edited by Banerjee, A., Zhou, Z. H., Papalexakis, E. E., Riondato, M. SIAM. 2022: 127-135

View details for Web of Science ID 001281343300007
Deep Ordinal Neural Network for Length of Stay Estimation in the Intensive Care Units Cai, D., Song, M., Sun, C., Zhang, B., Hong, S., Li, H., ACM ASSOC COMPUTING MACHINERY. 2022: 3843-3847

View details for DOI 10.1145/3511808.3557578

View details for Web of Science ID 001074639603084
Confidence-Guided Learning Process for Continuous Classification of Time Series Sun, C., Song, M., Cai, D., Zhang, B., Hong, S., Li, H., ACM ASSOC COMPUTING MACHINERY. 2022: 4525-4529

View details for DOI 10.1145/3511808.3557565

View details for Web of Science ID 001074639604111
GRP-FED: Addressing Client Imbalance in Federated Learning via Global-Regularized Personalization Proceedings of the 2022 SIAM International Conference on Data Mining (SDM 2022) Chou, Y., Hong, S., Sun, C. 2022

View details for DOI 10.1137/1.9781611977172.51
Hypergraph Contrastive Learning for Electronic Health Records Proceedings of the 2022 SIAM International Conference on Data Mining (SDM 2022) Cai, D., Sun, C., et al 2022

View details for DOI 10.1137/1.9781611977172.15
Hypergraph Structure Learning for Hypergraph Neural Networks Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI 2022) Cai, D., Song, M., Sun, C. 2022

View details for DOI 10.24963/ijcai.2022/267
A Systematic Review of Echo State Networks From Design to Application IEEE Transactions on Artificial Intelligence Sun, C., et al 2022

View details for DOI 10.1109/TAI.2022.3225780
Hypergraph Structure Learning for Hypergraph Neural Networks Cai, D., Song, M., Sun, C., Zhang, B., Hong, S., Li, H. edited by DeRaedt, L. IJCAI-INT JOINT CONF ARTIF INTELL. 2022: 1923-1929

View details for Web of Science ID 001202342302008
GRP-FED: Addressing Client Imbalance in Federated Learning via Global-Regularized Personalization Chou, Y., Hong, S., Sun, C., Cai, D., Song, M., Li, H. edited by Banerjee, A., Zhou, Z. H., Papalexakis, E. E., Riondato, M. SIAM. 2022: 451-458

View details for Web of Science ID 001281343300047
Classifying vaguely labeled data based on evidential fusion INFORMATION SCIENCES Song, M., Sun, C., Cai, D., Hong, S., Li, H. 2022; 583: 159-173

View details for DOI 10.1016/j.ins.2021.11.005

View details for Web of Science ID 000727727800007
Interpretable time-aware and co-occurrence-aware network for medical prediction. BMC medical informatics and decision making Sun, C., Dui, H., Li, H. 2021; 21 (1): 305

Abstract

Disease prediction based on electronic health records (EHRs) is essential for personalized healthcare. But it's hard due to the special data structure and the interpretability requirement of methods. The structure of EHR is hierarchical: each patient has a sequence of admissions, and each admission has some co-occurrence diagnoses. However, the existing methods only partially model these characteristics and lack the interpretation for non-specialists.This work proposes a time-aware and co-occurrence-aware deep learning network (TCoN), which is not only suitable for EHR data structure but also interpretable: the co-occurrence-aware self-attention (CS-attention) mechanism and time-aware gated recurrent unit (T-GRU) can model multilevel relations; the interpretation path and the diagnosis graph can make the result interpretable.The method is tested on a real-world dataset for mortality prediction, readmission prediction, disease prediction, and next diagnoses prediction. Experimental results show that TCoN is better than baselines with 2.01% higher accuracy. Meanwhile, the method can give the interpretation of causal relationships and the diagnosis graph of each patient.This work proposes a novel model-TCoN. It is an interpretable and effective deep learning method, that can model the hierarchical medical structure and predict medical events. The experiments show that it outperforms all state-of-the-art methods. Future work can apply the graph embedding technology based on more knowledge data such as doctor notes.

View details for DOI 10.1186/s12911-021-01662-z

View details for PubMedID 34727940

View details for PubMedCentralID PMC8561378
Personalized vital signs control based on continuous action-space reinforcement learning with supervised experience BIOMEDICAL SIGNAL PROCESSING AND CONTROL Sun, C., Hong, S., Song, M., Shang, J., Li, H. 2021; 69

View details for DOI 10.1016/j.bspc.2021.102847

View details for Web of Science ID 000685910600005
Predicting COVID-19 disease progression and patient outcomes based on temporal deep learning. BMC medical informatics and decision making Sun, C., Hong, S., Song, M., Li, H., Wang, Z. 2021; 21 (1): 45

Abstract

The coronavirus disease 2019 (COVID-19) pandemic has caused health concerns worldwide since December 2019. From the beginning of infection, patients will progress through different symptom stages, such as fever, dyspnea or even death. Identifying disease progression and predicting patient outcome at an early stage helps target treatment and resource allocation. However, there is no clear COVID-19 stage definition, and few studies have addressed characterizing COVID-19 progression, making the need for this study evident.We proposed a temporal deep learning method, based on a time-aware long short-term memory (T-LSTM) neural network and used an online open dataset, including blood samples of 485 patients from Wuhan, China, to train the model. Our method can grasp the dynamic relations in irregularly sampled time series, which is ignored by existing works. Specifically, our method predicted the outcome of COVID-19 patients by considering both the biomarkers and the irregular time intervals. Then, we used the patient representations, extracted from T-LSTM units, to subtype the patient stages and describe the disease progression of COVID-19.Using our method, the accuracy of the outcome of prediction results was more than 90% at 12 days and 98, 95 and 93% at 3, 6, and 9 days, respectively. Most importantly, we found 4 stages of COVID-19 progression with different patient statuses and mortality risks. We ranked 40 biomarkers related to disease and gave the reference values of them for each stage. Top 5 is Lymph, LDH, hs-CRP, Indirect Bilirubin, Creatinine. Besides, we have found 3 complications - myocardial injury, liver function injury and renal function injury. Predicting which of the 4 stages the patient is currently in can help doctors better assess and cure the patient.To combat the COVID-19 epidemic, this paper aims to help clinicians better assess and treat infected patients, provide relevant researchers with potential disease progression patterns, and enable more effective use of medical resources. Our method predicted patient outcomes with high accuracy and identified a four-stage disease progression. We hope that the obtained results and patterns will aid in fighting the disease.

View details for DOI 10.1186/s12911-020-01359-9

View details for PubMedID 33557818

View details for PubMedCentralID PMC7869774
Practical Lessons on 12-Lead ECG Classification: Meta-Analysis of Methods From PhysioNet/Computing in Cardiology Challenge 2020. Frontiers in physiology Hong, S., Zhang, W., Sun, C., Zhou, Y., Li, H. 2021; 12: 811661

Abstract

Cardiovascular diseases (CVDs) are one of the most fatal disease groups worldwide. Electrocardiogram (ECG) is a widely used tool for automatically detecting cardiac abnormalities, thereby helping to control and manage CVDs. To encourage more multidisciplinary researches, PhysioNet/Computing in Cardiology Challenge 2020 (Challenge 2020) provided a public platform involving multi-center databases and automatic evaluations for ECG classification tasks. As a result, 41 teams successfully submitted their solutions and were qualified for rankings. Although Challenge 2020 was a success, there has been no in-depth methodological meta-analysis of these solutions, making it difficult for researchers to benefit from the solutions and results. In this study, we aim to systematically review the 41 solutions in terms of data processing, feature engineering, model architecture, and training strategy. For each perspective, we visualize and statistically analyze the effectiveness of the common techniques, and discuss the methodological advantages and disadvantages. Finally, we summarize five practical lessons based on the aforementioned analysis: (1) Data augmentation should be employed and adapted to specific scenarios; (2) Combining different features can improve performance; (3) A hybrid design of different types of deep neural networks (DNNs) is better than using a single type; (4) The use of end-to-end architectures should depend on the task being solved; (5) Multiple models are better than one. We expect that our meta-analysis will help accelerate the research related to ECG classification based on machine-learning models.

View details for DOI 10.3389/fphys.2021.811661

View details for PubMedID 35095568

View details for PubMedCentralID PMC8795785
TE-ESN: Time Encoding Echo State Network for Prediction Based on Irregularly Sampled Time Series Data Sun, C., Hong, S., Song, M., Chou, Y., Sun, Y., Cai, D., Li, H. edited by Zhou, Z. H. IJCAI-INT JOINT CONF ARTIF INTELL. 2021: 3010-3016

View details for Web of Science ID 001202335503012
TE-ESN: Time Encoding Echo State Network for Prediction Based on Irregularly Sampled Time Series Data Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021) Sun, C., et al 2021

View details for DOI 10.24963/ijcai.2021/414

Chenxi Sun

Postdoctoral Scholar, Neurology and Neurological Sciences

Contact

Additional Info

Links

Current Research and Scholarly Interests

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract