Hejie Cui's Profile | Stanford Profiles

Bio

Dr. Hejie Cui is a postdoctoral researcher at the Stanford Center for Biomedical Informatics Research at Stanford University. Her research focuses on the intersection of machine learning, data mining, and biomedical informatics. At Stanford, Dr. Cui works on large language model (LLM) evaluation and post-training for healthcare. Dr. Cui has authored and co-authored several publications in top computer science and interdisciplinary venues, including NeurIPS, KDD, AAAI, CIKM, TMI, and MICCAI. Her work contributes to advancing the application of artificial intelligence in healthcare and improving the understanding of complex biomedical data. Dr. Cui was selected as a Rising Star in EECS in 2023. She has also received numerous awards, including the Fellowship of 2021 CRA-WP Grad Cohort for Women, Student Travel Grant Award for MICCAI'22, NSF Travel Grant for CIKM'22, and NeurIPS AI4Science Travel Award for NeurIPS'22. Dr. Cui holds a Ph.D. in Computer Science from Emory University (2024) and a B.Eng. in Computer Science and Engineering from Tongji University (2019). During her graduate studies, she gained industry experience through internships at Microsoft Research and Amazon Science.

Honors & Awards

Rising Star in EECS, EECS Rising Stars Committee (11/2023)
Laney-EDGE Graduate School Diverse Scholars in the Sciences, Laney Graduate School, Emory University (08/2023)
Laney Graduate Student Council Research Grant, Emory University (11/2022)
Award for CRA-WP Grad Cohort for Women, Computing Research Association (04/2021)
Mitacs Globalink Research Award (GRA), Canada Mitacs (05/2018)

Professional Education

PhD, Emory University, Computer Science (2024)
BEng, Tongji University, Computer Science and Engineering (2019)

Stanford Advisors

Nigam Shah, Postdoctoral Faculty Sponsor

Contact

Academic
hcui24@stanford.edu

University - Scholar Department: Department of Biomedical Data Science Position: Postdoctoral Scholar

Additional Info

Mail Code: 5479
ORCID:
https://orcid.org/0000-0001-6388-2619

All Publications

BrainGB: A Benchmark for Brain Network Analysis With Graph Neural Networks IEEE TRANSACTIONS ON MEDICAL IMAGING Cui, H., Dai, W., Zhu, Y., Kan, X., Gu, A., Lukemire, J., Zhan, L., He, L., Guo, Y., Yang, C. 2023; 42 (2): 493-506

Abstract

Mapping the connectome of the human brain using structural or functional connectivity has become one of the most pervasive paradigms for neuroimaging analysis. Recently, Graph Neural Networks (GNNs) motivated from geometric deep learning have attracted broad interest due to their established power for modeling complex networked data. Despite their superior performance in many fields, there has not yet been a systematic study of how to design effective GNNs for brain network analysis. To bridge this gap, we present BrainGB, a benchmark for brain network analysis with GNNs. BrainGB standardizes the process by (1) summarizing brain network construction pipelines for both functional and structural neuroimaging modalities and (2) modularizing the implementation of GNN designs. We conduct extensive experiments on datasets across cohorts and modalities and recommend a set of general recipes for effective GNN designs on brain networks. To support open and reproducible research on GNN-based brain network analysis, we host the BrainGB website at https://braingb.us with models, tutorials, examples, as well as an out-of-box Python package. We hope that this work will provide useful empirical evidence and offer insights for future research in this novel and promising direction.

View details for DOI 10.1109/TMI.2022.3218745

View details for Web of Science ID 000934156000015

View details for PubMedID 36318557
Neighborhood-Regularized Self-Training for Learning with Few Labels Xu, R., Yu, Y., Cui, H., Kan, X., Zhu, Y., Ho, J., Zhang, C., Yang, C., Williams, B., Chen, Y., Neville, J. ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2023: 10611-10619

Abstract

Training deep neural networks (DNNs) with limited supervision has been a popular research topic as it can significantly alleviate the annotation burden. Self-training has been successfully applied in semi-supervised learning tasks, but one drawback of self-training is that it is vulnerable to the label noise from incorrect pseudo labels. Inspired by the fact that samples with similar labels tend to share similar representations, we develop a neighborhood-based sample selection approach to tackle the issue of noisy pseudo labels. We further stabilize self-training via aggregating the predictions from different rounds during sample selection. Experiments on eight tasks show that our proposed method outperforms the strongest self-training baseline with 1.83% and 2.51% performance gain for text and graph datasets on average. Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36.8% and saves 57.3% of the time when compared with the best baseline. Our code and appendices will be uploaded to https://github.com/ritaranx/NeST.

View details for Web of Science ID 001243747800035

View details for PubMedID 38333625

View details for PubMedCentralID PMC10851329
R-Mixup: Riemannian Mixup for Biological Networks Kan, X., Li, Z., Cui, H., Yu, Y., Xu, R., Yu, S., Zhang, Z., Guo, Y., Yang, C., ACM ASSOC COMPUTING MACHINERY. 2023: 1073-1085

Abstract

Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities. However, due to their characteristics of high dimensionality and low sample size, directly applying deep learning models on biological networks usually faces severe overfitting. In this work, we propose R-Mixup, a Mixup-based data augmentation technique that suits the symmetric positive definite (SPD) property of adjacency matrices from biological networks with optimized training efficiency. The interpolation process in R-Mixup leverages the log-Euclidean distance metrics from the Riemannian manifold, effectively addressing the swelling effect and arbitrarily incorrect label issues of vanilla Mixup. We demonstrate the effectiveness of R-Mixup with five real-world biological network datasets on both regression and classification tasks. Besides, we derive a commonly ignored necessary condition for identifying the SPD matrices of biological networks and empirically study its influence on the model performance. The code implementation can be found in Appendix E.

View details for DOI 10.1145/3580305.3599483

View details for Web of Science ID 001118896301013

View details for PubMedID 38343707

View details for PubMedCentralID PMC10853987
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting Cui, H., Fang, X., Zhang, Z., Xu, R., Kan, X., Liu, X., Yu, Y., Li, M., Song, Y., Yang, C., Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023

View details for Web of Science ID 001227224005020
Interpretable Graph Neural Networks for Connectome-Based Brain Disorder Analysis Cui, H., Dai, W., Zhu, Y., Li, X., He, L., Yang, C., Wang, L., Dou, Q., Fletcher, P. T., Speidel, S., Li, S. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 375-385

View details for DOI 10.1007/978-3-031-16452-1_36

View details for Web of Science ID 000867418200036
How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation Cui, H., Lu, J., Ge, Y., Yang, C., Hagen, M., Verberne, S., Macdonald, C., Seifert, C., Balog, K., Norvag, K., Setty SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 75-83

View details for DOI 10.1007/978-3-030-99739-7_9

View details for Web of Science ID 000787788000009
On Positional and Structural Node Features for Graph Neural Networks on Non-attributed Graphs Cui, H., Lu, Z., Li, P., Yang, C., ACM ASSOC COMPUTING MACHINERY. 2022: 3898-3902

View details for DOI 10.1145/3511808.3557661

View details for Web of Science ID 001074639603094
Pulmonary Vessel Segmentation Based on Orthogonal Fused U-Net plus plus of Chest CT Images Cui, H., Liu, X., Huang, N., Shen, D., Liu, T., Peters, T. M., Staib, L. H., Essert, C., Zhou, S., Yap, P. T., Khan, A. SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 293-300

View details for DOI 10.1007/978-3-030-32226-7_33

View details for Web of Science ID 000548737100033
BrainSTEAM: A Practical Pipeline for Connectome-based fMRI Analysis towards Subject Classification Li, A., Yang, Y., Cui, H., Yang, C., Hunter, L., Altman, R. B., Ritchie, M. D., Murray, T., Klein, T. E. WORLD SCIENTIFIC PUBL CO PTE LTD. 2024: 53-64

Abstract

Functional brain networks represent dynamic and complex interactions among anatomical regions of interest (ROIs), providing crucial clinical insights for neural pattern discovery and disorder diagnosis. In recent years, graph neural networks (GNNs) have proven immense success and effectiveness in analyzing structured network data. However, due to the high complexity of data acquisition, resulting in limited training resources of neuroimaging data, GNNs, like all deep learning models, suffer from overfitting. Moreover, their capability to capture useful neural patterns for downstream prediction is also adversely affected. To address such challenge, this study proposes BrainSTEAM, an integrated framework featuring a spatio-temporal module that consists of an EdgeConv GNN model, an autoencoder network, and a Mixup strategy. In particular, the spatio-temporal module aims to dynamically segment the time series signals of the ROI features for each subject into chunked sequences. We leverage each sequence to construct correlation networks, thereby increasing the training data. Additionally, we employ the EdgeConv GNN to capture ROI connectivity structures, an autoencoder for data denoising, and mixup for enhancing model training through linear data augmentation. We evaluate our framework on two real-world neuroimaging datasets, ABIDE for Autism prediction and HCP for gender prediction. Extensive experiments demonstrate the superiority and robustness of BrainSTEAM when compared to a variety of existing models, showcasing the strong potential of our proposed mechanisms in generalizing to other studies for connectome-based fMRI analysis.

View details for Web of Science ID 001258333100005

View details for PubMedID 38160269
Federated Learning for Cross-Institution Brain Network Analysis Xie, H., Yang, Y., Cui, H., Yang, C., Chen, W., Astley, S. M. SPIE-INT SOC OPTICAL ENGINEERING. 2024

View details for DOI 10.1117/12.3005883

View details for Web of Science ID 001208134600016
FedBrain: Federated Training of Graph Neural Networks for Connectome-based Brain Imaging Analysis Yang, Y., Xie, H., Cui, H., Yang, C., Hunter, L., Altman, R. B., Ritchie, M. D., Murray, T., Klein, T. E. WORLD SCIENTIFIC PUBL CO PTE LTD. 2024: 214-225

Abstract

Recent advancements in neuroimaging techniques have sparked a growing interest in understanding the complex interactions between anatomical regions of interest (ROIs), forming into brain networks that play a crucial role in various clinical tasks, such as neural pattern discovery and disorder diagnosis. In recent years, graph neural networks (GNNs) have emerged as powerful tools for analyzing network data. However, due to the complexity of data acquisition and regulatory restrictions, brain network studies remain limited in scale and are often confined to local institutions. These limitations greatly challenge GNN models to capture useful neural circuitry patterns and deliver robust downstream performance. As a distributed machine learning paradigm, federated learning (FL) provides a promising solution in addressing resource limitation and privacy concerns, by enabling collaborative learning across local institutions (i.e., clients) without data sharing. While the data heterogeneity issues have been extensively studied in recent FL literature, cross-institutional brain network analysis presents unique data heterogeneity challenges, that is, the inconsistent ROI parcellation systems and varying predictive neural circuitry patterns across local neuroimaging studies. To this end, we propose FedBrain, a GNN-based personalized FL framework that takes into account the unique properties of brain network data. Specifically, we present a federated atlas mapping mechanism to overcome the feature and structure heterogeneity of brain networks arising from different ROI atlas systems, and a clustering approach guided by clinical prior knowledge to address varying predictive neural circuitry patterns regarding different patient groups, neuroimaging modalities and clinical outcomes. Compared to existing FL strategies, our approach demonstrates superior and more consistent performance, showcasing its strong potential and generalizability in cross-institutional connectome-based brain imaging analysis. The implementation is available here.

View details for Web of Science ID 001258333100016

View details for PubMedID 38160281
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting. Advances in neural information processing systems Cui, H., Fang, X., Zhang, Z., Xu, R., Kan, X., Liu, X., Yu, Y., Li, M., Song, Y., Yang, C. 2023; 36: 23499-23519

Abstract

Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achieve this, we present OpenVik which consists of an open relational region detector to detect regions potentially containing relational knowledge and a visual knowledge generator that generates format-free knowledge by prompting the large multimodality model with the detected region of interest. We also explore two data enhancement techniques for diversifying the generated format-free visual knowledge. Extensive knowledge quality evaluations highlight the correctness and uniqueness of the extracted open visual knowledge by OpenVik. Moreover, integrating our extracted knowledge across various visual reasoning applications shows consistent improvements, indicating the real-world applicability of OpenVik.

View details for PubMedID 39130613
TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS Dai, W., Cui, H., Kan, X., Guo, Y., Van Rooij, S., Yang, C., IEEE IEEE. 2023

View details for DOI 10.1109/ISBI53787.2023.10230606

View details for Web of Science ID 001062050500283
PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis Yang, Y., Cui, H., Yang, C., Sarker, T., Beam, A., Mortazavi, B. J., Ho, J. C. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023: 526-544

View details for Web of Science ID 001221739300030
DYNAMIC BRAIN TRANSFORMER WITH MULTI-LEVEL ATTENTION FOR FUNCTIONAL BRAIN NETWORK ANALYSIS Kan, X., Gu, A., Cui, H., Guo, Y., Yang, C., IEEE IEEE. 2023

View details for DOI 10.1109/BHI58575.2023.10313480

View details for Web of Science ID 001107519300049
DEEP DAG LEARNING OF EFFECTIVE BRAIN CONNECTIVITY FOR FMRI ANALYSIS Yu, Y., Kan, X., Cui, H., Xu, R., Zheng, Y., Song, X., Zhu, Y., Zhang, K., Nabi, R., Guo, Y., Zhang, C., Yang, C., IEEE IEEE. 2023

Abstract

Functional magnetic resonance imaging (fMRI) has become one of the most common imaging modalities for brain function analysis. Recently, graph neural networks (GNN) have been adopted for fMRI analysis with superior performance. Unfortunately, traditional functional brain networks are mainly constructed based on similarities among region of interests (ROIs), which are noisy and can lead to inferior results for GNN models. To better adapt GNNs for fMRI analysis, we propose DABNet, a Deep DAG learning framework based on Brain Networks for fMRI analysis. DABNet adopts a brain network generator module, which harnesses the DAG learning approach to transform the raw time-series into effective brain connectivities. Experiments on two fMRI datasets demonstrate the efficacy of DABNet. The generated brain networks also highlight the prediction-related brain regions and thus provide interpretations for predictions.

View details for DOI 10.1109/ISBI53787.2023.10230429

View details for Web of Science ID 001062050500107

View details for PubMedID 38868456

View details for PubMedCentralID PMC11168307
Joint Embedding of Structural and Functional Brain Networks with Graph Neural Networks for Mental Illness Diagnosis. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference Zhu, Y., Cui, H., He, L., Sun, L., Yang, C. 2022; 2022: 272-276

Abstract

Multimodal brain networks characterize complex connectivities among different brain regions from both structural and functional aspects and provide a new means for mental disease analysis. Recently, Graph Neural Networks (GNNs) have become a de facto model for analyzing graph-structured data. However, how to employ GNNs to extract effective representations from brain networks in multiple modalities remains rarely explored. Moreover, as brain networks provide no initial node features, how to design informative node attributes and leverage edge weights for GNNs to learn is left unsolved. To this end, we develop a novel multiview GNN for multimodal brain networks. In particular, we treat each modality as a view for brain networks and employ contrastive learning for multimodal fusion. Then, we propose a GNN model which takes advantage of the message passing scheme by propagating messages based on degree statistics and brain region connectivities. Extensive experiments on two real-world disease datasets (HIV and Bipolar) demonstrate the effectiveness of our proposed method over state-of-the-art baselines.

View details for DOI 10.1109/EMBC48229.2022.9871118

View details for PubMedID 36085703
BRAIN NETWORK TRANSFORMER Kan, X., Dai, W., Cui, H., Zhang, Z., Guo, Y., Yang, C., Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2022

View details for Web of Science ID 001213927501046
Data-Efficient Brain Connectome Analysis via Multi-Task Meta-Learning Yang, Y., Zhu, Y., Cui, H., Kan, X., He, L., Guo, Y., Yang, C., ACM ASSOC COMPUTING MACHINERY. 2022: 4743-4751

View details for DOI 10.1145/3534678.3542680

View details for Web of Science ID 001119000304076
FBNetGen: Task-aware GNN-based fMRI Analysis via Functional Brain Network Generation Kan, X., Cui, H., Lukemire, J., Guo, Y., Yang, C., Konukoglu, E., Menze, B., Venkataraman, A., Baumgartner, C., Dou, Q., Albarqouni, S. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022: 618-637

Abstract

Functional magnetic resonance imaging (fMRI) is one of the most common imaging modalities to investigate brain functions. Recent studies in neuroscience stress the great potential of functional brain networks constructed from fMRI data for clinical predictions. Traditional functional brain networks, however, are noisy and unaware of downstream prediction tasks, while also incompatible with the deep graph neural network (GNN) models. In order to fully unleash the power of GNNs in network-based fMRI analysis, we develop FBNETGEN, a task-aware and interpretable fMRI analysis framework via deep brain network generation. In particular, we formulate (1) prominent region of interest (ROI) features extraction, (2) brain networks generation, and (3) clinical predictions with GNNs, in an end-to-end trainable model under the guidance of particular prediction tasks. Along with the process, the key novel component is the graph generator which learns to transform raw time-series features into task-oriented brain networks. Our learnable graphs also provide unique interpretations by highlighting prediction-related brain regions. Comprehensive experiments on two datasets, i.e., the recently released and currently largest publicly available fMRI dataset Adolescent Brain Cognitive Development (ABCD), and the widely-used fMRI dataset PNC, prove the superior effectiveness and interpretability of FBNETGEN. The implementation is available at https://github.com/Wayfear/FBNETGEN.

View details for Web of Science ID 001227587200039

View details for PubMedID 37377881

View details for PubMedCentralID PMC10296778
Zero-Shot Scene Graph Relation Prediction Through Commonsense Knowledge Integration Kan, X., Cui, H., Yang, C., Oliver, N., PerezCruz, F., Kramer, S., Read, J., Lozano, J. A. SPRINGER INTERNATIONAL PUBLISHING AG. 2021: 466-482

View details for DOI 10.1007/978-3-030-86520-7_29

View details for Web of Science ID 000713032300029

Hejie Cui

Postdoctoral Scholar, Biomedical Informatics

Bio

Honors & Awards

Professional Education

Stanford Advisors

Contact

Additional Info

Links

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract