Hejie Cui
Postdoctoral Scholar, Biomedical Informatics
Bio
Dr. Hejie Cui is a postdoctoral researcher at the Stanford Center for Biomedical Informatics Research at Stanford University. Her research focuses on the intersection of machine learning, data mining, and biomedical informatics. At Stanford, Dr. Cui works on large language model (LLM) evaluation and post-training for healthcare. Dr. Cui has authored and co-authored several publications in top computer science and interdisciplinary venues, including NeurIPS, KDD, AAAI, CIKM, TMI, and MICCAI. Her work contributes to advancing the application of artificial intelligence in healthcare and improving the understanding of complex biomedical data. Dr. Cui was selected as a Rising Star in EECS in 2023. She has also received numerous awards, including the Fellowship of 2021 CRA-WP Grad Cohort for Women, Student Travel Grant Award for MICCAI'22, NSF Travel Grant for CIKM'22, and NeurIPS AI4Science Travel Award for NeurIPS'22. Dr. Cui holds a Ph.D. in Computer Science from Emory University (2024) and a B.Eng. in Computer Science and Engineering from Tongji University (2019). During her graduate studies, she gained industry experience through internships at Microsoft Research and Amazon Science.
Honors & Awards
-
Rising Star in EECS, EECS Rising Stars Committee (11/2023)
-
Laney-EDGE Graduate School Diverse Scholars in the Sciences, Laney Graduate School, Emory University (08/2023)
-
Laney Graduate Student Council Research Grant, Emory University (11/2022)
-
Award for CRA-WP Grad Cohort for Women, Computing Research Association (04/2021)
-
Mitacs Globalink Research Award (GRA), Canada Mitacs (05/2018)
Professional Education
-
PhD, Emory University, Computer Science (2024)
-
BEng, Tongji University, Computer Science and Engineering (2019)
All Publications
-
BrainGB: A Benchmark for Brain Network Analysis With Graph Neural Networks
IEEE TRANSACTIONS ON MEDICAL IMAGING
2023; 42 (2): 493-506
Abstract
Mapping the connectome of the human brain using structural or functional connectivity has become one of the most pervasive paradigms for neuroimaging analysis. Recently, Graph Neural Networks (GNNs) motivated from geometric deep learning have attracted broad interest due to their established power for modeling complex networked data. Despite their superior performance in many fields, there has not yet been a systematic study of how to design effective GNNs for brain network analysis. To bridge this gap, we present BrainGB, a benchmark for brain network analysis with GNNs. BrainGB standardizes the process by (1) summarizing brain network construction pipelines for both functional and structural neuroimaging modalities and (2) modularizing the implementation of GNN designs. We conduct extensive experiments on datasets across cohorts and modalities and recommend a set of general recipes for effective GNN designs on brain networks. To support open and reproducible research on GNN-based brain network analysis, we host the BrainGB website at https://braingb.us with models, tutorials, examples, as well as an out-of-box Python package. We hope that this work will provide useful empirical evidence and offer insights for future research in this novel and promising direction.
View details for DOI 10.1109/TMI.2022.3218745
View details for Web of Science ID 000934156000015
View details for PubMedID 36318557
-
Neighborhood-Regularized Self-Training for Learning with Few Labels
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2023: 10611-10619
Abstract
Training deep neural networks (DNNs) with limited supervision has been a popular research topic as it can significantly alleviate the annotation burden. Self-training has been successfully applied in semi-supervised learning tasks, but one drawback of self-training is that it is vulnerable to the label noise from incorrect pseudo labels. Inspired by the fact that samples with similar labels tend to share similar representations, we develop a neighborhood-based sample selection approach to tackle the issue of noisy pseudo labels. We further stabilize self-training via aggregating the predictions from different rounds during sample selection. Experiments on eight tasks show that our proposed method outperforms the strongest self-training baseline with 1.83% and 2.51% performance gain for text and graph datasets on average. Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36.8% and saves 57.3% of the time when compared with the best baseline. Our code and appendices will be uploaded to https://github.com/ritaranx/NeST.
View details for Web of Science ID 001243747800035
View details for PubMedID 38333625
View details for PubMedCentralID PMC10851329
-
R-Mixup: Riemannian Mixup for Biological Networks
ASSOC COMPUTING MACHINERY. 2023: 1073-1085
Abstract
Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities. However, due to their characteristics of high dimensionality and low sample size, directly applying deep learning models on biological networks usually faces severe overfitting. In this work, we propose R-Mixup, a Mixup-based data augmentation technique that suits the symmetric positive definite (SPD) property of adjacency matrices from biological networks with optimized training efficiency. The interpolation process in R-Mixup leverages the log-Euclidean distance metrics from the Riemannian manifold, effectively addressing the swelling effect and arbitrarily incorrect label issues of vanilla Mixup. We demonstrate the effectiveness of R-Mixup with five real-world biological network datasets on both regression and classification tasks. Besides, we derive a commonly ignored necessary condition for identifying the SPD matrices of biological networks and empirically study its influence on the model performance. The code implementation can be found in Appendix E.
View details for DOI 10.1145/3580305.3599483
View details for Web of Science ID 001118896301013
View details for PubMedID 38343707
View details for PubMedCentralID PMC10853987
-
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001227224005020
-
Interpretable Graph Neural Networks for Connectome-Based Brain Disorder Analysis
SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 375-385
View details for DOI 10.1007/978-3-031-16452-1_36
View details for Web of Science ID 000867418200036
-
How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation
SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 75-83
View details for DOI 10.1007/978-3-030-99739-7_9
View details for Web of Science ID 000787788000009
-
On Positional and Structural Node Features for Graph Neural Networks on Non-attributed Graphs
ASSOC COMPUTING MACHINERY. 2022: 3898-3902
View details for DOI 10.1145/3511808.3557661
View details for Web of Science ID 001074639603094
-
Pulmonary Vessel Segmentation Based on Orthogonal Fused U-Net plus plus of Chest CT Images
SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 293-300
View details for DOI 10.1007/978-3-030-32226-7_33
View details for Web of Science ID 000548737100033
-
TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS
IEEE. 2023
View details for DOI 10.1109/ISBI53787.2023.10230606
View details for Web of Science ID 001062050500283
-
PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023: 526-544
View details for Web of Science ID 001221739300030
-
Joint Embedding of Structural and Functional Brain Networks with Graph Neural Networks for Mental Illness Diagnosis.
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
2022; 2022: 272-276
Abstract
Multimodal brain networks characterize complex connectivities among different brain regions from both structural and functional aspects and provide a new means for mental disease analysis. Recently, Graph Neural Networks (GNNs) have become a de facto model for analyzing graph-structured data. However, how to employ GNNs to extract effective representations from brain networks in multiple modalities remains rarely explored. Moreover, as brain networks provide no initial node features, how to design informative node attributes and leverage edge weights for GNNs to learn is left unsolved. To this end, we develop a novel multiview GNN for multimodal brain networks. In particular, we treat each modality as a view for brain networks and employ contrastive learning for multimodal fusion. Then, we propose a GNN model which takes advantage of the message passing scheme by propagating messages based on degree statistics and brain region connectivities. Extensive experiments on two real-world disease datasets (HIV and Bipolar) demonstrate the effectiveness of our proposed method over state-of-the-art baselines.
View details for DOI 10.1109/EMBC48229.2022.9871118
View details for PubMedID 36085703
-
BRAIN NETWORK TRANSFORMER
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2022
View details for Web of Science ID 001213927501046
-
Data-Efficient Brain Connectome Analysis via Multi-Task Meta-Learning
ASSOC COMPUTING MACHINERY. 2022: 4743-4751
View details for DOI 10.1145/3534678.3542680
View details for Web of Science ID 001119000304076
-
FBNetGen: Task-aware GNN-based fMRI Analysis via Functional Brain Network Generation
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022: 618-637
Abstract
Functional magnetic resonance imaging (fMRI) is one of the most common imaging modalities to investigate brain functions. Recent studies in neuroscience stress the great potential of functional brain networks constructed from fMRI data for clinical predictions. Traditional functional brain networks, however, are noisy and unaware of downstream prediction tasks, while also incompatible with the deep graph neural network (GNN) models. In order to fully unleash the power of GNNs in network-based fMRI analysis, we develop FBNETGEN, a task-aware and interpretable fMRI analysis framework via deep brain network generation. In particular, we formulate (1) prominent region of interest (ROI) features extraction, (2) brain networks generation, and (3) clinical predictions with GNNs, in an end-to-end trainable model under the guidance of particular prediction tasks. Along with the process, the key novel component is the graph generator which learns to transform raw time-series features into task-oriented brain networks. Our learnable graphs also provide unique interpretations by highlighting prediction-related brain regions. Comprehensive experiments on two datasets, i.e., the recently released and currently largest publicly available fMRI dataset Adolescent Brain Cognitive Development (ABCD), and the widely-used fMRI dataset PNC, prove the superior effectiveness and interpretability of FBNETGEN. The implementation is available at https://github.com/Wayfear/FBNETGEN.
View details for Web of Science ID 001227587200039
View details for PubMedID 37377881
View details for PubMedCentralID PMC10296778
-
Zero-Shot Scene Graph Relation Prediction Through Commonsense Knowledge Integration
SPRINGER INTERNATIONAL PUBLISHING AG. 2021: 466-482
View details for DOI 10.1007/978-3-030-86520-7_29
View details for Web of Science ID 000713032300029