Rui Yan

Ph.D. Student in Computational and Mathematical Engineering, admitted Spring 2021

Education & Certifications

B.S., UCLA, Applied Mathematics & Computer Science (2019)

Contact

Academic
ruiyan@stanford.edu

University - Student Department: ICME Operations Position: Graduate

Additional Info

Mail Code: 4042

All Publications

Deep representation learning of protein-protein interaction networks for enhanced pattern discovery. Science advances Yan, R., Islam, M. T., Xing, L. 2024; 10 (51): eadq4324

Abstract

Protein-protein interaction (PPI) networks, where nodes represent proteins and edges depict myriad interactions among them, are fundamental to understanding the dynamics within biological systems. Despite their pivotal role in modern biology, reliably discerning patterns from these intertwined networks remains a substantial challenge. The essence of the challenge lies in holistically characterizing the relationships of each node with others in the network and effectively using this information for accurate pattern discovery. In this work, we introduce a self-supervised network embedding framework termed discriminative network embedding (DNE). Unlike conventional methods that primarily focus on direct or limited-order node proximity, DNE characterizes a node both locally and globally by harnessing the contrast between representations from neighboring and distant nodes. Our experimental results demonstrate DNE's superior performance over existing techniques across various critical network analyses, including PPI inference and the identification of protein functional modules. DNE emerges as a robust strategy for node representation in PPI networks, offering promising avenues for diverse biomedical applications.

View details for DOI 10.1126/sciadv.adq4324

View details for PubMedID 39693438

View details for PubMedCentralID PMC11654695
Interpretable discovery of patterns in tabular data via spatially semantic topographic maps. Nature biomedical engineering Yan, R., Islam, M. T., Xing, L. 2024

Abstract

Tabular data-rows of samples and columns of sample features-are ubiquitously used across disciplines. Yet the tabular representation makes it difficult to discover underlying associations in the data and thus hinders their analysis and the discovery of useful patterns. Here we report a broadly applicable strategy for unravelling intertwined relationships in tabular data by reconfiguring each data sample into a spatially semantic 2D topographic map, which we refer to as TabMap. A TabMap preserves the original feature values as pixel intensities, with the relationships among the features spatially encoded in the map (the strength of two inter-related features correlates with their distance on the map). TabMap makes it possible to apply 2D convolutional neural networks to extract association patterns in the data to aid data analysis, and offers interpretability by ranking features according to importance. We show the superior predictive performance of TabMap by applying it to 12 datasets across a wide range of biomedical applications, including disease diagnosis, human activity recognition, microbial identification and the analysis of quantitative structure-activity relationships.

View details for DOI 10.1038/s41551-024-01268-6

View details for PubMedID 39407015

View details for PubMedCentralID 6443823
Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging. IEEE transactions on medical imaging Yan, R., Qu, L., Wei, Q., Huang, S. C., Shen, L., Rubin, D. L., Xing, L., Zhou, Y. 2023; 42 (7): 1932-1943

Abstract

The collection and curation of large-scale medical datasets from multiple institutions is essential for training accurate deep learning models, but privacy concerns often hinder data sharing. Federated learning (FL) is a promising solution that enables privacy-preserving collaborative learning among different institutions, but it generally suffers from performance deterioration due to heterogeneous data distributions and a lack of quality labeled data. In this paper, we present a robust and label-efficient self-supervised FL framework for medical image analysis. Our method introduces a novel Transformer-based self-supervised pre-training paradigm that pre-trains models directly on decentralized target task datasets using masked image modeling, to facilitate more robust representation learning on heterogeneous data and effective knowledge transfer to downstream models. Extensive empirical results on simulated and real-world medical imaging non-IID federated datasets show that masked image modeling with Transformers significantly improves the robustness of models against various degrees of data heterogeneity. Notably, under severe data heterogeneity, our method, without relying on any additional pre-training data, achieves an improvement of 5.06%, 1.53% and 4.58% in test accuracy on retinal, dermatology and chest X-ray classification compared to the supervised baseline with ImageNet pre-training. In addition, we show that our federated self-supervised pre-training methods yield models that generalize better to out-of-distribution data and perform more effectively when fine-tuning with limited labeled data, compared to existing FL algorithms. The code is available at https://github.com/rui-yan/SSL-FL.

View details for DOI 10.1109/TMI.2022.3233574

View details for PubMedID 37018314
Correlative image learning of chemo-mechanics in phase-transforming solids. Nature materials Deng, H. D., Zhao, H., Jin, N., Hughes, L., Savitzky, B. H., Ophus, C., Fraggedakis, D., Borbely, A., Yu, Y., Lomeli, E. G., Yan, R., Liu, J., Shapiro, D. A., Cai, W., Bazant, M. Z., Minor, A. M., Chueh, W. C. 2022

Abstract

Constitutive laws underlie most physical processes in nature. However, learning such equations in heterogeneous solids (for example, due to phase separation) is challenging. One such relationship is between composition and eigenstrain, which governs the chemo-mechanical expansion in solids. Here we developed a generalizable, physically constrained image-learning framework to algorithmically learn the chemo-mechanical constitutive law at the nanoscale from correlative four-dimensional scanning transmission electron microscopy and X-ray spectro-ptychography images. We demonstrated this approach on LiXFePO4, a technologically relevant battery positive electrode material. We uncovered the functional form of the composition-eigenstrain relation in this two-phase binary solid across the entire composition range (0≤X≤1), including inside the thermodynamically unstable miscibility gap. The learned relation directly validates Vegard's law of linear response at the nanoscale. Our physics-constrained data-driven approach directly visualizes the residual strain field (by removing the compositional and coherency strain), which is otherwise impossible to quantify. Heterogeneities in the residual strain arise from misfit dislocations and were independently verified by X-ray diffraction line profile analysis. Our work provides the means to simultaneously quantify chemical expansion, coherency strain and dislocations in battery electrodes, which has implications on rate capabilities and lifetime. Broadly, this work also highlights the potential of integrating correlative microscopy and image learning for extracting material properties and physics.

View details for DOI 10.1038/s41563-021-01191-0

View details for PubMedID 35177785
Measurement and models accounting for cell death capture hidden variation in compound response. Cell death & disease Bae, S. Y., Guan, N., Yan, R., Warner, K., Taylor, S. D., Meyer, A. S. 2020; 11 (4): 255

Abstract

Cancer cell sensitivity or resistance is almost universally quantified through a direct or surrogate measure of cell number. However, compound responses can occur through many distinct phenotypic outcomes, including changes in cell growth, apoptosis, and non-apoptotic cell death. These outcomes have divergent effects on the tumor microenvironment, immune response, and resistance mechanisms. Here, we show that quantifying cell viability alone is insufficient to distinguish between these compound responses. Using an alternative assay and drug-response analysis amenable to high-throughput measurement, we find that compounds with identical viability outcomes can have very different effects on cell growth and death. Moreover, additive compound pairs with distinct growth/death effects can appear synergistic when only assessed by viability. Overall, these results demonstrate an approach to incorporating measurements of cell death when characterizing a pharmacologic response.

View details for DOI 10.1038/s41419-020-2462-8

View details for PubMedID 32312951

Rui Yan

Ph.D. Student in Computational and Mathematical Engineering, admitted Spring 2021

Education & Certifications

Contact

Additional Info

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract