Rui Yan
Ph.D. Student in Computational and Mathematical Engineering, admitted Spring 2021
Education & Certifications
-
B.S., UCLA, Applied Mathematics & Computer Science (2019)
All Publications
-
Interpretable discovery of patterns in tabular data via spatially semantic topographic maps.
Nature biomedical engineering
2024
Abstract
Tabular data-rows of samples and columns of sample features-are ubiquitously used across disciplines. Yet the tabular representation makes it difficult to discover underlying associations in the data and thus hinders their analysis and the discovery of useful patterns. Here we report a broadly applicable strategy for unravelling intertwined relationships in tabular data by reconfiguring each data sample into a spatially semantic 2D topographic map, which we refer to as TabMap. A TabMap preserves the original feature values as pixel intensities, with the relationships among the features spatially encoded in the map (the strength of two inter-related features correlates with their distance on the map). TabMap makes it possible to apply 2D convolutional neural networks to extract association patterns in the data to aid data analysis, and offers interpretability by ranking features according to importance. We show the superior predictive performance of TabMap by applying it to 12 datasets across a wide range of biomedical applications, including disease diagnosis, human activity recognition, microbial identification and the analysis of quantitative structure-activity relationships.
View details for DOI 10.1038/s41551-024-01268-6
View details for PubMedID 39407015
View details for PubMedCentralID 6443823
-
Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging.
IEEE transactions on medical imaging
2023; 42 (7): 1932-1943
Abstract
The collection and curation of large-scale medical datasets from multiple institutions is essential for training accurate deep learning models, but privacy concerns often hinder data sharing. Federated learning (FL) is a promising solution that enables privacy-preserving collaborative learning among different institutions, but it generally suffers from performance deterioration due to heterogeneous data distributions and a lack of quality labeled data. In this paper, we present a robust and label-efficient self-supervised FL framework for medical image analysis. Our method introduces a novel Transformer-based self-supervised pre-training paradigm that pre-trains models directly on decentralized target task datasets using masked image modeling, to facilitate more robust representation learning on heterogeneous data and effective knowledge transfer to downstream models. Extensive empirical results on simulated and real-world medical imaging non-IID federated datasets show that masked image modeling with Transformers significantly improves the robustness of models against various degrees of data heterogeneity. Notably, under severe data heterogeneity, our method, without relying on any additional pre-training data, achieves an improvement of 5.06%, 1.53% and 4.58% in test accuracy on retinal, dermatology and chest X-ray classification compared to the supervised baseline with ImageNet pre-training. In addition, we show that our federated self-supervised pre-training methods yield models that generalize better to out-of-distribution data and perform more effectively when fine-tuning with limited labeled data, compared to existing FL algorithms. The code is available at https://github.com/rui-yan/SSL-FL.
View details for DOI 10.1109/TMI.2022.3233574
View details for PubMedID 37018314
-
Correlative image learning of chemo-mechanics in phase-transforming solids.
Nature materials
2022
Abstract
Constitutive laws underlie most physical processes in nature. However, learning such equations in heterogeneous solids (for example, due to phase separation) is challenging. One such relationship is between composition and eigenstrain, which governs the chemo-mechanical expansion in solids. Here we developed a generalizable, physically constrained image-learning framework to algorithmically learn the chemo-mechanical constitutive law at the nanoscale from correlative four-dimensional scanning transmission electron microscopy and X-ray spectro-ptychography images. We demonstrated this approach on LiXFePO4, a technologically relevant battery positive electrode material. We uncovered the functional form of the composition-eigenstrain relation in this two-phase binary solid across the entire composition range (0≤X≤1), including inside the thermodynamically unstable miscibility gap. The learned relation directly validates Vegard's law of linear response at the nanoscale. Our physics-constrained data-driven approach directly visualizes the residual strain field (by removing the compositional and coherency strain), which is otherwise impossible to quantify. Heterogeneities in the residual strain arise from misfit dislocations and were independently verified by X-ray diffraction line profile analysis. Our work provides the means to simultaneously quantify chemical expansion, coherency strain and dislocations in battery electrodes, which has implications on rate capabilities and lifetime. Broadly, this work also highlights the potential of integrating correlative microscopy and image learning for extracting material properties and physics.
View details for DOI 10.1038/s41563-021-01191-0
View details for PubMedID 35177785
-
Measurement and models accounting for cell death capture hidden variation in compound response.
Cell death & disease
2020; 11 (4): 255
Abstract
Cancer cell sensitivity or resistance is almost universally quantified through a direct or surrogate measure of cell number. However, compound responses can occur through many distinct phenotypic outcomes, including changes in cell growth, apoptosis, and non-apoptotic cell death. These outcomes have divergent effects on the tumor microenvironment, immune response, and resistance mechanisms. Here, we show that quantifying cell viability alone is insufficient to distinguish between these compound responses. Using an alternative assay and drug-response analysis amenable to high-throughput measurement, we find that compounds with identical viability outcomes can have very different effects on cell growth and death. Moreover, additive compound pairs with distinct growth/death effects can appear synergistic when only assessed by viability. Overall, these results demonstrate an approach to incorporating measurements of cell death when characterizing a pharmacologic response.
View details for DOI 10.1038/s41419-020-2462-8
View details for PubMedID 32312951