Sheng Liu
Postdoctoral Scholar, Biomedical Data Sciences
Bio
Sheng Liu is a postdoctoral fellow at Stanford University. In May 2023, He received a Ph.D. degree from New York University, majoring in Data Science and Machine Learning. His background is in the area of robust and trustworthy machine learning, machine learning for healthcare.
All Publications
-
Long-term, ambulatory 12-lead ECG from a single non-standard lead using perceptual reconstruction.
medRxiv : the preprint server for health sciences
2025
Abstract
Despite its broadening indications, the implantable cardiac monitor (ICM) records a narrow, nonstandard electrocardiogram (ECG) signal which precludes morphological and functional assessments or the application of 12-lead ECG models. We hypothesize that deep learning can be used to reconstruct 12-lead ECG from a single ICM lead for continuously assessing clinical endpoints outside of rhythm detection alone.To reconstruct 12-lead ECG from a single ICM lead to detect conduction, repolarization, rhythm, and cardiac functional changes in a large, diverse patient population.We annotated 75,450 echocardiogram-ECG pairs with five disease labels a) right bundle branch block, b) left bundle branch block, c) atrial fibrillation, d) QT-prolongation and e) low left ventricular ejection fraction (LVEF) using regex-based parsing of clinician interpretations. We used perceptual loss to train a deep U-Net (ECG12-PerceptNet) to reconstruct 12-lead ECG from a simulated ICM signal. We compared the classification performance of the reconstructed 12-lead ECG against the original 12-lead and single lead ECG in an internal and external test set. Furthermore, we trained a regression model to predict the absolute LVEF using original and reconstructed 12-lead ECGs.The reconstructed ECG approached the original 12-lead ECG in classification performance across all endpoints while significantly outperforming the single lead ECG. We show two case studies where sequential LVEF measurements were tracked using LVEF predicted with the original and reconstructed 12-lead ECG.In this paper, we report the ECG12-PerceptNet which reconstructs 12-lead ECG from a simulated ICM signal. This can enable continuous in-home or ambulatory monitoring of cardiac functional changes, potentially reducing hospitalizations and out-of-hospital cardiac arrest.
View details for DOI 10.64898/2025.12.17.25342224
View details for PubMedID 41445642
View details for PubMedCentralID PMC12724160
-
Revealing neurocognitive and behavioral patterns through unsupervised manifold learning of dynamic brain data.
Nature computational science
2025
Abstract
Dynamic brain data are becoming increasingly accessible, providing a gateway to understanding the inner workings of the brain in living participants. However, the size and complexity of the data pose a challenge in extracting meaningful information across various data sources. Here we introduce a generalizable unsupervised deep manifold learning for exploration of neurocognitive and behavioral patterns. Unlike existing methods that extract patterns directly from the input data, the proposed brain-dynamic convolutional-network-based embedding (BCNE) captures brain-state trajectories by analyzing temporospatial correlations within the data and applying manifold learning. The results demonstrate that BCNE effectively delineates scene transitions, underscores the involvement of different brain regions in memory and narrative processing, distinguishes dynamic learning processes and identifies differences between active and passive behaviors. BCNE provides an effective tool for exploring general neuroscience inquiries or individual-specific patterns.
View details for DOI 10.1038/s43588-025-00911-9
View details for PubMedID 41345469
-
Quantifying large language model usage in scientific papers.
Nature human behaviour
2025
Abstract
Scientific publishing is the primary means of disseminating research findings. There has been speculation about how extensively large language models (LLMs) are being used in academic writing. Here we conduct a systematic analysis across 1,121,912 preprints and published papers from January 2020 to September 2024 on arXiv, bioRxiv and Nature portfolio journals, using a population-level framework based on word frequency shifts to estimate the prevalence of LLM-modified content over time. Our findings suggest a steady increase in LLM usage, with the largest and fastest growth estimated for computer science papers (up to 22%). By comparison, mathematics papers and the Nature portfolio showed lower evidence of LLM modification (up to 9%). LLM modification estimates were higher among papers from first authors who post preprints more frequently, papers in more crowded research areas and papers of shorter lengths. Our findings suggest that LLMs are being broadly used in scientific writing.
View details for DOI 10.1038/s41562-025-02273-8
View details for PubMedID 40760036
View details for PubMedCentralID 5199034
-
Automated radiotherapy treatment planning guided by GPT-4Vision.
Physics in medicine and biology
2025
Abstract
Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in frontier Artificial Intelligence (AI) models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, an automated treatment planning framework that integrates radiation oncology knowledge with the reasoning capabilities of large multi-modal models, such as GPT-4Vision (GPT-4V) from OpenAI. Approach: Via in-context learning, we incorporate clinical requirements and a few (3 in our experiments) approved clinical plans with their optimization settings, enabling GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan system is integrated into our in-house inverse treatment planning system through an application programming interface (API). For a given patient, GPT-RadPlan acts as both plan evaluator and planner, first assessing dose distributions and dose-volume histograms (DVHs), and then providing ``textual feedback'' on how to improve the plan to match the physician's requirements. In this manner, GPT-RadPlan iteratively refines the plan by adjusting planning parameters, such as weights and dose objectives, based on its suggestions. Main results: The efficacy of the automated planning system is showcased across 17 prostate cancer and 13 head & neck cancer VMAT plans with prescribed doses of 70.2 Gy and 72 Gy, respectively, where we compared GPT-RadPlan results to clinical plans produced by human experts. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and reducing organ-at-risk doses by 5 Gy on average. Significance: Consistently satisfying the dose-volume objectives in the clinical protocol, GPT-RadPlan represents the first multimodal large language model agent that mimics the behaviors of human planners in radiation oncology clinics, achieving promising results in automating the treatment planning process without the need for additional training.
View details for DOI 10.1088/1361-6560/adf02c
View details for PubMedID 40664228
-
Inference-specific learning for improved medical image segmentation.
Medical physics
2025
Abstract
Deep learning networks map input data to output predictions by fitting network parameters using training data. However, applying a trained network to new, unseen inference data resembles an interpolation process, which may lead to inaccurate predictions if the training and inference data distributions differ significantly.This study aims to generally improve the prediction accuracy of deep learning networks on the inference case by bridging the gap between training and inference data.We propose an inference-specific learning strategy to enhance the network learning process without modifying the network structure. By aligning training data to closely match the specific inference data, we generate an inference-specific training dataset, enhancing the network optimization around the inference data point for more accurate predictions. Taking medical image auto-segmentation as an example, we develop an inference-specific auto-segmentation framework consisting of initial segmentation learning, inference-specific training data deformation, and inference-specific segmentation refinement. The framework is evaluated on public abdominal, head-neck, and pancreas CT datasets comprising 30, 42, and 210 cases, respectively, for medical image segmentation.Experimental results show that our method improves the organ-averaged mean Dice by 6.2% (p-value = 0.001), 1.5% (p-value = 0.003), and 3.7% (p-value < 0.001) on the three datasets, respectively, with a more notable increase for difficult-to-segment organs (such as a 21.7% increase for the gallbladder [p-value = 0.004]). By incorporating organ mask-based weak supervision into the training data alignment learning, the inference-specific auto-segmentation accuracy is generally improved compared with the image intensity-based alignment. Besides, a moving-averaged calculation of the inference organ mask during the learning process strengthens both the robustness and accuracy of the final inference segmentation.By leveraging inference data during training, the proposed inference-specific learning strategy consistently improves auto-segmentation accuracy and holds the potential to be broadly applied for enhanced deep learning decision-making.
View details for DOI 10.1002/mp.17883
View details for PubMedID 40356014
-
GPT-RadPlan: a plugin for automated treatment planning in Eclipse TPS based on large language models
ELSEVIER IRELAND LTD. 2025: S2845-S2846
View details for Web of Science ID 001519901500017
-
Optimizing generative AI by backpropagating language model feedback.
Nature
2025; 639 (8055): 609-616
Abstract
Recent breakthroughs in artificial intelligence (AI) are increasingly driven by systems orchestrating multiple large language models (LLMs) and other specialized tools, such as search engines and simulators. So far, these systems are primarily handcrafted by domain experts and tweaked through heuristics rather than being automatically optimized, presenting a substantial challenge to accelerating progress. The development of artificial neural networks faced a similar challenge until backpropagation and automatic differentiation transformed the field by making optimization turnkey. Analogously, here we introduce TextGrad, a versatile framework that performs optimization by backpropagating LLM-generated feedback to improve AI systems. By leveraging natural language feedback to critique and suggest improvements to any part of a system-from prompts to outputs such as molecules or treatment plans-TextGrad enables the automatic optimization of generative AI systems across diverse tasks. We demonstrate TextGrad's generality and effectiveness through studies in solving PhD-level science problems, optimizing plans for radiotherapy treatments, designing molecules with specific properties, coding, and optimizing agentic systems. TextGrad empowers scientists and engineers to easily develop impactful generative AI systems.
View details for DOI 10.1038/s41586-025-08661-4
View details for PubMedID 40108317
View details for PubMedCentralID 10794143