Ehsan Adeli

Assistant Professor (Research) of Psychiatry and Behavioral Sciences (Public Mental Health and Populations Sciences) and, by courtesy, of Computer Science and of Biomedical Data Science

Bio

With a Ph.D. in artificial intelligence and computer vision and postgraduate training in biomedical imaging & computational neuroscience, I solve critical problems in healthcare and neuroscience.

My research group focuses on developing Translational Artificial Intelligence (AI) algorithms in medicine and mental health. My work involves the automatic analysis of human activities and behaviors from videos, connecting how humans perform actions to the brain by analyzing magnetic resonance images (MRIs). By exploring explainable machine learning algorithms, I aim to uncover the underlying factors of neurodegenerative and neuropsychiatric diseases and their impact on everyday life.

My research concentrates on and connects two main areas: digital humans and human neuroscience. I analyze 3D motion, actions, and behaviors using various human sensing technologies, such as video and sensory data. Additionally, I employ clinical and cognitive tests, as well as neuroimaging modalities like MRIs, to explore brain function and neural processes. Integrating these technologies to enhance clinical applications and provide deeper insights into the complexities of human behavior and brain function, my group develops world models for neuroscience.

Academic Appointments

Assistant Professor (Research), Psychiatry and Behavioral Sciences
Assistant Professor (Research) (By courtesy), Computer Science
Assistant Professor (Research) (By courtesy), Department of Biomedical Data Science
Member, Bio-X
Faculty Affiliate, Institute for Human-Centered Artificial Intelligence (HAI)
Member, Wu Tsai Human Performance Alliance
Member, Wu Tsai Neurosciences Institute

Administrative Appointments

Director of AI/Innovation in Precision Mental Health, Department of Psychiatry and Behavioral Sciences (2024 - Present)
Partnership in AI-Assisted Care Team Science Leader, Division of Primary Care and Population Health in the Clinical Excellence Research Center (CERC) (2023 - Present)
Associate Editor, Medical Image Analysis, An official journal of the MICCAI Society (2025 - Present)
Associate Editor, IEEE Transactions on Medical Imaging (2025 - Present)
Associate Editor, International Journal of Computer Vision (IJCV) (2023 - Present)
Associate Editor, IEEE Journal of Biomedical and Health Informatics (2020 - 2024)
Associate Editor, Journal of Ambient Intelligence and Smart Environments (2019 - 2024)

Honors & Awards

2023 Chairman’s Award for Educational Excellence, Stanford School of Medicine, Department of Psychiatry and Behavioral Sciences (2023)
Jaswa Innovator Award, Stanford School of Medicine (2022-2024)
Faculty Professional & Leadership Award, Department of Psychiatry and Behavioral Sciences, Stanford School of Medicine (2022)
Senior Member, IEEE (2021-Now)
REC Fellow, Stanford University Alzheimer's Disease Research Center (ADRC) (2020-2022)
Innovator Award 2021, Stanford University School of Medicine Department of Psychiatry & Behavioral Sciences (2020-2021)
Young Investigator Travel Award, Medical Image Computing and Computer Assisted Interventions (MICCAI) (2018)
NIH F32 Fellowship Award, NIAAA (2018-2019)

Professional Education

Postdoctoral Research Associate, University of North Carolina at Chapel Hill, Machine Learning and Medical Imaging (2017)
Graduate Research Scholar, Carnegie Mellon University, Computer Vision (2012)

Current Research and Scholarly Interests

My research group focuses on developing Translational Artificial Intelligence (AI) algorithms in medicine and mental health, leveraging recent advancements in AI, computer vision, ambient intelligence, and computational neuroscience. My work involves the automatic analysis of human activities and behaviors from videos, connecting how humans perform actions to the brain by analyzing magnetic resonance images (MRIs). By exploring explainable machine learning algorithms, I aim to uncover the underlying factors of neurodegenerative and neuropsychiatric diseases and their impact on everyday life.

My research concentrates on and connects two main areas: digital humans and human neuroscience. I analyze 3D motion, actions, and behaviors using various human sensing technologies, such as video and sensory data. Additionally, I employ clinical and cognitive tests, as well as neuroimaging modalities like MRIs, to explore brain function and neural processes. Integrating these technologies to enhance clinical applications and provide deeper insights into the complexities of human behavior and brain function, my group develops world models for neuroscience.

Clinical Trials

Assessing Cognitive Decline at Home Recruiting

Neuropsychiatric symptoms (NPS) refer to a range of mental and emotional issues that can be observed through how patients move, perform daily tasks, and express feelings on their faces. In this study, the investigators want to find ways to accurately and unobtrusively track these symptoms in people's homes over time. Our goals are to note when these symptoms happen, predict potential problems, and gather clear data to help doctors make accurate diagnoses. To do this, the investigators will first collect information from participants who have in-home sensors. the investigators will then use special computer programs that can recognize everyday activities and identify features that connect to scores from the Mild Behavioral Impairment Checklist (MBI-C). These scores will be compared to a questionnaire (NPIQ) filled out by caregivers or family members, along with any relevant information from doctors about the patients' symptoms. The investigators aim to see how these features can help differentiate between types of NPS, such as mood changes and agitation. Finally, the investigators will create a dashboard for doctors that summarizes the patterns of these symptoms in patients, making it easier to monitor and manage their mental health.

View full details
An Objective Assessment Tool for Evaluating Functioning in Older Adults Not Recruiting

The Investigators are seeking your consent to participate in research investigating the development of a mobile application that enhances physical and cognitive assessments. This is in response to the growing significance of the Short Physical Performance Battery (SPPB) in Alzheimer's Disease and Related Dementias (AD/ADRD) research. The SPPB has proven its value in evaluating lower extremity function and mobility in older adults, providing predictive insights into declines in daily living activities, falls, hospitalization, disability, and mortality. Recognizing the need for accessible and automated assessment tools, this project endeavors to design a mobile app with multi-fold functionality. The final version will guide users through SPPB tests, offer real-time performance scoring, and facilitate frequent, objective, and accurate physical and cognitive assessments. This is particularly critical for monitoring the progression of ADRD, identifying subtle physical changes indicative of cognitive decline, and enabling timely interventions tailored to patients' evolving needs. Our goal is to collect video data from 20 to 30 participants 18+ years of age who are considered healthy with no severe mobility issues to perform the SPPB. The video data will be used to develop a prototype of the SPPB application and validate testing in the lab. The video recording will be automatically encrypted and securely uploaded to Stanford privacy protected computer servers to test and refine the application results.

Stanford is currently not accepting patients for this trial.

View full details

2025-26 Courses

AI-Assisted Care
BIOE 277, CS 337, MED 277, PSYC 278 (Aut)
Deep Learning for Computer Vision
CS 231N (Spr)
Independent Studies (19)
- Advanced Reading and Research
  CS 499 (Aut, Win, Spr, Sum)
- Advanced Reading and Research
  CS 499P (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390A (Sum)
- Curricular Practical Training
  CS 390B (Sum)
- Curricular Practical Training
  CS 390C (Sum)
- Directed Reading
  BMDS 299 (Aut)
- Directed Reading in Psychiatry
  PSYC 299 (Aut, Win, Spr, Sum)
- Graduate Research
  PSYC 399 (Aut, Win, Spr, Sum)
- Graduate Research on Biomedical Data Science
  BMDS 399 (Aut, Win, Spr)
- Independent Project
  CS 399 (Aut, Win, Spr, Sum)
- Independent Project
  CS 399P (Win)
- Independent Work
  CS 199 (Aut, Win, Spr, Sum)
- Master's Research
  CME 291 (Aut, Win, Spr, Sum)
- Master's Research
  MATSCI 200 (Aut, Win, Spr)
- Part-time Curricular Practical Training
  CS 390D (Win, Spr)
- Senior Project
  CS 191 (Aut, Win)
- Supervised Undergraduate Research
  CS 195 (Aut, Win, Spr, Sum)
- Undergraduate Research, Independent Study, or Directed Reading
  PSYC 199 (Aut, Win, Spr, Sum)
- Writing Intensive Senior Research Project
  CS 191W (Aut, Spr)
Prior Year Courses
2024-25 Courses
- AI-Assisted Care
  CS 337, MED 277 (Aut)
- Artificial Intelligence in Medicine and Healthcare Ventures
  MED 180, PSYC 180 (Win)
- Deep Learning for Computer Vision
  CS 231N (Spr)
- Machine Learning for Neuroimaging
  BIODS 227, PSYC 121, PSYC 221 (Aut)
2023-24 Courses
- AI-Assisted Care
  CS 337, MED 277 (Aut)
- Deep Learning for Computer Vision
  CS 231N (Spr)
- Machine Learning for Neuroimaging
  BIODS 227, PSYC 121, PSYC 221 (Aut)
2022-23 Courses
- AI-Assisted Care
  MED 277 (Aut)
- Artificial Intelligence in Medicine and Healthcare Ventures
  MED 180 (Spr)
- Artificial Intelligence in Medicine and Healthcare Ventures
  PSYC 180 (Spr)
- Current Topics in Machine Learning for Neuroimaging
  PSYC 121 (Aut)
- Current Topics in Machine Learning for Neuroimaging
  PSYC 221 (Aut)

Stanford Advisees

Med Scholar Project Advisor
Anson Zhou
Doctoral Dissertation Reader (AC)
Ashwin Kumar, Shubo Yang
Postdoctoral Faculty Sponsor
Gustavo Chau Loo Kung, Yang Liu, Akshay Paruchuri, Narayan Schuetz, Jiyao Wang, Juze Zhang, Yue Zhao
Orals Evaluator
Cristobal Eyzaguirre
Doctoral Dissertation Advisor (AC)
Favour Nerrise, Karan Singh
Master's Program Advisor
Peter Alisky, Nicholas Allen, Kunal Arora, Kuo-Han Hung, Dev Jayram, Abhinav Sattiraju, Yalcin Tur, Merritt Vassallo, Katherine Xu, Patrick Zakrzewski
Doctoral Dissertation Co-Advisor (AC)
Zane Durante, Ridvan Yesiloglu
Doctoral (Program)
Jessica Brown, Fangrui Huang, Bailey Trang Nguyen, Chaitanya Patel, Adam Sun, Heng Yu
Postdoctoral Research Mentor
Xinliang Zhou

All Publications

Cycle Diffusion Model for Counterfactual Image Generation. Predictive Intelligence in Medicine. PRIME (Workshop) Huang, F., Wang, A., Li, B., Trang, B., Yesiloglu, R., Hua, T., Peng, W., Adeli, E. 2026; 16164: 173-185

Abstract

Deep generative models have demonstrated remarkable success in medical image synthesis. However, ensuring conditioning faithfulness and high-quality synthetic images for direct or counterfactual generation remains a challenge. In this work, we introduce a cycle training framework to fine-tune diffusion models for improved conditioning adherence and enhanced synthetic image realism. Our approach, Cycle Diffusion Model (CDM), enforces consistency between generated and original images by incorporating cycle constraints, enabling more reliable direct and counterfactual generation. Experiments on a combined 3D brain MRI dataset (from ABCD, HCP aging & young adults, ADNI, and PPMI) show that our method improves conditioning accuracy and enhances image quality as measured by FID and SSIM. The results suggest that the cycle strategy used in CDM can be an effective method for refining diffusion-based medical image generation, with applications in data augmentation, counterfactual, and disease progression modeling.

View details for DOI 10.1007/978-3-032-07904-6_16

View details for PubMedID 42028560

View details for PubMedCentralID PMC13102316
Neural Autoregressive Modeling of Brain Aging. Deep generative models : 5th MICCAI workshop, DGM4MICCAI 2025, held in conjunction with MICCAI 2025, Daejeon, South Korea, September 23, 2025, Proceedings. DGM4MICCAI (Workshop) (5th : 2025 : Taejon-si, Korea) Yesiloglu, R., Peng, W., Islam, M. T., Adeli, E. 2026; 16128: 341-350

Abstract

Brain aging synthesis is a critical task with broad applications in clinical and computational neuroscience. The ability to predict the future structural evolution of a subject's brain from an earlier MRI scan provides valuable insights into aging trajectories. Yet, the high-dimensionality of data, subtle changes of structure across ages, and subject-specific patterns constitute challenges in the synthesis of the aging brain. To overcome these challenges, we propose NeuroAR, a novel brain aging simulation model based on generative autoregressive transformers. NeuroAR synthesizes the aging brain by autoregressively estimating the discrete token maps of a future scan from a convenient space of concatenated token embeddings of a previous and future scan. To guide the generation, it concatenates into each scale the subject's previous scan, and uses its acquisition age and the target age at each block via cross-attention. We evaluate our approach on both the elderly population and adolescent subjects, demonstrating superior performance over state-of-the-art generative models, including latent diffusion models (LDM) and generative adversarial networks, in terms of image fidelity. Furthermore, we employ a pre-trained age predictor to further validate the consistency and realism of the synthesized images with respect to expected aging patterns. NeuroAR significantly outperforms key models, including LDM, demonstrating its ability to model subject-specific brain aging trajectories with high fidelity.

View details for DOI 10.1007/978-3-032-05472-2_33

View details for PubMedID 41919271

View details for PubMedCentralID PMC13035358
Latent Causal Modeling for 3D Brain MRI Counterfactuals. Deep generative models : 5th MICCAI workshop, DGM4MICCAI 2025, held in conjunction with MICCAI 2025, Daejeon, South Korea, September 23, 2025, Proceedings. DGM4MICCAI (Workshop) (5th : 2025 : Taejon-si, Korea) Peng, W., Xia, T., De Sousa Ribeiro, F., Bosschieter, T., Adeli, E., Zhao, Q., Glocker, B., Pohl, K. M. 2026; 16128: 192-201

Abstract

The number of samples in structural brain MRI studies is often too small to properly train deep learning models. Generative models show promise in addressing this issue by effectively learning the data distribution and generating high-fidelity MRI. However, they struggle to produce diverse, high-quality data outside the distribution defined by the training data. One way to address the issue is using causal models developed for 3D volume counterfactuals. However, accurately modeling causality in high-dimensional spaces is a challenge so that these models generally generate 3D brain MRIS of lower quality. To address these challenges, we propose a two-stage method that constructs a Structural Causal Model (SCM) within the latent space. In the first stage, we employ a VQ-VAE to learn a compact embedding of the MRI volume. Subsequently, we integrate our causal model into this latent space and execute a three-step counterfactual procedure using a closed-form Generalized Linear Model (GLM). Our experiments conducted on real-world high-resolution MRI data (1mm) provided by the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) demonstrate that our method can generate high-quality 3D MRI counterfactuals.

View details for DOI 10.1007/978-3-032-05472-2_19

View details for PubMedID 41841031

View details for PubMedCentralID PMC12988853
WASABI: A Metric for Evaluating Morphometric Plausibility of Synthetic Brain MRIs. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Jafrasteh, B., Peng, W., Wan, C., Luo, Y., Adeli, E., Zhao, Q. 2026; 15961: 684-694

Abstract

Generative models enhance neuroimaging through data augmentation, quality improvement, and rare condition studies. Despite advances in realistic synthetic MRIs, evaluations focus on texture and perception, lacking sensitivity to crucial morphometric fidelity. This study proposes a new metric, called WASABI (Wasserstein-Based Anatomical Brain Index), to assess the morphometric plausibility of synthetic brain MRIs. WASABI leverages SynthSeg, a deep learning-based brain parcellation tool, to derive volumetric measures of brain regions in each MRI and uses the multivariate Wasserstein distance to compare distributions between real and synthetic anatomies. Based on controlled experiments on two real datasets and synthetic MRIs from five generative models, WASABI demonstrates higher sensitivity in quantifying morphometric discrepancies compared to traditional image-level metrics, even when synthetic images achieve near-perfect visual quality. Our findings advocate for shifting the evaluation paradigm beyond visual inspection and conventional metrics, emphasizing morphometric fidelity as a crucial benchmark for clinically meaningful brain MRI synthesis. Our code is available at https://github.com/BahramJafrasteh/wasabi-mri.

View details for DOI 10.1007/978-3-032-04937-7_65

View details for PubMedID 42028468

View details for PubMedCentralID PMC13102318
Generating Novel Brain Morphology by Deforming Learned Templates. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Wang, A. Q., Huang, F., Trang, B., Peng, W., Abbasi, M., Pohl, K. M., Sabuncu, M. R., Adeli, E. 2026; 15961: 207-217

Abstract

Designing generative models for 3D structural brain MRI that synthesize morphologically-plausible and attribute-specific (e.g., age, sex, disease state) samples is an active area of research. Existing approaches based on frameworks like GANs or diffusion models synthesize the image directly, which may limit their ability to capture intricate morphological details. In this work, we propose a 3D brain MRI generation method based on state-of-the-art latent diffusion models (LDMs), called MorphLDM, that generates novel images by applying synthesized deformation fields to a learned template. Instead of using a reconstruction-based autoencoder (as in a typical LDM), our encoder outputs a latent embedding derived from both an image and a learned template that is itself the output of a template decoder; this latent is passed to a deformation field decoder, whose output is applied to the learned template. A registration loss is minimized between the original image and the deformed template with respect to the encoder and both decoders. Empirically, our approach outperforms generative baselines on metrics spanning image diversity, adherence with respect to input conditions, and voxel-based morphometry. Our code is available at https://github.com/alanqrwang/morphldm.

View details for DOI 10.1007/978-3-032-04937-7_20

View details for PubMedID 41767463

View details for PubMedCentralID PMC12945308
A framework of digital biomarkers for neurodegenerative diseases. Nature reviews bioengineering Nerrise, F., Schütz, N., Zhao, Q., Gould, C., Milstein, A., Schulman, K., Henderson, V. W., Landay, J., Fei-Fei, L., Lin, F. V., Adeli, E. 2026

Abstract

Digital biomarkers (DBMs) are a new class of health indicators derived from digital technologies - including smartphones, wearable devices and ambient sensors - that enable continuous, real-time monitoring of signals in everyday settings. By providing richer and more dynamic data than conventional, point-in-time measurements, DBMs offer fresh opportunities for remote patient assessment, personalized care and large-scale biomedical research. Importantly, DBMs function as powerful complementary tools to traditional biomarkers that can screen candidates for more invasive tests and provide contextual data between clinical visits. This Review provides a standardized classification of DBMs focused on neurodegenerative diseases, including Alzheimer disease, Parkinson disease, mild cognitive impairment, Huntington disease, multiple sclerosis, frontotemporal dementia, spinocerebellar ataxia and dementia with Lewy bodies, centred around three questions: what is being measured (the concept of interest), how it is measured (the sensing technologies) and why it is measured (the application areas). By examining these dimensions, we highlight the potential of DBMs to transform clinical monitoring, early detection and therapeutic interventions in these disorders.

View details for DOI 10.1038/s44222-026-00433-7

View details for PubMedID 42239941

View details for PubMedCentralID PMC13229411
A framework of digital biomarkers for neurodegenerative diseases NATURE REVIEWS BIOENGINEERING Nerrise, F., Schutz, N., Zhao, Q., Gould, C., Milstein, A., Schulman, K., Henderson, V. W., Landay, J., Fei-Fei, L., Lin, F., Adeli, E. 2026

View details for DOI 10.1038/s44222-026-00433-7

View details for Web of Science ID 001746915400001
Resting-state fMRI foundation models enable robust and generalizable latent neural target discovery in cognitive aging interventions. bioRxiv : the preprint server for biology Zhou, X., Ai, M., Adeli, E., Zhang, Y., Liu, Y. M., Vankee-Lin, F. 2026

Abstract

The benefits of interventions targeting cognitive aging vary substantially across individuals, largely owing to heterogeneity in aging-related comorbidities. It is necessary to robustly identify neural patterns underlying intervention response and test their generalizability across heterogeneous cohorts. Resting-state functional MRI (rsfMRI) offers a potential pathway, but relying on predefined summary features with conventional methods has limited capacity to capture both within-individual longitudinal variation and between-individual differences, particularly in small and heterogeneous studies. Recent rsfMRI foundation models pretrained on large observational cohorts present a promising alternative by learning transferable spatiotemporal representations from time-series signals. Yet their validity and generalizability in local intervention settings remain unclear. Here, we systematically evaluated rsfMRI foundation models using data from two independent randomized controlled trials of older adults with mild cognitive impairment, testing whether these models can robustly extract longitudinal brain representations that predict post-intervention changes in episodic memory across trials. Foundation models outperformed conventional machine learning and deep learning approaches across both trials. Clinically informed adaptation using an external Alzheimer's disease cohort further improved performance and robustness to confounders (i.e., head motion, site, and intervention arm), with accuracy up to 82%. Multivariate decomposition of foundation model embeddings identified latent neural patterns associated with episodic memory change with cross-study consistency at baseline that became more spatially distributed at post-intervention. These findings show that rsfMRI foundation models can enable robust and generalizable identification of latent neural patterns linking longitudinal brain dynamics to individual intervention response, laying the foundation for precision-driven neural target discovery in cognitive aging research.

View details for DOI 10.64898/2025.12.30.697042

View details for PubMedID 42039513

View details for PubMedCentralID PMC13105019
Divergent Pathways Taken in Adolescence Predict Embracing or Resisting Moderate to Heavy Drinking in Young Adulthood. Biological psychiatry. Cognitive neuroscience and neuroimaging González, C., Nguyen-Louie, T. T., Sullivan, E. V., Pfefferbaum, A., Klo, J., Dehoney, J., Zhao, Q., Adeli, E., Tapert, S. F., Pohl, K. M. 2026

Abstract

Heavy alcohol drinking during adolescence is a major public health concern and a primary risk factor for developing alcohol use disorder (AUD) in adulthood. Identifying robust, modifiable predictors that distinguish adolescents who initiate heavy drinking from those who maintain low consumption can guide targeted prevention strategies.We analyzed longitudinal data from 285 participants (144 male, 141 female) of the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA) study. All participants were no-to-low drinkers at age 15 years, remained in the study within a year of turning 21, and were followed annually, making this neuroimaging study the largest of its kind. Each visit consisted of 240 measurements (spanning cognitive, environmental, neuroimaging, and psychosocial domains), which were used to train a deep learning model to predict drinking levels and learn an embedding space capturing key risk factors. Individual developmental profiles were then clustered into distinct pathways across ages 15 to 21.Six unique drinking pathways were identified. Two pathways leading to heavy drinking were characterized by early exposure to peer drinking, positive alcohol expectancies, and higher sensation-seeking, predominantly among males. Three moderate-drinking pathways showed gradual increases in consumption while maintaining lower peer drinking exposure. One pathway maintained no-to-low drinking, marked by negative expectancies toward alcohol and low social reinforcement.Distinct developmental pathways highlight socially driven motivators as key modifiable risk factors underlying adolescent drinking. Targeting peer influence and alcohol-related expectancies could help divert youth from trajectories leading to heavy drinking and reduce future burden of AUD.

View details for DOI 10.1016/j.bpsc.2026.04.004

View details for PubMedID 41985674
EchoAtlas: A Conversational, Multi-View Vision-Language Foundation Model for Echocardiography Interpretation and Clinical Reasoning. medRxiv : the preprint server for health sciences Chao, C. J., Asadi, M., Li, L., Ramasamy, G., Pecco, N., Wang, Y. C., Poterucha, T., Arsanjani, R., Kane, G., Oh, J. K., Banerjee, I., Langlotz, C., Fei-Fei, L., Adeli, E., Erickson, B. J. 2026

Abstract

Echocardiography is the most widely used cardiac imaging modality, yet artificial intelligence-enabled interpretation remains limited by the inability of existing models to integrate visual assessment, quantitative measurement, and clinical reasoning within a unified framework. Here we present EchoAtlas, the first autoregressive vision-language model developed for echocardiographic interpretation. Trained on over 12.9 million question-answer pairs derived from approximately 2 million echocardiogram videos, EchoAtlas achieves 0.966 accuracy on multiple-choice questions in our internal test set and establishes a new state-of-the-art on the public MIMIC-EchoQA benchmark (0.699 vs. 0.508 previously). EchoAtlas also provides accurate quantitative measurements, segment-level regional wall motion assessment, longitudinal comparison, and diagnostic reasoning across diverse question formats - capabilities not previously demonstrated in this domain. These results highlight the potential of autoregressive vision-language models as a foundation for interactive echocardiographic interpretation, representing an early step toward scalable, auditable artificial intelligence systems in cardiology practice.

View details for DOI 10.64898/2026.03.14.26348388

View details for PubMedID 41891021

View details for PubMedCentralID PMC13015684
SocialGen: Modeling Multi-Human Social Interaction with Language Models. Proceedings. International Conference on 3D Vision Yu, H., Zhang, J., Chen, C., Xiang, T., Fang, Y., Niebles, J. C., Adeli, E. 2026; 2026: 1844-1860

Abstract

Human interactions in everyday life are inherently social, involving engagements with diverse individuals across various contexts. Modeling these social interactions is fundamental to a wide range of real-world applications. In this paper, we introduce SocialGen, the first unified motion-language model capable of modeling interaction behaviors among varying numbers of individuals, to address this crucial yet challenging problem. Unlike prior methods that are limited to two-person interactions, we propose a novel social motion representation that supports tokenizing the motions of an arbitrary number of individuals and aligning them with the language space. This alignment enables the model to leverage rich, pretrained linguistic knowledge to better understand and reason about human social behaviors. To tackle the challenges of data scarcity, we curate a comprehensive multi-human interaction dataset, SocialX, enriched with textual annotations. Leveraging this dataset, we establish the first comprehensive benchmark for multi-human interaction tasks. Our method achieves state-of-the-art performance across motion-language tasks, setting a new standard for multi-human interaction modeling. Project page: socialgenx.github.io.

View details for DOI 10.1109/3dv69130.2026.00175

View details for PubMedID 42221196

View details for PubMedCentralID PMC13218778
Designing AI-Enabled Video Monitoring Clinician Dashboard for Neuropsychiatric Symptoms: A Survey of User Needs. The American journal of geriatric psychiatry. Open science, education, and practice Gould, C. E., Davis, C. H., Schüz, N., Lin, F. V., Samus, Q. M., Terada, T., Daniel, M., Tee, S., Adeli, E. 2026; 9: 46-51

Abstract

This study aimed to gather input from clinicians who assess and treat neuropsychiatric symptoms (NPS) to inform the development of a clinician dashboard to accompany an AI-enabled video-based monitoring system.The clinician survey inquired about the importance of tracking different NPS and about additional information or features desired for the dashboard. Responses (n = 28) were grouped into prescribing and nonprescribing clinicians for sensitivity analyses.The most important NPS to be detected were agitation/aggression, nighttime behaviors, depression, and anxiety. Multiple environmental factors were endorsed as being very important including: behavior frequency, intensity, and time of day.Findings demonstrate that the desired features of the dashboard were consistent across both prescribing and nonprescribing clinicians. Notably, some of the important symptoms and features that clinicians desired in a dashboard could not be extracted from existing sensor-based systems, but would be possible with an AI-enabled video monitoring system.

View details for DOI 10.1016/j.osep.2026.01.001

View details for PubMedID 41987876

View details for PubMedCentralID PMC13076236
Cycle Diffusion Model for Counterfactual Image Generation Huang, F., Wang, A., Li, B., Trang, B., Yesiloglu, R., Hua, T., Peng, W., Adeli, E. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2026: 173-185

View details for DOI 10.1007/978-3-032-07904-6_16

View details for Web of Science ID 001720954600016
WASABI: A Metric for Evaluating Morphometric Plausibility of Synthetic Brain MRIs Jafrasteh, B., Peng, W., Wan, C., Luo, Y., Adeli, E., Zhao, Q. edited by Gee, J. C., Alexander, D. C., Hong, J., Iglesias, J. E., Sudre, C. H., Venkataraman, A., Golland, P., Kim, J. H., Park, J. SPRINGER INTERNATIONAL PUBLISHING AG. 2026: 684-694

View details for DOI 10.1007/978-3-032-04937-7_65

View details for Web of Science ID 001596376900065
Generating Novel Brain Morphology by Deforming Learned Templates Wang, A. Q., Huang, F., Trang, B., Peng, W., Abbasi, M., Pohl, K., Sabuncu, M. R., Adeli, E. edited by Gee, J. C., Alexander, D. C., Hong, J., Iglesias, J. E., Sudre, C. H., Venkataraman, A., Golland, P., Kim, J. H., Park, J. SPRINGER INTERNATIONAL PUBLISHING AG. 2026: 207-217

View details for DOI 10.1007/978-3-032-04937-7_20

View details for Web of Science ID 001596376900020
Neural Autoregressive Modeling of Brain Aging Yesiloglu, R., Peng, W., Islam, M., Adeli, E. edited by Mukhopadhyay, A., Oksuz, Engelhardt, S., Mehrof, D., Yuan, Y. SPRINGER INTERNATIONAL PUBLISHING AG. 2026: 341-350

View details for DOI 10.1007/978-3-032-05472-2_33

View details for Web of Science ID 001679945500033
Person-Centered Noninvasive Brain Stimulation for Aging-Related Neurological and Mental Disorders: A Multi-Dimensional Framework for Designing Protocols. Neuroscience and biobehavioral reviews Zhou, S., Keller, C. J., Chen, N. F., Adeli, E., Lin, F. V. 2025: 106528

Abstract

Noninvasive brain stimulation (NIBS) shows significant promise for treating aging-related neurological and mental disorders (ANMDs). However, its clinical benefits vary widely across individuals due to the highly heterogeneous nature of patient traits and states. Person-centered NIBS - defined as stimulation protocols tailored to the inter- and intra-individual variability in brain structure, function, and behavioral context - may help address this heterogeneity and enhance the clinical benefits of NIBS. While current clinical guidelines provide evidence-based standards for managing complex clinical realities and guiding practitioners, existing personalization strategies are isolated and lack a systematic framework needed to design truly person-centered NIBS.We propose a multi-dimensional framework for person-centered NIBS protocol designs, systematically targeting inter-individual differences and intra-individual variations in patients, to enhance the availability and engagement of neural resources supporting the improvement of cognition, mood, and motor functions in ANMDs.We provide a systematic review of current personalization strategies and the efficacy of personalized NIBS in ANMDs. We then presented a multi-dimensional conceptual framework of person-centered NIBS intended to guide future protocol development for systematically targeting inter-individual differences and intra-individual variations in patients. Complementing this, the operational framework provides a perspective-based, structured approach to personalize NIBS target and dosage (stimulation parameters and session scheduling), encompassing (1) structural and functional neuroimaging for target localization and parameter initialization, (2) closed-loop NIBS with multimodal neurobehavioral monitoring for real-time parameter adjustment, (3) dose-response modeling for between-session parameter adaption, and (4) complementary pharmacological and behavioral interventions.We envision that the proposed multi-dimensional framework could guide person-centered NIBS with enhanced effectiveness of NIBS in ANMDs.

View details for DOI 10.1016/j.neubiorev.2025.106528

View details for PubMedID 41423000
EchoGraph system for automated quality assessment of echocardiography reports. NPJ digital medicine Chao, C. J., Delbrouck, J. B., Asadi, M., Banerjee, I., Farina, J. M., Galasso, F., Mahmoud, A. K., Abbas, M. T., Wang, Y. C., Arsanjani, R., Kane, G. C., Oh, J. K., Erickson, B. J., Fei-Fei, L., Adeli, E., Langlotz, C. 2025

Abstract

Generative AI needs automatic clinical text accuracy metrics, but none exist for echocardiography. To address this, we developed EchoGraph, a BERT-based model trained on 600 densely annotated echocardiography reports from the Mayo Clinic (2017), split 7:2:1 for training, validation, and testing, using a tailored schema with 48,256 entities and 29,731 relations annotated. Sixty random MIMIC-EchoNote reports were annotated (3672 entities and 2360 relations) for external validation. EchoGraph demonstrated strong performance predicting entities (micro F1 0.85) and relations (micro F1 0.70), maintaining performance on external validation (entity micro F1 0.80, relation micro F1 0.52). EchoGraph F1 score showed superior error sensitivity versus RadGraph F1, with 2.8-fold higher slope magnitude (-0.817 vs -0.291) and better variance explained (R2 = 0.803 vs 0.578). EchoGraph offers an effective solution for evaluating language model-based echocardiography applications, supporting more accurate AI-generated reports.

View details for DOI 10.1038/s41746-025-02140-w

View details for PubMedID 41372462
Brain-Cognition Fingerprinting via Graph-GCCA with Contrastive Learning. Machine learning in clinical neuroimaging : 7th international workshop, MLCN 2024, held in conjunction with MICCAI 2024, Marrakesh, Morocco, October 10, 2024, proceedings. MLCN (Workshop) (7th : 2024 : Marrakesh, Morocco) Wang, Y., Peng, W., Zhang, Y., Adeli, E., Zhao, Q., Pohl, K. M. 2025; 15266: 24-34

Abstract

Many longitudinal neuroimaging studies aim to improve the understanding of brain aging and diseases by studying the dynamic interactions between brain function and cognition. Doing so requires accurate encoding of their multidimensional relationship while accounting for individual variability over time. For this purpose, we propose an unsupervised learning model (called Contrastive Learning-based Graph Generalized Canonical Correlation Analysis (CoGraCa)) that encodes their relationship via Graph Attention Networks and generalized Canonical Correlational Analysis. To create brain-cognition fingerprints reflecting unique neural and cognitive phenotype of each person, the model also relies on individualized and multimodal contrastive learning. We apply CoGraCa to longitudinal dataset of healthy individuals consisting of resting-state functional MRI and cognitive measures acquired at multiple visits for each participant. The generated fingerprints effectively capture significant individual differences and outperform current single-modal and CCA-based multimodal models in identifying sex and age. More importantly, our encoding provides interpretable interactions between those two modalities.

View details for DOI 10.1007/978-3-031-78761-4_3

View details for PubMedID 39872150

View details for PubMedCentralID PMC11772010
Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation. Advances in neural information processing systems Trang, B., Saremi, P., Wang, A. Q., Huang, F., TehraniNasab, Z., Kumar, A., Arbel, T., Fei-Fei, L., Adeli, E. 2025; 38: 104553-104599

Abstract

Capturing diversity is crucial in conditional and prompt-based image generation, particularly when conditions contain uncertainty that can lead to multiple plausible outputs. To generate diverse images reflecting this diversity, traditional methods often modify random seeds, making it difficult to discern meaningful differences between samples, or diversify the input prompt, which is limited in verbally interpretable diversity. We propose Rainbow, a novel conditional image generation framework, applicable to any pretrained conditional generative model, that addresses inherent condition/prompt uncertainty and generates diverse plausible images. Rainbow is based on a simple yet effective idea: decomposing the input condition into diverse latent representations, each capturing an aspect of the uncertainty and generating a distinct image. First, we integrate a latent graph, parameterized by Generative Flow Networks (GFlowNets), into the prompt representation computation. Second, leveraging GFlowNets' advanced graph sampling capabilities to capture uncertainty and output diverse trajectories over the graph, we produce multiple trajectories that collectively represent the input condition, leading to diverse condition representations and corresponding output images. Evaluations on natural image and medical image datasets demonstrate Rainbow's improvement in both diversity and fidelity across image synthesis, image generation, and counterfactual generation tasks.

View details for PubMedID 42088644
Guest Editorial: Applications of Intelligent Environments to Health IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS Hornos, M. J., Adeli, E., Zamudio, V. M. 2025; 29 (12): 9195-9197

View details for DOI 10.1109/JBHI.2025.3633965

View details for Web of Science ID 001640358300024
A multi-dimensional transfer learning framework for studying reward-guided behaviors across species NATURE MENTAL HEALTH Liu, Y., Turnbull, A., Adeli, E., Zhao, G., Wang, K., Vankee-Lin, F. 2025

View details for DOI 10.1038/s44220-025-00547-8

View details for Web of Science ID 001626268600001
Neurocognitive Latent Space Regularization for Multi-Label Diagnosis from MRI. Predictive Intelligence in Medicine. PRIME (Workshop) Manasseh-Lewis, J., Godoy, F., Peng, W., Paul, R., Adeli, E., Pohl, K. 2025; 15155: 185-195

Abstract

Interpretability is essential to MRI brain studies relying on deep learning models for neuroscientific discovery. One way to facilitate the interpretability of a deep learning model is to ensure the samples are arranged in the model's latent space with respect to clinically meaningful variables. To achieve this in the context of cross-sectional brain MRI studies, we regularize the latent space of a multi-label classifier via pairwise disentanglement, so that the difference between the representation of two brain MRIs along the disentangled direction in the latent space is similar to the difference in their neuropsychological test scores. We apply our technique to classify brain MRIs of 156 controls, 165 cases diagnosed with mild cognitive impairment (MCI), 166 diagnosed with human immunodeficiency virus (HIV)-associated cognitive disorder (HAND), and 32 individuals diagnosed with HIV without HAND. The latent space is disentangled with respect to the neuropsychological z-score (NPZ), which is negatively correlated with the severity of cognitive impairment (i.e., low scores for those diagnosed with MCI or HAND). Based on cross-validation, the proposed model achieves statistically significantly higher balanced accuracy than the same model without disentanglement. Furthermore, the difference between representations along the disentangled direction significantly correlates with the difference in NPZ. Finally, the brain regions guiding the classification process aligned with the neuroscientific literature.

View details for DOI 10.1007/978-3-031-74561-4_16

View details for PubMedID 40365134

View details for PubMedCentralID PMC12068855
Spectral Graph Sample Weighting for Interpretable Sub-cohort Analysis in Predictive Models for Neuroimaging. PRedictive Intelligence in MEdicine. PRIME (Workshop) Paschali, M., Jiang, Y. H., Siegel, S., Gonzalez, C., Pohl, K. M., Chaudhari, A., Zhao, Q. 2025; 15155: 24-34

Abstract

Recent advancements in medicine have confirmed that brain disorders often comprise multiple subtypes of mechanisms, developmental trajectories, or severity levels. Such heterogeneity is often associated with demographic aspects (e.g., sex) or disease-related contributors (e.g., genetics). Thus, the predictive power of machine learning models used for symptom prediction varies across subjects based on such factors. To model this heterogeneity, one can assign each training sample a factor-dependent weight, which modulates the subject's contribution to the overall objective loss function. To this end, we propose to model the subject weights as a linear combination of the eigenbases of a spectral population graph that captures the similarity of factors across subjects. In doing so, the learned weights smoothly vary across the graph, highlighting sub-cohorts with high and low predictability. Our proposed sample weighting scheme is evaluated on two tasks. First, we predict initiation of heavy alcohol drinking in young adulthood from imaging and neuropsychological measures from the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA). Next, we detect Dementia vs. Mild Cognitive Impairment (MCI) using imaging and demographic measurements in subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Compared to existing sample weighting schemes, our sample weights improve interpretability and highlight sub-cohorts with distinct characteristics and varying model accuracy.

View details for DOI 10.1007/978-3-031-74561-4_3

View details for PubMedID 39525051

View details for PubMedCentralID PMC11549025
Regularized CCA identifies sex-specific brain-behavior associations in adolescent psychopathology. Translational psychiatry Milecki, L., Gonzalez, C., Adeli, E., Nooner, K. B., Sabuncu, M. R., Kuceyeski, A., Zhao, Q. 2025; 15 (1): 405

Abstract

Adolescence is a critical period of neural development and a sensitive window for the emergence of psychiatric symptoms. Resting-state functional MRI (rs-fMRI) provides a unique opportunity to investigate brain-behavior associations. However, the role of sex-specific differences in these associations remains underexplored, despite their potential to reveal heterogeneous neurobiological mechanisms and guide personalized interventions. In this study, we analyzed data from the Adolescent Brain Cognitive Development (ABCD) Study, comprising 7,892 adolescents (9-10 years old, 3,896 females). Using Canonical Correlation Analysis (CCA) and a rigorous cross-validation framework, we identified associations between cortical-to-cortical (Cor-Cor) and cortical-to-subcortical (Cor-Sub) functional connectivity and eight symptom domains from the Child Behavior Checklist (CBCL). Unlike previous approaches, we directly examined sex differences within the brain-behavior mappings by applying separate CCA models in boys and girls to uncover differential connectivity-behavior relationships. Our analysis uncovered two reproducible components for both Cor-Cor and Cor-Sub mappings on the whole cohort (r1 = 0.130, p < 0.001, r2 = 0.122, p < 0.01 for Cor-Cor; r1 = 0.157, p < 0.001, r2 = 0.115, p < 0.01 for Cor-Sub). Importantly, sex-stratified analyses revealed distinct patterns of brain-behavior associations. Among females, high loadings on attention and thought problems were linked to high loadings on default mode network, whereas in males, attention and thought problems were linked to sensorimotor networks. Compared to females, males also had higher loadings on internalizing symptoms, such as anxious/depressed and withdrawn/depressed symptoms, coupled with lower loadings on putamen and hippocampus functional connectivity. These findings suggest there may be fundamentally different brain-behavior mappings across the sexes in adolescence, in addition to previously reported sex differences in functional connectivity and behavioral profiles. By revealing sex-specific neural correlates of psychiatric symptoms in early adolescence, this study paves the way for sex-informed strategies in clinical risk assessment and personalized treatment design.

View details for DOI 10.1038/s41398-025-03678-9

View details for PubMedID 41107246

View details for PubMedCentralID 2762785
Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation. ... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Xiang, T., Li, K., Long, C., Hane, C., Guo, P., Delp, S., Adeli, E., Fei-Fei, L. 2025; 2025: 16492-16502

Abstract

Recent advances in text-to-image diffusion models have been driven by the increasing availability of paired 2D data. However, the development of 3D diffusion models has been hindered by the scarcity of high-quality 3D data, resulting in less competitive performance compared to their 2D counterparts. To address this challenge, we propose repurposing pre-trained 2D diffusion models for 3D object generation. We introduce Gaussian Atlas, a novel representation that utilizes dense 2D grids, enabling the fine-tuning of 2D diffusion models to generate 3D Gaussians. Our approach demonstrates successful transfer learning from a pre-trained 2D diffusion model to a 2D manifold flattened from 3D structures. To support model training, we compile GaussianVerse, a large-scale dataset comprising 205K high-quality 3D Gaussian fittings of various 3D objects. Our experimental results show that text-to-image diffusion models can be effectively adapted for 3D content generation, bridging the gap between 2D and 3D modeling.

View details for PubMedID 41938265
Personalized cognitive enhancement for older adults: An aging-friendly closed-loop human-machine interface framework. Ageing research reviews Zhou, S., Liu, Y., Turnbull, A., Tapparello, C., Adeli, E., Lin, F. V. 2025: 102877

Abstract

Emerging digitally delivered non-pharmacological interventions (dNPIs) offer scalable, low-risk solutions for enhancing cognitive function in older adults, yet their effectiveness remains inconsistent due to a lack of personalization and precise mechanisms of action. Generic, population-based designs often fail to predict individual gains, underscoring the need for more tailored approaches. To address this, we propose a closed-loop human-machine interface (HMI) framework for personalizing dNPIs by optimizing the engagement of neurocognitive resources for cognitive enhancement. Our framework tackles three major challenges: (1) comprehensive and effective neurobehavioral representations for cognitive decoding, (2) tailoring interventions for domain-specific cognitive processes, and (3) ensuring aging-friendly design on usability, validity, and reliability for long-term adherence. We provide reviews and perspectives to guide the development of closed-loop HMIs by outlining the operational details of three key components-sensor, controller, and external actuator-that monitor, analyze, and modulate neurobehavioral activities through real-time adaptive interventions. Centering on neurobehavioral characteristics of older adults, we propose to advance closed-loop HMIs toward (1) deploying multimodal sensor network that captures activities from both central and peripheral nervous systems, (2) artificial intelligence (AI)-powered cognitive decoding and modulation that integrates multi-modal easy-to-acquire neurobehavioral signals and predicts the cross-modal harder-to-acquire signals, and (3) targeting neurobehavioral processes via internal and/or external regulation. We envision that the proposed closed-loop HMI framework could provide personalized dNPI with enhanced effectiveness and scalability for cognitive enhancement in older adults, promoting brain resilience and healthy longevity in the aging population.

View details for DOI 10.1016/j.arr.2025.102877

View details for PubMedID 40850344
Statistical variability in comparing accuracy of neuroimaging based classification models via cross validation. Scientific reports Jafrasteh, B., Adeli, E., Pohl, K. M., Kuceyeski, A., Sabuncu, M. R., Zhao, Q. 2025; 15 (1): 28745

Abstract

Machine learning (ML) has significantly transformed biomedical research, leading to a growing interest in model development to advance classification accuracy in various clinical applications. However, this progress raises essential questions regarding how to rigorously compare the accuracy of different ML models. In this study, we highlight the practical challenges in quantifying the statistical significance of accuracy differences between two neuroimaging-based classification models when cross-validation (CV) is performed. Specifically, we propose an unbiased framework to assess the impact of CV setups (e.g., the number of folds) on the statistical significance. We apply this framework to three publicly available neuroimaging datasets to re-emphasize known flaws in current computation of p-values for comparing model accuracies. We further demonstrate that the likelihood of detecting significant differences among models varies substantially with the intrinsic properties of the data, testing procedures, and CV configurations of choice. Given that many of the above factors do not typically fall into the evaluation criteria of ML-based biomedical studies, we argue that such variability can potentially lead to p-hacking and inconsistent conclusions on model improvement. The obtained results from this study underscore that more rigorous practices in model comparison are urgently needed in order to mitigate the reproducibility crisis in biomedical ML research.

View details for DOI 10.1038/s41598-025-12026-2

View details for PubMedID 40769991
Efficient one-shot federated learning on medical data using knowledge distillation with image synthesis and client model adaptation. Medical image analysis Kang, M., Chikontwe, P., Kim, S., Jin, K. H., Adeli, E., Pohl, K. M., Park, S. H. 2025; 105: 103714

Abstract

One-shot federated learning (FL) has emerged as a promising solution in scenarios where multiple communication rounds are not practical. Though previous methods using knowledge distillation (KD) with synthetic images have shown promising results in transferring clients' knowledge to the global model on one-shot FL, overfitting and extensive computations still persist. To tackle these issues, we propose a novel one-shot FL framework that generates pseudo intermediate samples using mixup, which incorporates synthesized images with diverse types of structure noise. This approach (i) enhances the diversity of training samples, preventing overfitting and providing informative visual clues for effective training and (ii) allows for the reuse of synthesized images, reducing computational resources and improving overall training efficiency. To mitigate domain disparity introduced by noise, we design noise-adapted client models by updating batch normalization statistics on noise to enhance KD. With these in place, the training process involves iteratively updating the global model through KD with both the original and noise-adapted client models using pseudo-generated images. Extensive evaluations on five small-sized and three regular-sized medical image classification datasets demonstrate the superiority of our approach over previous methods.

View details for DOI 10.1016/j.media.2025.103714

View details for PubMedID 40674892
Confounder-Free Continual Learning via Recursive Feature Normalization. Proceedings of machine learning research Shah, Y., Gonzalez, C., Abbasi, M. H., Zhao, Q., Pohl, K. M., Adeli, E. 2025; 267: 54112-54142

Abstract

Confounders are extraneous variables that affect both the input and the target, resulting in spurious correlations and biased predictions. There are recent advances in dealing with or removing confounders in traditional models, such as metadata normalization (MDN), where the distribution of the learned features is adjusted based on the study confounders. However, in the context of continual learning, where a model learns continuously from new data over time without forgetting, learning feature representations that are invariant to confounders remains a significant challenge. To remove their influence from intermediate feature representations, we introduce the Recursive MDN (R-MDN) layer, which can be integrated into any deep learning architecture, including vision transformers, and at any model stage. R-MDN performs statistical regression via the recursive least squares algorithm to maintain and continually update an internal model state with respect to changing distributions of data and confounding variables. Our experiments demonstrate that R-MDN promotes equitable predictions across population groups, both within static learning and across different stages of continual learning, by reducing catastrophic forgetting caused by confounder effects changing over time.

View details for DOI 10.1609/aaai.v32i1.11792

View details for PubMedID 41574232

View details for PubMedCentralID PMC12823023
Confounder-Free Continual Learning via Recursive Feature Normalization. Proceedings of machine learning research Shah, Y., Gonzalez, C., Abbasi, M. H., Zhao, Q., Pohl, K. M., Adeli, E. 2025; 267: 54112-54142

Abstract

Confounders are extraneous variables that affect both the input and the target, resulting in spurious correlations and biased predictions. There are recent advances in dealing with or removing confounders in traditional models, such as metadata normalization (MDN), where the distribution of the learned features is adjusted based on the study confounders. However, in the context of continual learning, where a model learns continuously from new data over time without forgetting, learning feature representations that are invariant to confounders remains a significant challenge. To remove their influence from intermediate feature representations, we introduce the Recursive MDN (R-MDN) layer, which can be integrated into any deep learning architecture, including vision transformers, and at any model stage. R-MDN performs statistical regression via the recursive least squares algorithm to maintain and continually update an internal model state with respect to changing distributions of data and confounding variables. Our experiments demonstrate that R-MDN promotes equitable predictions across population groups, both within static learning and across different stages of continual learning, by reducing catastrophic forgetting caused by confounder effects changing over time.

View details for PubMedID 41574232
Artificial Intelligence in Obsessive-Compulsive Disorder: A Systematic Review. Current treatment options in psychiatry Kim, J., Pacheco, J. P., Golden, A., Aboujaoude, E., van Roessel, P., Gandhi, A., Mukunda, P., Avanesyan, T., Xue, H., Adeli, E., Kim, J. P., Saggar, M., Stirman, S. W., Kuhn, E., Supekar, K., Pohl, K. M., Rodriguez, C. I. 2025; 12 (1): 23

Abstract

Obsessive-compulsive disorder (OCD) is a chronic and disabling condition, often leading to significant functional impairments. Despite its early onset, there is an average delay of 17 years from symptom onset to diagnosis and treatment, resulting in poorer outcomes. This systematic review aims to synthesize current findings on the application of AI in OCD, highlighting opportunities for early symptom detection, scalable therapy training, clinical decision support, novel therapeutics, computer vision-based approaches, and multimodal biomarker discovery.While previous reviews focused on biomarker-based OCD detection and treatment using machine learning (ML), the findings of the current review add information about novel applications of deep learning technology, specifically generative artificial intelligence (GenAI) and natural language processing (NLP). Among the included 13 articles, most studies (84.6%) utilized secondary data analyses, primarily through GenAI/NLP. Nearly 77% of these studies were published in the past two years, with high quality of evidence. The primary focus areas were enhancing treatment and management, and timely OCD detection (both 38.5%); followed by AI tool development for broader mental health applications.AI technologies offer transformative potential for improvements related to OCD if diagnosis occurs earlier after onset; thereby lessening the consequential economic burden. Prioritizing investment in ethically sound AI research could significantly improve OCD outcomes in mental health care.The online version contains supplementary material available at 10.1007/s40501-025-00359-8.

View details for DOI 10.1007/s40501-025-00359-8

View details for PubMedID 40524733

View details for PubMedCentralID PMC12167270
Foundation versus domain-specific models for left ventricular segmentation on cardiac ultrasound. NPJ digital medicine Chao, C. J., Gu, Y. R., Kumar, W., Xiang, T., Appari, L., Wu, J., Farina, J. M., Wraith, R., Jeong, J., Arsanjani, R., Kane, G. C., Oh, J. K., Langlotz, C. P., Banerjee, I., Fei-Fei, L., Adeli, E. 2025; 8 (1): 341

Abstract

The Segment Anything Model (SAM) was fine-tuned on the EchoNet-Dynamic dataset and evaluated on external transthoracic echocardiography (TTE) and Point-of-Care Ultrasound (POCUS) datasets from CAMUS (University Hospital of St Etienne) and Mayo Clinic (99 patients: 58 TTE, 41 POCUS). Fine-tuned SAM was superior or comparable to MedSAM. The fine-tuned SAM also outperformed EchoNet and U-Net models, demonstrating strong generalization, especially on apical 2-chamber (A2C) images (fine-tuned SAM vs. EchoNet: CAMUS-A2C: DSC 0.891 ± 0.040 vs. 0.752 ± 0.196, p < 0.0001) and POCUS (DSC 0.857 ± 0.047 vs. 0.667 ± 0.279, p < 0.0001). Additionally, SAM-enhanced workflow reduced annotation time by 50% (11.6 ± 4.5 sec vs. 5.7 ± 1.7 sec, p < 0.0001) while maintaining segmentation quality. We demonstrated an effective strategy for fine-tuning a vision foundation model for enhancing clinical workflow efficiency and supporting human-AI collaboration.

View details for DOI 10.1038/s41746-025-01730-y

View details for PubMedID 40481190

View details for PubMedCentralID PMC12144204
Identifying unique pathways towards heavy alcohol drinking during adolescence Gonzalez, C., Nguyen-Louie, T., Sullivan, E., Pfefferbaum, A., Zhao, Q., Adeli, E., Tapert, S., Pohl, K. WILEY. 2025

View details for Web of Science ID 001635196501165
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis. Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Yeganeh, Y., Farshad, A., Charisiadis, I., Hasny, M., Hartenberger, M., Ommer, B., Navab, N., Adeli, E. 2025; 2025: 7685-7695

Abstract

Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models; however, such large datasets are not always accessible in medical imaging due to cost and privacy issues, which contradicts one of the main applications of such models to produce synthetic samples where real data is scarce. Also, fine-tuning on pre-trained general models has been a challenge due to the distribution shift between the medical domain and the pre-trained models. Here, we propose Latent Drift (LD) for diffusion models that can be adopted for any fine-tuning method to mitigate the issues faced by the distribution shift or employed in inference time as a condition. Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation, which is crucial to investigate how parameters such as gender, age, and adding or removing diseases in a patient would alter the medical images. We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation. Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.

View details for DOI 10.1109/CVPR52734.2025.00720

View details for PubMedID 42112522

View details for PubMedCentralID PMC13155541
The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion. Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Chen, C., Zhang, J., Lakshmikanth, S. K., Fang, Y., Shao, R., Wetzstein, G., Fei-Fei, L., Adeli, E. 2025; 2025: 6200-6211

Abstract

Human communication is inherently multimodal, involving a combination of verbal and non-verbal cues such as speech, facial expressions, and body gestures. Modeling these behaviors is essential for understanding human interaction and for creating virtual characters that can communicate naturally in applications like games, films, and virtual reality. However, existing motion generation models are typically limited to specific input modalities-either speech, text, or motion data-and cannot fully leverage the diversity of available data. In this paper, we propose a novel framework that unifies verbal and non-verbal language using multimodal language models for human motion understanding and generation. This model is flexible in taking text, speech, and motion or any combination of them as input. Coupled with our novel pre-training strategy, our model not only achieves state-of-the-art performance on co-speech gesture generation but also requires much less data for training. Our model also unlocks an array of novel tasks such as editable gesture generation and emotion prediction from motion. We believe unifying the verbal and non-verbal language of human motion is essential for real-world applications, and language models offer a powerful approach to achieving this goal. Project page: languageofmotion.github.io.

View details for DOI 10.1109/CVPR52734.2025.00581

View details for PubMedID 42112521

View details for PubMedCentralID PMC13155610
AdaVid: Adaptive Video-Language Pretraining. Conference on Computer Vision and Pattern Recognition Workshops. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Workshops Patel, C., Niebles, J. C., Adeli, E. 2025; 2025: 6379-6388

Abstract

Contrastive video-language pretraining has demonstrated great success in learning rich and robust video representations. However, deploying such video encoders on compute-constrained edge devices remains challenging due to their high computational demands. Additionally, existing models are typically trained to process only short video clips, often limited to 4 to 64 frames. In this paper, we introduce AdaVid, a flexible architectural framework designed to learn efficient video encoders that can dynamically adapt their computational footprint based on available resources. At the heart of AdaVid is an adaptive transformer block, inspired by Matryoshka Representation Learning, which allows the model to adjust its hidden embedding dimension at inference time. We show that AdaVid-EgoVLP, trained on video-narration pairs from the large-scale Ego4D dataset, matches the performance of the standard EgoVLP on short video-language benchmarks using only half the compute, and even outperforms EgoVLP when given equal computational resources. We further explore the trade-off between frame count and compute on the challenging Diving48 classification benchmark, showing that AdaVid enables the use of more frames without exceeding computational limits. To handle longer videos, we also propose a lightweight hierarchical network that aggregates short clip features, achieving a strong balance between compute efficiency and accuracy across several long video benchmarks.

View details for DOI 10.1109/cvprw67362.2025.00635

View details for PubMedID 41937865
Evaluating large language models in echocardiography reporting: opportunities and challenges. European heart journal. Digital health Chao, C. J., Banerjee, I., Arsanjani, R., Ayoub, C., Tseng, A., Delbrouck, J. B., Kane, G. C., Lopez-Jimenez, F., Attia, Z., Oh, J. K., Erickson, B., Fei-Fei, L., Adeli, E., Langlotz, C. 2025; 6 (3): 326-339

Abstract

The increasing need for diagnostic echocardiography tests presents challenges in preserving the quality and promptness of reports. While Large Language Models (LLMs) have proven effective in summarizing clinical texts, their application in echo remains underexplored.Adult echocardiography studies, conducted at the Mayo Clinic from 1 January 2017 to 31 December 2017, were categorized into two groups: development (all Mayo locations except Arizona) and Arizona validation sets. We adapted open-source LLMs (Llama-2, MedAlpaca, Zephyr, and Flan-T5) using In-Context Learning and Quantized Low-Rank Adaptation fine-tuning (FT) for echo report summarization from 'Findings' to 'Impressions.' Against cardiologist-generated Impressions, the models' performance was assessed both quantitatively with automatic metrics and qualitatively by cardiologists. The development dataset included 97 506 reports from 71 717 unique patients, predominantly male (55.4%), with an average age of 64.3 ± 15.8 years. EchoGPT, a fine-tuned Llama-2 model, outperformed other models with win rates ranging from 87% to 99% in various automatic metrics, and produced reports comparable to cardiologists in qualitative review (significantly preferred in conciseness (P < 0.001), with no significant preference in completeness, correctness, and clinical utility). Correlations between automatic and human metrics were fair to modest, with the best being RadGraph F1 scores vs. clinical utility (r = 0.42) and automatic metrics showed insensitivity (0-5% drop) to changes in measurement numbers.EchoGPT can generate draft reports for human review and approval, helping to streamline the workflow. However, scalable evaluation approaches dedicated to echo reports remains necessary.

View details for DOI 10.1093/ehjdh/ztae086

View details for PubMedID 40395412

View details for PubMedCentralID PMC12088711
Non-parametric prediction of brain MRI microstructure using transfer learning. Imaging neuroscience (Cambridge, Mass.) Chau Loo Kung, G., Weber, E. M., Batra, A., Ni, L., Zeineh, M., Chaudhari, A., Adeli, E., Knowles, J. K., McNab, J. A. 2025; 3

Abstract

Magnetic resonance imaging (MRI) can be sensitive to tissue microstructural features and infer parameterized features by performing a voxel-wise fit of the signal to a biophysical model. However, biophysical models rely on simplified representations of brain tissue. Machine learning (ML) techniques may serve as a data-driven approach to optimize for microstructural feature extraction. Unfortunately, training an ML model for these applications requires a large database of paired specimen MRI and histology datasets, which is costly, cumbersome, and challenging to acquire. In this work, we present a novel approach allowing a reliable estimation of brain tissue microstructure using MRI as inputs, with a minimal amount of paired MRI-histology data. Our method involves pretraining a conditional normalizing flow model to predict the distribution of microstructural features. The model is trained on synthetic MRI data generated from unpaired histology and MRI physics, reducing the data requirement in future steps. The synthetic MRI generation data combines segmentation of a publicly available EM slice, feature extraction and MRI simulators. Subsequently, the model is fine-tuned using experimental MRI/Electron Microscopy (EM) data of nine excised mouse brains through transfer learning. This approach enables the prediction of non-parameterized joint distributions of g-ratio and axon diameters for a given voxel based on MRI input. Results show a close agreement between the distributions predicted by the network and the EM ground-truth histograms (mean Jensen-Shannon Distances of 0.24 and 0.23 on the test set, for axon diameter and g-ratios respectively, compared to distances of 0.18 and 0.18 of a direct fitting of a Gamma distribution to the ground truth). The approach also shows up to 4% decreased mean percent errors of the distributions compared to biophysical model fitting and increased prediction capabilities that are consistent with electron microscopy validation and previous biological studies. For example, g-ratio values predicted along the corpus callosum anterior-posterior axis show a significant difference for mice after myelin remodeling seizures are well established (p < 0.001) but not before seizure onset (p = 0.562). The results suggest that pretraining on synthetic MRI and then using transfer learning is an effective approach for addressing the lack of paired MRI/histology data when training ML models for microstructure prediction. This approach is a step toward developing a versatile and widely used foundation model for predicting microstructural features using MRI.

View details for DOI 10.1162/imag_a_00548

View details for PubMedID 40800932

View details for PubMedCentralID PMC12320010
Non-parametric prediction of brain MRI microstructure using transfer learning IMAGING NEUROSCIENCE Kung, G., Weber, E. M. M., Batra, A., Ni, L., Zeineh, M., Chaudhari, A., Adeli, E., Knowles, J. K., McNab, J. A. 2025; 3

View details for DOI 10.1162/imag_a_00548

View details for Web of Science ID 001521331500001
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation. Proceedings. IEEE International Conference on Computer Vision Patel, C., Nakamura, H., Kyuragi, Y., Kozuka, K., Niebles, J. C., Adeli, E. 2025; 2025: 10318-10329

Abstract

Egocentric human motion generation and forecasting with scene-context is crucial for enhancing AR/VR experiences, improving human-robot interaction, advancing assistive technologies, and enabling adaptive healthcare solutions by accurately predicting and simulating movement from a first-person perspective. However, existing methods primarily focus on third-person motion synthesis with structured 3D scene contexts, limiting their effectiveness in real-world egocentric settings where limited field of view, frequent occlusions, and dynamic cameras hinder scene perception. To bridge this gap, we introduce Egocentric Motion Generation and Egocentric Motion Forecasting, two novel tasks that utilize first-person images for scene-aware motion synthesis without relying on explicit 3D scene. We propose UniEgoMotion, a unified conditional motion diffusion model with a novel head-centric motion representation tailored for egocentric devices. UniEgoMotion's simple yet effective design supports egocentric motion reconstruction, forecasting, and generation from first-person visual inputs in a unified framework. Unlike previous works that overlook scene semantics, our model effectively extracts image-based scene context to infer plausible 3D motion. To facilitate training, we introduce EE4D-Motion, a large-scale dataset derived from EgoExo4D, augmented with pseudo-ground-truth 3D motion annotations. UniEgoMotion achieves state-of-the-art performance in egocentric motion reconstruction and is the first to generate motion from a single egocentric image. Extensive evaluations demonstrate the effectiveness of our unified framework, setting a new benchmark for egocentric motion modeling and unlocking new possibilities for egocentric applications.

View details for DOI 10.1109/ICCV51701.2025.00961

View details for PubMedID 42079508

View details for PubMedCentralID PMC13132493
Evaluating large language models in echocardiography reporting: opportunities and challenges EUROPEAN HEART JOURNAL - DIGITAL HEALTH Chao, C., Banerjee, I., Arsanjani, R., Ayoub, C., Tseng, A., Delbrouck, J., Kane, G. C., Lopez-Jimenez, F., Attia, Z., Oh, J. K., Erickson, B., Fei-Fei, L., Adeli, E., Langlotz, C. 2025

View details for DOI 10.1093/ehjdh/ztae086

View details for Web of Science ID 001456248300001
Establishing Clinically Operational Domains of Multidimensional Frailty: A Consensus Approach to Improve Multidimensional Frailty Diagnosis at Point of Care. The Gerontologist Shapiro, L. M., Arya, S., Adeli, E., Fredericson, M., Kaplan, R. M., Eppler, S. L., Lorenz, K., Lorig, K., Marwell, J., Schmiesing, C., Schroeder, R., Schulman, K., Trivedi, R., Kamal, R. 2025

Abstract

Frailty is common amongst older patients, however, there is a lack of agreement on methods to diagnose and monitor frailty at point of care. The purpose of this study was to establish consensus on important, feasible, and usable domains for point of care frailty assessment within all conceptual models of frailty.We reviewed instruments that assess frailty and extracted the domains measured by each tool. We developed 3 use cases for frailty assessment which provided context for voters: (1) longitudinal tracking of frailty in the aging patient (>50 years), (2) preoperative evaluation of frailty before surgery in adults (>50 years), and (3) discharge disposition after hospital admission in adults (>50 years). We conducted a modified RAND/UCLA Delphi with a panel of 11 experts. Panelists rated each domain for each use case on a scale from 1 to 9, where 1 is definitely not important/feasible/usable and 9 is definitely important/feasible/usable.Panelists achieved agreement on the following domains for the respective clinical use cases: Physical Strength 1, 2, and 3; Balance 1 and 3; Cognition 1, 2, and 3; Nutrition 1; Physical Activity 1, 2, and 3; Depression 1; Disease 1, 2, and 3; and Social Environment 1 and 3. The remaining items were indeterminate.We established consensus on eight domains of frailty across three use cases. These results can inform the measurement of domains to diagnose, monitor, and inform the management of frailty within the defined use cases.

View details for DOI 10.1093/geront/gnae183

View details for PubMedID 40119454
Generating Novel Brain Morphology by Deforming Learned Templates. ArXiv Wang, A. Q., Huang, F., Trang, B., Peng, W., Abbasi, M., Pohl, K., Sabuncu, M., Adeli, E. 2025

Abstract

Designing generative models for 3D structural brain MRI that synthesize morphologically-plausible and attribute-specific (e.g., age, sex, disease state) samples is an active area of research. Existing approaches based on frameworks like GANs or diffusion models synthesize the image directly, which may limit their ability to capture intricate morphological details. In this work, we propose a 3D brain MRI generation method based on state-of-the-art latent diffusion models (LDMs), called MorphLDM, that generates novel images by applying synthesized deformation fields to a learned template. Instead of using a reconstruction-based autoencoder (as in a typical LDM), our encoder outputs a latent embedding derived from both an image and a learned template that is itself the output of a template decoder; this latent is passed to a deformation field decoder, whose output is applied to the learned template. A registration loss is minimized between the original image and the deformed template with respect to the encoder and both decoders. Empirically, our approach outperforms generative baselines on metrics spanning image diversity, adherence with respect to input conditions, and voxel-based morphometry. Our code is available at https://github.com/alanqrwang/morphldm.

View details for DOI 10.1109/ACCESS.2021.3075608

View details for PubMedID 40093358

View details for PubMedCentralID PMC11908372
Developing ICU Clinical Behavioral Atlas Using Ambient Intelligence and Computer Vision. NEJM AI Dai, W., Adeli, E., Luo, Z., Dash, D., Lakshmikanth, S., Durante, Z., Tang, P., Kaushal, A., Milstein, A., Fei-Fei, L., Schulman, K. 2025; 2 (2)

Abstract

While computer vision has gained traction in medical applications, models specifically engineered for intensive care unit (ICU) activities are limited.We present Clinical Behavioral Atlas (CBA), a computer vision system that can identify 40 clinically relevant activity categories and 55 object categories solely through RGB video data. The system was developed using a dataset comprising over 140,000 hours of continuous video and over 350,000 densely annotated frames, collected from 16 sensors in 8 ICU rooms at an academic medical center.The model demonstrated strong performance in entity and activity detection, with sensitivities of 0.75~0.81 and average precisions of 0.64~0.73, respectively. Permutation tests yielded P values of less than 0.05 for most activity categories. We observed a positive correlation between the performance and both the number and size of entities. The model excelled at identifying common and large objects, even with limited samples, but struggled with small items like oral swabs. Activity detection performance correlated linearly with video duration. The model showed robust performance (>0.85 average precision) for most clinical activities, but activities of daily living exhibited greater variation and lower average precision (0.23-0.95), indicating potential for further refinement due to their complexity and relative scarcity in the dataset. Experiments against other popular activity recognition models reveal that our method substantially outperforms all baselines, with improvements of 0.30 and 0.45 in average precision over the next best method.CBA expands automated identification of clinically important bedside clinical actions such as ICU preventive bundle elements. While we have demonstrated the feasibility of computer vision as a tool to assist in clinical care in high-intensity settings such as the ICU, the development of a full clinical-level performance CBA model will require larger datasets, ideally from multiple locations. (Funded by Schmidt Futures and others.).

View details for DOI 10.1056/aioa2400590

View details for PubMedID 41867261

View details for PubMedCentralID PMC13004004
Communication Efficient Federated Learning for Multi-Organ Segmentation via Knowledge Distillation with Image Synthesis. IEEE transactions on medical imaging Kim, S., Park, H., Chikontwe, P., Kang, M., Jin, K. H., Adeli, E., Pohl, K. M., Park, S. H. 2025; PP

Abstract

Federated learning (FL) methods for multi-organ segmentation in CT scans are gaining popularity, but generally require numerous rounds of parameter exchange between a central server and clients. This repetitive sharing of parameters between server and clients may not be practical due to the varying network infrastructures of clients and the large transmission of data. Further increasing repetitive sharing results from data heterogeneity among clients, i.e., clients may differ with respect to the type of data they share. For example, they might provide label maps of different organs (i.e. partial labels) as segmentations of all organs shown in the CT are not part of their clinical protocol. To this end, we propose an efficient communication approach for FL with partial labels. Specifically, parameters of local models are transmitted once to a central server and the global model is trained via knowledge distillation (KD) of the local models. While one can make use of unlabeled public data as inputs for KD, the model accuracy is often limited due to distribution shifts between local and public datasets. Herein, we propose to generate synthetic images from clients' models as additional inputs to mitigate data shifts between public and local data. In addition, our proposed method offers flexibility for additional finetuning through several rounds of communication using existing FL algorithms, leading to enhanced performance. Extensive evaluation on public datasets in few communication FL scenario reveals that our approach substantially improves over state-of-the-art methods.

View details for DOI 10.1109/TMI.2025.3525581

View details for PubMedID 40030865
The Transition From Homogeneous to Heterogeneous Machine Learning in Neuropsychiatric Research BIOLOGICAL PSYCHIATRY: GLOBAL OPEN SCIENCE Zhao, Q., Nooner, K. B., Tapert, S. F., Adeli, E., Pohl, K. M., Kuceyeski, A., Sabuncu, M. R. 2025; 5 (1)

View details for DOI 10.1016/j.bpsgos.2024.100397

View details for Web of Science ID 001344771600001
SOE: SO(3)-Equivariant 3D MRI Encoding He, S., Paschali, M., Ouyang, J., Masood, A., Chaudhari, A., Adeli, E. edited by Bathula, D. R., Nirmala, A. B., Dvornek, N. C., Govindarajan, S. T., Habes, M., Kumar, Nebli, A., Wolfers, T., Xiao, Y. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 68-77

View details for DOI 10.1007/978-3-031-78761-4_7

View details for Web of Science ID 001532201500007
SpaRG: Sparsely Reconstructed Graphs for Generalizable fMRI Analysis. Machine learning in clinical neuroimaging : 7th international workshop, MLCN 2024, held in conjunction with MICCAI 2024, Marrakesh, Morocco, October 10, 2024, proceedings. MLCN (Workshop) (7th : 2024 : Marrakesh, Morocco) Gonzalez, C., Miraoui, Y., Fan, Y., Adeli, E., Pohl, K. M. 2025; 15266: 46-56

Abstract

Deep learning can help uncover patterns in resting-state functional Magnetic Resonance Imaging (rs-fMRI) associated with psychiatric disorders and personal traits. Yet the problem of interpreting deep learning findings is rarely more evident than in fMRI analyses, as the data is sensitive to scanning effects and inherently difficult to visualize. We propose a simple approach to mitigate these challenges grounded on sparsification and self-supervision. Instead of extracting post-hoc feature attributions to uncover functional connections that are important to the target task, we identify a small subset of highly informative connections during training and occlude the rest. To this end, we jointly train a (1) sparse input mask, (2) variational autoencoder (VAE), and (3) downstream classifier in an end-to-end fashion. While we need a portion of labeled samples to train the classifier, we optimize the sparse mask and VAE with unlabeled data from additional acquisition sites, retaining only the input features that generalize well. We evaluate our method - Sparsely Reconstructed Graphs (SpaRG) - on the public ABIDE dataset for the task of sex classification, training with labeled cases from 18 sites and adapting the model to two additional out-of-distribution sites with a portion of unlabeled samples. For a relatively coarse parcellation (64 regions), SpaRG utilizes only 1% of the original connections while improving the classification accuracy across domains. Our code can be found at www.github.com/yanismiraoui/SpaRG.

View details for DOI 10.1007/978-3-031-78761-4_5

View details for PubMedID 39758707
The Transition From Homogeneous to Heterogeneous Machine Learning in Neuropsychiatric Research. Biological psychiatry global open science Zhao, Q., Nooner, K. B., Tapert, S. F., Adeli, E., Pohl, K. M., Kuceyeski, A., Sabuncu, M. R. 2025; 5 (1): 100397

Abstract

Despite the advantage of neuroimaging-based machine learning (ML) models as pivotal tools for investigating brain-behavior relationships in neuropsychiatric studies, these data-driven predictive approaches have yet to yield substantial, clinically actionable insights for mental health care. A notable impediment lies in the inadequate accommodation of most ML research to the natural heterogeneity within large samples. Although commonly thought of as individual-level analyses, many ML algorithms are unimodal and homogeneous and thus incapable of capturing the potentially heterogeneous relationships between biology and psychopathology. We review the current landscape of computational research targeting population heterogeneity and argue that there is a need to expand from brain subtyping and behavioral phenotyping to analyses that focus on heterogeneity at the relational level. To this end, we review and suggest several existing ML models with the capacity to discern how external environmental and sociodemographic factors moderate the brain-behavior mapping function in a data-driven fashion. These heterogeneous ML models hold promise for enhancing the discovery of individualized brain-behavior associations and advancing precision psychiatry.

View details for DOI 10.1016/j.bpsgos.2024.100397

View details for PubMedID 39526023

View details for PubMedCentralID PMC11546160
Re-thinking Temporal Search for Long-Form Video Understanding Ye, J., Wang, Z., Sun, H., Chandrasegaran, K., Durante, Z., Eyzaguirre, C., Bisk, Y., Niebles, J., Adeli, E., Li Fei-Fei, Wu, J., Li, M., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2025: 8579-8591

View details for DOI 10.1109/CVPR52734.2025.00802

View details for Web of Science ID 001601106700220
Artist-Created Mesh Generation from Raw Observation He, Y., Kwon, Y., Cai, W., Adeli, E., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2025: 2663-2668

View details for DOI 10.1109/ICCVW69036.2025.00277

View details for Web of Science ID 001740020100274
LOMM: Latest Object Memory Management for Temporally Consistent Video Instance Segmentation. ... IEEE International Conference on Computer Vision workshops. IEEE International Conference on Computer Vision Lee, S., Seo, J., Choi, M., Han, K., Jeong, J., Durante, Z., Adeli, E., Park, S. H., Im, S. 2025; 2025: 13719-13729

Abstract

In this paper, we introduce Latest Object Memory (LOM), a system for robustly tracking and continuously updating the latest states of objects by explicitly modeling their presence across video frames. LOM enables consistent tracking and accurate identity management across frames, enhancing both performance and reliability through the video segmentation process. Building upon LOM, we present Latest Object Memory Management (LOMM) for temporally consistent video instance segmentation, significantly improving long-term instance tracking. This enables consistent tracking and accurate identity management across frames, enhancing both performance and reliability through the video segmentation process. Moreover, we introduce Decoupled Object Association (DOA), a strategy that separately handles newly appearing and already existing objects. By leveraging our memory system, DOA accurately assigns object indices, improving matching accuracy and ensuring stable identity consistency, even in dynamic scenes where objects frequently appear and disappear. Extensive experiments and ablation studies demonstrate the superiority of our method over traditional approaches, setting a new state-of-the-art in video instance segmentation. Notably, our LOMM achieves an AP score of 54.0 on YouTube-VIS 2022, a dataset known for its challenging long videos. Project page: this https URL.

View details for DOI 10.1109/iccv51701.2025.01273

View details for PubMedID 42088149
GAMMA-PD: Graph-based Analysis of Multi-Modal Motor Impairment Assessments in Parkinson's Disease. Graphs in biomedical image analysis : 6th International Workshop, GRAIL 2024, held in conjunction with MICCAI 2024, Marrakesh, Morocco, October 6, 2024, Proceeding. GRAIL (Workshop) (6th : 2024 : Marrakesh, Morocco) Nerrise, F., Heiman, A. L., Adeli, E. 2025; 15182: 57-68

Abstract

The rapid advancement of medical technology has led to an exponential increase in multi-modal medical data, including imaging, genomics, and electronic health records (EHRs). Graph neural networks (GNNs) have been widely used to represent this data due to their prominent performance in capturing pairwise relationships. However, the heterogeneity and complexity of multi-modal medical data still pose significant challenges for standard GNNs, which struggle with learning higher-order, non-pairwise relationships. This paper proposes GAMMA-PD (Graph-based Analysis of Multi-modal Motor Impairment Assessments in Parkinson's Disease), a novel heterogeneous hypergraph fusion framework for multi-modal clinical data analysis. GAMMA-PD integrates imaging and non-imaging data into a "hypernetwork" (patient population graph) by preserving higher-order information and similarity between patient profiles and symptom subtypes. We also design a feature-based attention-weighted mechanism to interpret feature-level contributions towards downstream decision tasks. We evaluate our approach with clinical data from the Parkinson's Progression Markers Initiative (PPMI) and a private dataset. We demonstrate gains in predicting motor impairment symptoms in Parkinson's disease. Our end-to-end framework also learns associations between subsets of patient characteristics to generate clinically relevant explanations for disease and symptom profiles. The source code is available at https://github.com/favour-nerrise/GAMMA-PD.

View details for DOI 10.1007/978-3-031-83243-7_6

View details for PubMedID 40709078
Segmentation of Brain Metastases in MRI: A Two-Stage Deep Learning Approach with Modality Impact Study Sadegheih, Y., Merhof, D. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 196-206

View details for DOI 10.1007/978-3-031-74561-4_17

View details for Web of Science ID 001449852700017
Physics-Guided Multi-view Graph Neural Network for Schizophrenia Classification via Structural-Functional Coupling Mazumder, B., Kanyal, A., Wu, L., Calhoun, V. D., Ye, D. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 61-73

View details for DOI 10.1007/978-3-031-74561-4_6

View details for Web of Science ID 001449852700006
Neurocognitive Latent Space Regularization for Multi-label Diagnosis from MRI Manasseh-Lewis, J., Godoy, F., Peng, W., Paul, R., Adeli, E., Pohl, K. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 185-195

View details for DOI 10.1007/978-3-031-74561-4_16

View details for Web of Science ID 001449852700016
PRISM: Progressive Restoration for Scene Graph-Based Image Manipulation Jahoda, P., Yeganeh, Y., Adeli, E., Navab, N., Farshad, A. edited by DelBue, A., Canton, C., Pont-Tuset, J., Tommasi, T. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 142-160

View details for DOI 10.1007/978-3-031-91838-4_9

View details for Web of Science ID 001544984100009
Gene-to-Image: Decoding Brain Images from Genetics via Latent Diffusion Models Jeon, S., Song, Y., Kim, W. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 48-60

View details for DOI 10.1007/978-3-031-74561-4_5

View details for Web of Science ID 001449852700005
SOE: SO(3)-Equivariant 3D MRI Encoding He, S., Paschali, M., Ouyang, J., Masood, A., Chaudhari, A., Adeli, E. edited by Bathula, D. R., Nirmala, A. B., Dvornek, N. C., Govindarajan, S. T., Habes, M., Kumar, Nebli, A., Wolfers, T., Xiao, Y. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 68-77

View details for DOI 10.1007/978-3-031-78761-4_7

View details for Web of Science ID 001532201500007
SpaRG: Sparsely Reconstructed Graphs for Generalizable fMRI Analysis. Machine learning in clinical neuroimaging : 7th international workshop, MLCN 2024, held in conjunction with MICCAI 2024, Marrakesh, Morocco, October 10, 2024, proceedings. MLCN (Workshop) (7th : 2024 : Marrakesh, Morocco) Gonzalez, C., Miraoui, Y., Fan, Y., Adeli, E., Pohl, K. M. 2025; 15266: 46-56

Abstract

Deep learning can help uncover patterns in resting-state functional Magnetic Resonance Imaging (rs-fMRI) associated with psychiatric disorders and personal traits. Yet the problem of interpreting deep learning findings is rarely more evident than in fMRI analyses, as the data is sensitive to scanning effects and inherently difficult to visualize. We propose a simple approach to mitigate these challenges grounded on sparsification and self-supervision. Instead of extracting post-hoc feature attributions to uncover functional connections that are important to the target task, we identify a small subset of highly informative connections during training and occlude the rest. To this end, we jointly train a (1) sparse input mask, (2) variational autoencoder (VAE), and (3) downstream classifier in an end-to-end fashion. While we need a portion of labeled samples to train the classifier, we optimize the sparse mask and VAE with unlabeled data from additional acquisition sites, retaining only the input features that generalize well. We evaluate our method - Sparsely Reconstructed Graphs (SpaRG) - on the public ABIDE dataset for the task of sex classification, training with labeled cases from 18 sites and adapting the model to two additional out-of-distribution sites with a portion of unlabeled samples. For a relatively coarse parcellation (64 regions), SpaRG utilizes only 1% of the original connections while improving the classification accuracy across domains. Our code can be found at www.github.com/yanismiraoui/SpaRG.

View details for DOI 10.1007/978-3-031-78761-4_5

View details for PubMedID 39758707
The Transition From Homogeneous to Heterogeneous Machine Learning in Neuropsychiatric Research. Biological psychiatry global open science Zhao, Q., Nooner, K. B., Tapert, S. F., Adeli, E., Pohl, K. M., Kuceyeski, A., Sabuncu, M. R. 2025; 5 (1): 100397

Abstract

Despite the advantage of neuroimaging-based machine learning (ML) models as pivotal tools for investigating brain-behavior relationships in neuropsychiatric studies, these data-driven predictive approaches have yet to yield substantial, clinically actionable insights for mental health care. A notable impediment lies in the inadequate accommodation of most ML research to the natural heterogeneity within large samples. Although commonly thought of as individual-level analyses, many ML algorithms are unimodal and homogeneous and thus incapable of capturing the potentially heterogeneous relationships between biology and psychopathology. We review the current landscape of computational research targeting population heterogeneity and argue that there is a need to expand from brain subtyping and behavioral phenotyping to analyses that focus on heterogeneity at the relational level. To this end, we review and suggest several existing ML models with the capacity to discern how external environmental and sociodemographic factors moderate the brain-behavior mapping function in a data-driven fashion. These heterogeneous ML models hold promise for enhancing the discovery of individualized brain-behavior associations and advancing precision psychiatry.

View details for DOI 10.1016/j.bpsgos.2024.100397

View details for PubMedID 39526023

View details for PubMedCentralID PMC11546160
Medical Image Segmentation Review: The Success of U-Net. IEEE transactions on pattern analysis and machine intelligence Azad, R., Aghdam, E. K., Rauland, A., Jia, Y., Avval, A. H., Bozorgpour, A., Karimijafarbigloo, S., Cohen, J. P., Adeli, E., Merhof, D. 2024; 46 (12): 10076-10095

Abstract

Automatic medical image segmentation is a crucial topic in the medical domain and successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities. Over the years, the U-Net model has received tremendous attention from academic and industrial researchers who have extended it to address the scale and complexity created by medical tasks. These extensions are commonly related to enhancing the U-Net's backbone, bottleneck, or skip connections, or including representation learning, or combining it with a Transformer architecture, or even addressing probabilistic prediction of the segmentation map. Having a compendium of different previously proposed U-Net variants makes it easier for machine learning researchers to identify relevant research questions and understand the challenges of the biological tasks that challenge the model. In this work, we discuss the practical aspects of the U-Net model and organize each variant model into a taxonomy. Moreover, to measure the performance of these strategies in a clinical application, we propose fair evaluations of some unique and famous designs on well-known datasets. Furthermore, we provide a comprehensive implementation library with trained models. In addition, for ease of future studies, we created an online list of U-Net papers with their possible official implementation.

View details for DOI 10.1109/TPAMI.2024.3435571

View details for PubMedID 39167505
Clinical Manifestations. Alzheimer's & dementia : the journal of the Alzheimer's Association Adeli, E. 2024; 20 Suppl 3: e086276

Abstract

Historically, screening for incidence of AD-related MCI or conversion from MCI to AD dementia has relied on cognitive, activities of daily living, and brain imaging measures. Limitations of this diagnostic approach include dependency on education and language, time-consuming and costly measures, and long-term monitoring. Emerging studies suggest that non-tremor motor dysfunction in dementias is known to be highly associated with AD biomarkers, with signs of cognitive decline visible in gait and hand movement at various stages of the illness. With the evidence that gait and physical disturbances are early predictors of cognitive impairment and that their trajectories could readily be tracked, we utilize recent advances in computer vision (CV) to quantify mobility in a data-driven fashion from the video-recorded 5-minute Short Performance Physical Battery (SPPB) tests. We use the data collected at Stanford AD Research Center and show that our CV methods can automatically reduce videos to body markers (human skeleton tracked through time) and extract several features (such as gait speed, mean torso inclination angle, double support time, gait-summary score, etc.) and finally turn those into clinical SPPB test scores. Our initial data observed a significant difference between healthy controls (HC) and the two MCI and AD groups for the repeated chair stand test score. Similarly, an inverse correlation between the MoCa cognitive test score and the gait speed is observed. At the end of the talk, I will also discuss how this CV method for mobility can be used for detecting behavioral changes in animal AD models and implications for future human AD research.

View details for DOI 10.1002/alz.086276

View details for PubMedID 39750704
A health-equity framework for tailoring digital non-pharmacological interventions in aging. Nature. Mental health Turnbull, A., Odden, M. C., Gould, C. E., Adeli, E., Kaplan, R. M., Lin, F. V. 2024; 2 (11): 1277-1284

Abstract

If designed with health equity in mind, digital non-pharmacological interventions (NPIs) represent a cost-effective, scalable means of reducing health disparities associated with age-related mental health disorders in older adults in the USA. However, disparities in technological access, literacy and effectiveness can limit the impact of these interventions in older adults from disadvantaged groups. We present a health-equity-promoting framework for the development of digital NPIs for age-related mental health disorders and provide an example from the literature that highlights how interventions can be targeted at specific groups to increase technological access, literacy and effectiveness to ensure that these interventions can meet their potential of reducing health disparities.

View details for DOI 10.1038/s44220-024-00347-6

View details for PubMedID 39867489

View details for PubMedCentralID PMC11756576
A health-equity framework for tailoring digital non-pharmacological interventions in aging NATURE MENTAL HEALTH Turnbull, A., Odden, M. C., Gould, C. E., Adeli, E., Kaplan, R. M., Lin, F. 2024; 2 (11): 1277-1284

View details for DOI 10.1038/s44220-024-00347-6

View details for Web of Science ID 001390110200015
Profiles of brain topology for dual-functional stability in old age. GeroScience Zhou, S., Anthony, M., Adeli, E., Lin, F. V. 2024

Abstract

Dual-functional stability (DFS) in cognitive and physical abilities is important for successful aging. This study examines the brain topology profiles that underpin high DFS in older adults by testing two hypotheses: (1) older adults with high DFS would exhibit a unique brain organization that preserves their physical and cognitive functions across various tasks, and (2) any individuals with this distinct brain topology would consistently show high DFS. We analyzed two cohorts of cognitively and physically healthy older adults from the UK (Cam-CAN, n = 79) and the US (CF, n = 48) using neuroimaging data and a combination of cognitive and physical tasks. Variability in DFS was characterized using k-mean clustering for intra-individual variability (IIV) in cognitive and physical tasks. Graph theory analyses of diffusion tensor imaging connectomes were used to assess brain network segregation and integration through clustering coefficients (CCs) and shortest path lengths (PLs). Using support vector machine and regression, brain topology features, derived from PLs + CCs, differentiated the high DFS subgroup from low and mix DFS subgroups with accuracies of 65.82% and 84.78% in Cam-CAN and CF samples, respectively, which predicted cross-task DFS score in CF samples at 58.06% and 70.53% for cognitive and physical stability, respectively. Results showed distinctive neural correlates associated with high DFS, notably varying regional brain segregation and integration within critical areas such as the insula, frontal pole, and temporal pole. The identified brain topology profiles suggest a distinctive neural basis for DFS, a trait indicative of successful aging. These insights offer a foundation for future research to explore targeted interventions that could enhance cognitive and physical resilience in older adults, promoting a healthier and more functional lifespan.

View details for DOI 10.1007/s11357-024-01396-6

View details for PubMedID 39432149

View details for PubMedCentralID 7058488
SOM2LM: Self-Organized Multi-Modal Longitudinal Maps. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Ouyang, J., Zhao, Q., Adeli, E., Zaharchuk, G., Pohl, K. M. 2024; 15002: 400-410

Abstract

Neuroimage modalities acquired by longitudinal studies often provide complementary information regarding disease progression. For example, amyloid PET visualizes the build-up of amyloid plaques that appear in earlier stages of Alzheimer's disease (AD), while structural MRIs depict brain atrophy appearing in the later stages of the disease. To accurately model multi-modal longitudinal data, we propose an interpretable self-supervised model called Self-Organized Multi-Modal Longitudinal Maps (SOM2LM). SOM2LM encodes each modality as a 2D self-organizing map (SOM) so that one dimension of each modality-specific SOMs corresponds to disease abnormality. The model also regularizes across modalities to depict their temporal order of capturing abnormality. When applied to longitudinal T1w MRIs and amyloid PET of the Alzheimer's Disease Neuroimaging Initiative (ADNI, N=741), SOM2LM generates interpretable latent spaces that characterize disease abnormality. When compared to state-of-art models, it achieves higher accuracy for the downstream tasks of cross-modality prediction of amyloid status from T1w-MRI and joint-modality prediction of individuals with mild cognitive impairment converting to AD using both MRI and amyloid PET. The code is available at https://github.com/ouyangjiahong/longitudinal-som-multi-modality.

View details for DOI 10.1007/978-3-031-72069-7_38

View details for PubMedID 40655076

View details for PubMedCentralID PMC12254005
Data-Driven Discovery of Movement-Linked Heterogeneity in Neurodegenerative Diseases. Nature machine intelligence Endo, M., Nerrise, F., Zhao, Q., Sullivan, E. V., Fei-Fei, L., Henderson, V. W., Pohl, K. M., Poston, K. L., Adeli, E. 2024; 6 (9): 1034-1045

Abstract

Neurodegenerative diseases manifest different motor and cognitive signs and symptoms that are highly heterogeneous. Parsing these heterogeneities may lead to an improved understanding of underlying disease mechanisms; however current methods are dependent on clinical assessments and somewhat arbitrary choice of behavioral tests. Herein, we present a data-driven subtyping approach using video-captured human motion and brain functional connectivity (FC) from resting-state (rs)-fMRI. We applied our framework to a cohort of individuals at different stages of Parkinson's disease (PD). The process mapped the data to low-dimensional measures by projecting them onto a canonical correlation space that identified three PD subtypes: Subtype I was characterized by motor difficulties and poor visuospatial abilities; Subtype II exhibited difficulties in non-motor components of activities of daily living and motor complications (dyskinesias and motor fluctuations); and Subtype III was characterized by predominant tremor symptoms. We conducted a convergent validity analysis by comparing our approach to existing and widely used approaches. The compared approaches yielded subtypes that were adequately well-clustered in the motion-brain representation space we created to delineate subtypes. Our data-driven approach, contrary to other forms of subtyping, derived biomarkers predictive of motion impairment and subtype memberships that were captured objectively by digital videos.

View details for DOI 10.1038/s42256-024-00882-y

View details for PubMedID 40357335

View details for PubMedCentralID PMC12068835
Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs. Medical image analysis Peng, W., Bosschieter, T., Ouyang, J., Paul, R., Sullivan, E. V., Pfefferbaum, A., Adeli, E., Zhao, Q., Pohl, K. M. 2024; 98: 103325

Abstract

Recent advances in generative models have paved the way for enhanced generation of natural and medical images, including synthetic brain MRIs. However, the mainstay of current AI research focuses on optimizing synthetic MRIs with respect to visual quality (such as signal-to-noise ratio) while lacking insights into their relevance to neuroscience. To generate high-quality T1-weighted MRIs relevant for neuroscience discovery, we present a two-stage Diffusion Probabilistic Model (called BrainSynth) to synthesize high-resolution MRIs conditionally-dependent on metadata (such as age and sex). We then propose a novel procedure to assess the quality of BrainSynth according to how well its synthetic MRIs capture macrostructural properties of brain regions and how accurately they encode the effects of age and sex. Results indicate that more than half of the brain regions in our synthetic MRIs are anatomically plausible, i.e., the effect size between real and synthetic MRIs is small relative to biological factors such as age and sex. Moreover, the anatomical plausibility varies across cortical regions according to their geometric complexity. As is, the MRIs generated by BrainSynth significantly improve the training of a predictive model to identify accelerated aging effects in an independent study. These results indicate that our model accurately capture the brain's anatomical information and thus could enrich the data of underrepresented samples in a study. The code of BrainSynth will be released as part of the MONAI project at https://github.com/Project-MONAI/GenerativeModels.

View details for DOI 10.1016/j.media.2024.103325

View details for PubMedID 39208560
Data-driven discovery of movement-linked heterogeneity in neurodegenerative diseases NATURE MACHINE INTELLIGENCE Endo, M., Nerrise, F., Zhao, Q., Sullivan, E. V., Fei-Fei, L., Henderson, V. W., Pohl, K. M., Poston, K. L., Adeli, E. 2024

View details for DOI 10.1038/s42256-024-00882-y

View details for Web of Science ID 001287474600001
TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers. Medical image analysis Chen, J., Mei, J., Li, X., Lu, Y., Yu, Q., Wei, Q., Luo, X., Xie, Y., Adeli, E., Wang, Y., Lungren, M. P., Zhang, S., Xing, L., Lu, L., Yuille, A., Zhou, Y. 2024; 97: 103280

Abstract

Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules: (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations: Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https://github.com/Beckschen/TransUNet and https://github.com/Beckschen/TransUNet-3D, respectively.

View details for DOI 10.1016/j.media.2024.103280

View details for PubMedID 39096845
Vision-based estimation of fatigue and engagement in cognitive training sessions. Artificial intelligence in medicine Wang, Y., Turnbull, A., Xu, Y., Heffner, K., Lin, F. V., Adeli, E. 2024; 154: 102923

Abstract

Computerized cognitive training (CCT) is a scalable, well-tolerated intervention that has promise for slowing cognitive decline. The effectiveness of CCT is often affected by a lack of effective engagement. Mental fatigue is a the primary factor for compromising effective engagement in CCT, particularly in older adults at risk for dementia. There is a need for scalable, automated measures that can constantly monitor and reliably detect mental fatigue during CCT. Here, we develop and validate a novel Recurrent Video Transformer (RVT) method for monitoring real-time mental fatigue in older adults with mild cognitive impairment using their video-recorded facial gestures during CCT. The RVT model achieved the highest balanced accuracy (79.58%) and precision (0.82) compared to the prior models for binary and multi-class classification of mental fatigue. We also validated our model by significantly relating to reaction time across CCT tasks (Waldchi2=5.16,p=0.023). By leveraging dynamic temporal information, the RVT model demonstrates the potential to accurately measure real-time mental fatigue, laying the foundation for future CCT research aiming to enhance effective engagement by timely prevention of mental fatigue.

View details for DOI 10.1016/j.artmed.2024.102923

View details for PubMedID 38970987
Federated learning with knowledge distillation for multi-organ segmentation with partially labeled datasets. Medical image analysis Kim, S., Park, H., Kang, M., Jin, K. H., Adeli, E., Pohl, K. M., Park, S. H. 2024; 95: 103156

Abstract

The state-of-the-art multi-organ CT segmentation relies on deep learning models, which only generalize when trained on large samples of carefully curated data. However, it is challenging to train a single model that can segment all organs and types of tumors since most large datasets are partially labeled or are acquired across multiple institutes that may differ in their acquisitions. A possible solution is Federated learning, which is often used to train models on multi-institutional datasets where the data is not shared across sites. However, predictions of federated learning can be unreliable after the model is locally updated at sites due to 'catastrophic forgetting'. Here, we address this issue by using knowledge distillation (KD) so that the local training is regularized with the knowledge of a global model and pre-trained organ-specific segmentation models. We implement the models in a multi-head U-Net architecture that learns a shared embedding space for different organ segmentation, thereby obtaining multi-organ predictions without repeated processes. We evaluate the proposed method using 8 publicly available abdominal CT datasets of 7 different organs. Of those datasets, 889 CTs were used for training, 233 for internal testing, and 30 volumes for external testing. Experimental results verified that our proposed method substantially outperforms other state-of-the-art methods in terms of accuracy, inference time, and the number of parameters.

View details for DOI 10.1016/j.media.2024.103156

View details for PubMedID 38603844
FedNN: Federated learning on concept drift data using weight and adaptive group normalizations PATTERN RECOGNITION Kang, M., Kim, S., Jin, K., Adeli, E., Pohl, K. M., Park, S. 2024; 149

View details for DOI 10.1016/j.patcog.2023.110230

View details for Web of Science ID 001154871600001
SCOPE: Structural Continuity Preservation for Retinal Vessel Segmentation Yeganeh, Y., Guevercin, G., Xiao, R., Abuzer, A., Adeli, E., Farshad, A., Navab, N. edited by Ahmadi, S. A., Pereira, S. SPRINGER INTERNATIONAL PUBLISHING AG. 2024: 3-13

View details for DOI 10.1007/978-3-031-55088-1_1

View details for Web of Science ID 001212367700001
OccFusion: Rendering Occluded Humans with Generative Diffusion Priors. Advances in neural information processing systems Sun, A., Xiang, T., Delp, S., Fei-Fei, L., Adeli, E. 2024; 37: 92184-92209

Abstract

Most existing human rendering methods require every part of the human to be fully visible throughout the input video. However, this assumption does not hold in real-life settings where obstructions are common, resulting in only partial visibility of the human. Considering this, we present OccFusion, an approach that utilizes efficient 3D Gaussian splatting supervised by pretrained 2D diffusion models for efficient and high-fidelity human rendering. We propose a pipeline consisting of three stages. In the Initialization stage, complete human masks are generated from partial visibility masks. In the Optimization stage, human 3D Gaussians are optimized with additional supervision by Score-Distillation Sampling (SDS) to create a complete geometry of the human. Finally, in the Refinement stage, in-context inpainting is designed to further improve rendering quality on the less observed human body parts. We evaluate OccFusion on ZJU-MoCap and challenging OcMotion sequences and find that it achieves state-of-the-art performance in the rendering of occluded humans.

View details for PubMedID 40575631
Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation Hwa, J., Zhao, Q., Lahiri, A., Masood, A., Salimi, B., Adeli, E., IEEE IEEE COMPUTER SOC. 2024: 103-112

View details for DOI 10.1109/CVPRW63382.2024.00015

View details for Web of Science ID 001327781700011
OccFusion: Rendering Occluded Humans with Generative Diffusion Priors. Advances in neural information processing systems Sun, A., Xiang, T., Delp, S., Fei-Fei, L., Adeli, E. 2024; 37: 92184-92209

Abstract

Most existing human rendering methods require every part of the human to be fully visible throughout the input video. However, this assumption does not hold in real-life settings where obstructions are common, resulting in only partial visibility of the human. Considering this, we present OccFusion, an approach that utilizes efficient 3D Gaussian splatting supervised by pretrained 2D diffusion models for efficient and high-fidelity human rendering. We propose a pipeline consisting of three stages. In the Initialization stage, complete human masks are generated from partial visibility masks. In the Optimization stage, human 3D Gaussians are optimized with additional supervision by Score-Distillation Sampling (SDS) to create a complete geometry of the human. Finally, in the Refinement stage, in-context inpainting is designed to further improve rendering quality on the less observed human body parts. We evaluate OccFusion on ZJU-MoCap and challenging OcMotion sequences and find that it achieves state-of-the-art performance in the rendering of occluded humans.

View details for PubMedID 40575631

View details for PubMedCentralID PMC12199745
H-ViT: A Hierarchical Vision Transformer for Deformable Image Registration Ghahremani, M., Khateri, M., Jian, B., Wiestler, B., Adeli, E., Wachinger, C., IEEE IEEE COMPUTER SOC. 2024: 11513-11523

View details for DOI 10.1109/CVPR52733.2024.01094

View details for Web of Science ID 001342442402082
Towards Robust 3D Pose Transfer with Adversarial Learning Chen, H., Tang, H., Adeli, E., Zhao, G., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2024: 2295-2304

View details for DOI 10.1109/CVPR52733.2024.00223

View details for Web of Science ID 001322555902062
Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection Kim, S., An, S., Chikontwe, P., Kang, M., Adeli, E., Pohl, K. M., Park, S. edited by Wooldridge, M., Dy, J., Natarajan, S. ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 8591-8599

View details for Web of Science ID 001239938200076
SOM2LM: Self-Organized Multi-Modal Longitudinal Maps Ouyang, J., Zhao, Q., Adeli, E., Zaharchuk, G., Pohl, K. M. edited by Linguraru, M. G., Dou, Q., Feragen, A., Giannarou, S., Glocker, B., Lekadir, K., Schnabel, J. A. SPRINGER INTERNATIONAL PUBLISHING AG. 2024: 400-410

View details for DOI 10.1007/978-3-031-72069-7_38

View details for Web of Science ID 001342225800038
One-shot Federated Learning on Medical Data using Knowledge Distillation with Image Synthesis and Client Model Adaptation. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Kang, M., Chikontwe, P., Kim, S., Jin, K. H., Adeli, E., Pohl, K. M., Park, S. H. 2023; 14221: 521-531

Abstract

One-shot federated learning (FL) has emerged as a promising solution in scenarios where multiple communication rounds are not practical. Notably, as feature distributions in medical data are less discriminative than those of natural images, robust global model training with FL is non-trivial and can lead to overfitting. To address this issue, we propose a novel one-shot FL framework leveraging Image Synthesis and Client model Adaptation (FedISCA) with knowledge distillation (KD). To prevent overfitting, we generate diverse synthetic images ranging from random noise to realistic images. This approach (i) alleviates data privacy concerns and (ii) facilitates robust global model training using KD with decentralized client models. To mitigate domain disparity in the early stages of synthesis, we design noise-adapted client models where batch normalization statistics on random noise (synthetic images) are updated to enhance KD. Lastly, the global model is trained with both the original and noise-adapted client models via KD and synthetic images. This process is repeated till global model convergence. Extensive evaluation of this design on five small- and three large-scale medical image classification datasets reveals superior accuracy over prior methods. Code is available at https://github.com/myeongkyunkang/FedISCA.

View details for DOI 10.1007/978-3-031-43895-0_49

View details for PubMedID 38204983

View details for PubMedCentralID PMC10781197
An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Nerrise, F., Zhao, Q., Poston, K. L., Pohl, K. M., Adeli, E. 2023; 14221: 723-733

Abstract

One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (xGW-GAT) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. xGW-GAT predicts the multi-class gait impairment on the MDS-Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explainability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.

View details for DOI 10.1007/978-3-031-43895-0_68

View details for PubMedID 37982132

View details for PubMedCentralID PMC10657737
Generating Realistic Brain MRIs via a Conditional Diffusion Probabilistic Model. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Peng, W., Adeli, E., Bosschieter, T., Hyun Park, S., Zhao, Q., Pohl, K. M. 2023; 14227: 14-24

Abstract

As acquiring MRIs is expensive, neuroscience studies struggle to attain a sufficient number of them for properly training deep learning models. This challenge could be reduced by MRI synthesis, for which Generative Adversarial Networks (GANs) are popular. GANs, however, are commonly unstable and struggle with creating diverse and high-quality data. A more stable alternative is Diffusion Probabilistic Models (DPMs) with a fine-grained training strategy. To overcome their need for extensive computational resources, we propose a conditional DPM (cDPM) with a memory-efficient process that generates realistic-looking brain MRIs. To this end, we train a 2D cDPM to generate an MRI subvolume conditioned on another subset of slices from the same MRI. By generating slices using arbitrary combinations between condition and target slices, the model only requires limited computational resources to learn interdependencies between slices even if they are spatially far apart. After having learned these dependencies via an attention network, a new anatomy-consistent 3D brain MRI is generated by repeatedly applying the cDPM. Our experiments demonstrate that our method can generate high-quality 3D MRIs that share a similar distribution to real MRIs while still diversifying the training set. The code is available at https://github.com/xiaoiker/mask3DMRI_diffusion and also will be released as part of MONAI, at https://github.com/Project-MONAI/GenerativeModels.

View details for DOI 10.1007/978-3-031-43993-3_2

View details for PubMedID 38169668

View details for PubMedCentralID PMC10758344
LSOR: Longitudinally-Consistent Self-Organized Representation Learning. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Ouyang, J., Zhao, Q., Adeli, E., Peng, W., Zaharchuk, G., Pohl, K. M. 2023; 14220: 279-289

Abstract

Interpretability is a key issue when applying deep learning models to longitudinal brain MRIs. One way to address this issue is by visualizing the high-dimensional latent spaces generated by deep learning via self-organizing maps (SOM). SOM separates the latent space into clusters and then maps the cluster centers to a discrete (typically 2D) grid preserving the high-dimensional relationship between clusters. However, learning SOM in a high-dimensional latent space tends to be unstable, especially in a self-supervision setting. Furthermore, the learned SOM grid does not necessarily capture clinically interesting information, such as brain age. To resolve these issues, we propose the first self-supervised SOM approach that derives a high-dimensional, interpretable representation stratified by brain age solely based on longitudinal brain MRIs (i.e., without demographic or cognitive information). Called Longitudinally-consistent Self-Organized Representation learning (LSOR), the method is stable during training as it relies on soft clustering (vs. the hard cluster assignments used by existing SOM). Furthermore, our approach generates a latent space stratified according to brain age by aligning trajectories inferred from longitudinal MRIs to the reference vector associated with the corresponding SOM cluster. When applied to longitudinal MRIs of the Alzheimer's Disease Neuroimaging Initiative (ADNI, N=632), LSOR generates an interpretable latent space and achieves comparable or higher accuracy than the state-of-the-art representations with respect to the downstream tasks of classification (static vs. progressive mild cognitive impairment) and regression (determining ADAS-Cog score of all subjects). The code is available at https://github.com/ouyangjiahong/longitudinal-som-single-modality.

View details for DOI 10.1007/978-3-031-43907-0_27

View details for PubMedID 37961067
An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment. ArXiv Nerrise, F., Zhao, Q., Poston, K. L., Pohl, K. M., Adeli, E. 2023

Abstract

One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (xGW-GAT) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. xGW-GAT predicts the multi-class gait impairment on the MDS-Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explain-ability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.

View details for PubMedID 37547656

View details for PubMedCentralID PMC10402187
CCA identifies a neurophysiological marker of adaptation capacity that is reliably linked to internal locus of control of cognition in amnestic MCI. GeroScience Peralta-Malváez, L., Turnbull, A., Anthony, M., Adeli, E., Lin, F. V. 2023

Abstract

Locus of control (LOC) describes whether an individual thinks that they themselves (internal LOC) or external factors (external LOC) have more influence on their lives. LOC varies by domain, and a person's LOC for their intellectual capacities (LOC-Cognition) may be a marker of resilience in older adults at risk for dementia, with internal LOC-Cognition relating to better outcomes and improved treatment adherence. Vagal control, a key component of parasympathetic autonomic nervous system (ANS) regulation, may reflect a neurophysiological biomarker of internal LOC-Cognition. We used canonical correlation analysis (CCA) to identify a shared neurophysiological marker of ANS regulation from electrocardiogram (during auditory working memory) and functional connectivity (FC) data. A canonical variable from root mean square of successive differences (RMSSD) time series and between-network FC was significantly related to internal LOC-Cognition (β = 0.266, SE = 0.971, CI = [0.190, 4.073], p = 0.031) in 65 participants (mean age = 74.7, 32 female) with amnestic mild cognitive impairment (aMCI). Follow-up data from 55 of these individuals (mean age = 73.6, 22 females) was used to show reliability of this relationship (β = 0.271, SE = 0.971, CI = [0.033, 2.630], p = 0.047), and a second sample (40 participants with aMCI/healthy cognition, mean age = 72.7, 24 females) showed that the canonical vector biomarker generalized to visual working memory (β = 0.36, SE = 0.136, CI = [0.023, 0.574], p = 0.037), but not inhibition task RMSSD data (β = 0.08, SE = 1.486, CI = [- 0.354, 0.657], p = 0.685). This canonical vector may represent a biomarker of autonomic regulation that explains how some older adults maintain internal LOC-Cognition as dementia progresses. Future work should further test the causality of this relationship and the modifiability of this biomarker.

View details for DOI 10.1007/s11357-023-00730-8

View details for PubMedID 36697886
Rendering Humans from Object-Occluded Monocular Videos Xiang, T., Sun, A., Wu, J., Adeli, E., Fei-Fei, L., IEEE IEEE COMPUTER SOC. 2023: 3216-3227

View details for DOI 10.1109/ICCV51070.2023.00300

View details for Web of Science ID 001159644303043
Transformers Pay Attention to Convolutions Leveraging Emerging Properties of ViTs by Dual Attention-Image Network Yeganeh, Y., Farshad, A., Weinberger, P., Ahmadi, S., Adeli, E., Navab, N., IEEE IEEE COMPUTER SOC. 2023: 2296-2307

View details for DOI 10.1109/ICCVW60793.2023.00244

View details for Web of Science ID 001156680302038
One-Shot Federated Learning on Medical Data Using Knowledge Distillation with Image Synthesis and Client Model Adaptation Kang, M., Chikontwe, P., Kim, S., Jin, K., Adeli, E., Pohl, K. M., Park, S. edited by Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T., Taylor, R. SPRINGER INTERNATIONAL PUBLISHING AG. 2023: 521-531

View details for DOI 10.1007/978-3-031-43895-0_49

View details for Web of Science ID 001109624900049
Disentangling Normal Aging From Severity of Disease via Weak Supervision on Longitudinal MRI IEEE TRANSACTIONS ON MEDICAL IMAGING Ouyang, J., Zhao, Q., Adeli, E., Zaharchuk, G., Pohl, K. M. 2022; 41 (10): 2558-2569

Abstract

The continuous progression of neurological diseases are often categorized into conditions according to their severity. To relate the severity to changes in brain morphometry, there is a growing interest in replacing these categories with a continuous severity scale that longitudinal MRIs are mapped onto via deep learning algorithms. However, existing methods based on supervised learning require large numbers of samples and those that do not, such as self-supervised models, fail to clearly separate the disease effect from normal aging. Here, we propose to explicitly disentangle those two factors via weak-supervision. In other words, training is based on longitudinal MRIs being labelled either normal or diseased so that the training data can be augmented with samples from disease categories that are not of primary interest to the analysis. We do so by encouraging trajectories of controls to be fully encoded by the direction associated with brain aging. Furthermore, an orthogonal direction linked to disease severity captures the residual component from normal aging in the diseased cohort. Hence, the proposed method quantifies disease severity and its progression speed in individuals without knowing their condition. We apply the proposed method on data from the Alzheimer's Disease Neuroimaging Initiative (ADNI, N =632 ). We then show that the model properly disentangled normal aging from the severity of cognitive impairment by plotting the resulting disentangled factors of each subject and generating simulated MRIs for a given chronological age and condition. Moreover, our representation obtains higher balanced accuracy when used for two downstream classification tasks compared to other pre-training approaches. The code for our weak-supervised approach is available at https://github.com/ouyangjiahong/longitudinal-direction-disentangle.

View details for DOI 10.1109/TMI.2022.3166131

View details for Web of Science ID 000862400100002

View details for PubMedID 35404811

View details for PubMedCentralID PMC9578549
Semantic instance segmentation with discriminative deep supervision for medical images. Medical image analysis Zhou, S., Nie, D., Adeli, E., Wei, Q., Ren, X., Liu, X., Zhu, E., Yin, J., Wang, Q., Shen, D. 2022; 82: 102626

Abstract

Semantic instance segmentation is crucial for many medical image analysis applications, including computational pathology and automated radiation therapy. Existing methods for this task can be roughly classified into two categories: (1) proposal-based methods and (2) proposal-free methods. However, in medical images, the irregular shape-variations and crowding instances (e.g., nuclei and cells) make it hard for the proposal-based methods to achieve robust instance localization. On the other hand, ambiguous boundaries caused by the low-contrast nature of medical images (e.g., CT images) challenge the accuracy of the proposal-free methods. To tackle these issues, we propose a proposal-free segmentation network with discriminative deep supervision (DDS), which at the same time allows us to gain the power of the proposal-based method. The DDS module is interleaved with a carefully designed proposal-free segmentation backbone in our network. Consequently, the features learned by the backbone network become more sensitive to instance localization. Also, with the proposed DDS module, robust pixel-wise instance-level cues (especially structural information) are introduced for semantic segmentation. Extensive experiments on three datasets, i.e., a nuclei dataset, a pelvic CT image dataset, and a synthetic dataset, demonstrate the superior performance of the proposed algorithm compared to the previous works.

View details for DOI 10.1016/j.media.2022.102626

View details for PubMedID 36208573
Multiple Instance Neuroimage Transformer. PRedictive Intelligence in MEdicine. PRIME (Workshop) Singla, A., Zhao, Q., Do, D. K., Zhou, Y., Pohl, K. M., Adeli, E. 2022; 13564: 36-48

Abstract

For the first time, we propose using a multiple instance learning based convolution-free transformer model, called Multiple Instance Neuroimage Transformer (MINiT), for the classification of T1-weighted (T1w) MRIs. We first present several variants of transformer models adopted for neuroimages. These models extract non-overlapping 3D blocks from the input volume and perform multi-headed self-attention on a sequence of their linear projections. MINiT, on the other hand, treats each of the non-overlapping 3D blocks of the input MRI as its own instance, splitting it further into non-overlapping 3D patches, on which multi-headed self-attention is computed. As a proof-of-concept, we evaluate the efficacy of our model by training it to identify sex from T1w-MRIs of two public datasets: Adolescent Brain Cognitive Development (ABCD) and the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA). The learned attention maps highlight voxels contributing to identifying sex differences in brain morphometry. The code is available at https://github.com/singlaayush/MINIT.

View details for DOI 10.1007/978-3-031-16919-9_4

View details for PubMedID 36331280

View details for PubMedCentralID PMC9629332
Bridging the Gap between Deep Learning and Hypothesis-Driven Analysis via Permutation Testing. PRedictive Intelligence in MEdicine. PRIME (Workshop) Paschali, M., Zhao, Q., Adeli, E., Pohl, K. M. 2022; 13564: 13-23

Abstract

A fundamental approach in neuroscience research is to test hypotheses based on neuropsychological and behavioral measures, i.e., whether certain factors (e.g., related to life events) are associated with an outcome (e.g., depression). In recent years, deep learning has become a potential alternative approach for conducting such analyses by predicting an outcome from a collection of factors and identifying the most "informative" ones driving the prediction. However, this approach has had limited impact as its findings are not linked to statistical significance of factors supporting hypotheses. In this article, we proposed a flexible and scalable approach based on the concept of permutation testing that integrates hypothesis testing into the data-driven deep learning analysis. We apply our approach to the yearly self-reported assessments of 621 adolescent participants of the National Consortium of Alcohol and Neurodevelopment in Adolescence (NCANDA) to predict negative valence, a symptom of major depressive disorder according to the NIMH Research Domain Criteria (RDoC). Our method successfully identifies categories of risk factors that further explain the symptom.

View details for DOI 10.1007/978-3-031-16919-9_2

View details for PubMedID 36342897

View details for PubMedCentralID PMC9632755
GaitForeMer: Self-Supervised Pre-Training of Transformers via Human Motion Forecasting for Few-Shot Gait Impairment Severity Estimation. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Endo, M., Poston, K. L., Sullivan, E. V., Fei-Fei, L., Pohl, K. M., Adeli, E. 2022; 13438: 130-139

Abstract

Parkinson's disease (PD) is a neurological disorder that has a variety of observable motor-related symptoms such as slow movement, tremor, muscular rigidity, and impaired posture. PD is typically diagnosed by evaluating the severity of motor impairments according to scoring systems such as the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS). Automated severity prediction using video recordings of individuals provides a promising route for non-intrusive monitoring of motor impairments. However, the limited size of PD gait data hinders model ability and clinical potential. Because of this clinical data scarcity and inspired by the recent advances in self-supervised large-scale language models like GPT-3, we use human motion forecasting as an effective self-supervised pre-training task for the estimation of motor impairment severity. We introduce GaitForeMer, Gait Forecasting and impairment estimation transforMer, which is first pre-trained on public datasets to forecast gait movements and then applied to clinical data to predict MDS-UPDRS gait impairment severity. Our method outperforms previous approaches that rely solely on clinical data by a large margin, achieving an F1 score of 0.76, precision of 0.79, and recall of 0.75. Using GaitForeMer, we show how public human movement data repositories can assist clinical use cases through learning universal motion representations. The code is available at https://github.com/markendo/GaitForeMer.

View details for DOI 10.1007/978-3-031-16452-1_13

View details for PubMedID 36342887

View details for PubMedCentralID PMC9635991
A Penalty Approach for Normalizing Feature Distributions to Build Confounder-Free Models. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Vento, A., Zhao, Q., Paul, R., Pohl, K., Adeli, E. 2022; 13433: 387-397

Abstract

Translating the use of modern machine learning algorithms into clinical applications requires settling challenges related to explain-ability and management of nuanced confounding factors. To suitably interpret the results, removing or explaining the effect of confounding variables (or metadata) is essential. Confounding variables affect the relationship between input training data and target outputs. Accordingly, when we train a model on such data, confounding variables will bias the distribution of the learned features. A recent promising solution, Meta-Data Normalization (MDN), estimates the linear relationship between the metadata and each feature based on a non-trainable closed-form solution. However, this estimation is confined by the sample size of a mini-batch and thereby may result in an oscillating performance. In this paper, we extend the MDN method by applying a Penalty approach (referred to as PDMN). We cast the problem into a bi-level nested optimization problem. We then approximate that objective using a penalty method so that the linear parameters within the MDN layer are trainable and learned on all samples. This enables PMDN to be plugged into any architectures, even those unfit to run batch-level operations such as transformers and recurrent models. We show improvement in model accuracy and independence from the confounders using PMDN over MDN in a synthetic experiment and a multi-label, multi-site classification of magnetic resonance images.

View details for DOI 10.1007/978-3-031-16437-8_37

View details for PubMedID 36331278

View details for PubMedCentralID PMC9629333
Self-supervised learning of neighborhood embedding for longitudinal MRI. Medical image analysis Ouyang, J., Zhao, Q., Adeli, E., Zaharchuk, G., Pohl, K. M. 2022; 82: 102571

Abstract

In recent years, several deep learning models recommend first to represent Magnetic Resonance Imaging (MRI) as latent features before performing a downstream task of interest (such as classification or regression). The performance of the downstream task generally improves when these latent representations are explicitly associated with factors of interest. For example, we derived such a representation for capturing brain aging by applying self-supervised learning to longitudinal MRIs and then used the resulting encoding to automatically identify diseases accelerating the aging of the brain. We now propose a refinement of this representation by replacing the linear modeling of brain aging with one that is consistent in local neighborhoods in the latent space. Called Longitudinal Neighborhood Embedding (LNE), we derive an encoding so that neighborhoods are age-consistent (i.e., brain MRIs of different subjects with similar brain ages are in close proximity of each other) and progression-consistent, i.e., the latent space is defined by a smooth trajectory field where each trajectory captures changes in brain ages between a pair of MRIs extracted from a longitudinal sequence. To make the problem computationally tractable, we further propose a strategy for mini-batch sampling so that the resulting local neighborhoods accurately approximate the ones that would be defined based on the whole cohort. We evaluate LNE on three different downstream tasks: (1) to predict chronological age from T1-w MRI of 274 healthy subjects participating in a study at SRI International; (2) to distinguish Normal Control (NC) from Alzheimer's Disease (AD) and stable Mild Cognitive Impairment (sMCI) from progressive Mild Cognitive Impairment (pMCI) based on T1-w MRI of 632 participants of the Alzheimer's Disease Neuroimaging Initiative (ADNI); and (3) to distinguish no-to-low from moderate-to-heavy alcohol drinkers based on fractional anisotropy derived from diffusion tensor MRIs of 764 adolescents recruited by the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA). Across the three data sets, the visualization of the smooth trajectory vector fields and superior accuracy on downstream tasks demonstrate the strength of the proposed method over existing self-supervised methods in extracting information related to brain aging, which could help study the impact of substance use and neurodegenerative disorders. The code is available at https://github.com/ouyangjiahong/longitudinal-neighbourhood-embedding.

View details for DOI 10.1016/j.media.2022.102571

View details for PubMedID 36115098
A Novel Explainability Approach for Technology-Driven Translational Research on Brain Aging. Journal of Alzheimer's disease : JAD Turnbull, A., Kaplan, R., Adeli, E., Lin, F. V. 2022

Abstract

Brain aging leads to difficulties in functional independence. Mitigating these difficulties can benefit from technology that predicts, monitors, and modifies brain aging. Translational research prioritizes solutions that can be causally linked to specific pathophysiologies at the same time as demonstrating improvements in impactful real-world outcome measures. This poses a challenge for brain aging technology that needs to address the tension between mechanism-driven precision and clinical relevance. In the current opinion, by synthesizing emerging mechanistic, translational, and clinical research-related frameworks, and our own development of technology-driven brain aging research, we suggest incorporating the appreciation of four desiderata (causality, informativeness, transferability, and fairness) of explainability into early-stage research that designs and tests brain aging technology. We apply a series of work on electrocardiography-based "peripheral" neuroplasticity markers from our work as an illustration of our proposed approach. We believe this novel approach will promote the development and adoption of brain aging technology that links and addresses brain pathophysiology and functional independence in the field of translational research.

View details for DOI 10.3233/JAD-220441

View details for PubMedID 35754280
Detecting negative valence symptoms in adolescents based on longitudinal self-reports and behavioral assessments. Journal of affective disorders Paschali, M., Kiss, O., Zhao, Q., Adeli, E., Podhajsky, S., Muller-Oehring, E. M., Gotlib, I. H., Pohl, K. M., Baker, F. C. 2022

Abstract

BACKGROUND: Given the high prevalence of depressive symptoms reported by adolescents and associated risk of experiencing psychiatric disorders as adults, differentiating the trajectories of the symptoms related to negative valence at an individual level could be crucial in gaining a better understanding of their effects later in life.METHODS: A longitudinal deep learning framework is presented, identifying self-reported and behavioral measurements that detect the depressive symptoms associated with the Negative Valence System domain of the NIMH Research Domain Criteria (RDoC).RESULTS: Applied to the annual records of 621 participants (age range: 12 to 17 years) of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), the deep learning framework identifies predictors of negative valence symptoms, which include lower extraversion, poorer sleep quality, impaired executive control function and factors related to substance use.LIMITATIONS: The results rely mainly on self-reported measures and do not provide information about the underlying neural correlates. Also, a larger sample is required to understand the role of sex and other demographics related to the risk of experiencing symptoms of negative valence.CONCLUSIONS: These results provide new information about predictors of negative valence symptoms in individuals during adolescence that could be critical in understanding the development of depression and identifying targets for intervention. Importantly, findings can inform preventive and treatment approaches for depression in adolescents, focusing on a unique predictor set of modifiable modulators to include factors such as sleep hygiene training, cognitive-emotional therapy enhancing coping and controllability experience and/or substance use interventions.

View details for DOI 10.1016/j.jad.2022.06.002

View details for PubMedID 35688394
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning. Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Qu, L., Zhou, Y., Liang, P. P., Xia, Y., Wang, F., Adeli, E., Fei-Fei, L., Rubin, D. 2022; 2022: 10051-10061

Abstract

Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack of convergence and the potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate that self-attention-based architectures (e.g., Transformers) are more robust to distribution shifts and hence improve federated learning over heterogeneous data. Concretely, we conduct the first rigorous empirical investigation of different neural architectures across a range of federated algorithms, real-world benchmarks, and heterogeneous data splits. Our experiments show that simply replacing convolutional networks with Transformers can greatly reduce catastrophic forgetting of previous devices, accelerate convergence, and reach a better global model, especially when dealing with heterogeneous data. We release our code and pretrained models to encourage future exploration in robust architectures as an alternative to current research efforts on the optimization front.

View details for DOI 10.1109/cvpr52688.2022.00982

View details for PubMedID 36624800

View details for PubMedCentralID PMC9826695
Generative adversarial U-Net for domain-free few-shot medical diagnosis PATTERN RECOGNITION LETTERS Chen, X., Li, Y., Yao, L., Adeli, E., Zhang, Y., Wang, X. 2022; 157: 112-118

View details for DOI 10.1016/j.patrec.2022.03.022

View details for Web of Science ID 000807476100002
GaitForeMer: Self-supervised Pre-training of Transformers via Human Motion Forecasting for Few-Shot Gait Impairment Severity Estimation Endo, M., Poston, K. L., Sullivan, E. V., Fei-Fei, L., Pohl, K. M., Adeli, E. edited by Wang, L., Dou, Q., Fletcher, P. T., Speidel, S., Li, S. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 130-139

View details for DOI 10.1007/978-3-031-16452-1_13

View details for Web of Science ID 000867418200013
MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing Luo, Z., Durante, Z., Li, L., Xie, W., Liu, R., Jin, E., Huang, Z., Li, L., Wu, J., Niebles, J., Adeli, E., Li Fei-Fei edited by Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2022

View details for Web of Science ID 001215469505045
Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning Qu, L., Zhou, Y., Liang, P., Xia, Y., Wang, F., Adeli, E., Li Fei-Fei, Rubin, D., IEEE COMP SOC IEEE COMPUTER SOC. 2022: 10051-10061

View details for DOI 10.1109/CVPR52688.2022.00982

View details for Web of Science ID 000870759103013
WTM: Weighted Temporal Attention Module for Group Activity Recognition Yadav, S., Agrawal, P., Tiwari, K., Adeli, E., Pandey, H., Akbar, S., IEEE IEEE. 2022

View details for DOI 10.1109/IJCNN55064.2022.9892215

View details for Web of Science ID 000867070902098
PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens Hinojosa, C., Marquez, M., Arguello, H., Adeli, E., Fei-Fei, L., Niebles, J. edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 314-332

View details for DOI 10.1007/978-3-031-19772-7_19

View details for Web of Science ID 000898297000019
TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+for Medical Image Segmentation Azad, R., Heidari, M., Shariatnia, M., Aghdam, E., Karimijafarbigloo, S., Adeli, E., Merhof, D. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 91-102

View details for DOI 10.1007/978-3-031-16919-9_9

View details for Web of Science ID 000867616800008
Intervertebral Disc Labeling with Learning Shape Information, a Look once Approach Azad, R., Heidari, M., Cohen-Adad, J., Adeli, E., Merhof, D. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 49-59

View details for DOI 10.1007/978-3-031-16919-9_5

View details for Web of Science ID 000867616800005
Joint Graph Convolution for Analyzing Brain Structural and Functional Connectome Li, Y., Wei, Q., Adeli, E., Pohl, K. M., Zhao, Q. edited by Wang, L., Dou, Q., Fletcher, P. T., Speidel, S., Li, S. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 231-240

Abstract

The white-matter (micro-)structural architecture of the brain promotes synchrony among neuronal populations, giving rise to richly patterned functional connections. A fundamental problem for systems neuroscience is determining the best way to relate structural and functional networks quantified by diffusion tensor imaging and resting-state functional MRI. As one of the state-of-the-art approaches for network analysis, graph convolutional networks (GCN) have been separately used to analyze functional and structural networks, but have not been applied to explore inter-network relationships. In this work, we propose to couple the two networks of an individual by adding inter-network edges between corresponding brain regions, so that the joint structure-function graph can be directly analyzed by a single GCN. The weights of inter-network edges are learnable, reflecting non-uniform structure-function coupling strength across the brain. We apply our Joint-GCN to predict age and sex of 662 participants from the public dataset of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) based on their functional and micro-structural white-matter networks. Our results support that the proposed Joint-GCN outperforms existing multi-modal graph learning approaches for analyzing structural and functional networks.

View details for DOI 10.1007/978-3-031-16431-6_22

View details for Web of Science ID 000867524300022

View details for PubMedID 36321855

View details for PubMedCentralID PMC9620868
Bridging the Gap Between Deep Learning and Hypothesis-Driven Analysis via Permutation Testing Paschali, M., Zhao, Q., Adeli, E., Pohl, K. M. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 13-23

View details for DOI 10.1007/978-3-031-16919-9_2

View details for Web of Science ID 000867616800002
Multiple Instance Neuroimage Transformer Singla, A., Zhao, Q., Do, D. K., Zhou, Y., Pohl, K. M., Adeli, E. edited by Rekik, Adeli, E., Park, S. H., Cintas, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 36-48

View details for DOI 10.1007/978-3-031-16919-9_4

View details for Web of Science ID 000867616800004
A Penalty Approach for Normalizing Feature Distributions to Build Confounder-Free Models Vento, A., Zhao, Q., Paul, R., Pohl, K. M., Adeli, E. edited by Wang, L., Dou, Q., Fletcher, P. T., Speidel, S., Li, S. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 387-397

View details for DOI 10.1007/978-3-031-16437-8_37

View details for Web of Science ID 000867397400037
Switching Structured Prediction for Simple and Complex Human Activity Recognition IEEE TRANSACTIONS ON CYBERNETICS Arzani, M. M., Fathy, M., Azirani, A. A., Adeli, E. 2021; 51 (12): 5859-5870

Abstract

Automatic human activity recognition is an integral part of any interactive application involving humans (e.g., human-robot interaction systems). One of the main challenges for activity recognition is the diversity in the way individuals often perform activities. Furthermore, changes in any of the environment factors (i.e., illumination, complex background, human body shapes, viewpoint, etc.) intensify this challenge. In addition, there are different types of activities that robots need to interpret for seamless interaction with humans. Some activities are short, quick, and simple (e.g., sitting), while others may be detailed/complex, and spread throughout a long span of time (e.g., washing mouth). In this article, we recognize the activities within the context of graphical models in a sequence-labeling framework based on skeleton data. We propose a new structured prediction strategy based on probabilistic graphical models (PGMs) to recognize both types of activities (i.e., complex and simple). These activity types are often spanned in very diverse subspaces in the space of all possible activities, which would require different model parameterizations. In order to deal with these parameterization and structural breaks across models, a category-switching scheme is proposed to switch over the models based on the activity types. For parameter optimization, we utilize a distributed structured prediction technique to implement our model in a distributed setting. The method is tested on three widely used datasets (CAD-60, UT-Kinect, and Florence 3-D) that cover both activity types. The results illustrate that our proposed method is able to recognize simple and complex activities while the previous work concentrated on only one of these two main types.

View details for DOI 10.1109/TCYB.2019.2960481

View details for Web of Science ID 000733232400023

View details for PubMedID 31945007
Multi-label, multi-domain learning identifies compounding effects of HIV and cognitive impairment. Medical image analysis Zhang, J., Zhao, Q., Adeli, E., Pfefferbaum, A., Sullivan, E. V., Paul, R., Valcour, V., Pohl, K. M. 2021; 75: 102246

Abstract

Older individuals infected by Human Immunodeficiency Virus (HIV) are at risk for developing HIV-Associated Neurocognitive Disorder (HAND), i.e., from reduced cognitive functioning similar to HIV-negative individuals with Mild Cognitive Impairment (MCI) or to Alzheimer's Disease (AD) if more severely affected. Incompletely understood is how brain structure can serve to differentiate cognitive impairment (CI) in the HIV-positive (i.e., HAND) from the HIV-negative cohort (i.e., MCI and AD). To that end, we designed a multi-label classifier that labels the structural magnetic resonance images (MRI) of individuals by their HIV and CI status via two binary variables. Proper training of such an approach traditionally requires well-curated datasets containing large number of samples for each of the corresponding four cohorts (healthy controls, CI HIV-negative adults a.k.a. CI-only, HIV-positive patients without CI a.k.a. HIV-only, and HAND). Because of the rarity of such datasets, we proposed to improve training of the multi-label classifier via a multi-domain learning scheme that also incorporates domain-specific classifiers on auxiliary single-label datasets specific to either binary label. Specifically, we complement the training dataset of MRIs of the four cohorts (Control: 156, CI-only: 335, HIV-only: 37, HAND: 145) acquired by the Memory and Aging Center at the University of California - San Francisco with a CI-specific dataset only containing MRIs of HIV-negative subjects (Controls: 229, CI-only: 397) from the Alzheimer's Disease Neuroimaging Initiative and an HIV-specific dataset (Controls: 75, HIV-only: 75) provided by SRI International. Based on cross-validation on the UCSF dataset, the multi-domain and multi-label learning strategy leads to superior classification accuracy compared with one-domain or multi-class learning approaches, specifically for the undersampled HIV-only cohort. The 'prediction logits' of CI computed by the multi-label formulation also successfully stratify motor performance among the HIV-positive subjects (including HAND). Finally, brain patterns driving the subject-level predictions across all four cohorts characterize the independent and compounding effects of HIV and CI in the HAND cohort.

View details for DOI 10.1016/j.media.2021.102246

View details for PubMedID 34706304
Longitudinal Correlation Analysis for Decoding Multi-modal Brain Development. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Zhao, Q., Adeli, E., Pohl, K. M. 2021; 12907: 400-409

Abstract

Starting from childhood, the human brain restructures and rewires throughout life. Characterizing such complex brain development requires effective analysis of longitudinal and multi-modal neuroimaging data. Here, we propose such an analysis approach named Longitudinal Correlation Analysis (LCA). LCA couples the data of two modalities by first reducing the input from each modality to a latent representation based on autoencoders. A self-supervised strategy then relates the two latent spaces by jointly disentangling two directions, one in each space, such that the longitudinal changes in latent representations along those directions are maximally correlated between modalities. We applied LCA to analyze the longitudinal T1-weighted and diffusion-weighted MRIs of 679 youths from the National Consortium on Alcohol and Neurodevelopment in Adolescence. Unlike existing approaches that focus on either cross-sectional or single-modal modeling, LCA successfully unraveled coupled macrostructural and microstructural brain development from morphological and diffusivity features extracted from the data. A retesting of LCA on raw 3D image volumes of those subjects successfully replicated the findings from the feature-based analysis. Lastly, the developmental effects revealed by LCA were inline with the current understanding of maturational patterns of the adolescent brain.

View details for DOI 10.1007/978-3-030-87234-2_38

View details for PubMedID 35253021

View details for PubMedCentralID PMC8896397
Self-Supervised Longitudinal Neighbourhood Embedding. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Ouyang, J., Zhao, Q., Adeli, E., Sullivan, E. V., Pfefferbaum, A., Zaharchuk, G., Pohl, K. M. 2021; 12902: 80-89

Abstract

Longitudinal MRIs are often used to capture the gradual deterioration of brain structure and function caused by aging or neurological diseases. Analyzing this data via machine learning generally requires a large number of ground-truth labels, which are often missing or expensive to obtain. Reducing the need for labels, we propose a self-supervised strategy for representation learning named Longitudinal Neighborhood Embedding (LNE). Motivated by concepts in contrastive learning, LNE explicitly models the similarity between trajectory vectors across different subjects. We do so by building a graph in each training iteration defining neighborhoods in the latent space so that the progression direction of a subject follows the direction of its neighbors. This results in a smooth trajectory field that captures the global morphological change of the brain while maintaining the local continuity. We apply LNE to longitudinal T1w MRIs of two neuroimaging studies: a dataset composed of 274 healthy subjects, and Alzheimer's Disease Neuroimaging Initiative (ADNI, N = 632). The visualization of the smooth trajectory vector field and superior performance on downstream tasks demonstrate the strength of the proposed method over existing self-supervised methods in extracting information associated with normal aging and in revealing the impact of neurodegenerative disorders. The code is available at https://github.com/ouyangjiahong/longitudinal-neighbourhood-embedding.

View details for DOI 10.1007/978-3-030-87196-3_8

View details for PubMedID 35727732

View details for PubMedCentralID PMC9204645
Representation Disentanglement for Multi-modal Brain MRI Analysis. Information processing in medical imaging : proceedings of the ... conference Ouyang, J., Adeli, E., Pohl, K. M., Zhao, Q., Zaharchuk, G. 2021; 12729: 321-333

Abstract

Multi-modal MRIs are widely used in neuroimaging applications since different MR sequences provide complementary information about brain structures. Recent works have suggested that multi-modal deep learning analysis can benefit from explicitly disentangling anatomical (shape) and modality (appearance) information into separate image presentations. In this work, we challenge mainstream strategies by showing that they do not naturally lead to representation disentanglement both in theory and in practice. To address this issue, we propose a margin loss that regularizes the similarity in relationships of the representations across subjects and modalities. To enable robust training, we further use a conditional convolution to design a single model for encoding images of all modalities. Lastly, we propose a fusion function to combine the disentangled anatomical representations as a set of modality-invariant features for downstream tasks. We evaluate the proposed method on three multi-modal neuroimaging datasets. Experiments show that our proposed method can achieve superior disentangled representations compared to existing disentanglement strategies. Results also indicate that the fused anatomical representation has potential in the downstream task of zero-dose PET reconstruction and brain tumor segmentation.

View details for DOI 10.1007/978-3-030-78191-0_25

View details for PubMedID 35173402
Longitudinal Pooling & Consistency Regularization to Model Disease Progression From MRIs. IEEE journal of biomedical and health informatics Ouyang, J., Zhao, Q., Sullivan, E. V., Pfefferbaum, A., Tapert, S. F., Adeli, E., Pohl, K. M. 2021; 25 (6): 2082-2092

Abstract

Many neurological diseases are characterized by gradual deterioration of brain structure andfunction. Large longitudinal MRI datasets have revealed such deterioration, in part, by applying machine and deep learning to predict diagnosis. A popular approach is to apply Convolutional Neural Networks (CNN) to extract informative features from each visit of the longitudinal MRI and then use those features to classify each visit via Recurrent Neural Networks (RNNs). Such modeling neglects the progressive nature of the disease, which may result in clinically implausible classifications across visits. To avoid this issue, we propose to combine features across visits by coupling feature extraction with a novel longitudinal pooling layer and enforce consistency of the classification across visits in line with disease progression. We evaluate the proposed method on the longitudinal structural MRIs from three neuroimaging datasets: Alzheimer's Disease Neuroimaging Initiative (ADNI, N=404), a dataset composed of 274 normal controls and 329 patients with Alcohol Use Disorder (AUD), and 255 youths from the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA). In allthree experiments our method is superior to other widely used approaches for longitudinal classification thus making a unique contribution towards more accurate tracking of the impact of conditions on the brain. The code is available at https://github.com/ouyangjiahong/longitudinal-pooling.

View details for DOI 10.1109/JBHI.2020.3042447

View details for PubMedID 33270567
Multi-view representation learning and understanding MULTIMEDIA TOOLS AND APPLICATIONS Zhou, T., Zhang, Y., Thung, K., Adeli, E., Rekik, I., Zhao, Q., Zhang, C. 2021; 80 (15): 22865

View details for DOI 10.1007/s11042-021-10504-z

View details for Web of Science ID 000669314100025
Going Beyond Saliency Maps: Training Deep Models to Interpret Deep Models. Information processing in medical imaging : proceedings of the ... conference Liu, Z., Adeli, E., Pohl, K. M., Zhao, Q. 2021; 12729: 71-82

Abstract

Interpretability is a critical factor in applying complex deep learning models to advance the understanding of brain disorders in neuroimaging studies. To interpret the decision process of a trained classifier, existing techniques typically rely on saliency maps to quantify the voxel-wise or feature-level importance for classification through partial derivatives. Despite providing some level of localization, these maps are not human-understandable from the neuroscience perspective as they often do not inform the specific type of morphological changes linked to the brain disorder. Inspired by the image-to-image translation scheme, we propose to train simulator networks to inject (or remove) patterns of the disease into a given MRI based on a warping operation, such that the classifier increases (or decreases) its confidence in labeling the simulated MRI as diseased. To increase the robustness of training, we propose to couple the two simulators into a unified model based on conditional convolution. We applied our approach to interpreting classifiers trained on a synthetic dataset and two neuroimaging datasets to visualize the effect of Alzheimer's disease and alcohol dependence. Compared to the saliency maps generated by baseline approaches, our simulations and visualizations based on the Jacobian determinants of the warping field reveal meaningful and understandable patterns related to the diseases.

View details for DOI 10.1007/978-3-030-78191-0_6

View details for PubMedID 34548772
Longitudinal self-supervised learning. Medical image analysis Zhao, Q., Liu, Z., Adeli, E., Pohl, K. M. 2021; 71: 102051

Abstract

Machine learning analysis of longitudinal neuroimaging data is typically based on supervised learning, which requires large number of ground-truth labels to be informative. As ground-truth labels are often missing or expensive to obtain in neuroscience, we avoid them in our analysis by combing factor disentanglement with self-supervised learning to identify changes and consistencies across the multiple MRIs acquired of each individual over time. Specifically, we propose a new definition of disentanglement by formulating a multivariate mapping between factors (e.g., brain age) associated with an MRI and a latent image representation. Then, factors that evolve across acquisitions of longitudinal sequences are disentangled from that mapping by self-supervised learning in such a way that changes in a single factor induce change along one direction in the representation space. We implement this model, named Longitudinal Self-Supervised Learning (LSSL), via a standard autoencoding structure with a cosine loss to disentangle brain age from the image representation. We apply LSSL to two longitudinal neuroimaging studies to highlight its strength in extracting the brain-age information from MRI and revealing informative characteristics associated with neurodegenerative and neuropsychological disorders. Moreover, the representations learned by LSSL facilitate supervised classification by recording faster convergence and higher (or similar) prediction accuracy compared to several other representation learning techniques.

View details for DOI 10.1016/j.media.2021.102051

View details for PubMedID 33882336
Cascaded MultiTask 3-D Fully Convolutional Networks for Pancreas Segmentation IEEE TRANSACTIONS ON CYBERNETICS Xue, J., He, K., Nie, D., Adeli, E., Shi, Z., Lee, S., Zheng, Y., Liu, X., Li, D., Shen, D. 2021; 51 (4): 2153-2165

Abstract

Automatic pancreas segmentation is crucial to the diagnostic assessment of diabetes or pancreatic cancer. However, the relatively small size of the pancreas in the upper body, as well as large variations of its location and shape in retroperitoneum, make the segmentation task challenging. To alleviate these challenges, in this article, we propose a cascaded multitask 3-D fully convolution network (FCN) to automatically segment the pancreas. Our cascaded network is composed of two parts. The first part focuses on fast locating the region of the pancreas, and the second part uses a multitask FCN with dense connections to refine the segmentation map for fine voxel-wise segmentation. In particular, our multitask FCN with dense connections is implemented to simultaneously complete tasks of the voxel-wise segmentation and skeleton extraction from the pancreas. These two tasks are complementary, that is, the extracted skeleton provides rich information about the shape and size of the pancreas in retroperitoneum, which can boost the segmentation of pancreas. The multitask FCN is also designed to share the low- and mid-level features across the tasks. A feature consistency module is further introduced to enhance the connection and fusion of different levels of feature maps. Evaluations on two pancreas datasets demonstrate the robustness of our proposed method in correctly segmenting the pancreas in various settings. Our experimental results outperform both baseline and state-of-the-art methods. Moreover, the ablation study shows that our proposed parts/modules are critical for effective multitask learning.

View details for DOI 10.1109/TCYB.2019.2955178

View details for Web of Science ID 000631201900034

View details for PubMedID 31869812
Deep End-to-End One-Class Classifier IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Sabokrou, M., Fathy, M., Zhao, G., Adeli, E. 2021; 32 (2): 675–84

Abstract

One-class classification (OCC) poses as an essential component in many machine learning and computer vision applications, including novelty, anomaly, and outlier detection systems. With a known definition for a target or normal set of data, one-class classifiers can determine if any given new sample spans within the distribution of the target class. Solving for this task in a general setting is particularly very challenging, due to the high diversity of samples from the target class and the absence of any supervising signal over the novelty (nontarget) concept, which makes designing end-to-end models unattainable. In this article, we propose an adversarial training approach to detect out-of-distribution samples in an end-to-end trainable deep model. To this end, we jointly train two deep neural networks, R and D . The latter plays as the discriminator while the former, during training, helps D characterize a probability distribution for the target class by creating adversarial examples and, during testing, collaborates with it to detect novelties. Using our OCC, we first test outlier detection on two image data sets, Modified National Institute of Standards and Technology (MNIST) and Caltech-256. Then, several experiments for video anomaly detection are performed on University of Minnesota (UMN) and University of California, San Diego (UCSD) data sets. Our proposed method can successfully learn the target class underlying distribution and outperforms other approaches.

View details for DOI 10.1109/TNNLS.2020.2979049

View details for Web of Science ID 000616310400017

View details for PubMedID 32275608
MetricUNet: Synergistic image- and voxel-level learning for precise prostate segmentation via online sampling. Medical image analysis He, K. n., Lian, C. n., Adeli, E. n., Huo, J. n., Gao, Y. n., Zhang, B. n., Zhang, J. n., Shen, D. n. 2021; 71: 102039

Abstract

Fully convolutional networks (FCNs), including UNet and VNet, are widely-used network architectures for semantic segmentation in recent studies. However, conventional FCN is typically trained by the cross-entropy or Dice loss, which only calculates the error between predictions and ground-truth labels for pixels individually. This often results in non-smooth neighborhoods in the predicted segmentation. This problem becomes more serious in CT prostate segmentation as CT images are usually of low tissue contrast. To address this problem, we propose a two-stage framework, with the first stage to quickly localize the prostate region, and the second stage to precisely segment the prostate by a multi-task UNet architecture. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Therefore, the proposed network has a dual-branch architecture that tackles two tasks: (1) a segmentation sub-network aiming to generate the prostate segmentation, and (2) a voxel-metric learning sub-network aiming to improve the quality of the learned feature space supervised by a metric loss. Specifically, the voxel-metric learning sub-network samples tuples (including triplets and pairs) in voxel-level through the intermediate feature maps. Unlike conventional deep metric learning methods that generate triplets or pairs in image-level before the training phase, our proposed voxel-wise tuples are sampled in an online manner and operated in an end-to-end fashion via multi-task learning. To evaluate the proposed method, we implement extensive experiments on a real CT image dataset consisting 339 patients. The ablation studies show that our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss. And the comparisons show that the proposed method outperforms the state-of-the-art methods by a reasonable margin.

View details for DOI 10.1016/j.media.2021.102039

View details for PubMedID 33831595
Metadata Normalization Lu, M., Zhao, Q., Zhang, J., Pohl, K. M., Li Fei-Fei, Niebles, J., Adeli, E., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 10912-10922

Abstract

Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods. While these techniques normalize feature distributions by standardizing with batch statistics, they do not correct the influence on features from extraneous variables or multiple distributions. Such extra variables, referred to as metadata here, may create bias or confounding effects (e.g., race when classifying gender from face images). We introduce the Metadata Normalization (MDN) layer, a new batch-level operation which can be used end-to-end within the training framework, to correct the influence of metadata on feature distributions. MDN adopts a regression analysis technique traditionally used for preprocessing to remove (regress out) the metadata effects on model features during training. We utilize a metric based on distance correlation to quantify the distribution bias from the metadata and demonstrate that our method successfully removes metadata effects on four diverse settings: one synthetic, one 2D image, one video, and one 3D medical image dataset.

View details for DOI 10.1109/CVPR46437.2021.01077

View details for Web of Science ID 000742075001011

View details for PubMedID 34776724

View details for PubMedCentralID PMC8589298
MOMA: Multi-Object Multi-Actor Activity Parsing Luo, Z., Xie, W., Kapoor, S., Liang, Y., Cooper, M., Niebles, J., Adeli, E., Li Fei-Fei edited by Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P. S., Vaughan, J. W. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2021

View details for Web of Science ID 000922928208019
Longitudinal Correlation Analysis for Decoding Multi-modal Brain Development Zhao, Q., Adeli, E., Pohl, K. M. edited by DeBruijne, M., Cattin, P. C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2021: 400-409

View details for DOI 10.1007/978-3-030-87234-2_38

View details for Web of Science ID 000712024400038
Metadata Normalization. Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Lu, M., Zhao, Q., Zhang, J., Pohl, K. M., Fei-Fei, L., Niebles, J. C., Adeli, E. 2021; 2021: 10912-10922

Abstract

Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods. While these techniques normalize feature distributions by standardizing with batch statistics, they do not correct the influence on features from extraneous variables or multiple distributions. Such extra variables, referred to as metadata here, may create bias or confounding effects (e.g., race when classifying gender from face images). We introduce the Metadata Normalization (MDN) layer, a new batch-level operation which can be used end-to-end within the training framework, to correct the influence of metadata on feature distributions. MDN adopts a regression analysis technique traditionally used for preprocessing to remove (regress out) the metadata effects on model features during training. We utilize a metric based on distance correlation to quantify the distribution bias from the metadata and demonstrate that our method successfully removes metadata effects on four diverse settings: one synthetic, one 2D image, one video, and one 3D medical image dataset.

View details for DOI 10.1109/cvpr46437.2021.01077

View details for PubMedID 34776724

View details for PubMedCentralID PMC8589298
Self-supervised Longitudinal Neighbourhood Embedding Ouyang, J., Zhao, Q., Adeli, E., Sullivan, E., Pfefferbaum, A., Zaharchuk, G., Pohl, K. M. edited by DeBruijne, M., Cattin, P. C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. SPRINGER INTERNATIONAL PUBLISHING AG. 2021: 80-89

View details for DOI 10.1007/978-3-030-87196-3_8

View details for Web of Science ID 000712020700008
CoCon: Cooperative-Contrastive Learning Rai, N., Adeli, E., Lee, K., Gaidon, A., Niebles, J., IEEE Comp Soc IEEE COMPUTER SOC. 2021: 3379-3388

View details for DOI 10.1109/CVPRW53098.2021.00377

View details for Web of Science ID 000705890203052
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild Adeli, V., Ehsanpour, M., Reid, I., Niebles, J., Savarese, S., Adeli, E., Rezatofighi, H., IEEE IEEE. 2021: 13370-13380

View details for DOI 10.1109/ICCV48922.2021.01314

View details for Web of Science ID 000798743203055
Home Action Genome: Cooperative Compositional Action Understanding Rai, N., Chen, H., Ji, J., Desai, R., Kozuka, K., Ishizaka, S., Adeli, E., Niebles, J., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 11179-11188

View details for DOI 10.1109/CVPR46437.2021.01103

View details for Web of Science ID 000742075001037
Representation Learning with Statistical Independence to Mitigate Bias. IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision Adeli, E., Zhao, Q., Pfefferbaum, A., Sullivan, E. V., Fei-Fei, L., Niebles, J. C., Pohl, K. M. 2021; 2021: 2512-2522

Abstract

Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years. Such challenges range from spurious associations between variables in medical studies to the bias of race in gender or face recognition systems. Controlling for all types of biases in the dataset curation stage is cumbersome and sometimes impossible. The alternative is to use the available data and build models incorporating fair representation learning. In this paper, we propose such a model based on adversarial training with two competing objectives to learn features that have (1) maximum discriminative power with respect to the task and (2) minimal statistical mean dependence with the protected (bias) variable(s). Our approach does so by incorporating a new adversarial loss function that encourages a vanished correlation between the bias and the learned features. We apply our method to synthetic data, medical images (containing task bias), and a dataset for gender classification (containing dataset bias). Our results show that the learned features by our method not only result in superior prediction performance but also are unbiased.

View details for DOI 10.1109/wacv48630.2021.00256

View details for PubMedID 34522832
Is Frailty Associated with Adverse Outcomes After Orthopaedic Surgery?: A Systematic Review and Assessment of Definitions. JBJS reviews Lemos, J. L., Welch, J. M., Xiao, M., Shapiro, L. M., Adeli, E., Kamal, R. N. 1800; 9 (12)

Abstract

BACKGROUND: There is increasing evidence supporting the association between frailty and adverse outcomes after surgery. There is, however, no consensus on how frailty should be assessed and used to inform treatment. In this review, we aimed to synthesize the current literature on the use of frailty as a predictor of adverse outcomes following orthopaedic surgery by (1) identifying the frailty instruments used and (2) evaluating the strength of the association between frailty and adverse outcomes after orthopaedic surgery.METHODS: A systematic review was performed using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. PubMed, Scopus, and the Cochrane Central Register of Controlled Trials were searched to identify articles that reported on outcomes after orthopaedic surgery within frail populations. Only studies that defined frail patients using a frailty instrument were included. The methodological quality of studies was assessed using the Newcastle-Ottawa Scale (NOS). Study demographic information, frailty instrument information (e.g., number of items, domains included), and clinical outcome measures (including mortality, readmissions, and length of stay) were collected and reported.RESULTS: The initial search yielded 630 articles. Of these, 177 articles underwent full-text review; 82 articles were ultimately included and analyzed. The modified frailty index (mFI) was the most commonly used frailty instrument (38% of the studies used the mFI-11 [11-item mFI], and 24% of the studies used the mFI-5 [5-item mFI]), although a large variety of instruments were used (24 different instruments identified). Total joint arthroplasty (22%), hip fracture management (17%), and adult spinal deformity management (15%) were the most frequently studied procedures. Complications (71%) and mortality (51%) were the most frequently reported outcomes; 17% of studies reported on a functional outcome.CONCLUSIONS: There is no consensus on the best approach to defining frailty among orthopaedic surgery patients, although instruments based on the accumulation-of-deficits model (such as the mFI) were the most common. Frailty was highly associated with adverse outcomes, but the majority of the studies were retrospective and did not identify frailty prospectively in a prediction model. Although many outcomes were described (complications and mortality being the most common), there was a considerable amount of heterogeneity in measurement strategy and subsequent strength of association. Future investigations evaluating the association between frailty and orthopaedic surgical outcomes should focus on prospective study designs, long-term outcomes, and assessments of patient-reported outcomes and/or functional recovery scores.CLINICAL RELEVANCE: Preoperatively identifying high-risk orthopaedic surgery patients through frailty instruments has the potential to improve patient outcomes. Frailty screenings can create opportunities for targeted intervention efforts and guide patient-provider decision-making.

View details for DOI 10.2106/JBJS.RVW.21.00065

View details for PubMedID 34936580
3D CNNs with Adaptive Temporal Feature Resolutions Fayyaz, M., Bahrami, E., Diba, A., Noroozi, M., Adeli, E., Van Gool, L., Gall, J., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 4729-4738

View details for DOI 10.1109/CVPR46437.2021.00470

View details for Web of Science ID 000739917304090
Scalable Differential Privacy with Sparse Network Finetuning Luo, Z., Wu, D. J., Adeli, E., Li Fei-Fei, IEEE COMP SOC IEEE COMPUTER SOC. 2021: 5057-5066

View details for DOI 10.1109/CVPR46437.2021.00502

View details for Web of Science ID 000739917305026
Quantifying Parkinson's disease motor severity under uncertainty using MDS-UPDRS videos. Medical image analysis Lu, M., Zhao, Q., Poston, K. L., Sullivan, E. V., Pfefferbaum, A., Shahid, M., Katz, M., Kouhsari, L. M., Schulman, K., Milstein, A., Niebles, J. C., Henderson, V. W., Fei-Fei, L., Pohl, K. M., Adeli, E. 2021; 73: 102179

Abstract

Parkinson's disease (PD) is a brain disorder that primarily affects motor function, leading to slow movement, tremor, and stiffness, as well as postural instability and difficulty with walking/balance. The severity of PD motor impairments is clinically assessed by part III of the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS), a universally-accepted rating scale. However, experts often disagree on the exact scoring of individuals. In the presence of label noise, training a machine learning model using only scores from a single rater may introduce bias, while training models with multiple noisy ratings is a challenging task due to the inter-rater variabilities. In this paper, we introduce an ordinal focal neural network to estimate the MDS-UPDRS scores from input videos, to leverage the ordinal nature of MDS-UPDRS scores and combat class imbalance. To handle multiple noisy labels per exam, the training of the network is regularized via rater confusion estimation (RCE), which encodes the rating habits and skills of raters via a confusion matrix. We apply our pipeline to estimate MDS-UPDRS test scores from their video recordings including gait (with multiple Raters, R=3) and finger tapping scores (single rater). On a sizable clinical dataset for the gait test (N=55), we obtained a classification accuracy of 72% with majority vote as ground-truth, and an accuracy of ∼84% of our model predicting at least one of the raters' scores. Our work demonstrates how computer-assisted technologies can be used to track patients and their motor impairments, even when there is uncertainty in the clinical ratings. The latest version of the code will be available at https://github.com/mlu355/PD-Motor-Severity-Estimation.

View details for DOI 10.1016/j.media.2021.102179

View details for PubMedID 34340101
Association of Heavy Drinking With Deviant Fiber Tract Development in Frontal Brain Systems in Adolescents. JAMA psychiatry Zhao, Q., Sullivan, E. V., Honnorat, N., Adeli, E., Podhajsky, S., De Bellis, M. D., Voyvodic, J., Nooner, K. B., Baker, F. C., Colrain, I. M., Tapert, S. F., Brown, S. A., Thompson, W. K., Nagel, B. J., Clark, D. B., Pfefferbaum, A., Pohl, K. M. 2020

Abstract

Importance: Maturation of white matter fiber systems subserves cognitive, behavioral, emotional, and motor development during adolescence. Hazardous drinking during this active neurodevelopmental period may alter the trajectory of white matter microstructural development, potentially increasing risk for developing alcohol-related dysfunction and alcohol use disorder in adulthood.Objective: To identify disrupted adolescent microstructural brain development linked to drinking onset and to assess whether the disruption is more pronounced in younger rather than older adolescents.Design, Setting, and Participants: This case-control study, conducted from January 13, 2013, to January 15, 2019, consisted of an analysis of 451 participants from the National Consortium on Alcohol and Neurodevelopment in Adolescence cohort. Participants were aged 12 to 21 years at baseline and had at least 2 usable magnetic resonance diffusion tensor imaging (DTI) scans and up to 5 examination visits spanning 4 years. Participants with a youth-adjusted Cahalan score of 0 were labeled as no-to-low drinkers; those with a score of greater than 1 for at least 2 consecutive visits were labeled as heavy drinkers. Exploratory analysis was conducted between no-to-low and heavy drinkers. A between-group analysis was conducted between age- and sex-matched youths, and a within-participant analysis was performed before and after drinking.Exposures: Self-reported alcohol consumption in the past year summarized by categorical drinking levels.Main Outcomes and Measures: Diffusion tensor imaging measurement of fractional anisotropy (FA) in the whole brain and fiber systems quantifying the developmental change of each participant as a slope.Results: Analysis of whole-brain FA of 451 adolescents included 291 (64.5%) no-to-low drinkers and 160 (35.5%) heavy drinkers who indicated the potential for a deleterious association of alcohol with microstructural development. Among the no-to-low drinkers, 142 (48.4%) were boys with mean (SD) age of 16.5 (2.2) years and 149 (51.2%) were girls with mean (SD) age of 16.5 (2.1) years and 192 (66.0%) were White participants. Among the heavy drinkers, 86 (53.8%) were boys with mean (SD) age of 20.1 (1.5) years and 74 (46.3%) were girls with mean (SD) age of 20.5 (2.0) years and 142 (88.8%) were White participants. A group analysis revealed FA reduction in heavy-drinking youth compared with age- and sex-matched controls (t154=-2.7, P=.008). The slope of this reduction correlated with log of days of drinking since the baseline visit (r156=-0.21, 2-tailed P=.008). A within-participant analysis contrasting developmental trajectories of youths before and after they initiated heavy drinking supported the prediction that drinking onset was associated with and potentially preceded disrupted white matter integrity. Age-alcohol interactions (t152=3.0, P=.004) observed for the FA slopes indicated that the alcohol-associated disruption was greater in younger than older adolescents and was most pronounced in the genu and body of the corpus callosum, regions known to continue developing throughout adolescence.Conclusions and Relevance: This case-control study of adolescents found a deleterious association of alcohol use with white matter microstructural integrity. These findings support the concept of heightened vulnerability to environmental agents, including alcohol, associated with attenuated development of major white matter tracts in early adolescence.

View details for DOI 10.1001/jamapsychiatry.2020.4064

View details for PubMedID 33377940
Ethical issues in using ambient intelligence in health-care settings. The Lancet. Digital health Martinez-Martin, N., Luo, Z., Kaushal, A., Adeli, E., Haque, A., Kelly, S. S., Wieten, S., Cho, M. K., Magnus, D., Fei-Fei, L., Schulman, K., Milstein, A. 2020

Abstract

Ambient intelligence is increasingly finding applications in health-care settings, such as helping to ensure clinician and patient safety by monitoring staff compliance with clinical best practices or relieving staff of burdensome documentation tasks. Ambient intelligence involves using contactless sensors and contact-based wearable devices embedded in health-care settings to collect data (eg, imaging data of physical spaces, audio data, or body temperature), coupled with machine learning algorithms to efficiently and effectively interpret these data. Despite the promise of ambient intelligence to improve quality of care, the continuous collection of large amounts of sensor data in health-care settings presents ethical challenges, particularly in terms of privacy, data management, bias and fairness, and informed consent. Navigating these ethical issues is crucial not only for the success of individual uses, but for acceptance of the field as a whole.

View details for DOI 10.1016/S2589-7500(20)30275-2

View details for PubMedID 33358138
Multiview Feature Learning With Multiatlas-Based Functional Connectivity Networks for MCI Diagnosis. IEEE transactions on cybernetics Zhang, Y., Zhang, H., Adeli, E., Chen, X., Liu, M., Shen, D. 2020; PP

Abstract

Functional connectivity (FC) networks built from resting-state functional magnetic resonance imaging (rs-fMRI) has shown promising results for the diagnosis of Alzheimer's disease and its prodromal stage, that is, mild cognitive impairment (MCI). FC is usually estimated as a temporal correlation of regional mean rs-fMRI signals between any pair of brain regions, and these regions are traditionally parcellated with a particular brain atlas. Most existing studies have adopted a predefined brain atlas for all subjects. However, the constructed FC networks inevitably ignore the potentially important subject-specific information, particularly, the subject-specific brain parcellation. Similar to the drawback of the ``single view'' (versus the ``multiview'' learning) in medical image-based classification, FC networks constructed based on a single atlas may not be sufficient to reveal the underlying complicated differences between normal controls and disease-affected patients due to the potential bias from that particular atlas. In this study, we propose a multiview feature learning method with multiatlas-based FC networks to improve MCI diagnosis. Specifically, a three-step transformation is implemented to generate multiple individually specified atlases from the standard automated anatomical labeling template, from which a set of atlas exemplars is selected. Multiple FC networks are constructed based on these preselected atlas exemplars, providing multiple views of the FC network-based feature representations for each subject. We then devise a multitask learning algorithm for joint feature selection from the constructed multiple FC networks. The selected features are jointly fed into a support vector machine classifier for multiatlas-based MCI diagnosis. Extensive experimental comparisons are carried out between the proposed method and other competing approaches, including the traditional single-atlas-based method. The results indicate that our method significantly improves the MCI classification, demonstrating its promise in the brain connectome-based individualized diagnosis of brain diseases.

View details for DOI 10.1109/TCYB.2020.3016953

View details for PubMedID 33306476
Guest Editorial: AI-Powered 3D Vision IET IMAGE PROCESSING Yang, Y., Yang, J., Adeli, E. 2020; 14 (12): 2627–29

View details for DOI 10.1049/iet-ipr.2020.1194

View details for Web of Science ID 000582146100001
Depth map artefacts reduction: a review IET IMAGE PROCESSING Ibrahim, M., Liu, Q., Khan, R., Yang, J., Adeli, E., Yang, Y. 2020; 14 (12): 2630–44

View details for DOI 10.1049/iet-ipr.2019.1622

View details for Web of Science ID 000582146100002
Spatio-Temporal Graph Convolution for Resting-State fMRI Analysis. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Gadgil, S., Zhao, Q., Pfefferbaum, A., Sullivan, E. V., Adeli, E., Pohl, K. M. 2020; 12267: 528–38

Abstract

The Blood-Oxygen-Level-Dependent (BOLD) signal of resting-state fMRI (rs-fMRI) records the temporal dynamics of intrinsic functional networks in the brain. However, existing deep learning methods applied to rs-fMRI either neglect the functional dependency between different brain regions in a network or discard the information in the temporal dynamics of brain activity. To overcome those shortcomings, we propose to formulate functional connectivity networks within the context of spatio-temporal graphs. We train a spatio-temporal graph convolutional network (ST-GCN) on short sub-sequences of the BOLD time series to model the non-stationary nature of functional connectivity. Simultaneously, the model learns the importance of graph edges within ST-GCN to gain insight into the functional connectivities contributing to the prediction. In analyzing the rs-fMRI of the Human Connectome Project (HCP, N = 1,091) and the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA, N = 773), ST-GCN is significantly more accurate than common approaches in predicting gender and age based on BOLD signals. Furthermore, the brain regions and functional connections significantly contributing to the predictions of our model are important markers according to the neuroscience literature.

View details for DOI 10.1007/978-3-030-59728-3_52

View details for PubMedID 33257918
Inpainting Cropped Diffusion MRI using Deep Generative Models. PRedictive Intelligence in MEdicine. PRIME (Workshop) Ayub, R., Zhao, Q., Meloy, M. J., Sullivan, E. V., Pfefferbaum, A., Adeli, E., Pohl, K. M. 2020; 12329: 91-100

Abstract

Minor artifacts introduced during image acquisition are often negligible to the human eye, such as a confined field of view resulting in MRI missing the top of the head. This cropping artifact, however, can cause suboptimal processing of the MRI resulting in data omission or decreasing the power of subsequent analyses. We propose to avoid data or quality loss by restoring these missing regions of the head via variational autoencoders (VAE), a deep generative model that has been previously applied to high resolution image reconstruction. Based on diffusion weighted images (DWI) acquired by the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), we evaluate the accuracy of inpainting the top of the head by common autoencoder models (U-Net, VQVAE, and VAE-GAN) and a custom model proposed herein called U-VQVAE. Our results show that U-VQVAE not only achieved the highest accuracy, but also resulted in MRI processing producing lower fractional anisotropy (FA) in the supplementary motor area than FA derived from the original MRIs. Lower FA implies that inpainting reduces noise in processing DWI and thus increase the quality of the generated results. The code is available at https://github.com/RdoubleA/DWIinpainting.

View details for DOI 10.1007/978-3-030-59354-4_9

View details for PubMedID 33997866
Socially and Contextually Aware Human Motion and Pose Forecasting IEEE ROBOTICS AND AUTOMATION LETTERS Adeli, V., Adeli, E., Reid, I., Niebles, J., Rezatofighi, H. 2020; 5 (4): 6033–40

View details for DOI 10.1109/LRA.2020.3010742

View details for Web of Science ID 000554894900027
Vision-based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Lu, M., Poston, K., Pfefferbaum, A., Sullivan, E. V., Fei-Fei, L., Pohl, K. M., Niebles, J. C., Adeli, E. 2020; 12263: 637–47

Abstract

Parkinson's disease (PD) is a progressive neurological disorder primarily affecting motor function resulting in tremor at rest, rigidity, bradykinesia, and postural instability. The physical severity of PD impairments can be quantified through the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS), a widely used clinical rating scale. Accurate and quantitative assessment of disease progression is critical to developing a treatment that slows or stops further advancement of the disease. Prior work has mainly focused on dopamine transport neuroimaging for diagnosis or costly and intrusive wearables evaluating motor impairments. For the first time, we propose a computer vision-based model that observes non-intrusive video recordings of individuals, extracts their 3D body skeletons, tracks them through time, and classifies the movements according to the MDS-UPDRS gait scores. Experimental results show that our proposed method performs significantly better than chance and competing methods with an F 1-score of 0.83 and a balanced accuracy of 81%. This is the first benchmark for classifying PD patients based on MDS-UPDRS gait severity and could be an objective biomarker for disease severity. Our work demonstrates how computer-assisted technologies can be used to non-intrusively monitor patients and their motor impairments. The code is available at https://github.com/mlu355/PD-Motor-Severity-Estimation.

View details for DOI 10.1007/978-3-030-59716-0_61

View details for PubMedID 33103164
Deep Learning Identifies Morphological Determinants of Sex Differences in the Pre-Adolescent Brain. NeuroImage Adeli, E., Zhao, Q., Zahr, N. M., Goldstone, A., Pfefferbaum, A., Sullivan, E. V., Pohl, K. M. 2020: 117293

Abstract

The application of data-driven deep learning to identify sex differences in developing brain structures of pre-adolescents has heretofore not been accomplished. Here, the approach identifies sex differences by analyzing the minimally processed MRIs of the first 8,144 participants (age 9 and 10 years) recruited by the Adolescent Brain Cognitive Development (ABCD) study. The identified pattern accounted for confounding factors (i.e., head size, age, puberty development, socioeconomic status) and comprised cerebellar (corpus medullare, lobules III, IV/V, and VI) and subcortical (pallidum, amygdala, hippocampus, parahippocampus, insula, putamen) structures. While these have been individually linked to expressing sex differences, a novel discovery was that their grouping accurately predicted the sex in individual pre-adolescents. Another novelty was relating differences specific to the cerebellum to pubertal development. Finally, we found that reducing the pattern to a single score not only accurately predicted sex but also correlated with cognitive behavior linked to working memory. The predictive power of this score and the constellation of identified brain structures provide evidence for sex differences in pre-adolescent neurodevelopment and may augment understanding of sex-specific vulnerability or resilience to psychiatric disorders and presage sex-linked learning disabilities.

View details for DOI 10.1016/j.neuroimage.2020.117293

View details for PubMedID 32841716
Segmenting the Future IEEE ROBOTICS AND AUTOMATION LETTERS Chiu, H., Adeli, E., Niebles, J. 2020; 5 (3): 4202–9

View details for DOI 10.1109/LRA.2020.2992184

View details for Web of Science ID 000541731600001
Image-to-Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography IEEE TRANSACTIONS ON MEDICAL IMAGING Eslami, M., Tabarestani, S., Albarqouni, S., Adeli, E., Navab, N., Adjouadi, M. 2020; 39 (7): 2553–65

Abstract

Chest X-ray radiography is one of the earliest medical imaging technologies and remains one of the most widely-used for diagnosis, screening, and treatment follow up of diseases related to lungs and heart. The literature in this field of research reports many interesting studies dealing with the challenging tasks of bone suppression and organ segmentation but performed separately, limiting any learning that comes with the consolidation of parameters that could optimize both processes. This study, and for the first time, introduces a multitask deep learning model that generates simultaneously the bone-suppressed image and the organ-segmented image, enhancing the accuracy of tasks, minimizing the number of parameters needed by the model and optimizing the processing time, all by exploiting the interplay between the network parameters to benefit the performance of both tasks. The architectural design of this model, which relies on a conditional generative adversarial network, reveals the process on how the wellestablished pix2pix network (image-to-image network) is modified to fit the need for multitasking and extending it to the new image-to-images architecture. The developed source code of this multitask model is shared publicly on Github as the first attempt for providing the two-task pix2pix extension, a supervised/paired/aligned/registered image-to-images translation which would be useful in many multitask applications. Dilated convolutions are also used to improve the results through a more effective receptive field assessment. The comparison with state-of-the-art al-gorithms along with ablation study and a demonstration video1 are provided to evaluate the efficacy and gauge the merits of the proposed approach.

View details for DOI 10.1109/TMI.2020.2974159

View details for Web of Science ID 000545410200024

View details for PubMedID 32078541
Skeleton-based structured early activity prediction MULTIMEDIA TOOLS AND APPLICATIONS Arzani, M. M., Fathy, M., Azirani, A. A., Adeli, E. 2020

View details for DOI 10.1007/s11042-020-08875-w

View details for Web of Science ID 000528439200002
Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction IEEE ROBOTICS AND AUTOMATION LETTERS Liu, B., Adeli, E., Cao, Z., Lee, K., Shenoi, A., Gaidon, A., Niebles, J. 2020; 5 (2): 3485–92

View details for DOI 10.1109/LRA.2020.2976305

View details for Web of Science ID 000520954200034
Mammographic mass segmentation using multichannel and multiscale fully convolutional networks INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY Xu, S., Adeli, E., Cheng, J., Xiang, L., Li, Y., Lee, S., Shen, D. 2020

View details for DOI 10.1002/ima.22423

View details for Web of Science ID 000521597700001
Editorial: Predictive Intelligence in Biomedical and Health Informatics IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS Adeli, E., Rekik, S. H., Park, S. H., Shen, D. 2020; 24 (2): 333-335

View details for DOI 10.1109/JBHI.2019.2962852

View details for Web of Science ID 000516606600001
FCN Based Label Correction for Multi-Atlas Guided Organ Segmentation. Neuroinformatics Zhu, H. n., Adeli, E. n., Shi, F. n., Shen, D. n. 2020

Abstract

Segmentation of medical images using multiple atlases has recently gained immense attention due to their augmented robustness against variabilities across different subjects. These atlas-based methods typically comprise of three steps: atlas selection, image registration, and finally label fusion. Image registration is one of the core steps in this process, accuracy of which directly affects the final labeling performance. However, due to inter-subject anatomical variations, registration errors are inevitable. The aim of this paper is to develop a deep learning-based confidence estimation method to alleviate the potential effects of registration errors. We first propose a fully convolutional network (FCN) with residual connections to learn the relationship between the image patch pair (i.e., patches from the target subject and the atlas) and the related label confidence patch. With the obtained label confidence patch, we can identify the potential errors in the warped atlas labels and correct them. Then, we use two label fusion methods to fuse the corrected atlas labels. The proposed methods are validated on a publicly available dataset for hippocampus segmentation. Experimental results demonstrate that our proposed methods outperform the state-of-the-art segmentation methods.

View details for DOI 10.1007/s12021-019-09448-5

View details for PubMedID 31898145
It Is Not the Journey But the Destination: Endpoint Conditioned Trajectory Prediction Mangalam, K., Girase, H., Agarwal, S., Lee, K., Adeli, E., Malik, J., Gaidon, A. edited by Vedaldi, A., Bischof, H., Brox, T., Frahm, J. M. SPRINGER INTERNATIONAL PUBLISHING AG. 2020: 759-776

View details for DOI 10.1007/978-3-030-58536-5_45

View details for Web of Science ID 001500572000045
Procedure Planning in Instructional Videos Chang, C., Huang, D., Xu, D., Adeli, E., Fei-Fei, L., Niebles, J. edited by Vedaldi, A., Bischof, H., Brox, T., Frahm, J. M. SPRINGER INTERNATIONAL PUBLISHING AG. 2020: 334-350

View details for DOI 10.1007/978-3-030-58621-8_20

View details for Web of Science ID 001500594500020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation Pan, B., Cai, H., Huang, D., Lee, K., Gaidon, A., Adeli, E., Niebles, J., IEEE IEEE COMPUTER SOC. 2020: 10867-10876

View details for DOI 10.1109/CVPR42600.2020.01088

View details for Web of Science ID 001309199903074
Adversarial Cross-Domain Action Recognition with Co-Attention Pan, B., Cao, Z., Adeli, E., Niebles, J., Assoc Advancement Artificial Intelligence ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2020: 11815-11822

View details for Web of Science ID 000668126804033
Adolescent alcohol use disrupts functional neurodevelopment in sensation seeking girls. Addiction biology Zhao, Q. n., Sullivan, E. V., Műller-Oehring, E. M., Honnorat, N. n., Adeli, E. n., Podhajsky, S. n., Baker, F. C., Colrain, I. M., Prouty, D. n., Tapert, S. F., Brown, S. A., Meloy, M. J., Brumback, T. n., Nagel, B. J., Morales, A. M., Clark, D. B., Luna, B. n., De Bellis, M. D., Voyvodic, J. T., Nooner, K. B., Pfefferbaum, A. n., Pohl, K. M. 2020: e12914

Abstract

Exogenous causes, such as alcohol use, and endogenous factors, such as temperament and sex, can modulate developmental trajectories of adolescent neurofunctional maturation. We examined how these factors affect sexual dimorphism in brain functional networks in youth drinking below diagnostic threshold for alcohol use disorder (AUD). Based on the 3-year, annually acquired, longitudinal resting-state functional magnetic resonance imaging (MRI) data of 526 adolescents (12-21 years at baseline) from the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) cohort, developmental trajectories of 23 intrinsic functional networks (IFNs) were analyzed for (1) sexual dimorphism in 259 participants who were no-to-low drinkers throughout this period; (2) sex-alcohol interactions in two age- and sex-matched NCANDA subgroups (N = 76 each), half no-to-low, and half moderate-to-heavy drinkers; and (3) moderating effects of gender-specific alcohol dose effects and a multifactorial impulsivity measure on IFN connectivity in all NCANDA participants. Results showed that sex differences in no-to-low drinkers diminished with age in the inferior-occipital network, yet girls had weaker within-network connectivity than boys in six other networks. Effects of adolescent alcohol use were more pronounced in girls than boys in three IFNs. In particular, girls showed greater within-network connectivity in two motor networks with more alcohol consumption, and these effects were mediated by sensation-seeking only in girls. Our results implied that drinking might attenuate the naturally diminishing sexual differences by disrupting the maturation of network efficiency more severely in girls. The sex-alcohol-dose effect might explain why women are at higher risk of alcohol-related health and psychosocial consequences than men.

View details for DOI 10.1111/adb.12914

View details for PubMedID 32428984
Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision Mangalam, K., Adeli, E., Lee, K., Gaidon, A., Niebles, J., IEEE Comp Soc IEEE COMPUTER SOC. 2020: 2773–82

View details for Web of Science ID 000578444802088
Training confounder-free deep learning models for medical applications. Nature communications Zhao, Q. n., Adeli, E. n., Pohl, K. M. 2020; 11 (1): 6010

Abstract

The presence of confounding effects (or biases) is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Confounders affect the relationship between input data (e.g., brain MRIs) and output variables (e.g., diagnosis). Improper modeling of those relationships often results in spurious and biased associations. Traditional machine learning and statistical models minimize the impact of confounders by, for example, matching data sets, stratifying data, or residualizing imaging measurements. Alternative strategies are needed for state-of-the-art deep learning models that use end-to-end training to automatically extract informative features from large set of images. In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations between the confounder(s) and prediction outcome. The method does so by exploiting concepts from traditional statistical methods and recent fair machine learning schemes. We evaluate the method on predicting the diagnosis of HIV solely from Magnetic Resonance Images (MRIs), identifying morphological sex differences in adolescence from those of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), and determining the bone age from X-ray images of children. The results show that our method can accurately predict while reducing biases associated with confounders. The code is available at https://github.com/qingyuzhao/br-net .

View details for DOI 10.1038/s41467-020-19784-9

View details for PubMedID 33243992
Population-guided large margin classifier for high-dimension low-sample-size problems PATTERN RECOGNITION Yin, Q., Adeli, E., Shen, L., Shen, D. 2020; 97

View details for DOI 10.1016/j.patcog.2019.107030

View details for Web of Science ID 000491609400009
Confounder-Aware Visualization of ConvNets. Machine learning in medical imaging. MLMI (Workshop) Zhao, Q., Adeli, E., Pfefferbaum, A., Sullivan, E. V., Pohl, K. M. 2019; 11861: 328–36

Abstract

With recent advances in deep learning, neuroimaging studies increasingly rely on convolutional networks (ConvNets) to predict diagnosis based on MR images. To gain a better understanding of how a disease impacts the brain, the studies visualize the salience maps of the ConvNet highlighting voxels within the brain majorly contributing to the prediction. However, these salience maps are generally confounded, i.e., some salient regions are more predictive of confounding variables (such as age) than the diagnosis. To avoid such misinterpretation, we propose in this paper an approach that aims to visualize confounder-free saliency maps that only highlight voxels predictive of the diagnosis. The approach incorporates univariate statistical tests to identify confounding effects within the intermediate features learned by ConvNet. The influence from the subset of confounded features is then removed by a novel partial back-propagation procedure. We use this two-step approach to visualize confounder-free saliency maps extracted from synthetic and two real datasets. These experiments reveal the potential of our visualization in producing unbiased model-interpretation.

View details for DOI 10.1007/978-3-030-32692-0_38

View details for PubMedID 32549051
Covariance Shrinkage for Dynamic Functional Connectivity. Connectomics in neuroImaging : third International Workshop, CNI 2019, held in conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings. CNI (Workshop) (3rd : 2019 : Shenzhen Shi, China) Honnorat, N., Adeli, E., Zhao, Q., Pfefferbaum, A., Sullivan, E. V., Pohl, K. 2019; 11848: 32–41

Abstract

The tracking of dynamic functional connectivity (dFC) states in resting-state fMRI scans aims to reveal how the brain sequentially processes stimuli and thoughts. Despite the recent advances in statistical methods, estimating the high dimensional dFC states from a small number of available time points remains a challenge. This paper shows that the challenge is reduced by linear covariance shrinkage, a statistical method used for the estimation of large covariance matrices from small number of samples. We present a computationally efficient formulation of our approach that scales dFC analysis up to full resolution resting-state fMRI scans. Experiments on synthetic data demonstrate that our approach produces dFC estimates that are closer to the ground-truth than state-of-the-art estimation approaches. When comparing methods on the rs-fMRI scans of 162 subjects, we found that our approach is better at extracting functional networks and capturing differences in rs-fMRI acquisition and diagnosis.

View details for DOI 10.1007/978-3-030-32391-2_4

View details for PubMedID 32924030
Data Augmentation Based on Substituting Regional MRIs Volume Scores. Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention : International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in c... Leng, T., Zhao, Q., Yang, C., Lu, Z., Adeli, E., Pohl, K. M. 2019; 11851: 32–41

Abstract

Due to difficulties in collecting sufficient training data, recent advances in neural-network-based methods have not been fully explored in the analysis of brain Magnetic Resonance Imaging (MRI). A possible solution to the limited-data issue is to augment the training set with synthetically generated data. In this paper, we propose a data augmentation strategy based on regional feature substitution. We demonstrate the advantages of this strategy with respect to training a simple neural-network-based classifier in predicting when individual youth transition from no-to-low to medium-to-heavy alcohol drinkers solely based on their volumetric MRI measurements. Based on 20-fold cross-validation, we generate more than one million synthetic samples from less than 500 subjects for each training run. The classifier achieves an accuracy of 74.1% in correctly distinguishing non-drinkers from drinkers at baseline and a 43.2% weighted accuracy in predicting the transition over a three year period (5-group classification task). Both accuracy scores are significantly better than training the classifier on the original dataset.

View details for DOI 10.1007/978-3-030-33642-4_4

View details for PubMedID 32924031
High-Resolution Encoder-Decoder Networks for Low-Contrast Medical Image Segmentation. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Zhou, S., Nie, D., Adeli, E., Yin, J., Lian, J., Shen, D. 2019

Abstract

Automatic image segmentation is an essential step for many medical image analysis applications, include computer-aided radiation therapy, disease diagnosis, and treatment effect evaluation. One of the major challenges for this task is the blurry nature of medical images (e.g., CT, MR and, microscopic images), which can often result in low-contrast and vanishing boundaries. With the recent advances in convolutional neural networks, vast improvements have been made for image segmentation, mainly based on the skip-connection-linked encoder-decoder deep architectures. However, in many applications (with adjacent targets in blurry images), these models often fail to accurately locate complex boundaries and properly segment tiny isolated parts. In this paper, we aim to provide a method for blurry medical image segmentation and argue that skip connections are not enough to help accurately locate indistinct boundaries. Accordingly, we propose a novel high-resolution multi-scale encoder-decoder network (HMEDN), in which multi-scale dense connections are introduced for the encoder-decoder structure to finely exploit comprehensive semantic information. Besides skip connections, extra deeply-supervised high-resolution pathways (comprised of densely connected dilated convolutions) are integrated to collect high-resolution semantic information for accurate boundary localization. These pathways are paired with a difficulty-guided cross-entropy loss function and a contour regression task to enhance the quality of boundary detection. Extensive experiments on a pelvic CT image dataset, a multi-modal brain tumor dataset, and a cell segmentation dataset show the effectiveness of our method for 2D/3D semantic segmentation and 2D instance segmentation, respectively. Our experimental results also show that besides increasing the network complexity, raising the resolution of semantic feature maps can largely affect the overall model performance. For different tasks, finding a balance between these two factors can further improve the performance of the corresponding network.

View details for DOI 10.1109/TIP.2019.2919937

View details for PubMedID 31226074
Variational Autoencoder with Truncated Mixture of Gaussians for Functional Connectivity Analysis. Information processing in medical imaging : proceedings of the ... conference Zhao, Q., Honnorat, N., Adeli, E., Pohl, K. M. 2019; 11492: 867-879

Abstract

Resting-state functional connectivity states are often identified as clusters of dynamic connectivity patterns. However, existing clustering approaches do not distinguish major states from rarely occurring minor states and hence are sensitive to noise. To address this issue, we propose to model major states using a non-linear generative process guided by a Gaussian-mixture distribution in a low-dimensional latent space, while separately modeling the connectivity patterns of minor states by a non-informative uniform distribution. We embed this truncated Gaussian-Mixture model in a Variational Autoencoder framework to obtain a general joint clustering and outlier detection approach, tGM-VAE. When applied to synthetic data with known ground-truth, tGM-VAE is more accurate in clustering connectivity patterns than existing approaches. On the rs-fMRI of 593 healthy adolescents, tGM-VAE identifies meaningful major connectivity states. The dwell time of these states significantly correlates with age.

View details for DOI 10.1007/978-3-030-20351-1_68

View details for PubMedID 32699491

View details for PubMedCentralID PMC7375028
Joint Classification and Regression via Deep Multi-Task Multi-Channel Learning for Alzheimer's Disease Diagnosis IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING Liu, M., Zhang, J., Adeli, E., Shen, D. 2019; 66 (5): 1195–1206

Abstract

In the field of computer-aided Alzheimer's disease (AD) diagnosis, jointly identifying brain diseases and predicting clinical scores using magnetic resonance imaging (MRI) have attracted increasing attention since these two tasks are highly correlated. Most of existing joint learning approaches require hand-crafted feature representations for MR images. Since hand-crafted features of MRI and classification/regression models may not coordinate well with each other, conventional methods may lead to sub-optimal learning performance. Also, demographic information (e.g., age, gender, and education) of subjects may also be related to brain status, and thus can help improve the diagnostic performance. However, conventional joint learning methods seldom incorporate such demographic information into the learning models. To this end, we propose a deep multi-task multi-channel learning (DM 2L) framework for simultaneous brain disease classification and clinical score regression, using MRI data and demographic information of subjects. Specifically, we first identify the discriminative anatomical landmarks from MR images in a data-driven manner, and then extract multiple image patches around these detected landmarks. We then propose a deep multi-task multi-channel convolutional neural network for joint classification and regression. Our DM 2L framework can not only automatically learn discriminative features for MR images, but also explicitly incorporate the demographic information of subjects into the learning process. We evaluate the proposed method on four large multi-center cohorts with 1984 subjects, and the experimental results demonstrate that DM 2L is superior to several state-of-the-art joint learning methods in both the tasks of disease classification and clinical score regression.

View details for DOI 10.1109/TBME.2018.2869989

View details for Web of Science ID 000466024600001

View details for PubMedID 30222548

View details for PubMedCentralID PMC6764421
Infant Brain Development Prediction With Latent Partial Multi-View Representation Learning IEEE TRANSACTIONS ON MEDICAL IMAGING Zhang, C., Adeli, E., Wu, Z., Li, G., Lin, W., Shen, D. 2019; 38 (4): 909–18

Abstract

The early postnatal period witnesses rapid and dynamic brain development. However, the relationship between brain anatomical structure and cognitive ability is still unknown. Currently, there is no explicit model to characterize this relationship in the literature. In this paper, we explore this relationship by investigating the mapping between morphological features of the cerebral cortex and cognitive scores. To this end, we introduce a multi-view multi-task learning approach to intuitively explore complementary information from different time-points and handle the missing data issue in longitudinal studies simultaneously. Accordingly, we establish a novel model, latent partial multi-view representation learning. Our approach regards data from different time-points as different views and constructs a latent representation to capture the complementary information from incomplete time-points. The latent representation explores the complementarity across different time-points and improves the accuracy of prediction. The minimization problem is solved by the alternating direction method of multipliers. Experimental results on both synthetic and real data validate the effectiveness of our proposed algorithm.

View details for DOI 10.1109/TMI.2018.2874964

View details for Web of Science ID 000463608000004

View details for PubMedID 30307859

View details for PubMedCentralID PMC6450718
Novel Machine Learning Identifies Brain Patterns Distinguishing Diagnostic Membership of Human Immunodeficiency Virus, Alcoholism, and Their Comorbidity of Individuals. Biological psychiatry. Cognitive neuroscience and neuroimaging Adeli, E., Zahr, N. M., Pfefferbaum, A., Sullivan, E. V., Pohl, K. M. 2019

Abstract

The incidence of alcohol use disorder (AUD) in human immunodeficiency virus (HIV) infection is twice that of the rest of the population. This study documents complex radiologically identified, neuroanatomical effects of AUD+HIV comorbidity by identifying structural brain systems that predicted diagnosis on an individual basis. Applying novel machine learning analysis to 549 participants (199 control subjects, 222 with AUD, 68 with HIV, 60 with AUD+HIV), 298 magnetic resonance imaging brain measurements were automatically reduced to small subsets per group. Significance of each diagnostic pattern was inferred from its accuracy in predicting diagnosis and performance on six cognitive measures. While all three diagnostic patterns predicted the learning and memory score, the AUD+HIV pattern was the largest and had the highest predication accuracy (78.1%). Providing a roadmap for analyzing large, multimodal datasets, the machine learning analysis revealed imaging phenotypes that predicted diagnostic membership of magnetic resonance imaging scans of individuals with AUD, HIV, and their comorbidity.

View details for DOI 10.1016/j.bpsc.2019.02.003

View details for PubMedID 30982583
3-D Fully Convolutional Networks for Multimodal Isointense Infant Brain Image Segmentation IEEE TRANSACTIONS ON CYBERNETICS Nie, D., Wang, L., Adeli, E., Lao, C., Lin, W., Shen, D. 2019; 49 (3): 1123–36

Abstract

Accurate segmentation of infant brain images into different regions of interest is one of the most important fundamental steps in studying early brain development. In the isointense phase (approximately 6-8 months of age), white matter and gray matter exhibit similar levels of intensities in magnetic resonance (MR) images, due to the ongoing myelination and maturation. This results in extremely low tissue contrast and thus makes tissue segmentation very challenging. Existing methods for tissue segmentation in this isointense phase usually employ patch-based sparse labeling on single modality. To address the challenge, we propose a novel 3-D multimodal fully convolutional network (FCN) architecture for segmentation of isointense phase brain MR images. Specifically, we extend the conventional FCN architectures from 2-D to 3-D, and, rather than directly using FCN, we intuitively integrate coarse (naturally high-resolution) and dense (highly semantic) feature maps to better model tiny tissue regions, in addition, we further propose a transformation module to better connect the aggregating layers; we also propose a fusion module to better serve the fusion of feature maps. We compare the performance of our approach with several baseline and state-of-the-art methods on two sets of isointense phase brain images. The comparison results show that our proposed 3-D multimodal FCN model outperforms all previous methods by a large margin in terms of segmentation accuracy. In addition, the proposed framework also achieves faster segmentation results compared to all other methods. Our experiments further demonstrate that: 1) carefully integrating coarse and dense feature maps can considerably improve the segmentation performance; 2) batch normalization can speed up the convergence of the networks, especially when hierarchical feature aggregations occur; and 3) integrating multimodal information can further boost the segmentation performance.

View details for DOI 10.1109/TCYB.2018.2797905

View details for Web of Science ID 000458655900033

View details for PubMedID 29994385

View details for PubMedCentralID PMC6230311
Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Adeli, E., Thung, K., An, L., Wu, G., Shi, F., Wang, T., Shen, D. 2019; 41 (2): 515–22

Abstract

Discriminative methods commonly produce models with relatively good generalization abilities. However, this advantage is challenged in real-world applications (e.g., medical image analysis problems), in which there often exist outlier data points (sample-outliers) and noises in the predictor values (feature-noises). Methods robust to both types of these deviations are somewhat overlooked in the literature. We further argue that denoising can be more effective, if we learn the model using all the available labeled and unlabeled samples, as the intrinsic geometry of the sample manifold can be better constructed using more data points. In this paper, we propose a semi-supervised robust discriminative classification method based on the least-squares formulation of linear discriminant analysis to detect sample-outliers and feature-noises simultaneously, using both labeled training and unlabeled testing data. We conduct several experiments on a synthetic, some benchmark semi-supervised learning, and two brain neurodegenerative disease diagnosis datasets (for Parkinson's and Alzheimer's diseases). Specifically for the application of neurodegenerative diseases diagnosis, incorporating robust machine learning methods can be of great benefit, due to the noisy nature of neuroimaging data. Our results show that our method outperforms the baseline and several state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.

View details for DOI 10.1109/TPAMI.2018.2794470

View details for Web of Science ID 000456150600018

View details for PubMedID 29994560

View details for PubMedCentralID PMC6050136
Multi-Channel 3D Deep Feature Learning for Survival Time Prediction of Brain Tumor Patients Using Multi-Modal Neuroimages SCIENTIFIC REPORTS Nie, D., Lu, J., Zhang, H., Adeli, E., Wang, J., Yu, Z., Liu, L., Wang, Q., Wu, J., Shen, D. 2019; 9: 1103

Abstract

High-grade gliomas are the most aggressive malignant brain tumors. Accurate pre-operative prognosis for this cohort can lead to better treatment planning. Conventional survival prediction based on clinical information is subjective and could be inaccurate. Recent radiomics studies have shown better prognosis by using carefully-engineered image features from magnetic resonance images (MRI). However, feature engineering is usually time consuming, laborious and subjective. Most importantly, the engineered features cannot effectively encode other predictive but implicit information provided by multi-modal neuroimages. We propose a two-stage learning-based method to predict the overall survival (OS) time of high-grade gliomas patient. At the first stage, we adopt deep learning, a recently dominant technique of artificial intelligence, to automatically extract implicit and high-level features from multi-modal, multi-channel preoperative MRI such that the features are competent of predicting survival time. Specifically, we utilize not only contrast-enhanced T1 MRI, but also diffusion tensor imaging (DTI) and resting-state functional MRI (rs-fMRI), for computing multiple metric maps (including various diffusivity metric maps derived from DTI, and also the frequency-specific brain fluctuation amplitude maps and local functional connectivity anisotropy-related metric maps derived from rs-fMRI) from 68 high-grade glioma patients with different survival time. We propose a multi-channel architecture of 3D convolutional neural networks (CNNs) for deep learning upon those metric maps, from which high-level predictive features are extracted for each individual patch of these maps. At the second stage, those deeply learned features along with the pivotal limited demographic and tumor-related features (such as age, tumor size and histological type) are fed into a support vector machine (SVM) to generate the final prediction result (i.e., long or short overall survival time). The experimental results demonstrate that this multi-model, multi-channel deep survival prediction framework achieves an accuracy of 90.66%, outperforming all the competing methods. This study indicates highly demanded effectiveness on prognosis of deep learning technique in neuro-oncological applications for better individualized treatment planning towards precision medicine.

View details for DOI 10.1038/s41598-018-37387-9

View details for Web of Science ID 000457287000091

View details for PubMedID 30705340

View details for PubMedCentralID PMC6355868
Multi-task prediction of infant cognitive scores from longitudinal incomplete neuroimaging data NEUROIMAGE Adeli, E., Meng, Y., Li, G., Lin, W., Shen, D. 2019; 185: 783–92

Abstract

Early postnatal brain undergoes a stunning period of development. Over the past few years, research on dynamic infant brain development has received increased attention, exhibiting how important the early stages of a child's life are in terms of brain development. To precisely chart the early brain developmental trajectories, longitudinal studies with data acquired over a long-enough period of infants' early life is essential. However, in practice, missing data from different time point(s) during the data gathering procedure is often inevitable. This leads to incomplete set of longitudinal data, which poses a major challenge for such studies. In this paper, prediction of multiple future cognitive scores with incomplete longitudinal imaging data is modeled into a multi-task machine learning framework. To efficiently learn this model, we account for selection of informative features (i.e., neuroimaging morphometric measurements for different time points), while preserving the structural information and the interrelation between these multiple cognitive scores. Several experiments are conducted on a carefully acquired in-house dataset, and the results affirm that we can predict the cognitive scores measured at the age of four years old, using the imaging data of earlier time points, as early as 24 months of age, with a reasonable performance (i.e., root mean square error of 0.18).

View details for DOI 10.1016/j.neuroimage.2018.04.052

View details for Web of Science ID 000451628200066

View details for PubMedID 29709627

View details for PubMedCentralID PMC6204112
Difficulty-Aware Attention Network with Confidence Learning for Medical Image Segmentation Nie, D., Wang, L., Xiang, L., Zhou, S., Adeli, E., Shen, D., AAAI ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2019: 1085–92

View details for Web of Science ID 000485292601012
Generative Adversarial Irregularity Detection in Mammography Images Ahmadi, M., Sabokrou, M., Fathy, M., Berangi, R., Adeli, E. edited by Rekik, Adeli, E., Park, S. H. SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 94-104

View details for DOI 10.1007/978-3-030-32281-6_10

View details for Web of Science ID 000865800400009
Self-Supervised Representation Learning via Neighborhood-Relational Encoding Sabokrou, M., Khalooei, M., Adeli, E., IEEE IEEE. 2019: 8009–18

View details for DOI 10.1109/ICCV.2019.00810

View details for Web of Science ID 000548549203013
Imitation Learning for Human Pose Prediction Wang, B., Adeli, E., Chiu, H., Huang, D., Niebles, J., IEEE IEEE. 2019: 7123–32

View details for DOI 10.1109/ICCV.2019.00722

View details for Web of Science ID 000548549202023
Variational AutoEncoder for Regression: Application to Brain Aging Analysis Zhao, Q., Adeli, E., Honnorat, N., Leng, T., Pohl, K. M. edited by Shen, D., Liu, T., Peters, T. M., Staib, L. H., Essert, C., Zhou, S., Yap, P. T., Khan, A. SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 823–31

Abstract

While unsupervised variational autoencoders (VAE) have become a powerful tool in neuroimage analysis, their application to supervised learning is under-explored. We aim to close this gap by proposing a unified probabilistic model for learning the latent space of imaging data and performing supervised regression. Based on recent advances in learning disentangled representations, the novel generative process explicitly models the conditional distribution of latent representations with respect to the regression target variable. Performing a variational inference procedure on this model leads to joint regularization between the VAE and a neural-network regressor. In predicting the age of 245 subjects from their structural Magnetic Resonance (MR) images, our model is more accurate than state-of-the-art methods when applied to either region-of-interest (ROI) measurements or raw 3D volume images. More importantly, unlike simple feed-forward neural-networks, disentanglement of age in latent representations allows for intuitive interpretation of the structural developmental patterns of the human brain.

View details for DOI 10.1007/978-3-030-32245-8_91

View details for Web of Science ID 000548438900091

View details for PubMedID 32705091

View details for PubMedCentralID PMC7377006
Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection. IEEE transactions on pattern analysis and machine intelligence Adeli, E. n., Li, X. n., Kwon, D. n., Zhang, Y. n., Pohl, K. n. 2019

Abstract

Many vision-based applications rely on logistic regression for embedding classification within a probabilistic context, such as recognition in images and videos or identifying disease-specific image phenotypes from neuroimages. Logistic regression, however, often performs poorly when trained on data that is noisy, has irrelevant features, or when the samples are distributed across the classes in an imbalanced setting; a common occurrence in visual recognition tasks. To deal with those issues, researchers generally rely on ad-hoc regularization techniques or model a subset of these issues. We instead propose a mathematically sound logistic regression model that selects a subset of (relevant) features and (informative and balanced) set of samples during the training process. The model does so by applying cardinality constraints (via l0 -'norm' sparsity) on the features and samples. l0 defines sparsity in mathematical settings but in practice has mostly been approximated (e.g., via l1 or its variations) for computational simplicity. We prove that a local minimum to the non-convex optimization problems induced by cardinality constraints can be computed by combining block coordinate descent with penalty decomposition. On synthetic, image recognition, and neuroimaging datasets, we furthermore show that the accuracy of the method is higher than alternative methods and classifiers commonly used in the literature.

View details for DOI 10.1109/TPAMI.2019.2901688

View details for PubMedID 30835210
Variational Autoencoder with Truncated Mixture of Gaussians for Functional Connectivity Analysis Zhao, Q., Honnorat, N., Adeli, E., Pfefferbaum, A., Sullivan, E. V., Pohl, K. M. edited by Chung, A. C., Gee, J. C., Yushkevich, P. A., Bao, S. SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 867–79

View details for DOI 10.1007/978-3-030-20351-1_68

View details for Web of Science ID 000493380900068
Action-Agnostic Human Pose Forecasting Chiu, H., Adeli, E., Wang, B., Huang, D., Niebles, J., IEEE IEEE. 2019: 1423–32

View details for DOI 10.1109/WACV.2019.00156

View details for Web of Science ID 000469423400149
UNSUPERVISED FEATURE RANKING AND SELECTION BASED ON AUTOENCODERS Sharifipour, S., Fayyazi, H., Sabokrou, M., Adeli, E., IEEE IEEE. 2019: 3172–76

View details for Web of Science ID 000482554003079
AVID: Adversarial Visual Irregularity Detection Sabokrou, M., Pourreza, M., Fayyaz, M., Entezari, R., Fathy, M., Gall, J., Adeli, E. edited by Jawahar, C. V., Li, H., Mori, G., Schindler, K. SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 488–505

View details for DOI 10.1007/978-3-030-20876-9_31

View details for Web of Science ID 000492905500031
Chained regularization for identifying brain patterns specific to HIV infection NEUROIMAGE Adeli, E., Kwon, D., Zhao, Q., Pfefferbaum, A., Zahr, N. M., Sullivan, E. V., Pohl, K. M. 2018; 183: 425–37

View details for DOI 10.1016/j.neuroimage.2018.08.022

View details for Web of Science ID 000447750200038
Multi-Label Transduction for Identifying Disease Comorbidity Patterns. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Adeli, E., Kwon, D., Pohl, K. M. 2018; 11072: 575-583

Abstract

Study of the untoward effects associated with the comorbidity of multiple diseases on brain morphology requires identifying differences across multiple diagnostic groupings. To identify such effects and differentiate between groups of patients and normal subjects, conventional methods often compare each patient group with healthy subjects using binary or multi-class classifiers. However, testing inferences across multiple diagnostic groupings of complex disorders commonly yield inconclusive or conflicting findings when the classifier is confined to modeling two cohorts at a time or considers class labels mutually-exclusive (as in multi-class classifiers). These shortcomings are potentially caused by the difficulties associated with modeling compounding factors of diseases with these approaches. Multi-label classifiers, on the other hand, can appropriately model disease comorbidity, as each subject can be assigned to two or more labels. In this paper, we propose a multi-label transductive (MLT) method based on low-rank matrix completion that is able not only to classify the data into multiple labels but also to identify patterns from MRI data unique to each cohort. To evaluate the method, we use a dataset containing individuals with Alcohol Use Disorder (AUD) and human immunodeficiency virus (HIV) infection (specifically 244 healthy controls, 227 AUD, 70 HIV, and 61 AUD+HIV). On this dataset, our proposed method is more accurate in correctly labeling subjects than common approaches. Furthermore, our method identifies patterns specific to each disease and AUD+HIV comorbidity that shows that the comorbidity is characterized by a compounding effect of AUD and HIV infection.

View details for DOI 10.1007/978-3-030-00931-1_66

View details for PubMedID 33688637

View details for PubMedCentralID PMC7938692
End-To-End Alzheimer's Disease Diagnosis and Biomarker Identification. Machine learning in medical imaging. MLMI (Workshop) Esmaeilzadeh, S., Belivanis, D. I., Pohl, K. M., Adeli, E. 2018; 11046: 337-345

Abstract

As shown in computer vision, the power of deep learning lies in automatically learning relevant and powerful features for any perdition task, which is made possible through end-to-end architectures. However, deep learning approaches applied for classifying medical images do not adhere to this architecture as they rely on several pre- and post-processing steps. This shortcoming can be explained by the relatively small number of available labeled subjects, the high dimensionality of neuroimaging data, and difficulties in interpreting the results of deep learning methods. In this paper, we propose a simple 3D Convolutional Neural Networks and exploit its model parameters to tailor the end-to-end architecture for the diagnosis of Alzheimer's disease (AD). Our model can diagnose AD with an accuracy of 94.1% on the popular ADNI dataset using only MRI data, which outperforms the previous state-of-the-art. Based on the learned model, we identify the disease biomarkers, the results of which were in accordance with the literature. We further transfer the learned model to diagnose mild cognitive impairment (MCI), the prodromal stage of AD, which yield better results compared to other methods.

View details for DOI 10.1007/978-3-030-00919-9_39

View details for PubMedID 32832936

View details for PubMedCentralID PMC7440044
Chained regularization for identifying brain patterns specific to HIV infection. NeuroImage Adeli, E., Kwon, D., Zhao, Q., Pfefferbaum, A., Zahr, N. M., Sullivan, E. V., Pohl, K. M. 2018

Abstract

Human Immunodeficiency Virus (HIV) infection continues to have major adverse public health and clinical consequences despite the effectiveness of combination Antiretroviral Therapy (cART) in reducing HIV viral load and improving immune function. As successfully treated individuals with HIV infection age, their cognition declines faster than reported for normal aging. This phenomenon underlines the importance of improving long-term care, which requires better understanding of the impact of HIV on the brain. In this paper, automated identification of patients and brain regions affected by HIV infection are modeled as a classification problem, whose solution is determined in two steps within our proposed Chained-Regularization framework. The first step focuses on selecting the HIV pattern (i.e., the most informative constellation of brain region measurements for distinguishing HIV infected subjects from healthy controls) by constraining the search for the optimal parameter setting of the classifier via group sparsity (ℓ2,1-norm). The second step improves classification accuracy by constraining the parameterization with respect to the selected measurements and the Euclidean regularization (ℓ2-norm). When applied to the cortical and subcortical structural Magnetic Resonance Images (MRI) measurements of 65 controls and 65 HIV infected individuals, this approach is more accurate in distinguishing the two cohorts than more common models. Finally, the brain regions of the identified HIV pattern concur with the HIV literature that uses traditional group analysis models.

View details for PubMedID 30138676
Exploring diagnosis and imaging biomarkers of Parkinson's disease via iterative canonical correlation analysis based feature selection COMPUTERIZED MEDICAL IMAGING AND GRAPHICS Liu, L., Wang, Q., Adeli, E., Zhang, L., Zhang, H., Shen, D. 2018; 67: 21–29

Abstract

Parkinson's disease (PD) is a neurodegenerative disorder that progressively hampers the brain functions and leads to various movement and non-motor symptoms. However, it is difficult to attain early-stage PD diagnosis based on the subjective judgment of physicians in clinical routines. Therefore, automatic and accurate diagnosis of PD is highly demanded, so that the corresponding treatment can be implemented more appropriately. In this paper, we focus on finding the most discriminative features from different brain regions in PD through T1-weighted MR images, which can help the subsequent PD diagnosis. Specifically, we proposed a novel iterative canonical correlation analysis (ICCA) feature selection method, aiming at exploiting MR images in a more comprehensive manner and fusing features of different types into a common space. To state succinctly, we first extract the feature vectors from the gray matter and the white matter tissues separately, represented as insights of two different anatomical feature spaces for the subject's brain. The ICCA feature selection method aims at iteratively finding the optimal feature subset from two sets of features that have inherent high correlation with each other. In experiments we have conducted thorough investigations on the optimal feature set extracted by our ICCA method. We also demonstrate that using the proposed feature selection method, the PD diagnosis performance is further improved, and also outperforms many state-of-the-art methods.

View details for DOI 10.1016/j.compmedimag.2018.04.002

View details for Web of Science ID 000447358800003

View details for PubMedID 29702348
Anatomy-guided joint tissue segmentation and topological correction for 6-month infant brain MRI with risk of autism HUMAN BRAIN MAPPING Wang, L., Li, G., Adeli, E., Liu, M., Wu, Z., Meng, Y., Lin, W., Shen, D. 2018; 39 (6): 2609–23

Abstract

Tissue segmentation of infant brain MRIs with risk of autism is critically important for characterizing early brain development and identifying biomarkers. However, it is challenging due to low tissue contrast caused by inherent ongoing myelination and maturation. In particular, at around 6 months of age, the voxel intensities in both gray matter and white matter are within similar ranges, thus leading to the lowest image contrast in the first postnatal year. Previous studies typically employed intensity images and tentatively estimated tissue probabilities to train a sequence of classifiers for tissue segmentation. However, the important prior knowledge of brain anatomy is largely ignored during the segmentation. Consequently, the segmentation accuracy is still limited and topological errors frequently exist, which will significantly degrade the performance of subsequent analyses. Although topological errors could be partially handled by retrospective topological correction methods, their results may still be anatomically incorrect. To address these challenges, in this article, we propose an anatomy-guided joint tissue segmentation and topological correction framework for isointense infant MRI. Particularly, we adopt a signed distance map with respect to the outer cortical surface as anatomical prior knowledge, and incorporate such prior information into the proposed framework to guide segmentation in ambiguous regions. Experimental results on the subjects acquired from National Database for Autism Research demonstrate the effectiveness to topological errors and also some levels of robustness to motion. Comparisons with the state-of-the-art methods further demonstrate the advantages of the proposed method in terms of both segmentation accuracy and topological correctness.

View details for DOI 10.1002/hbm.24027

View details for Web of Science ID 000438015400025

View details for PubMedID 29516625

View details for PubMedCentralID PMC5951769
Conversion and time-to-conversion predictions of mild cognitive impairment using low-rank affinity pursuit denoising and matrix completion MEDICAL IMAGE ANALYSIS Thung, K., Yap, P., Adeli, E., Lee, S., Shen, D., Alzheimers Dis Neuroimaging Init 2018; 45: 68–82

Abstract

In this paper, we aim to predict conversion and time-to-conversion of mild cognitive impairment (MCI) patients using multi-modal neuroimaging data and clinical data, via cross-sectional and longitudinal studies. However, such data are often heterogeneous, high-dimensional, noisy, and incomplete. We thus propose a framework that includes sparse feature selection, low-rank affinity pursuit denoising (LRAD), and low-rank matrix completion (LRMC) in this study. Specifically, we first use sparse linear regressions to remove unrelated features. Then, considering the heterogeneity of the MCI data, which can be assumed as a union of multiple subspaces, we propose to use a low rank subspace method (i.e., LRAD) to denoise the data. Finally, we employ LRMC algorithm with three data fitting terms and one inequality constraint for joint conversion and time-to-conversion predictions. Our framework aims to answer a very important but yet rarely explored question in AD study, i.e., when will the MCI convert to AD? This is different from survival analysis, which provides the probabilities of conversion at different time points that are mainly used for global analysis, while our time-to-conversion prediction is for each individual subject. Evaluations using the ADNI dataset indicate that our method outperforms conventional LRMC and other state-of-the-art methods. Our method achieves a maximal pMCI classification accuracy of 84% and time prediction correlation of 0.665.

View details for DOI 10.1016/j.media.2018.01.002

View details for Web of Science ID 000427664400006

View details for PubMedID 29414437

View details for PubMedCentralID PMC6892173
Adversarially Learned One-Class Classifier for Novelty Detection Sabokrou, M., Khalooei, M., Fathy, M., Adeli, E., IEEE IEEE. 2018: 3379–88

View details for DOI 10.1109/CVPR.2018.00356

View details for Web of Science ID 000457843603054
End-To-End Alzheimer's Disease Diagnosis and Biomarker Identification Esmaeilzadeh, S., Belivanis, D., Pohl, K. M., Adeli, E. edited by Shi, Y., Suk, H. I., Liu, M. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 337–45

View details for DOI 10.1007/978-3-030-00919-9_39

View details for Web of Science ID 000477767800039
INFANT BRAIN DEVELOPMENT PREDICTION WITH LATENT PARTIAL MULTI-VIEW REPRESENTATION LEARNING. Proceedings. IEEE International Symposium on Biomedical Imaging Zhang, C. n., Adeli, E. n., Wu, Z. n., Li, G. n., Lin, W. n., Shen, D. n. 2018; 2018: 1048–51

Abstract

The early postnatal period witnesses rapid and dynamic brain development. Understanding the cognitive development patterns can help identify various disorders at early ages of life and is essential for the health and well-being of children. This inspires us to investigate the relation between cognitive ability and the cerebral cortex by exploiting brain images in a longitudinal study. Specifically, we aim to predict the infant brain development status based on the morphological features of the cerebral cortex. For this goal, we introduce a multi-view multi-task learning approach to dexterously explore complementary information from different time points and handle the missing data simultaneously. Specifically, we establish a novel model termed as Latent Partial Multi-view Representation Learning. The approach regards data of different time points as different views, and constructs a latent representation to capture the complementary underlying information from different and even incomplete time points. It uncovers the latent representation that can be jointly used to learn the prediction model. This formulation elegantly explores the complementarity, effectively reduces the redundancy of different views, and improves the accuracy of prediction. The minimization problem is solved by the Alternating Direction Method of Multipliers (ADMM). Experimental results on real data validate the proposed method.

View details for PubMedID 30464798

View details for PubMedCentralID PMC6242279
Landmark-based deep multi-instance learning for brain disease diagnosis MEDICAL IMAGE ANALYSIS Liu, M., Zhang, J., Adeli, E., Shen, D. 2018; 43: 157–68

Abstract

In conventional Magnetic Resonance (MR) image based methods, two stages are often involved to capture brain structural information for disease diagnosis, i.e., 1) manually partitioning each MR image into a number of regions-of-interest (ROIs), and 2) extracting pre-defined features from each ROI for diagnosis with a certain classifier. However, these pre-defined features often limit the performance of the diagnosis, due to challenges in 1) defining the ROIs and 2) extracting effective disease-related features. In this paper, we propose a landmark-based deep multi-instance learning (LDMIL) framework for brain disease diagnosis. Specifically, we first adopt a data-driven learning approach to discover disease-related anatomical landmarks in the brain MR images, along with their nearby image patches. Then, our LDMIL framework learns an end-to-end MR image classifier for capturing both the local structural information conveyed by image patches located by landmarks and the global structural information derived from all detected landmarks. We have evaluated our proposed framework on 1526 subjects from three public datasets (i.e., ADNI-1, ADNI-2, and MIRIAD), and the experimental results show that our framework can achieve superior performance over state-of-the-art approaches.

View details for DOI 10.1016/j.media.2017.10.005

View details for Web of Science ID 000418627400012

View details for PubMedID 29107865

View details for PubMedCentralID PMC6203325
Multi-Layer Multi-View Classification for Alzheimer's Disease Diagnosis Zhang, C., Adeli, E., Zhou, T., Chen, X., Shen, D., AAAI ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2018: 4406–13

Abstract

In this paper, we propose a novel multi-view learning method for Alzheimer's Disease (AD) diagnosis, using neuroimaging and genetics data. Generally, there are several major challenges associated with traditional classification methods on multi-source imaging and genetics data. First, the correlation between the extracted imaging features and class labels is generally complex, which often makes the traditional linear models ineffective. Second, medical data may be collected from different sources (i.e., multiple modalities of neuroimaging data, clinical scores or genetics measurements), therefore, how to effectively exploit the complementarity among multiple views is of great importance. In this paper, we propose a Multi-Layer Multi-View Classification (ML-MVC) approach, which regards the multi-view input as the first layer, and constructs a latent representation to explore the complex correlation between the features and class labels. This captures the high-order complementarity among different views, as we exploit the underlying information with a low-rank tensor regularization. Intrinsically, our formulation elegantly explores the nonlinear correlation together with complementarity among different views, and thus improves the accuracy of classification. Finally, the minimization problem is solved by the Alternating Direction Method of Multipliers (ADMM). Experimental results on Alzheimers Disease Neuroimaging Initiative (ADNI) data sets validate the effectiveness of our proposed method.

View details for Web of Science ID 000485488904061

View details for PubMedID 30416868

View details for PubMedCentralID PMC6223635
Fine-Grained Segmentation Using Hierarchical Dilated Neural Networks Zhou, S., Nie, D., Adeli, E., Gao, Y., Wang, L., Yin, J., Shen, D. edited by Frangi, A. F., Schnabel, J. A., Davatzikos, C., AlberolaLopez, C., Fichtinger, G. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 488–96

View details for DOI 10.1007/978-3-030-00937-3_56

View details for Web of Science ID 000477769100056
Predictive Modeling of Longitudinal Data for Alzheimer's Disease Diagnosis Using RNNs Aghili, M., Tabarestani, S., Adjouadi, M., Adeli, E. edited by Rekik, Unal, G., Adeli, E., Park, S. H. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 112–19

View details for DOI 10.1007/978-3-030-00320-3_14

View details for Web of Science ID 000477923900014
Predictive Modeling of Longitudinal Data for Alzheimer's Disease Diagnosis Using RNNs Aghili, M., Tabarestani, S., Adjouadi, M., Adeli, E. edited by Rekik, Unal, G., Adeli, E., Park, S. H. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 112–19

View details for DOI 10.1007/978-3-030-00320-3_14

View details for Web of Science ID 000477923900014
Joint Sparse and Low-Rank Regularized MultiTask Multi-Linear Regression for Prediction of Infant Brain Development with Incomplete Data. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Adeli, E., Meng, Y., Li, G., Lin, W., Shen, D. 2017; 10433: 40–48

Abstract

Studies involving dynamic infant brain development has received increasing attention in the past few years. For such studies, a complete longitudinal dataset is often required to precisely chart the early brain developmental trajectories. Whereas, in practice, we often face missing data at different time point(s) for different subjects. In this paper, we propose a new method for prediction of infant brain development scores at future time points based on longitudinal imaging measures at early time points with possible missing data. We treat this as a multi-dimensional regression problem, for predicting multiple brain development scores (multi-task) from multiple previous time points (multi-linear). To solve this problem, we propose an objective function with a joint ℓ1 and low-rank regularization on the mapping weight tensor, to enforce feature selection, while preserving the structural information from multiple dimensions. Also, based on the bag-of-words model, we propose to extract features from longitudinal imaging data. The experimental results reveal that we can effectively predict the brain development scores assessed at the age of four years, using the imaging data as early as two years of age.

View details for DOI 10.1007/978-3-319-66182-7_5

View details for PubMedID 30159549
Maximum Mean Discrepancy Based Multiple Kernel Learning for Incomplete Multimodality Neuroimaging Data. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Zhu, X., Thung, K., Adeli, E., Zhang, Y., Shen, D. 2017; 10435: 72–80

Abstract

It is challenging to use incomplete multimodality data for Alzheimer's Disease (AD) diagnosis. The current methods to address this challenge, such as low-rank matrix completion (i.e., imputing the missing values and unknown labels simultaneously) and multi-task learning (i.e., defining one regression task for each combination of modalities and then learning them jointly), are unable to model the complex data-to-label relationship in AD diagnosis and also ignore the heterogeneity among the modalities. In light of this, we propose a new Maximum Mean Discrepancy (MMD) based Multiple Kernel Learning (MKL) method for AD diagnosis using incomplete multimodality data. Specifically, we map all the samples from different modalities into a Reproducing Kernel Hilbert Space (RKHS), by devising a new MMD algorithm. The proposed MMD method incorporates data distribution matching, pair-wise sample matching and feature selection in an unified formulation, thus alleviating the modality heterogeneity issue and making all the samples comparable to share a common classifier in the RKHS. The resulting classifier obviously captures the nonlinear data-to-label relationship. We have tested our method using MRI and PET data from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset for AD diagnosis. The experimental results show that our method outperforms other methods.

View details for DOI 10.1007/978-3-319-66179-7_9

View details for PubMedID 29392246
Deep Multi-Task Multi-Channel Learning for Joint Classification and Regression of Brain Status. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Liu, M., Zhang, J., Adeli, E., Shen, D. 2017; 10435: 3–11

Abstract

Jointly identifying brain diseases and predicting clinical scores have attracted increasing attention in the domain of computer-aided diagnosis using magnetic resonance imaging (MRI) data, since these two tasks are highly correlated. Although several joint learning models have been developed, most existing methods focus on using human-engineered features extracted from MRI data. Due to the possible heterogeneous property between human-engineered features and subsequent classification/regression models, those methods may lead to sub-optimal learning performance. In this paper, we propose a deep multi-task multi-channel learning (DM2L) framework for simultaneous classification and regression for brain disease diagnosis, using MRI data and personal information (i.e., age, gender, and education level) of subjects. Specifically, we first identify discriminative anatomical landmarks from MR images in a data-driven manner, and then extract multiple image patches around these detected landmarks. A deep multi-task multi-channel convolutional neural network is then developed for joint disease classification and clinical score regression. We train our model on a large multi-center cohort (i.e., ADNI-1) and test it on an independent cohort (i.e., ADNI-2). Experimental results demonstrate that DM2L is superior to the state-of-the-art approaches in brain diasease diagnosis.

View details for DOI 10.1007/978-3-319-66179-7_1

View details for PubMedID 29756129
Multi-modal classification of neurodegenerative disease by progressive graph-based transductive learning MEDICAL IMAGE ANALYSIS Wang, Z., Zhu, X., Adeli, E., Zhu, Y., Nie, F., Munsell, B., Wu, G., ADNI PPMI 2017; 39: 218–30

Abstract

Graph-based transductive learning (GTL) is a powerful machine learning technique that is used when sufficient training data is not available. In particular, conventional GTL approaches first construct a fixed inter-subject relation graph that is based on similarities in voxel intensity values in the feature domain, which can then be used to propagate the known phenotype data (i.e., clinical scores and labels) from the training data to the testing data in the label domain. However, this type of graph is exclusively learned in the feature domain, and primarily due to outliers in the observed features, may not be optimal for label propagation in the label domain. To address this limitation, a progressive GTL (pGTL) method is proposed that gradually finds an intrinsic data representation that more accurately aligns imaging features with the phenotype data. In general, optimal feature-to-phenotype alignment is achieved using an iterative approach that: (1) refines inter-subject relationships observed in the feature domain by using the learned intrinsic data representation in the label domain, (2) updates the intrinsic data representation from the refined inter-subject relationships, and (3) verifies the intrinsic data representation on the training data to guarantee an optimal classification when applied to testing data. Additionally, the iterative approach is extended to multi-modal imaging data to further improve pGTL classification accuracy. Using Alzheimer's disease and Parkinson's disease study data, the classification accuracy of the proposed pGTL method is compared to several state-of-the-art classification methods, and the results show pGTL can more accurately identify subjects, even at different progression stages, in these two study data sets.

View details for DOI 10.1016/j.media.2017.05.003

View details for Web of Science ID 000404200900016

View details for PubMedID 28551556

View details for PubMedCentralID PMC5901767
A Hierarchical Feature and Sample Selection Framework and Its Application for Alzheimer's Disease Diagnosis SCIENTIFIC REPORTS An, L., Adeli, E., Liu, M., Zhang, J., Lee, S., Shen, D. 2017; 7: 45269

Abstract

Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer's disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.

View details for DOI 10.1038/srep45269

View details for Web of Science ID 000397815500001

View details for PubMedID 28358032

View details for PubMedCentralID PMC5372170
Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson's Disease SCIENTIFIC REPORTS Adeli, E., Wu, G., Saghafi, B., An, L., Shi, F., Shen, D. 2017; 7: 41069

Abstract

Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson's disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.

View details for DOI 10.1038/srep41069

View details for Web of Science ID 000392663200001

View details for PubMedID 28120883

View details for PubMedCentralID PMC5264393
Structured Prediction with Short/Long-Range Dependencies for Human Activity Recognition from Depth Skeleton Data Arzani, M. M., Fathy, M., Aghajan, H., Azirani, A. A., Raahemifar, K., Adeli, E. edited by Bicchi, A., Okamura, A. IEEE. 2017: 560–67

View details for Web of Science ID 000426978200079
Deep Relative Attributes Souri, Y., Noury, E., Adeli, E. edited by Lai, S. H., Lepetit, Nishino, K., Sato, Y. SPRINGER INTERNATIONAL PUBLISHING AG. 2017: 118–33

View details for DOI 10.1007/978-3-319-54193-8_8

View details for Web of Science ID 000426209200008
Consciousness Level and Recovery Outcome Prediction Using High-Order Brain Functional Connectivity Network Jia, X., Zhang, H., Adeli, E., Shen, D. edited by Wu, G., Laurienti, P., Bonilha, L., Munsell, B. C. SPRINGER INTERNATIONAL PUBLISHING AG. 2017: 17–24

Abstract

Based on the neuroimaging data from a large set of acquired brain injury patients, we investigate the feasibility of using machine learning for automatic prediction of individual consciousness level. Rather than using the traditional Pearson's correlation-based brain functional network, which measures only the simple temporal synchronization of the BOLD signals from each pair of brain regions, we construct a high-order brain functional network that is capable of characterizing topographical information-based high-level functional associations among brain regions. In such a high-order brain network, each node represents the community of a brain region, described by a set of this region's low-order functional associations with other brain regions, and each edge characterizes topographical similarity between a pair of such communities. Experimental results show that the high-order brain functional network enables a significant better classification for consciousness level and recovery outcome prediction.

View details for DOI 10.1007/978-3-319-67159-8_3

View details for Web of Science ID 000463626800003

View details for PubMedID 30345427

View details for PubMedCentralID PMC6193499
Joint feature-sample selection and robust diagnosis of Parkinson's disease from MRI data NEUROIMAGE Adeli, E., Shi, F., An, L., Wee, C., Wu, G., Wang, T., Shen, D. 2016; 141: 206–19

Abstract

Parkinson's disease (PD) is an overwhelming neurodegenerative disorder caused by deterioration of a neurotransmitter, known as dopamine. Lack of this chemical messenger impairs several brain regions and yields various motor and non-motor symptoms. Incidence of PD is predicted to double in the next two decades, which urges more research to focus on its early diagnosis and treatment. In this paper, we propose an approach to diagnose PD using magnetic resonance imaging (MRI) data. Specifically, we first introduce a joint feature-sample selection (JFSS) method for selecting an optimal subset of samples and features, to learn a reliable diagnosis model. The proposed JFSS model effectively discards poor samples and irrelevant features. As a result, the selected features play an important role in PD characterization, which will help identify the most relevant and critical imaging biomarkers for PD. Then, a robust classification framework is proposed to simultaneously de-noise the selected subset of features and samples, and learn a classification model. Our model can also de-noise testing samples based on the cleaned training data. Unlike many previous works that perform de-noising in an unsupervised manner, we perform supervised de-noising for both training and testing data, thus boosting the diagnostic accuracy. Experimental results on both synthetic and publicly available PD datasets show promising results. To evaluate the proposed method, we use the popular Parkinson's progression markers initiative (PPMI) database. Our results indicate that the proposed method can differentiate between PD and normal control (NC), and outperforms the competing methods by a relatively large margin. It is noteworthy to mention that our proposed framework can also be used for diagnosis of other brain disorders. To show this, we have also conducted experiments on the widely-used ADNI database. The obtained results indicate that our proposed method can identify the imaging biomarkers and diagnose the disease with favorable accuracies compared to the baseline methods.

View details for DOI 10.1016/j.neuroimage.2016.05.054

View details for Web of Science ID 000384074500018

View details for PubMedID 27296013

View details for PubMedCentralID PMC5866718
Feature Selection Based on Iterative Canonical Correlation Analysis for Automatic Diagnosis of Parkinson's Disease. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Liu, L., Wang, Q., Adeli, E., Zhang, L., Zhang, H., Shen, D. 2016; 9901: 1–8

Abstract

Parkinson's disease (PD) is a major progressive neurodegenerative disorder. Accurate diagnosis of PD is crucial to control the symptoms appropriately. However, its clinical diagnosis mostly relies on the subjective judgment of physicians and the clinical symptoms that often appear late. Recent neuroimaging techniques, along with machine learning methods, provide alternative solutions for PD screening. In this paper, we propose a novel feature selection technique, based on iterative canonical correlation analysis (ICCA), to investigate the roles of different brain regions in PD through T1-weighted MR images. First of all, gray matter and white matter tissue volumes in brain regions of interest are extracted as two feature vectors. Then, a small group of significant features were selected using the iterative structure of our proposed ICCA framework from both feature vectors. Finally, the selected features are used to build a robust classifier for automatic diagnosis of PD. Experimental results show that the proposed feature selection method results in better diagnosis accuracy, compared to the baseline and state-of-the-art methods.

View details for DOI 10.1007/978-3-319-46723-8_1

View details for PubMedID 28593202
Progressive Graph-Based Transductive Learning for Multi-modal Classification of Brain Disorder Disease. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Wang, Z., Zhu, X., Adeli, E., Zhu, Y., Zu, C., Nie, F., Shen, D., Wu, G. 2016; 9900: 291–99

Abstract

Graph-based Transductive Learning (GTL) is a powerful tool in computer-assisted diagnosis, especially when the training data is not sufficient to build reliable classifiers. Conventional GTL approaches first construct a fixed subject-wise graph based on the similarities of observed features (i.e., extracted from imaging data) in the feature domain, and then follow the established graph to propagate the existing labels from training to testing data in the label domain. However, such a graph is exclusively learned in the feature domain and may not be necessarily optimal in the label domain. This may eventually undermine the classification accuracy. To address this issue, we propose a progressive GTL (pGTL) method to progressively find an intrinsic data representation. To achieve this, our pGTL method iteratively (1) refines the subject-wise relationships observed in the feature domain using the learned intrinsic data representation in the label domain, (2) updates the intrinsic data representation from the refined subject-wise relationships, and (3) verifies the intrinsic data representation on the training data, in order to guarantee an optimal classification on the new testing data. Furthermore, we extend our pGTL to incorporate multi-modal imaging data, to improve the classification accuracy and robustness as multi-modal imaging data can provide complementary information. Promising classification results in identifying Alzheimer's disease (AD), Mild Cognitive Impairment (MCI), and Normal Control (NC) subjects are achieved using MRI and PET data.

View details for DOI 10.1007/978-3-319-46720-7_34

View details for PubMedID 28386606
3D Deep Learning for Multi-modal Imaging-Guided Survival Time Prediction of Brain Tumor Patients. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Nie, D., Zhang, H., Adeli, E., Liu, L., Shen, D. 2016; 9901: 212–20

Abstract

High-grade glioma is the most aggressive and severe brain tumor that leads to death of almost 50% patients in 1-2 years. Thus, accurate prognosis for glioma patients would provide essential guidelines for their treatment planning. Conventional survival prediction generally utilizes clinical information and limited handcrafted features from magnetic resonance images (MRI), which is often time consuming, laborious and subjective. In this paper, we propose using deep learning frameworks to automatically extract features from multi-modal preoperative brain images (i.e., T1 MRI, fMRI and DTI) of high-grade glioma patients. Specifically, we adopt 3D convolutional neural networks (CNNs) and also propose a new network architecture for using multi-channel data and learning supervised features. Along with the pivotal clinical features, we finally train a support vector machine to predict if the patient has a long or short overall survival (OS) time. Experimental results demonstrate that our methods can achieve an accuracy as high as 89.9% We also find that the learned features from fMRI and DTI play more important roles in accurately predicting the OS time, which provides valuable insights into functional neuro-oncological applications.

View details for DOI 10.1007/978-3-319-46723-8_25

View details for PubMedID 28149967
Semi-supervised Hierarchical Multimodal Feature and Sample Selection for Alzheimer's Disease Diagnosis. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention An, L., Adeli, E., Liu, M., Zhang, J., Shen, D. 2016; 9901: 79–87

Abstract

Alzheimer's disease (AD) is a progressive neurodegenerative disease that impairs a patient's memory and other important mental functions. In this paper, we leverage the mutually informative and complementary features from both structural magnetic resonance imaging (MRI) and single nucleotide polymorphism (SNP) for improving the diagnosis. Due to the feature redundancy and sample outliers, direct use of all training data may lead to suboptimal performance in classification. In addition, as redundant features are involved, the most discriminative feature subset may not be identified in a single step, as commonly done in most existing feature selection approaches. Therefore, we formulate a hierarchical multimodal feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps. To positively guide the data manifold preservation, we utilize both labeled and unlabeled data in the learning process, making our method semi-supervised. The finally selected features and samples are then used to train support vector machine (SVM) based classification models. Our method is evaluated on 702 subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, and the superior classification results in AD related diagnosis demonstrate the effectiveness of our approach as compared to other methods.

View details for DOI 10.1007/978-3-319-46723-8_10

View details for PubMedID 30101233
Inherent Structure-Based Multiview Learning With Multitemplate Feature Representation for Alzheimer's Disease Diagnosis IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING Liu, M., Zhang, D., Adeli, E., Shen, D. 2016; 63 (7): 1473–82

Abstract

Multitemplate-based brain morphometric pattern analysis using magnetic resonance imaging has been recently proposed for automatic diagnosis of Alzheimer's disease (AD) and its prodromal stage (i.e., mild cognitive impairment or MCI). In such methods, multiview morphological patterns generated from multiple templates are used as feature representation for brain images. However, existing multitemplate-based methods often simply assume that each class is represented by a specific type of data distribution (i.e., a single cluster), while in reality, the underlying data distribution is actually not preknown. In this paper, we propose an inherent structure-based multiview leaning method using multiple templates for AD/MCI classification. Specifically, we first extract multiview feature representations for subjects using multiple selected templates and then cluster subjects within a specific class into several subclasses (i.e., clusters) in each view space. Then, we encode those subclasses with unique codes by considering both their original class information and their own distribution information, followed by a multitask feature selection model. Finally, we learn an ensemble of view-specific support vector machine classifiers based on their, respectively, selected features in each view and fuse their results to draw the final decision. Experimental results on the Alzheimer's Disease Neuroimaging Initiative database demonstrate that our method achieves promising results for AD/MCI classification, compared to the state-of-the-art multitemplate-based methods.

View details for DOI 10.1109/TBME.2015.2496233

View details for Web of Science ID 000380323800013

View details for PubMedID 26540666

View details for PubMedCentralID PMC4851920
Multi-Level Canonical Correlation Analysis for Standard-Dose PET Image Estimation IEEE TRANSACTIONS ON IMAGE PROCESSING An, L., Zhang, P., Adeli, E., Wang, Y., Ma, G., Shi, F., Lalush, D. S., Lin, W., Shen, D. 2016; 25 (7): 3303–15

Abstract

Positron emission tomography (PET) images are widely used in many clinical applications, such as tumor detection and brain disorder diagnosis. To obtain PET images of diagnostic quality, a sufficient amount of radioactive tracer has to be injected into a living body, which will inevitably increase the risk of radiation exposure. On the other hand, if the tracer dose is considerably reduced, the quality of the resulting images would be significantly degraded. It is of great interest to estimate a standard-dose PET (S-PET) image from a low-dose one in order to reduce the risk of radiation exposure and preserve image quality. This may be achieved through mapping both S-PET and low-dose PET data into a common space and then performing patch-based sparse representation. However, a one-size-fits-all common space built from all training patches is unlikely to be optimal for each target S-PET patch, which limits the estimation accuracy. In this paper, we propose a data-driven multi-level canonical correlation analysis scheme to solve this problem. In particular, a subset of training data that is most useful in estimating a target S-PET patch is identified in each level, and then used in the next level to update common space and improve estimation. In addition, we also use multi-modal magnetic resonance images to help improve the estimation with complementary information. Validations on phantom and real human brain data sets show that our method effectively estimates S-PET images and well preserves critical clinical quantification measures, such as standard uptake value.

View details for DOI 10.1109/TIP.2016.2567072

View details for Web of Science ID 000377371700002

View details for PubMedID 27187957

View details for PubMedCentralID PMC5106345
Joint Feature-Sample Selection and Robust Classification for Parkinson's Disease Diagnosis Adeli-Mosabbeb, E., Wee, C., An, L., Shi, F., Shen, D. edited by Menze, B., Langs, G., Montillo, A., Kelm, M., Muller, H., Zhang, S., Cai, W., Metaxas, D. SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 127–36

View details for DOI 10.1007/978-3-319-42016-5_12

View details for Web of Science ID 000389404000012
Relationship Induced Multi-atlas Learning for Alzheimer's Disease Diagnosis Liu, M., Zhang, D., Adeli-Mosabbeb, E., Shen, D. edited by Menze, B., Langs, G., Montillo, A., Kelm, M., Muller, H., Zhang, S., Cai, W., Metaxas, D. SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 24–33

View details for DOI 10.1007/978-3-319-42016-5_3

View details for Web of Science ID 000389404000003
Stability-Weighted Matrix Completion of Incomplete Multi-modal Data for Disease Diagnosis. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Thung, K., Adeli, E., Yap, P., Shen, D. 2016; 9901: 88–96

Abstract

Effective utilization of heterogeneous multi-modal data for Alzheimer's Disease (AD) diagnosis and prognosis has always been hampered by incomplete data. One method to deal with this is low-rank matrix completion (LRMC), which simultaneous imputes missing data features and target values of interest. Although LRMC yields reasonable results, it implicitly weights features from all the modalities equally, ignoring the differences in discriminative power of features from different modalities. In this paper, we propose stability-weighted LRMC (swLRMC), an LRMC improvement that weights features and modalities according to their importance and reliability. We introduce a method, called stability weighting, to utilize subsampling techniques and outcomes from a range of hyper-parameters of sparse feature learning to obtain a stable set of weights. Incorporating these weights into LRMC, swLRMC can better account for differences in features and modalities for improving diagnosis. Experimental results confirm that the proposed method outperforms the conventional LRMC, feature-selection based LRMC, and other state-of-the-art methods.

View details for DOI 10.1007/978-3-319-46723-8_11

View details for PubMedID 28286884
Non-negative matrix completion for action detection IMAGE AND VISION COMPUTING Adeli-Mosabbeb, E., Fathy, M. 2015; 39: 38-51

View details for DOI 10.1016/j.imavis.2015.04.006

View details for Web of Science ID 000357543400004
Joint Diagnosis and Conversion Time Prediction of Progressive Mild Cognitive Impairment (pMCI) Using Low-Rank Subspace Clustering and Matrix Completion Thung, K., Yap, P., Adeli-M, E., Shen, D. edited by Navab, N., Hornegger, J., Wells, W. M., Frangi, A. F. SPRINGER INTERNATIONAL PUBLISHING AG. 2015: 527-534

Abstract

Identifying progressive mild cognitive impairment (pMCI) patients and predicting when they will convert to Alzheimer's disease (AD) are important for early medical intervention. Multi-modality and longitudinal data provide a great amount of information for improving diagnosis and prognosis. But these data are often incomplete and noisy. To improve the utility of these data for prediction purposes, we propose an approach to denoise the data, impute missing values, and cluster the data into low-dimensional subspaces for pMCI prediction. We assume that the data reside in a space formed by a union of several low-dimensional subspaces and that similar MCI conditions reside in similar subspaces. Therefore, we first use incomplete low-rank representation (ILRR) and spectral clustering to cluster the data according to their representative low-rank subspaces. At the same time, we denoise the data and impute missing values. Then we utilize a low-rank matrix completion (LRMC) framework to identify pMCI patients and their time of conversion. Evaluations using the ADNI dataset indicate that our method outperforms conventional LRMC method.

View details for DOI 10.1007/978-3-319-24574-4_63

View details for Web of Science ID 000365963800063

View details for PubMedID 27054201

View details for PubMedCentralID PMC4820009
Multi-label Discriminative Weakly-Supervised Human Activity Recognition and Localization Mosabbeb, E., Cabral, R., De la Torre, F., Fathy, M. edited by Cremers, D., Reid, Saito, H., Yang, M. H. SPRINGER-VERLAG BERLIN. 2015: 241-258

View details for DOI 10.1007/978-3-319-16814-2_16

View details for Web of Science ID 000362446300016
Medical Image Retrieval Using Multi-graph Learning for MCI Diagnostic Assistance Gao, Y., Adeli-M, E., Kim, M., Giannakopoulos, P., Haller, S., Shen, D. edited by Navab, N., Hornegger, J., Wells, W. M., Frangi, A. F. SPRINGER INTERNATIONAL PUBLISHING AG. 2015: 86-93

Abstract

Alzheimer's disease (AD) is an irreversible neurodegenerative disorder that can lead to progressive memory loss and cognition impairment. Therefore, diagnosing AD during the risk stage, a.k.a. Mild Cognitive Impairment (MCI), has attracted ever increasing interest. Besides the automated diagnosis of MCI, it is important to provide physicians with related MCI cases with visually similar imaging data for case-based reasoning or evidence-based medicine in clinical practices. To this end, we propose a multi-graph learning based medical image retrieval technique for MCI diagnostic assistance. Our method is comprised of two stages, the query category prediction and ranking. In the first stage, the query is formulated into a multi-graph structure with a set of selected subjects in the database to learn the relevance between the query subject and the existing subject categories through learning the multi-graph combination weights. This predicts the category that the query belongs to, based on which a set of subjects in the database are selected as candidate retrieval results. In the second stage, the relationship between these candidates and the query is further learned with a new multi-graph, which is used to rank the candidates. The returned subjects can be demonstrated to physicians as reference cases for MCI diagnosing. We evaluated the proposed method on a cohort of 60 consecutive MCI subjects and 350 normal controls with MRI data under three imaging parameters: T1 weighted imaging (T1), Diffusion Tensor Imaging (DTI) and Arterial Spin Labeling (ASL). The proposed method can achieve average 3.45 relevant samples in top 5 returned results, which significantly outperforms the baseline methods compared.

View details for DOI 10.1007/978-3-319-24571-3_11

View details for Web of Science ID 000366206800011

View details for PubMedID 27054200

View details for PubMedCentralID PMC4820016
Distributed matrix completion for large-scale multi-label classification INTELLIGENT DATA ANALYSIS Mosabbeb, E., Fathy, M. 2014; 18 (6): 1137-1151

View details for DOI 10.3233/IDA-140688

View details for Web of Science ID 000345307800009
Multi-View Human Activity Recognition in Distributed Camera Sensor Networks SENSORS Mosabbeb, E., Raahemifar, K., Fathy, M. 2013; 13 (7): 8750-8770

Abstract

With the increasing demand on the usage of smart and networked cameras in intelligent and ambient technology environments, development of algorithms for such resource-distributed networks are of great interest. Multi-view action recognition addresses many challenges dealing with view-invariance and occlusion, and due to the huge amount of processing and communicating data in real life applications, it is not easy to adapt these methods for use in smart camera networks. In this paper, we propose a distributed activity classification framework, in which we assume that several camera sensors are observing the scene. Each camera processes its own observations, and while communicating with other cameras, they come to an agreement about the activity class. Our method is based on recovering a low-rank matrix over consensus to perform a distributed matrix completion via convex optimization. Then, it is applied to the problem of human activity classification. We test our approach on IXMAS and MuHAVi datasets to show the performance and the feasibility of the method.

View details for DOI 10.3390/s130708750

View details for Web of Science ID 000328612800038

View details for PubMedID 23881136

View details for PubMedCentralID PMC3758620
Distributed Activity Recognition in Camera Networks via Low-Rank Matrix Recovery Mosabbeb, E., Raahemifar, K., Fathy, M., IEEE IEEE. 2013

View details for Web of Science ID 000352861800042
Multi-view Support Vector Machines for Distributed Activity Recognition Mosabbeb, E., Raahemifar, K., Fathy, M., IEEE IEEE. 2013

View details for Web of Science ID 000352861800041
Model-based human gait tracking, 3D reconstruction and recognition in uncalibrated monocular video IMAGING SCIENCE JOURNAL Adeli-Mosabbeb, E., Fathy, M., Zargari, F. 2012; 60 (1): 9-28

View details for DOI 10.1179/1743131X11Y.0000000002

View details for Web of Science ID 000298664300003
A non-parametric heuristic algorithm for convex and non-convex data clustering based on equipotential surfaces EXPERT SYSTEMS WITH APPLICATIONS Bayat, F., Mosabbeb, E., Jalali, A., Bayat, F. 2010; 37 (4): 3318-3325

View details for DOI 10.1016/j.eswa.2009.10.019

View details for Web of Science ID 000274202900070
Prediction of significant wave height using regressive support vector machines OCEAN ENGINEERING Mahjoobi, J., Mosabbeb, E. 2009; 36 (5): 339-347

View details for DOI 10.1016/j.oceaneng.2009.01.001

View details for Web of Science ID 000265813900004
SHABaN Multi-agent Team To Herd Cows Rahmani, A. T., Saberi, A., Mohammadi, M., Nikanjam, A., Mosabbeb, E., Abdoos, M. edited by Hindriks, K. V., Pokahr, A., Sardina, S. SPRINGER-VERLAG BERLIN. 2009: 248-252

View details for Web of Science ID 000270329400020
A Novel Approach for Branch Buffer Consuming Power Reduction Zamani, B., Adeli, E., Gharedaghi, H., Soryani, M., IEEE edited by Xie, Y., Li, W., Zhou, J. IEEE COMPUTER SOC. 2008: 436-+

View details for DOI 10.1109/ICCEE.2008.48

View details for Web of Science ID 000263155500089
A low-cost strong shadow-based segmentation approach for vehicle tracking in congested traffic scenes Mosabbeb, E., Sadeghi, M., Fathy, M., Bahekmat, M., IEEE IEEE. 2007: 147-+

View details for Web of Science ID 000254143900027
A new approach for vehicle detection in congested traffic scenes based on strong shadow segmentation Mosabbeb, E., Sadeghi, M., Fathy, M. edited by Bebis, G., Boyle, R., Parvin, B., Koracin, D., Paragios, N., Tanveer, S. M., Ju, T., Liu, Z., Coquillart, S., CruzNeira, C., Muller, T., Malzbender, T. SPRINGER-VERLAG BERLIN. 2007: 427-+

View details for Web of Science ID 000251785200042

Ehsan Adeli

Assistant Professor (Research) of Psychiatry and Behavioral Sciences (Public Mental Health and Populations Sciences) and, by courtesy, of Computer Science and of Biomedical Data Science

Bio

Academic Appointments

Administrative Appointments

Honors & Awards

Professional Education

Contact

Additional Info

Links

Current Research and Scholarly Interests

Clinical Trials

2025-26 Courses

2024-25 Courses

2023-24 Courses

2022-23 Courses

Stanford Advisees

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract