Enhanced Alzheimer’s Diagnosis Using Multimodal Data: A Comparative Study of CNN Architectures

Authors

DOI:

https://doi.org/10.3991/ijoe.v21i14.57471

Keywords:

Azheimer's, Classification, CNN Architecture, multimodal

Abstract


Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that leads to severe cognitive decline, making early and accurate diagnosis essential for effective patient management. Traditional diagnostic approaches often rely on unimodal imaging data such as magnetic resonance imaging (MRI) or positron emission tomography (PET) scans, yet these methods are insufficient to capture the heterogeneous and multimodal characteristics of AD biomarkers. Similarly, conventional convolutional neural network (CNN) models demonstrate strong image recognition capabilities but remain limited in integrating diverse sources of clinical evidence. To address this gap, this study proposes an enhanced multimodal diagnostic framework that combines MRI, PET, and clinical metadata to provide a more holistic representation of disease progression. The framework is evaluated through a comparative analysis of state-of-the-art CNN architectures, including densely connected convolutional networks (DenseNet), residual network (ResNet), InceptionNet, VGG, MobileNet, and EfficientNet. Experiments were conducted on the ADNI dataset, which includes 372 subjects and a total of 63,777 image slices. The results clearly demonstrate that multimodal models outperform unimodal counterparts in classification performance. EfficientNetB1 emerged as the best-performing model, achieving 98.6% accuracy, precision, recall, and F1-score, highlighting the significant contribution of clinical metadata integration. However, this superior accuracy comes with higher computational requirements, as EfficientNetB1 required 22.16 seconds for prediction with a memory load of 31.6 GB. In contrast, lightweight models such as MobileNet offered faster inference speeds but sacrificed accuracy, reaching only about 76%. These findings emphasize the critical trade-off between computational efficiency and diagnostic performance in real-world clinical scenarios. Overall, the study provides strong evidence that multimodal CNN-based architectures offer robust and accurate tools for AD detection, while also underscoring the need to balance model complexity with resource constraints in practical healthcare implementation.

Author Biographies

Lailil Muflikhah, Brawijaya University, Malang, Indonesia

Department of Informatics Engineering

Faculty of Computer Science

Galih Restu Baihaqi, Brawijaya University, Malang, Indonesia

Department of Informatics Engineering

Faculty of Computer Science

Shafatyra Shalsadilla, Brawijaya University, Malang, Indonesia

Department of Informatics Engineering

Faculty of Computer Science

Achmad Ridok, Brawijaya University, Malang, Indonesia

Department of Informatics Engineering

Faculty of Computer Science

Sri Soenarti, Brawijaya University, Malang, Indonesia

Department of Internal Medic

Faculty of Medicine

Downloads

Published

2025-12-12

How to Cite

Muflikhah, L., Restu Baihaqi, G., Shalsadilla, S., Ridok, A., & Soenarti, S. (2025). Enhanced Alzheimer’s Diagnosis Using Multimodal Data: A Comparative Study of CNN Architectures. International Journal of Online and Biomedical Engineering (iJOE), 21(14), pp. 182–198. https://doi.org/10.3991/ijoe.v21i14.57471

Issue

Section

Papers