Dissecting Retinal Disease: A Multi-Modal Deep Learning Approach with Explainable AI for Disease Classification across Various Classes

Ayush Gupta; Jeya Mala D.; Vishal Kumar Yadav; Mayank Arora

doi:10.3991/ijoe.v21i02.51409

Authors

Ayush Gupta Vellore Institute of Technology, Chennai, Tamil Nadu, India https://orcid.org/0009-0002-4080-2940
Jeya Mala D. Vellore Institute of Technology, Chennai, Tamil Nadu, India https://orcid.org/0000-0002-2100-8218
Vishal Kumar Yadav Vellore Institute of Technology, Chennai, Tamil Nadu, India
Mayank Arora Vellore Institute of Technology, Chennai, Tamil Nadu, India https://orcid.org/0009-0002-6812-4424

DOI:

https://doi.org/10.3991/ijoe.v21i02.51409

Keywords:

, Deep Learning, Retinal Diseases Classification, Explainable AI (XAI), ResNet-18, VGG-16, Grad CAM, DenseNet

Abstract

This study investigates the efficacy of various deep learning (DL) models in detecting retinal diseases, specifically focusing on cataract detection. Utilizing a pre-processed fundus images data set classified into normal and cataract classes, we evaluate the performance of ResNet, VGG-16 and VGG-19 models based on accuracy, sensitivity, and specificity in classifying fundus images. The primary objective of this work is to provide explanations on the predictions done by the said DL models in order to ensure the ground-truth verification. The explanation is achieved using the explainable artificial intelligence (XAI) model namely gradient-weighted class activation mapping (Grad-CAM), which helps to visualize and interpret the decision-making process of these models. Through a comprehensive exploratory data analysis (EDA), model training, and evaluation, VGG-19 emerged as the superior model, achieving the highest accuracy, precision, and recall. Grad-CAM heat maps provide insights into the models’ attention in image features, highlighting the impact of cataracts on retinal structure. The study underscores the potential of DL in retinal disease detection and the pivotal role of explainable artificial intelligence (XAI) in enhancing model interpretability. Future directions include exploring more advanced DL architectures and furthering the application of XAI techniques to improve detection systems’ accuracy and transparency.