Comparative Evaluation of PD Detection Using Deep Learning on IMFCCs Extracted from VMD
DOI:
https://doi.org/10.3991/ijoe.v20i15.51327Keywords:
IMFCC, MFCC, LSTM, CNN, VMDAbstract
This paper presents a new method for extracting vocal features for the diagnosis of Parkinson’s disease (PD) via voice analysis applying variational mode decomposition (VMD). The classical method of extracting mel-frequency cepstral coefficients (MFCC) is compared to a new approach that generates coefficients named intrinsic mel-frequency cepstral coefficients (IMFCC). For this study, two audio databases were used: the SAKAR database containing 38 recordings and a PC-GITA database comprising 50 recordings. The signal preprocessing steps include frame segmentation, pre-emphasis, and filtering. The voice signal is then decomposed into intrinsic modes employing VMD. From these modes, the log-energy of specific components is calculated to extract the IMFCC. In this study, two types of classifiers were used: convolutional neural networks (CNN) and long short-term memory (LSTM). The results show that IMFCC provides a new perspective for representing vocal signals, capturing distinct features compared to classical MFCC. Notably, the IMFCC2 attained the highest accuracy of 100% adopting the CNN classifier. This approach could improve the performance of systems for identifying PD via voice analysis, offering a robust and complementary alternative to existing feature extraction methods.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Nouhaila BOUALOULOU, Taoufiq BELHOUSSINE DRISSI, Benayad NSIRI

This work is licensed under a Creative Commons Attribution 4.0 International License.

