High Performance of LSTM on Dengue Shock Syndrome Detection Using DNA Sequence Encoding Methods

Authors

DOI:

https://doi.org/10.3991/ijoe.v21i04.53383

Keywords:

LSTM, dengue shock syndrome, DNA sequence, encoding method

Abstract


Dengue fever (DF) is a significant global health challenge, affecting approximately 390 million people annually and imposing substantial public health and economic burdens. Accurate DNA sequence classification is crucial for identifying genetic factors in diseases such as DF. However, many machine learning (ML) models for disease detection rely on basic encoding methods such as one-hot encoding, which fail to fully exploit the sequential and contextual nature of DNA data. To address this limitation, this study applies long short-term memory (LSTM) networks, a neural architecture adept at handling sequential data, to classify DNA sequences for detecting dengue and dengue shock syndrome (DSS). The study evaluates three encoding techniques— one-hot encoding, term frequency-inverse document frequency (TF-IDF), and Word2Vec— using datasets of 3,458 DNA sequences sourced from genomics repositories. Preprocessing included the removal of non-ACGT sequences and duplicates to ensure data integrity, followed by under-sampling to address class imbalance. Experimental results demonstrate that the LSTM model with Word2Vec encoding achieved the highest accuracy (0.98), significantly outperforming other encoding techniques. Word2Vec captures contextual and semantic relationships within DNA sequences, enabling superior classification performance. These findings highlight the potential of combining advanced encoding techniques with LSTM networks to improve the accuracy of disease detection models. The study’s approach offers promising implications for genomic diagnostics, particularly in resource-limited settings, and lays the foundation for future research into applying similar methodologies to other diseases or datasets.

Author Biographies

Agustin Iskandar, Brawijaya University, Malang, Indonesia

Faculty of Medicine

Novanto Yudistira, Brawijaya University, Malang, Indonesia

Department of Informatics Engineering

Faculty of Computer Science

Brawijaya University

Bambang Nurdewanto, Universitas Merdeka, Malang, Indonesia

Department of Information System

Downloads

Published

2025-03-25

How to Cite

Muflikhah, L., Iskandar, A., Yudistira, N., & Nurdewanto, B. (2025). High Performance of LSTM on Dengue Shock Syndrome Detection Using DNA Sequence Encoding Methods. International Journal of Online and Biomedical Engineering (iJOE), 21(04), pp. 79–98. https://doi.org/10.3991/ijoe.v21i04.53383

Issue

Section

Papers