Early CKD Prediction Using Ensemble and Basic Machine Learning Models
DOI:
https://doi.org/10.3991/ijoe.v22i05.58661Keywords:
Chronic Kidney Disease, Machine Learning, Prediction, Gradient Boosting, Cross ValidationAbstract
Chronic kidney disease (CKD) is a progressive illness that often remains undiagnosed until advanced stages and represents a significant global health burden. Proper and timely diagnosis of CKD can significantly improve patient prognosis and reduce treatment costs. This study evaluates several machine learning (ML) models, including support vector machine (SVM), random forest (RF), gradient boosting (GB), Naïve Bayes (NB), AdaBoost, and a multilayer perceptron (MLP) neural network. Additionally, it proposes a stacking ensemble model combining RF and GB for accurate CKD prediction using a publicly available Kaggle dataset. Missing value handling and feature normalisation are performed during data preprocessing, and model performance is evaluated using an 80:20 train–test split with metrics such as the area under the curve (AUC), classification accuracy (CA), F1-score, precision, recall, and Matthews Correlation Coefficient (MCC). Experimental results indicate that RF and GB achieve the strongest individual performance, while the proposed stacking ensemble attains the highest CA of 99.4%. These findings highlight the potential of artificial intelligence (AI)-driven predictive models to support proactive CKD diagnosis and enhance clinical decision-making in healthcare systems.
References
1. Michel Jadoul, Mabel Aoun, Mannix Masimango Imani, “The major global burden of chronic kidney disease,” The Lancet Global Health, Volume 12, Issue 3, e342 - e343, March 2024, DOI: 10.1016/S2214-109X(24)00050-0
2. M. Rashed-Al-Mahfuz, A. Haque, A. Azad, S. A. Alyami, J. M. W. Quinn and M. A. Moni, "Clinically Applicable ML Approaches to Identify Attributes of CKD (CKD) for Use in Low-Cost Diagnostic Screening," in IEEE Journal of Translational Engineering in Health and Medicine, vol. 9, pp. 1-11, April 2021, Art no. 4900511, doi: 10.1109/JTEHM.2021.3073629
3. Merlin C Thomas, Mark E Cooper, and Paul Zimmet, “Changing epidemiology of type 2 diabetes mellitus and associated CKD,” Nature Rev. Nephrol., vol. 12, no. 2, pp. 73–81, February 2016, doi: 10.1038/nrneph.2015.173
4. M. W. Taal and B. M. Brenner, “Predicting initiation and progression of CKD: Developing renal risk scores,” Kidney International, vol. 70, no. 10, pp. 1694–1705, November 2006, doi: 10.1038/sj.ki.5001794
5. Marc Evans, Ruth D. Lewis, Angharad R. Morgan, Martin B. Whyte, Wasim Hanif, Stephen C. Bain, Sarah Davies, Umesh Dashora, Zaheer Yousef, Dipesh C. Patel & W. David Strain. “A Narrative Review of CKD in Clinical Practice: Current Challenges and Future Perspectives. Advances in Therapy, Volume 39, pages 33–43, November 2022, Doi: 10.1007/s12325-021-01927-z
6. Pamela Kushner, Kamlesh Khunti, Ana Cebrián & Gary Deed, “Early Identification and Management of CKD: A Narrative Review of the Crucial Role of Primary Care Practitioners,” Advances in Therapy, Volume 41, pages 3757–3770, August 2024, Doi: 10.1007/s12325-024-02957-z
7. Supriya V. Mahadevkar; Bharti Khemani; Shruti Patil; Ketan Kotecha; Deepali R. Vora; Ajith Abraham., "A Review on ML Styles in Computer Vision—Techniques and Future Directions," in IEEE Access, vol. 10, pp. 107293-107329, 2022, doi: 10.1109/ACCESS.2022.3209825
8. M. Zhao, Y. Cong, and L. Carin, “On leveraging pretrained GANs for generation with limited data,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 11340–11351.
9. H. Ali, C. Grönlund, and Z. Shah, “Leveraging GANs for data scarcity of COVID-19: Beyond the hype,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), vol. 12892, Jun. 2023, pp. 659–667
10. C. Tian, L. Fei, W. Zheng, Y. Xu, W. Zuo, and C.-W. Lin, “DL on image denoising: An overview,” Neural Netw., vol. 131, pp. 251–275, Nov. 2020.
11. M. Ribeiro et al., “Denoising and decoding spontaneous vagus nerve recordings with ML,” in Proc. 45th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), Jul. 2023, pp. 1–4.
12. S. Dara and P. Tumma, “Feature extraction by using DL: A survey,” in Proc. 2nd Int. Conf. Electron. Commun. Aerosp. Technol. (ICECA), Mar. 2018, pp. 1795–1801.
13. Y. Wang, H. Yao, and S. Zhao, “Auto-encoder based dimensionality reduction,” Neurocomputing, vol. 184, pp. 232–242, Apr. 2016.
14. T. Jawad, R. G. L. Koh, and J. Zariffa, “Selective peripheral nerve recording using simulated human median nerve activity and CNNs,” Biomed. Eng. OnLine, vol. 22, no. 1, pp. 1–19, Dec. 2023.
15. Zermane, M. Z. M. Tohir, H. Zermane, M. R. Baharudin, and H. M. Yusoff, “Predicting fatal fall from heights accidents using RF classification ML model,” Saf. Sci., vol. 159, Mar. 2023, Art. no. 106023.
16. J. R. de Miras, A. J. Ibáñez-Molina, M. F. Soriano, and S. Iglesias-Parro, “Schizophrenia classification using ML on resting state EEG signal,” Biomed. Signal Process. Control, vol. 79, Jan. 2023, Art. no. 104233.
17. Akpojoto Siemuri; Kannan Selvan; Heidi Kuusniemi; Petri Valisuo; Mohammed S. Elmusrati, "A Systematic Review of ML Techniques for GNSS Use Cases," in IEEE Transactions on Aerospace and Electronic Systems, vol. 58, no. 6, pp. 5043-5077, Dec. 2022, doi: 10.1109/TAES.2022.3219366.
18. R. R. Kadhim and M. Y. Kamil, ‘‘Comparison of ML models for breast cancer diagnosis,’’ IAES Int. J. Artif. Intell. (IJ-AI), vol. 12, no. 1, p. 415, Mar. 2023.
19. G. Kumawat, S. K. Vishwakarma, P. Chakrabarti, P. Chittora, T. Chakrabarti, and J. C.-W. Lin, ‘‘Prognosis of cervical cancer disease by applying ML techniques,’’ J. Circuits, Syst. Comput., vol. 32, no. 1, Jan. 2023, Art. no. 2350019
20. S. Revathy, ‘‘CKD prediction using ML models,’’ Int. J. Eng. Adv. Technol., vol. 9, no. 1, pp. 6364–6367, 2019.
21. K. R. A. Padmanaban and G. Parthiban, ‘‘Applying ML techniques for predicting the risk of CKD,’’ Indian J. Sci. Technol., vol. 9, no. 29, pp. 1–6, Aug. 2016
22. V. Stepanyan, ‘‘Comparative analysis of ML methods for prediction of heart disease,’’ J. Mach. Manuf. Reliab., vol. 51, no. 8, pp. 789–799, 2022.
23. M. Alfadli and A. O. Almagrabi, ‘‘Feature-limited prediction on the UCI heart disease dataset,’’ Comput., Mater. Continua, vol. 74, no. 3, pp. 5871–5883, 2023
24. R. Huang, J. Liu, T. K. Wan, D. Siriwanna, Y. M. P. Woo, A. Vodencarevic, C. W. Wong, and K. H. K. Chan, ‘‘Stroke mortality prediction based on ensemble learning and the combination of structured and textual data,’’ Comput. Biol. Med., vol. 155, Mar. 2023, Art. no. 106176.
25. P. B. Dash, ‘‘Efficient ensemble learning based CatBoost approach for early-stage stroke risk prediction,’’ in Ambient Intelligence in Health Care: Proceedings of ICAIHC 2022. Singapore: Springer, 2022, pp. 475–483.
26. B. S. Ahamed, M. S. Arya, and A. O. V. Nancy, ‘‘Diabetes mellitus disease prediction using ML classifiers with oversampling and feature augmentation,’’ Adv. Hum.-Comput. Interact., vol. 2022, pp. 1–14, Sep. 2022.
27. P. Theerthagiri, A. U. Ruby, and J. Vidya, ‘‘Diagnosis and classification of the diabetes using ML algorithms,’’ Social Netw. Comput. Sci., vol. 4, no. 1, p. 72, Nov. 2022.
28. W. Chang, Y. Liu, Y. Xiao, X. Yuan, X. Xu, S. Zhang, and S. Zhou, ‘‘A machine-learning-based prediction method for hypertension outcomes based on medical data,’’ Diagnostics, vol. 9, no. 4, p. 178, Nov. 2019.
29. M. A. J. Tengnah, R. Sooklall, and S. D. Nagowah, ‘‘A predictive model for hypertension diagnosis using ML techniques,’’ in Telemedicine Technologies. Mauritius: Academic, 2019, pp. 139–152
30. X. Lu, L. Yuan, R. Li, Z. Xing, N. Yao, and Y. Yu, ‘‘An improved Bi-LSTM-based missing value imputation approach for pregnancy examination data,’’ Algorithms, vol. 16, no. 1, p. 12, Dec. 2022.
31. P. Muthulakshmi and M. Parveen, ‘‘Z-score normalized feature selection and iterative African buffalo optimization for effective heart disease prediction,’’ Int. J. Intell. Eng. Syst., vol. 16, no. 1, pp. 25–37, 2022.
32. F. Yang, K. Wang, L. Sun, M. Zhai, J. Song, and H. Wang, ‘‘A hybrid sampling algorithm combining synthetic minority oversampling technique and edited nearest neighbor for missed abortion diagnosis,’’ BMC Med. Informat. Decis. Making, vol. 22, no. 1, p. 344, Dec. 2022.
33. Shahadat Uddin, Ibtisham Haque, Haohui Lu, Mohammad Ali Moni & Ergun Gide. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, Volume 12, no. 6256, April 2022. https://doi.org/10.1038/s41598-022-10358-x
34. Omankwu, Obinnaya.C.B; Osodoeke, Efe Charlse; and Ubah, Valentine Ifeanyi, Disease Prediction Using RF ML Algorithm, NIPES - Journal of Science and Technology Research, Vol. 6 No. 1, pp. 164-173, February 2024, https://doi.org/10.5281/zenodo.11005308
35. Aishwarya Balakrishnan; Jeevan Medikonda; Pramod. K; Manikandan Natarajan, "Information Set-Based DT for Parkinson’s Disease Severity Assessment Using Multidimensional Gait Dataset," in IEEE Access, vol. 12, pp. 129187-129201, 2024, doi: 10.1109/ACCESS.2024.3456438.
36. Israa mohammed Hassoon, Boosting Learning Algorithms for Chronic Diseases Prediction: A Review, Iraqi Journal for Computers and Informatics, Vol. 50 No. 2, pp. 22-30, January 2024.
37. Angel J. Sánchez-García; Cuauhtémoc López-Martín; Alain Abran, "GB Optimized Through Differential Evolution for Predicting the Testing Effort of Software Projects," in IEEE Access, vol. 11, pp. 135235-135254, November 2023, doi: 10.1109/ACCESS.2023.3337809
38. Ajith Abraham; Bineet Kumar Gupta; Archana Sachindeo Maurya; Satya Bhushan Verma; Mohammad Husain; Arshad Ali, "NB Approach for Word Sense Disambiguation System With a Focus on Parts-of-Speech Ambiguity Resolution," in IEEE Access, vol. 12, pp. 126668-126678, September 2024, doi: 10.1109/ACCESS.2024.3453912.
39. Zahid Ahmed; Biju Issac; Sufal Das, "Ok-NB: An Enhanced OPTICS and k-Naive Bayes Classifier for Imbalance Classification With Overlapping," in IEEE Access, vol. 12, pp. 57458-57477, April 2024, doi: 10.1109/ACCESS.2024.3391749
40. Zewen Li; Fan Liu; Wenjie Yang; Shouheng Peng; Jun Zhou, "A Survey of CNNs: Analysis, Applications, and Prospects," in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 6999-7019, December 2022, doi: 10.1109/TNNLS.2021.3084827.
41. Dina Mohamed Ahmed Samir Elkahwagy, Caroline Joseph Kiriacos & Manar Mansour. LR and other statistical tools in diagnostic biomarker studies. Clinical and Translational Oncology, Volume 26, pp. 2172–2180, March 2024,. https://doi.org/10.1007/s12094-024-03413-8
42. Ping Liang, Jiannan Yang, Weilan Wang, Guanjie Yuan, Min Han, Qingpeng Zhang, "DL Identifies Intelligible Predictors of Poor Prognosis in CKD," IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 7, pp. 3677-3685, July 2023, doi: 10.1109/JBHI.2023.3266587
43. Venkatrao and S. Kareemulla, "HDLNET: A Hybrid DL Network Model With Intelligent IOT for Detection and Classification of CKD," in IEEE Access, vol. 11, pp. 99638-99652, September 2023, doi: 10.1109/ACCESS.2023.3312183.
44. Mohammed Aljaafari; Shorouk E. El-Deep; Amr A. Abohany; Shaymaa E. Sorour, "Integrating Innovation in Healthcare: The Evolution of “CURA’s” AI-Driven Virtual Wards for Enhanced Diabetes and Kidney Disease Monitoring," in IEEE Access, vol. 12, pp. 126389-126414, August 2024, doi: 10.1109/ACCESS.2024.3451369.
45. Shahinda Mohamed Mostafa Elkholy; Amira Rezk; Ahmed Abo El Fetoh Saleh, "Early Prediction of CKD Using Deep Belief Network," in IEEE Access, vol. 9, pp. 135542-135549, September 2021, doi: 10.1109/ACCESS.2021.3114306.
46. Peng-yi Hao, Zhen-yu Xu, Shu-yuan Tian, Fu-li Wu, Wei Chen, Jian Wu & Xiao-nan Luo. Texture branch network for CKD screening based on ultrasound images. Frontiers of Information Technology & Electronic Engineering, Volume 21, pp. 1161–1170 August 2020. https://doi.org/10.1631/FITEE.1900210.
47. Ebrahime Mohammed Senan, Mosleh Hmoud Al-Adhaileh, Fawaz Waselallah Alsaade, Theyazn H. H. Aldhyani, Ahmed Abdullah Alqarni, Nizar Alsharif, M. Irfan Uddin, Ahmed H. Alahmadi, Mukti E Jadhav, Mohammed Y. Alzahrani, "Diagnosis of CKD using effective classification algorithms and recursive feature elimination techniques", Journal of Healthcare Engineering, vol. 2021, pp. 1-10, Jun. 2021, https://doi.org/10.1155/2021/1004767.
48. Thomas Ferguson, Pietro Ravani, Manish M. Sood, Alix Clarke, Paul Komenda, Claudio Rigatto, Navdeep Tangri, “Development and External Validation of a ML Model for Progression of CKD,” Kidney International Reports, Volume 7, Issue 8, August 2022, pp. 1772-1781, https://doi.org/10.1016/j.ekir.2022.05.004.
49. Francesco Paolo Schena , Vito Walter Anelli , Joseph Trotta, Tommaso Di Noia, Carlo Manno, Giovanni Tripepi, Graziella D’Arrigo, Nicholas C. Chesnaye, Maria Luisa Russo, Maria Stangou, Aikaterini Papagianni, Carmine Zoccali, Vladimir Tesa, Rosanna Coppo, “Development and testing of an artificial intelligence tool for predicting end-stage kidney disease in patients with immunoglobulin A nephropathy,” Kidney International, Volume 99, Issue 5, May 2021, pp. 1179-1188, https://doi.org/10.1016/j.kint.2020.07.046.
50. Jing Xiao, Ruifeng Ding, Xiulin Xu, Haochen Guan, Xinhui Feng, Tao Sun, Sibo Zhu & Zhibin Ye. “Comparison and development of ML tools in the prediction of CKD progression,” Journal of Translational Medicine, Volume 17, no. 119, April 2019,. https://doi.org/10.1186/s12967-019-1860-0
51. Dina Saif, Amany M. Sarhan & Nada M. Elshennawy, “Early prediction of CKD based on ensemble of DL models and optimizers,” Journal of Electrical Systems and Information Technology, Volume 11, no. 17, April 2024), https://doi.org/10.1186/s43067-024-00142-4
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Raghavendra Srinivasaiah, Santosh Kumar Jankatti, Niranjana Shravanabelagola Jinachandra, Manjunath Ramanna Lamani, Ravikumar Hodikehosahally Channegowda

This work is licensed under a Creative Commons Attribution 4.0 International License.

