An Integrated Ensemble Learning Framework for Predicting Liver Disease




Liver disease, ensemble machine learning, preprocessing, feature selection


The liver disease has become a pressing global issue, with a sharp increase in cases reported worldwide. Detecting liver disease can be difficult as it often has few noticeable symptoms, which means that by the time it is detected, it may have already progressed to an advanced stage, resulting in many people dying without even realizing they had it. Early detection is crucial as it enables patients to begin treatment earlier, which can potentially save their lives. This study aimed to assess the efficacy of five ensemble machine learning (ML) models, namely RF, XGBoost, Extra Trees, bagging, and stacking methods, in predicting liver disease. It uses the ILPD dataset. To prevent overfitting and biases in the dataset, several pre-processing statistical techniques were employed to handle missing data, outliers, and data balancing. The study’s results underline the importance of using the RFE feature selection method, which allowed the use of only the most relevant features for the model, which may have improved the accuracy and efficiency of the model. The study found that the highest testing accuracy of 93% was achieved by the proposed model, which utilized an improved preprocessing approach and a stacking ensemble classifier with RFE feature selection. The use of ensemble ML has given promising results. Indeed, medical professionals can develop models better equipped to handle the complexity and variability of medical data, resulting in more accurate diagnoses, more effective treatment plans, and better patient outcomes.




How to Cite

Ardchir, S., Ouassit, Y. ., Ounacer, S., EL Ghoumari, M. Y. ., & Azzouazi, M. (2023). An Integrated Ensemble Learning Framework for Predicting Liver Disease. International Journal of Online and Biomedical Engineering (iJOE), 19(13), pp. 138–152.