Forecasting of Post-Graduate Students’ Late Dropout Based on the Optimal Probability Threshold Adjustment Technique for Imbalanced Data
DOI:
https://doi.org/10.3991/ijet.v18i04.34825Keywords:
optimal likelihood threshold,, imbalanced data, student dropout prediction, resample techniques, distance learning coursesAbstract
The purpose of this research article was to contrast the benefits of the optimal probability threshold adjustment technique with other imbalanced data processing techniques, in its application to the prediction of post-graduate students’ late dropout from distance learning courses in two universities in the Ibero-American space. In this context, the optimization of the Logistic Regression, Random Forest, and Neural Network classifiers, together with different techniques, attributes, and algorithms (Hyperparameters, SMOTE, SMOTE_SVM, and ADASYN) resulted in a set of metrics for decision-making, prioritizing the reduction of false negatives. The best model was the Neural Network model in combination with SMOTE_SVM, obtaining a recall index of 0.75 and an f1-Score of 0.60. Likewise, the robustness of the Random Forest classifier for imbalanced data was demonstrated by achieving, with an optimal threshold of 0.427, very similar metrics to those obtained by the consensus of the three best models found. This demonstrates that, for Random Forest, the optimal prediction probability threshold is an excellent alternative to resampling techniques with different optimal thresholds. Finally, it is hoped that this research paper will contribute to boost the application of this simple but powerful technique, which is highly underrated with respect to data resampling techniques for imbalanced data.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Msc. Carmen Lili Rodríguez Velasco, Dr. Eduardo García Villena, Msc. Julien Brito Ballester, Dr. Frigdiano Álvaro Durántez Prados, Dr. Eduardo Silva Alvarado, Dr. Jorge Crespo Álvarez
This work is licensed under a Creative Commons Attribution 4.0 International License.