Improving Penalized Logistic Regression Model with Missing Values in High-Dimensional Data
DOI:
https://doi.org/10.3991/ijoe.v18i02.25047Keywords:
High-dimensional data, feature selection, missing data, multiple imputations, penalized regression.Abstract
Analysis without adequate handling of missing values may lead to inconsistent and biased estimates. Despite multiple imputations becoming a widely used approach in handling missing data, manuscript researchers generally encounter missing data in their respective studies. In high-dimensional data, penalized regression is a popular technique for performing feature selection and coefficient estimation simultaneously. However, one of the most vital issues with high-dimensional data is that it often contains large quantities of missing data that common multiple imputation approaches may not work correctly. Therefore, this study uses imputations penalized regression models as an extension of the penalized methods to improve the performance and impute missing values in high-dimensional data. The method was applied to real-life high-dimensional datasets for the different number of features, sample sizes, and missing dataset rates to evaluate its efficiency. The method was also compared with other existing imputation penalized methods for high-dimensional data. The comparative experimental results indicate that the proposed method outperforms its competitors by achieving higher sensitivity, specificity, and classification accuracy values.
Downloads
Published
2022-02-16
How to Cite
Alharthi, A. M., Lee, M. H., & Algamal, Z. Y. (2022). Improving Penalized Logistic Regression Model with Missing Values in High-Dimensional Data. International Journal of Online and Biomedical Engineering (iJOE), 18(02), pp. 40–54. https://doi.org/10.3991/ijoe.v18i02.25047
Issue
Section
Papers
License
Copyright (c) 2022 Aiedh Mrisi Alharthi, Muhammad Hisyam Lee, Zakariya Yahya Algamal
This work is licensed under a Creative Commons Attribution 4.0 International License.