Causal-Invariant Multi-Criteria Feature Selection with Graph-Guided Filtering and Ensemble Classification for Robust Prostate Cancer Diagnosis across TCGA and GEO Platforms
DOI:
https://doi.org/10.3991/ijoe.v22i03.58593Keywords:
Prostate cancer classification, Gene expression analysis,, Invariant feature selection, Ensemble learning, Interpretability, Cross-platform reproducibilityAbstract
Prostate cancer remains a major cause of mortality, and reliable molecular diagnostics are needed to complement PSA testing and Gleason grading. Although machine learning (ML) and deep learning (DL) models show promise, they often lack cross-platform reproducibility and interpretability. We present an invariant gene-selection and ensemble-learning framework that combines statistical dependency measures, graph-based filtering, and three classifiers: elastic-net logistic regression, LightGBM, and a shallow attention-guided neural network. Trained on TCGA-PRAD and externally validated on GSE21034, the model achieved an AUC of 0.92, outperforming classical and deep learning baselines while offering improved calibration, robustness, and reduced cross-platform divergence. SHAP values and pathway enrichment highlighted key drivers (AR, KLK3, and MYC) and confirmed enrichment in androgen signaling, PI3K–AKT, MAPK, and DNA repair pathways. Overall, the integration of invariant feature selection with interpretable ensembling provides both strong predictive accuracy and biologically meaningful insight, supporting reproducible molecular diagnostics for prostate cancer.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 SARA HADDOU BOUAZZA

This work is licensed under a Creative Commons Attribution 4.0 International License.

