Toggle Main Menu Toggle Search

Open Access padlockePrints

Decoding Breast Cancer Mutational Signatures: A Hybrid ElasticNet–XGBoost Approach Using Gene Expression Data

Lookup NU author(s): Dr Anurag SharmaORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

© 2026, International Journal of Prognostics and Health Management. All rights reserved.TP53, PIK3CA, and MUC16 are somatic mutations that are useful in breast cancer progression and prognosis, but direct mutation profiling based on sequencing is not always practicable in practice. The data about gene expression can contain indirect transcriptomic patterns linked with mutational underlying states. This paper proposes an expression-based machine learning model to predict the status of mutations using METABRIC breast cancer cohort. Instead of directly estimating genetic changes, the suggested method estimates statistical relationships between transcriptomic phenotypes and binary somatic mutation states. A multi-stage gene features selection pipeline using variance filtering, mutual information ranking, and correlation pruning was used to reduce the number of genes (19,000). A hybrid predictive architecture was trained using these features that combined ElasticNet logistic regression and XGBoost that allowed balancing between linear regularization and nonlinear interaction modeling. The hybrid model with a combination of five-fold stratified cross validation yielded mean ROC-AUC of 0.94 (TP53), 0.92 (PIK3CA), and 0.90 (MUC16) with the stability of the calibration and equal error rates. Coefficient analysis and SHAP-based explanations were used to investigate the interpretability of the models to describe the expression patterns on mutation status. The suggested framework is a hypothesis-generating, complementary method of transcriptomic analysis, which must be reevaluated by external validation to determine the wider generalizability.


Publication metadata

Author(s): Porwal O, Upreti K, Kshirsagar PR, Panwar S, Sharma A, Radhakrishnan GV, Jain R

Publication type: Article

Publication status: Published

Journal: International Journal of Prognostics and Health Management

Year: 2026

Volume: 17

Issue: 1

Online publication date: 18/04/2026

Acceptance date: 02/04/2018

Date deposited: 08/06/2026

ISSN (electronic): 2153-2648

Publisher: Prognostics and Health Management Society

URL: https://doi.org/10.36001/ijphm.2026.v17i1.4714

DOI: 10.36001/ijphm.2026.v17i1.4714


Altmetrics

Altmetrics provided by Altmetric


Share