Toggle Main Menu Toggle Search

Open Access padlockePrints

Enhancing Sampling Performance in XGBoost by Ensemble Feature Engineering

Lookup NU author(s): Dr Varun OjhaORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

Feature engineering is crucial in enhancing model performance, yet effectively combining multiple feature transformations to maximize their benefits remains a key challenge. In this study, we propose an innovative approach that integrates various feature engineering techniques within the boosting steps of the XGBoost algorithm and adapts the gradient-based one-sided sampling, forming an enhanced classifier named Feat-XGBoost. Feat-XGBoost aims to improve data representation and separation in model learning by iteratively applying feature transformations. We evaluated this approach on 61 diverse datasets, comparing its performance with 12 baseline classifiers, including standard XGBoost. The results show that Feat-XGBoost achieved improved accuracy in 36 datasets, with a notable increase in accuracy of 0.31 in the Balloon dataset and 13.5% on the hill-valley dataset. Across 61 datasets, the method demonstrates an average accuracy increase of 0.9080%, highlighting its effectiveness in enhancing model performance. These findings indicate that integrating multiple feature engineering strategies within the boosting framework can yield significant gains in model accuracy and robustness. We propose a simple ensemble, the Mix-XGBoost classifier, which selects the final classifier based on validation results from both the Feat-XGBoost and the baseline model. The results indicate that Mix-XGBoost enhances performance by leveraging the strengths of both classifiers. The source code will be publicly accessible after acceptance at https://github.com/lingping-fuzzy.


Publication metadata

Author(s): Kong L, Suganthan PN, Snášel V, Ojha V, Pan JS

Publication type: Article

Publication status: Published

Journal: Pattern Recognition

Year: 2026

Volume: 176

Online publication date: 28/01/2026

Acceptance date: 26/01/2026

Date deposited: 26/01/2026

ISSN (print): 0031-3203

ISSN (electronic): 1873-5142

Publisher: Elsevier BV

URL: https://doi.org/10.1016/j.patcog.2026.113169

DOI: 10.1016/j.patcog.2026.113169

ePrints DOI: 10.57711/fq87-7f04


Altmetrics

Altmetrics provided by Altmetric


Funding

Funder referenceFunder name
EP/Y028813/1
European Union under the REFRESH—project number CZ.10.03.01/00/22 /0000048
EPSRC
European Union’s HORIZON EUROPE
Ministry of Education, Youth, and Sports of the Czech Republic
Qatar National Library

Share