Pandawa Logo

Scopus CiteScore 2024

4.8

Calculated on 05 May, 2025

SJR 2024

0.31

Powered by scimagojr.com

Language

Journal of Multidisciplinary Applied Natural Science

ISSN (eletronic): 2774-3047


Articles https://doi.org/10.47352/jmans.2774-3047.360

Variable Selection in Kernel Ridge Regression based on Sparrow Search Algorithm with Application QSAR Modeling

Zainab Modhfer Ali Al-Shabaki Zakariya Yahya Algamal

Author information

Zainab Modhfer Ali Al-Shabaki

https://orcid.org/0009-0005-8353-0481

Author information

Zakariya Yahya Algamal

https://orcid.org/0000-0002-0229-7958

Published in: March 01, 2026

[1]
Z. M. A. Al-Shabaki and Z. Y. Algamal, “Variable Selection in Kernel Ridge Regression based on Sparrow Search Algorithm with Application QSAR Modeling”, J. Multidiscip. Appl. Nat. Sci., Mar. 2026.

Abstract

Variable selection plays a critical role in enhancing the predictive accuracy, interpretability, and computational efficiency of kernel ridge regression (KRR) models, especially when applied to high-dimensional datasets such as those used in quantitative structure-activity relationship (QSAR) modeling. This study investigates improved binary sparrow bird search algorithm (BSSA) variants incorporating different transfer functions for variable selection in KRR. The performance of these variants was extensively evaluated on seven benchmark biopharmaceutical datasets with thousands of molecular descriptors, comparing their prediction accuracy, variable subset compactness, and computational cost against baseline KRR without variable selection. Results demonstrate that all BSSA variants significantly outperform KRR in terms of mean squared error (MSE) and coefficient of determination. The quadratic-BSSA (Q-BSSA) variant consistently achieved the best predictive performance, reducing MSE by up to 30% and increasing the coefficient of determination to values above 0.95 on several datasets while selecting the fewest variables, reflecting effective and parsimonious variable selection. Furthermore, BSSA variants substantially decreased the computational time required for model training compared to KRR, with Q-BSSA exhibiting the lowest runtime across datasets. Statistical validation using the Wilcoxon signed-rank test confirmed that all BSSA variants provided statistically significant improvements over KRR. The findings highlight the efficacy of sophisticated binary metaheuristic algorithms for variable selection in kernel-based models, underscoring their potential in computational chemistry and related domains where high-dimensionality and nonlinear interactions complicate predictive modeling.

References

Paper information