TY - GEN
T1 - A Filter-Based Feature Selection and Ranking Approach to Enhance Genetic Programming for High-Dimensional Data Analysis
AU - Khorshidi, Mohammad Sadegh
AU - Yazdani, Danial
AU - Mandziuk, Jacek
AU - Nikoo, Mohammad Reza
AU - Gandomi, Amir H.
N1 - Publisher Copyright:
© 2023 IEEE.
DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2023/7/1
Y1 - 2023/7/1
N2 - Genetic programming (GP), as a predictive data analytic tool, has difficulties dealing with high-dimensional problems. Therefore, some GP variants have been proposed for this type of problem, such as multi-stage GP (MSGP). Filter-based feature selection is commonly used in the literature for various machine learning purposes. However, its application for GP is overlooked due to GP's capability to operate as a wrapper-based feature selection while trying to find an optimal expression of the target variable via a functional combination of predictors. The effectiveness of wrapper- and filer-based feature selection approaches in machine learning has been the subject of a long-standing debate in the literature. This study aims to introduce an efficient feature selection approach and couple it with MSGP in order to handle high-dimensional problems. In addition, the stages of the GP are systematically ordered based on the variables' information. The proposed approach is tested against five real high-dimensional datasets. The results show that GP's inherent wrapper feature selection ability can be advanced further by using a filter-based feature selection approach to shrink the search space, which results in improving computational costs, expression complexity and the accuracy of MSGP.
AB - Genetic programming (GP), as a predictive data analytic tool, has difficulties dealing with high-dimensional problems. Therefore, some GP variants have been proposed for this type of problem, such as multi-stage GP (MSGP). Filter-based feature selection is commonly used in the literature for various machine learning purposes. However, its application for GP is overlooked due to GP's capability to operate as a wrapper-based feature selection while trying to find an optimal expression of the target variable via a functional combination of predictors. The effectiveness of wrapper- and filer-based feature selection approaches in machine learning has been the subject of a long-standing debate in the literature. This study aims to introduce an efficient feature selection approach and couple it with MSGP in order to handle high-dimensional problems. In addition, the stages of the GP are systematically ordered based on the variables' information. The proposed approach is tested against five real high-dimensional datasets. The results show that GP's inherent wrapper feature selection ability can be advanced further by using a filter-based feature selection approach to shrink the search space, which results in improving computational costs, expression complexity and the accuracy of MSGP.
KW - Data Analytics
KW - Feature Ranking
KW - Feature Selection
KW - High-Dimensional Data
KW - Information Theory
KW - Multi-Stage Genetic Programming
UR - http://www.scopus.com/inward/record.url?scp=85174508312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174508312&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/b9825706-4aab-3e6a-9460-1cac8bad87b7/
U2 - 10.1109/cec53210.2023.10254048
DO - 10.1109/cec53210.2023.10254048
M3 - Conference contribution
AN - SCOPUS:85174508312
SN - 9798350314588
T3 - 2023 IEEE Congress on Evolutionary Computation (CEC)
SP - 1
EP - 9
BT - 2023 IEEE Congress on Evolutionary Computation, CEC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE Congress on Evolutionary Computation, CEC 2023
Y2 - 1 July 2023 through 5 July 2023
ER -