TY - JOUR
T1 - Modifying linearly non-separable support vector machine binary classifier to account for the centroid mean vector
AU - Al-Shukeili, Mubarak
AU - Wesonga, Ronald
N1 - Funding Information:
There is currently no financial support for this research.
Publisher Copyright:
© 2023 The Korean Statistical Society, and Korean International Statistical Society. All rights reserved.
PY - 2023/5/31
Y1 - 2023/5/31
N2 - This study proposes a modification to the objective function of the support vector machine for the linearly non-separable case of a binary classifier yi ∈ {-1, 1}. The modification takes into account the position of each data item xi from its corresponding class centroid. The resulting optimization function involves the centroid mean vector, and the spread of data besides the support vectors, which should be minimized by the choice of hyper-plane β. Theoretical assumptions have been tested to derive an optimal separable hyperplane that yields the minimal misclassification rate. The proposed method has been evaluated using simulation studies and reallife COVID-19 patient outcome hospitalization data. Results show that the proposed method performs better than the classical linear SVM classifier as the sample size increases and is preferred in the presence of correlations among predictors as well as among extreme values.
AB - This study proposes a modification to the objective function of the support vector machine for the linearly non-separable case of a binary classifier yi ∈ {-1, 1}. The modification takes into account the position of each data item xi from its corresponding class centroid. The resulting optimization function involves the centroid mean vector, and the spread of data besides the support vectors, which should be minimized by the choice of hyper-plane β. Theoretical assumptions have been tested to derive an optimal separable hyperplane that yields the minimal misclassification rate. The proposed method has been evaluated using simulation studies and reallife COVID-19 patient outcome hospitalization data. Results show that the proposed method performs better than the classical linear SVM classifier as the sample size increases and is preferred in the presence of correlations among predictors as well as among extreme values.
KW - centroid mean vector
KW - linearly nonseparable
KW - misclassification rate
KW - quadratic cost function
KW - support vector machine
UR - http://www.scopus.com/inward/record.url?scp=85161541772&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161541772&partnerID=8YFLogxK
U2 - 10.29220/CSAM.2023.30.3.245
DO - 10.29220/CSAM.2023.30.3.245
M3 - Article
AN - SCOPUS:85161541772
SN - 2287-7843
VL - 30
SP - 245
EP - 258
JO - Communications for Statistical Applications and Methods
JF - Communications for Statistical Applications and Methods
IS - 3
ER -