TY - JOUR
T1 - Prediction of Heart Disease Using a Hybrid XGBoost-GA Algorithm with Principal Component Analysis
T2 - A Real Case Study
AU - Ozcan, Tuncay
AU - Ozmen, Ebru Pekel
N1 - Publisher Copyright:
© 2023 World Scientific Publishing Company.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - Cardiovascular diseases are one of the most common causes of death in the world. At this point, early diagnosis of heart diseases is critically important. The aim of this study is to predict the heart disease using feature selection, classification and optimization algorithms. Firstly, principal component analysis (PCA) is used to create the feature selection model and to determine the effective attributes. Then, Extreme Gradient Boosting (XGBoost) classification model is proposed to predict the heart disease. Finally, genetic algorithm (GA) is applied to optimize the parameters of XGBoost to improve the classification accuracy. The developed hybrid PCA-XGBoost-GA approach is compared with XGBoost, PCA-XGBoost, XGBoost-GA, artificial neural network (ANN) and support vector machine (SVM). The effectiveness of these approaches is illustrated with a case study with the actual data taken from a university hospital in Turkey. The numerical results show that the proposed PCA-XGBoost-GA model outperforms the other classification models in terms of accuracy rate, recall, precision and F-measure. Moreover, feature selection and parameter optimization improve the classification performance of the XGBoost model.
AB - Cardiovascular diseases are one of the most common causes of death in the world. At this point, early diagnosis of heart diseases is critically important. The aim of this study is to predict the heart disease using feature selection, classification and optimization algorithms. Firstly, principal component analysis (PCA) is used to create the feature selection model and to determine the effective attributes. Then, Extreme Gradient Boosting (XGBoost) classification model is proposed to predict the heart disease. Finally, genetic algorithm (GA) is applied to optimize the parameters of XGBoost to improve the classification accuracy. The developed hybrid PCA-XGBoost-GA approach is compared with XGBoost, PCA-XGBoost, XGBoost-GA, artificial neural network (ANN) and support vector machine (SVM). The effectiveness of these approaches is illustrated with a case study with the actual data taken from a university hospital in Turkey. The numerical results show that the proposed PCA-XGBoost-GA model outperforms the other classification models in terms of accuracy rate, recall, precision and F-measure. Moreover, feature selection and parameter optimization improve the classification performance of the XGBoost model.
KW - Heart disease diagnosis
KW - classification
KW - extreme gradient boosting
KW - genetic algorithm
KW - principal component analysis
UR - http://www.scopus.com/inward/record.url?scp=85152953691&partnerID=8YFLogxK
U2 - 10.1142/S0218213023400092
DO - 10.1142/S0218213023400092
M3 - Article
AN - SCOPUS:85152953691
SN - 0218-2130
VL - 32
JO - International Journal on Artificial Intelligence Tools
JF - International Journal on Artificial Intelligence Tools
IS - 2
M1 - 2340009
ER -