TY - JOUR
T1 - Diagnostic Accuracy of a Machine Learning-Derived Appendicitis Score in Children
T2 - A Multicenter Validation Study
AU - Aydın, Emrah
AU - Sarnıç, Taha Eren
AU - Türkmen, İnan Utku
AU - Khanmammadova, Narmina
AU - Ateş, Ufuk
AU - Öztan, Mustafa Onur
AU - Sekmenli, Tamer
AU - Aras, Necip Fazıl
AU - Öztaş, Tülin
AU - Yalçınkaya, Ali
AU - Özbek, Murat
AU - Gökçe, Deniz
AU - Yalçın Cömert, Hatice Sonay
AU - Uzunlu, Osman
AU - Kandırıcı, Aliye
AU - Ertürk, Nazile
AU - Süzen, Alev
AU - Akova, Fatih
AU - Paşaoğlu, Mehmet
AU - Eroğlu, Egemen
AU - Göllü Bahadır, Gülnur
AU - Çakmak, Ahmet Murat
AU - Bilici, Salim
AU - Karabulut, Ramazan
AU - İmamoğlu, Mustafa
AU - Sarıhan, Haluk
AU - Karakuş, Süleyman Cüneyt
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/7
Y1 - 2025/7
N2 - Background: Accurate diagnosis of acute appendicitis in children remains challenging due to variable presentations and limitations of existing clinical scoring systems. While machine learning (ML) offers a promising approach to enhance diagnostic precision, most prior studies have been limited by small sample sizes, single-center data, or a lack of external validation. Methods: This prospective, multicenter study included 8586 pediatric patients to develop a machine learning-based diagnostic model using routinely available clinical and hematological parameters. A separate, prospectively collected external validation cohort of 3000 patients was used to assess model performance. The Random Forest algorithm was selected based on its superior performance during model comparison. Diagnostic accuracy, sensitivity, specificity, Area Under Curve (AUC), and calibration metrics were evaluated and compared with traditional scoring systems such as Pediatric Appendicitis Score (PAS), Alvarado, and Appendicitis Inflammatory Response Score (AIRS). Results: The ML model outperformed traditional clinical scores in both development and validation cohorts. In the external validation set, the Random Forest model achieved an AUC of 0.996, accuracy of 0.992, sensitivity of 0.998, and specificity of 0.993. Feature-importance analysis identified white blood cell count, red blood cell count, and mean platelet volume as key predictors. Conclusions: This large, prospectively validated study demonstrates that a machine learning-based scoring system using commonly accessible data can significantly improve the diagnosis of pediatric appendicitis. The model offers high accuracy and clinical interpretability and has the potential to reduce diagnostic delays and unnecessary imaging.
AB - Background: Accurate diagnosis of acute appendicitis in children remains challenging due to variable presentations and limitations of existing clinical scoring systems. While machine learning (ML) offers a promising approach to enhance diagnostic precision, most prior studies have been limited by small sample sizes, single-center data, or a lack of external validation. Methods: This prospective, multicenter study included 8586 pediatric patients to develop a machine learning-based diagnostic model using routinely available clinical and hematological parameters. A separate, prospectively collected external validation cohort of 3000 patients was used to assess model performance. The Random Forest algorithm was selected based on its superior performance during model comparison. Diagnostic accuracy, sensitivity, specificity, Area Under Curve (AUC), and calibration metrics were evaluated and compared with traditional scoring systems such as Pediatric Appendicitis Score (PAS), Alvarado, and Appendicitis Inflammatory Response Score (AIRS). Results: The ML model outperformed traditional clinical scores in both development and validation cohorts. In the external validation set, the Random Forest model achieved an AUC of 0.996, accuracy of 0.992, sensitivity of 0.998, and specificity of 0.993. Feature-importance analysis identified white blood cell count, red blood cell count, and mean platelet volume as key predictors. Conclusions: This large, prospectively validated study demonstrates that a machine learning-based scoring system using commonly accessible data can significantly improve the diagnosis of pediatric appendicitis. The model offers high accuracy and clinical interpretability and has the potential to reduce diagnostic delays and unnecessary imaging.
KW - appendicitis
KW - clinical decision support
KW - diagnosis
KW - machine learning
KW - pediatrics
KW - random forest
UR - https://www.scopus.com/pages/publications/105011511522
U2 - 10.3390/children12070937
DO - 10.3390/children12070937
M3 - Article
AN - SCOPUS:105011511522
SN - 2227-9067
VL - 12
JO - Children
JF - Children
IS - 7
M1 - 937
ER -