TY - JOUR
T1 - Quantitative evaluation of Saliency-Based Explainable artificial intelligence (XAI) methods in Deep Learning-Based mammogram analysis
AU - Cerekci, Esma
AU - Alis, Deniz
AU - Denizoglu, Nurper
AU - Camurdan, Ozden
AU - Ege Seker, Mustafa
AU - Ozer, Caner
AU - Hansu, Muhammed Yusuf
AU - Tanyel, Toygar
AU - Oksuz, Ilkay
AU - Karaarslan, Ercan
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/4
Y1 - 2024/4
N2 - Background: Explainable Artificial Intelligence (XAI) is prominent in the diagnostics of opaque deep learning (DL) models, especially in medical imaging. Saliency methods are commonly used, yet there's a lack of quantitative evidence regarding their performance. Objectives: To quantitatively evaluate the performance of widely utilized saliency XAI methods in the task of breast cancer detection on mammograms. Methods: Three radiologists drew ground-truth boxes on a balanced mammogram dataset of women (n = 1496 cancer-positive and negative scans) from three centers. A modified, pre-trained DL model was employed for breast cancer detection, using MLO and CC images. Saliency XAI methods, including Gradient-weighted Class Activation Mapping (Grad-CAM), Grad-CAM++, and Eigen-CAM, were evaluated. We utilized the Pointing Game to assess these methods, determining if the maximum value of a saliency map aligned with the bounding boxes, representing the ratio of correctly identified lesions among all cancer patients, with a value ranging from 0 to 1. Results: The development sample included 2,244 women (75%), with the remaining 748 women (25%) in the testing set for unbiased XAI evaluation. The model's recall, precision, accuracy, and F1-Score in identifying cancer in the testing set were 69%, 88%, 80%, and 0.77, respectively. The Pointing Game Scores for Grad-CAM, Grad-CAM++, and Eigen-CAM were 0.41, 0.30, and 0.35 in women with cancer and marginally increased to 0.41, 0.31, and 0.36 when considering only true-positive samples. Conclusions: While saliency-based methods provide some degree of explainability, they frequently fall short in delineating how DL models arrive at decisions in a considerable number of instances.
AB - Background: Explainable Artificial Intelligence (XAI) is prominent in the diagnostics of opaque deep learning (DL) models, especially in medical imaging. Saliency methods are commonly used, yet there's a lack of quantitative evidence regarding their performance. Objectives: To quantitatively evaluate the performance of widely utilized saliency XAI methods in the task of breast cancer detection on mammograms. Methods: Three radiologists drew ground-truth boxes on a balanced mammogram dataset of women (n = 1496 cancer-positive and negative scans) from three centers. A modified, pre-trained DL model was employed for breast cancer detection, using MLO and CC images. Saliency XAI methods, including Gradient-weighted Class Activation Mapping (Grad-CAM), Grad-CAM++, and Eigen-CAM, were evaluated. We utilized the Pointing Game to assess these methods, determining if the maximum value of a saliency map aligned with the bounding boxes, representing the ratio of correctly identified lesions among all cancer patients, with a value ranging from 0 to 1. Results: The development sample included 2,244 women (75%), with the remaining 748 women (25%) in the testing set for unbiased XAI evaluation. The model's recall, precision, accuracy, and F1-Score in identifying cancer in the testing set were 69%, 88%, 80%, and 0.77, respectively. The Pointing Game Scores for Grad-CAM, Grad-CAM++, and Eigen-CAM were 0.41, 0.30, and 0.35 in women with cancer and marginally increased to 0.41, 0.31, and 0.36 when considering only true-positive samples. Conclusions: While saliency-based methods provide some degree of explainability, they frequently fall short in delineating how DL models arrive at decisions in a considerable number of instances.
KW - Breast Cancer
KW - Deep Learning
KW - Mammogram
KW - XAI
UR - http://www.scopus.com/inward/record.url?scp=85185459618&partnerID=8YFLogxK
U2 - 10.1016/j.ejrad.2024.111356
DO - 10.1016/j.ejrad.2024.111356
M3 - Article
C2 - 38364587
AN - SCOPUS:85185459618
SN - 0720-048X
VL - 173
JO - European Journal of Radiology
JF - European Journal of Radiology
M1 - 111356
ER -