Comparison of semi-automatic and deep learning-based automatic methods for liver segmentation in living liver transplant donors

A. Emre Kavur, Naciye Sinem Gezer, Mustafa Barış, Yusuf Şahin, Savaş Özkan, Bora Baydar, Ulaş Yüksel, Çağlar Kılıkçıer, Şahin Olut, Gözde Bozdağı Akar, Gözde Ünal, Oğuz Dicle, M. Alper Selver*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

63 Citations (Scopus)

Abstract

PURPOSE We aimed to compare the accuracy and repeatability of emerging machine learning-based (i.e., deep learning) automatic segmentation algorithms with those of well-established interactive semi-automatic methods for determining liver volume in living liver transplant donors at computed tomography (CT) imaging. METHODS A total of 12 methods (6 semi-automatic, 6 full-automatic) were evaluated. The semi-automatic segmentation algorithms were based on both traditional iterative models including watershed, fast marching, region growing, active contours and modern techniques including robust statistics segmenter and super-pixels. These methods entailed some sort of interaction mechanism such as placing initialization seeds on images or determining a parameter range. The automatic methods were based on deep learning and included three framework templates (DeepMedic, NiftyNet and U-Net), the first two of which were applied with default parameter sets and the last two involved adapted novel model designs. For 20 living donors (8 training and 12 test data-sets), a group of imaging scientists and radiologists created ground truths by performing manual segmentations on contrast-enhanced CT images. Each segmentation was evaluated using five metrics (i.e., volume overlap and relative volume errors, average/root-mean-square/maximum symmetrical surface distances). The results were mapped to a scoring system and a final grade was calculated by taking their average. Accuracy and repeatability were evaluated using slice-by-slice comparisons and volumetric analysis. Diversity and complementarity were observed through heatmaps. Majority voting (MV) and simultaneous truth and performance level estimation (STAPLE) algorithms were utilized to obtain the fusion of the individual results. RESULTS The top four methods were automatic deep learning models, with scores of 79.63, 79.46, 77.15, and 74.50. Intra-user score was determined as 95.14. Overall, automatic deep learning segmentation outperformed interactive techniques on all metrics. The mean volume of liver of ground truth was 1409.93±271.28 mL, while it was calculated as 1342.21±231.24 mL using automatic and 1201.26±258.13 mL using interactive methods, showing higher accuracy and less variation with automatic methods. The qualitative analysis of segmentation results showed significant diversity and complementarity, enabling the idea of using ensembles to obtain superior results. The fusion score of automatic methods reached 83.87 with MV and 86.20 with STAPLE, which were only slightly less than fusion of all methods (MV, 86.70) and (STAPLE, 88.74). CONCLUSION Use of the new deep learning-based automatic segmentation algorithms substantially increases the accuracy and repeatability for segmentation and volumetric measurements of liver. Fusion of automatic methods based on ensemble approaches exhibits best results with almost no additional time cost due to potential parallel execution of multiple models.

Original languageEnglish
Pages (from-to)11-21
Number of pages11
JournalDiagnostic and Interventional Radiology
Volume26
Issue number1
DOIs
Publication statusPublished - 2020

Bibliographical note

Publisher Copyright:
© Turkish Society of Radiology 2020.

Fingerprint

Dive into the research topics of 'Comparison of semi-automatic and deep learning-based automatic methods for liver segmentation in living liver transplant donors'. Together they form a unique fingerprint.

Cite this