TY - JOUR
T1 - Multimodal Detection and Classification of Robot Manipulation Failures
AU - Inceoglu, Arda
AU - Aksoy, Eren Erdal
AU - Sariel, Sanem
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2024/2/1
Y1 - 2024/2/1
N2 - An autonomous service robot should be able to interact with its environment safely and robustly without requiring human assistance. Unstructured environments are challenging for robots since the exact prediction of outcomes is not always possible. Even when the robot behaviors are well-designed, the unpredictable nature of the physical robot-object interaction may lead to failures in object manipulation. In this letter, we focus on detecting and classifying both manipulation and post-manipulation phase failures using the same exteroception setup. We cover a diverse set of failure types for primary tabletop manipulation actions. In order to detect these failures, we propose FINO-Net (Inceoglu et al., 2021), a deep multimodal sensor fusion-based classifier network architecture. FINO-Net accurately detects and classifies failures from raw sensory data without any additional information on task description and scene state. In this work, we use our extended FAILURE dataset (Inceoglu et al., 2021) with 99 new multimodal manipulation recordings and annotate them with their corresponding failure types. FINO-Net achieves 0.87 failure detection and 0.80 failure classification F1 scores. Experimental results show that FINO-Net is also appropriate for real-time use.
AB - An autonomous service robot should be able to interact with its environment safely and robustly without requiring human assistance. Unstructured environments are challenging for robots since the exact prediction of outcomes is not always possible. Even when the robot behaviors are well-designed, the unpredictable nature of the physical robot-object interaction may lead to failures in object manipulation. In this letter, we focus on detecting and classifying both manipulation and post-manipulation phase failures using the same exteroception setup. We cover a diverse set of failure types for primary tabletop manipulation actions. In order to detect these failures, we propose FINO-Net (Inceoglu et al., 2021), a deep multimodal sensor fusion-based classifier network architecture. FINO-Net accurately detects and classifies failures from raw sensory data without any additional information on task description and scene state. In this work, we use our extended FAILURE dataset (Inceoglu et al., 2021) with 99 new multimodal manipulation recordings and annotate them with their corresponding failure types. FINO-Net achieves 0.87 failure detection and 0.80 failure classification F1 scores. Experimental results show that FINO-Net is also appropriate for real-time use.
KW - Deep learning methods
KW - data sets for robot learning
KW - failure detection and recovery
KW - sensor fusion
UR - http://www.scopus.com/inward/record.url?scp=85181561810&partnerID=8YFLogxK
U2 - 10.1109/LRA.2023.3346270
DO - 10.1109/LRA.2023.3346270
M3 - Article
AN - SCOPUS:85181561810
SN - 2377-3766
VL - 9
SP - 1396
EP - 1403
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 2
ER -