Abstract
As with all natural language processing tasks, the lack of open-source training data required for the development of dialogue agents is a major obstacle to research studies in the field. Especially languages that are not widely studied, such as Turkish, suffer more from this problem. This article introduces a comparison of Wizard-of-Oz and self-play data collection techniques for Turkish goal-oriented dialogue system generation. Three data sets have been prepared and introduced to the researchers by using these techniques. Being the first publicly available human-to-human Turkish dialogue data sets, although open for development, the created resources from the restaurant domain are very valuable for further research on Turkish dialogue systems. The mentioned methods are quantitatively compared on the produced data sets, in terms of dialog act classification and slot identification scores. Since it is costly to collect data with methods like Wizard-Of-Oz in every domain, an open-source flexible and easy-to-use framework is also provided implementing self-play which may be used to create machine-to-machine dialogue outlines and speed data collection for low-resource languages like Turkish. Besides, designed templates of annotation screens for crowdsourcing are provided for future studies.
Original language | English |
---|---|
Title of host publication | 2021 International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2021 - Proceedings |
Editors | Zeynep Hilal Kilimci, Tulay Yildirim, Vincenzo Piuri, Ireneusz Czarnowski, David Camacho, Yannis Manolopoulos, Serdar Solak |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781665436038 |
DOIs | |
Publication status | Published - 25 Aug 2021 |
Event | 2021 International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2021 - Kocaeli, Turkey Duration: 25 Aug 2021 → 27 Aug 2021 |
Publication series
Name | 2021 International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2021 - Proceedings |
---|
Conference
Conference | 2021 International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2021 |
---|---|
Country/Territory | Turkey |
City | Kocaeli |
Period | 25/08/21 → 27/08/21 |
Bibliographical note
Publisher Copyright:© 2021 IEEE.
Keywords
- Goal-oriented dialogue agent
- Self-play
- Wizard-of-oz