Gamified Crowd-sourcing for Word Sense Disambiguation of Turkish

  • Dilara Torunoğlu Selamet*
  • , Ali Şentaş
  • , Gülşen Eryiğit
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Word sense disambiguation (WSD) is the process of determining the correct meaning of a word based on its context in a sentence, a task that remains one of the core challenges in natural language processing (NLP) despite the advancements made by LLMs. Although LLMs have improved WSD performance, challenges remain–especially in nuanced contexts, low-resource languages, and domain-specific terminology. These gaps can hinder the accuracy of LLMs in high-stakes applications and multilingual settings. In response to the need for WSD, our study introduces a novel approach to WSD data collection through gamified crowdsourcing, which, to our knowledge, has not been previously applied in this field. A messaging bot method has been used to engage a wide, diverse audience to generate high-quality WSD data. Using a multiplayer game format, native speakers provide examples for different senses of ambiguous words and rate others’ contributions. Together with the suggested enhancements, this platform attracted a diverse range of participants–spanning ages, backgrounds, and genders–20 times more than a similar crowdsourcing method applied to a different NLP task, and sustained their engagement over an extended period. Unlike conventional academic crowdsourcing pools, this approach focuses on drawing individuals from varied backgrounds beyond academia or AI-focused communities. It not only gathered data but also encouraged participants to evaluate each other’s contributions, creating a rich and reliable dataset. Our findings suggest that gamified crowdsourcing can be a powerful tool for creating WSD corpora. Our approach not only supports future WSD research but also contributes valuable training data for developing more precise and resilient language models.

Original languageEnglish
Article number130
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume24
Issue number11
DOIs
Publication statusPublished - 17 Nov 2025

Bibliographical note

Publisher Copyright:
© 2025 Copyright held by the owner/author(s).

Keywords

  • Crowdsourcing
  • game with a purpose (GWAP)
  • gamification
  • language resources
  • word sense disambiguation

Fingerprint

Dive into the research topics of 'Gamified Crowd-sourcing for Word Sense Disambiguation of Turkish'. Together they form a unique fingerprint.

Cite this