Abstract
Word sense disambiguation (WSD) is the process of determining the correct meaning of a word based on its context in a sentence, a task that remains one of the core challenges in natural language processing (NLP) despite the advancements made by LLMs. Although LLMs have improved WSD performance, challenges remain–especially in nuanced contexts, low-resource languages, and domain-specific terminology. These gaps can hinder the accuracy of LLMs in high-stakes applications and multilingual settings. In response to the need for WSD, our study introduces a novel approach to WSD data collection through gamified crowdsourcing, which, to our knowledge, has not been previously applied in this field. A messaging bot method has been used to engage a wide, diverse audience to generate high-quality WSD data. Using a multiplayer game format, native speakers provide examples for different senses of ambiguous words and rate others’ contributions. Together with the suggested enhancements, this platform attracted a diverse range of participants–spanning ages, backgrounds, and genders–20 times more than a similar crowdsourcing method applied to a different NLP task, and sustained their engagement over an extended period. Unlike conventional academic crowdsourcing pools, this approach focuses on drawing individuals from varied backgrounds beyond academia or AI-focused communities. It not only gathered data but also encouraged participants to evaluate each other’s contributions, creating a rich and reliable dataset. Our findings suggest that gamified crowdsourcing can be a powerful tool for creating WSD corpora. Our approach not only supports future WSD research but also contributes valuable training data for developing more precise and resilient language models.
| Original language | English |
|---|---|
| Article number | 130 |
| Journal | ACM Transactions on Asian and Low-Resource Language Information Processing |
| Volume | 24 |
| Issue number | 11 |
| DOIs | |
| Publication status | Published - 17 Nov 2025 |
Bibliographical note
Publisher Copyright:© 2025 Copyright held by the owner/author(s).
Keywords
- Crowdsourcing
- game with a purpose (GWAP)
- gamification
- language resources
- word sense disambiguation
Fingerprint
Dive into the research topics of 'Gamified Crowd-sourcing for Word Sense Disambiguation of Turkish'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver