Building up lexical sample dataset for Turkish word sense disambiguation

Bahar Ilgen*, Eşref Adali, A. Cüneyd Tantuǧ

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

Word Sense Disambiguation (WSD) has become even more important research area in recent years with the widespread usage of Natural Language Processing (NLP) applications. WSD task has two variants: "Lexical Sample" and "All Words" approaches. Lexical Sample approach disambiguates the occurrences of a small sample of target words that were previously selected, while in the latter all the words in a piece of text are disambiguated. In the scope of this work, a Lexical Sample Dataset for Turkish has been prepared. As a first step, highly ambiguous words in Turkish have been selected. Collection of text samples for chosen words has been completed. Five taggers have annotated the word senses. This paper summarizes the step-by-step building-up process of a Lexical Sample Dataset in Turkish and presents the results of some experiments on it.

Original languageEnglish
Title of host publicationINISTA 2012 - International Symposium on INnovations in Intelligent SysTems and Applications
DOIs
Publication statusPublished - 2012
EventInternational Symposium on INnovations in Intelligent SysTems and Applications, INISTA 2012 - Trabzon, Turkey
Duration: 2 Jul 20124 Jul 2012

Publication series

NameINISTA 2012 - International Symposium on INnovations in Intelligent SysTems and Applications

Conference

ConferenceInternational Symposium on INnovations in Intelligent SysTems and Applications, INISTA 2012
Country/TerritoryTurkey
CityTrabzon
Period2/07/124/07/12

Keywords

  • Feature Selection
  • Lexical Sample
  • Machine Learning
  • Natural Language Processing
  • Word Sense Disambiguation

Fingerprint

Dive into the research topics of 'Building up lexical sample dataset for Turkish word sense disambiguation'. Together they form a unique fingerprint.

Cite this