Ana gezinime geç Aramaya geç Ana içeriğe geç

A comparative study to determine the effective window size of Turkish word sense disambiguation systems

  • Bahar Ilgen*
  • , Eşref Adali
  • , A. Cüneyd Tantuǧ
  • *Bu çalışma için yazışmadan sorumlu yazar
  • Istanbul Kultur University
  • Istanbul Technical University

Araştırma sonucu: Kitap/Rapor/Konferans Bildirisinde BölümKonferans katkısıbilirkişi

1 Atıf (Scopus)

Özet

In this paper, the effect of different windowing schemes on word sense disambiguation accuracy is presented. Turkish Lexical Sample Dataset has been used in the experiments. We took the samples of ambiguous verbs and nouns of the dataset and used bag-of-word properties as context information. The experi-ments have been repeated for different window sizes based on several machine learning algorithms. We follow 2/3 splitting strategy (2/3 for training, 1/3 for test-ing) and determine the most frequently used words in the training part. After re-moving stop words, we repeated the experiments by using most frequent 100, 75, 50 and 25 content words of the training data. Our findings show that the usage of most frequent 75 words as features improves the accuracy in results for Turkish verbs. Similar results have been obtained for Turkish nouns when we use the most frequent 100 words of the training set. Considering this information, selected al-gorithms have been tested on varying window sizes {30, 15, 10 and 5}. Our find-ings show that Naïve Bayes and Functional Tree methods yielded better accuracy results. And the window size 5 gives the best average results both for noun and the verb groups. It is observed that the best results of the two groups are 65.8 and 56 % points above the most frequent sense baseline of the verb and noun groups respectively.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıInformation Sciences and Systems 2013 - Proceedings of the 28th International Symposium on Computer and Information Sciences
YayınlayanSpringer Verlag
Sayfalar169-176
Sayfa sayısı8
ISBN (Basılı)9783319016030
DOI'lar
Yayın durumuYayınlandı - 2014
Etkinlik28th International Symposium on Computer and Information Sciences, ISCIS 2013 - Paris, France
Süre: 28 Eki 201329 Eki 2013

Yayın serisi

AdıLecture Notes in Electrical Engineering
Hacim264 LNEE
ISSN (Basılı)1876-1100
ISSN (Elektronik)1876-1119

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???28th International Symposium on Computer and Information Sciences, ISCIS 2013
Ülke/BölgeFrance
ŞehirParis
Periyot28/10/1329/10/13

Parmak izi

A comparative study to determine the effective window size of Turkish word sense disambiguation systems' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap