Comparison of semantic and single term similarity measures for clustering Turkish documents

Bülent Yücesoy*, Şule Gündüz Öǧüdücü

*Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: ???type-name???Konferans katkısıbilirkişi

5 Atıf (Scopus)

Özet

With the rapid growth of the World Wide Web (www), it becomes a critical issue to design and organize the vast amounts of on-line documents on the web according to their topic. Even for the search engines it is very important to group similar documents in order to improve their performance when a query is submitted to the system. Clustering is useful for taxonomy design and similarity search of documents on such a domain. Similarity is fundamental to many clustering applications on hypertext. In this paper, we will study how measures of similarity are used to cluster a collection of documents on a web site. Most of the document clustering techniques rely on single term analysis of text, such as vector space model. To better group of related documents we propose a new semantic similarity measure. We compare our measure with Wu-Palmer similarity and cosine similarity. Experimental results show that cosine similarity perform better than the semantic similarities. We demonstrate our results on Turkish documents. This is a first study that considers the semantic similarities between Turkish documents.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıProceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007
Sayfalar393-398
Sayfa sayısı6
DOI'lar
Yayın durumuYayınlandı - 2007
Etkinlik6th International Conference on Machine Learning and Applications, ICMLA 2007 - Cincinnati, OH, United States
Süre: 13 Ara 200715 Ara 2007

Yayın serisi

AdıProceedings - 6th International Conference on Machine Learning and Applications, ICMLA 2007

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???6th International Conference on Machine Learning and Applications, ICMLA 2007
Ülke/BölgeUnited States
ŞehirCincinnati, OH
Periyot13/12/0715/12/07

Parmak izi

Comparison of semantic and single term similarity measures for clustering Turkish documents' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap