Disambiguating main POS tags for Turkish

Razieh Ehsani, Muzaffer Ege Alper, Gülşen Eryiǧit, Eşref Adali

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

This paper presents the results of main part-of-speech tagging of Turkish sentences using Conditional Random Fields (CRFs). Although CRFs are applied to many different languages for part-of-speech (POS) tagging, Turkish poses interesting challenges to be modeled with them. The challenges include issues related to the statistical model of the problem as well as issues related to computational complexity and scaling. In this paper, we propose a novel model for main-POS tagging in Turkish. Furthermore, we propose some approaches to reduce the computational complexity and allow better scaling characteristics or improve the performance without increased complexity. These approaches are discussed with respect to their advantages and disadvantages. We show that the best approach is competitive with the current state of the art in accuracy and also in training and test durations. The good results obtained imply a good first step towards full morphological disambiguation.

Original languageEnglish
Title of host publicationProceedings of the 24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012
Pages202-213
Number of pages12
Publication statusPublished - 2012
Event24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012 - Chung-Li, Taiwan, Province of China
Duration: 21 Sept 201222 Sept 2012

Publication series

NameProceedings of the 24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012

Conference

Conference24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012
Country/TerritoryTaiwan, Province of China
CityChung-Li
Period21/09/1222/09/12

Fingerprint

Dive into the research topics of 'Disambiguating main POS tags for Turkish'. Together they form a unique fingerprint.

Cite this