Initial explorations on using CRFs for turkish named entity recognition

Gökhan Akin Şseker*, Güļsen Eryǐgit

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

52 Citations (Scopus)

Abstract

This paper reports the highest results (95% in MUC and 92% in CoNLL metric) in the literature for Turkish named entity recognition; more specifically for the task of detecting person, location and organization entities in general news texts. We give an in depth analysis of the previous reported results and make comparisons with them whenever possible. We use conditional random fields (CRFs) as our statistical model. The paper presents initial explorations on the usage of rich morphological structure of the Turkish language as features to CRFs together with the use of some basic and generative gazetteers.

Original languageEnglish
Pages2459-2474
Number of pages16
Publication statusPublished - 2012
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India
Duration: 8 Dec 201215 Dec 2012

Conference

Conference24th International Conference on Computational Linguistics, COLING 2012
Country/TerritoryIndia
CityMumbai
Period8/12/1215/12/12

Keywords

  • Conditional random fields
  • ENAMEX
  • Named entity recognition
  • Turkish

Fingerprint

Dive into the research topics of 'Initial explorations on using CRFs for turkish named entity recognition'. Together they form a unique fingerprint.

Cite this