Enhancing Turkish Coreference Resolution: Insights from deep learning, dropped pronouns, and multilingual transfer learning

Tuğba Pamay Arslan, Gülşen Eryiğit*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Coreference resolution (CR), which is the identification of in-text mentions that refer to the same entity, is a crucial step in natural language understanding. While CR in English has been studied for quite a long time, studies for pro-dropped and morphologically rich languages is an active research area which has yet to reach sufficient maturity. Turkish, a morphologically highly-rich language, poses interesting challenges for natural language processing tasks, including CR, due to its agglutinative nature and consequent pronoun-dropping phenomenon. This article explores the use of different neural CR architectures (i.e., mention-pair, mention-ranking, and end-to-end) on Turkish, a morphologically highly-rich language, by formulating multiple research questions around the impacts of dropped pronouns, data quality, and interlingual transfer. The preparations made to explore these research questions and the findings obtained as a result of our explorations revealed the first Turkish CR dataset that includes dropped pronoun annotations (of size 4K entities/22K mentions), new state-of-the-art results on Turkish CR, the first neural end-to-end Turkish CR results (70.4% F-score), the first multilingual end-to-end CR results including Turkish (yielding 1.0 percentage points improvement on Turkish) and the demonstration of the positive impact of dropped pronouns on CR of pro-dropped and morphologically rich languages, for the first time in the literature. Our research has brought Turkish end-to-end CR performances (72.0% F-score) to similar levels with other languages, surpassing the baseline scores by 32.1 percentage points.

Original languageEnglish
Article number101681
JournalComputer Speech and Language
Volume89
DOIs
Publication statusPublished - Jan 2025

Bibliographical note

Publisher Copyright:
© 2024 Elsevier Ltd

Keywords

  • Coreference resolution
  • Deep learning
  • Natural language processing
  • Turkish coreference resolution

Fingerprint

Dive into the research topics of 'Enhancing Turkish Coreference Resolution: Insights from deep learning, dropped pronouns, and multilingual transfer learning'. Together they form a unique fingerprint.

Cite this