Boosting Dependency Parsing Performance by Incorporating Additional Features for Agglutinative Languages

Mücahit Altıntaş, A. Cüneyd Tantuğ

Research output: Contribution to journalConference articlepeer-review

Abstract

In recent studies, the use of language models has increased noticeably and has made quite good contributions. However, using the proper representation and taking into account the complementary components are still among the issues to be considered. In this research, the impact of sub-word level sentence piece based word representation on the performance of dependency parsing has been demonstrated for agglutinative languages. Furthermore, we propose to use the sentence representation that holds all meaning of the sentence as an additional feature to improve dependency parsing. Our proposed enhancements are experimented on nine agglutinative languages; Estonian, Finnish, Hungarian, Indonesian, Japanese, Kazakh, Korean, Turkish, and Uyghur. We found that the sentence piece based token encoding has contributed parsing performance for the majority of the experimented languages. Using the entire meaning of the sentence as a complementary feature has enhanced parsing performance for six languages out of nine.

Original languageEnglish
Pages (from-to)61-70
Number of pages10
JournalCEUR Workshop Proceedings
Volume3315
Publication statusPublished - 2022
Event2022 International Conference and Workshop on Agglutanative Language Technologies as a Challenge of Natural Language Processing, ALTNLP 2022 - Virtual, Online, Slovenia
Duration: 7 Jun 20228 Jun 2022

Bibliographical note

Publisher Copyright:
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Keywords

  • agglutinative languages
  • dependency parsing
  • sentence piece
  • sentence representation

Fingerprint

Dive into the research topics of 'Boosting Dependency Parsing Performance by Incorporating Additional Features for Agglutinative Languages'. Together they form a unique fingerprint.

Cite this