TFEEC: Turkish Financial Event Extraction Corpus

Kadir Şinas Kaynak*, Ahmet Cüneyd Tantuğ

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Event extraction from the news is essential for making financial decisions accurately. Therefore, it has been researched in many languages for a long time. However, to the best of our knowledge, no study has been conducted in the domain of Turkish financial and economic text mining. To fill this gap, we have created an ontology and presented a well-defined and high-quality company-specific event corpus of Turkish economic and financial news. Using our dataset, we conducted a preliminary evaluation of the event extraction model to serve as a baseline for further work. Most approaches in the event extraction domain rely on machine learning and require large amounts of labeled data. However, building a training corpus with manually annotated events is a very time-consuming and intensive process. To solve this problem, we tried active learning and weak supervision methods to reduce human effort and automatically produce more labeled data without degrading machine learning performance. Experiments on our dataset show that both methods are useful. Furthermore, when we combined the manually annotated dataset with the automatically labeled dataset and used it in model training, we demonstrated that the performance increased by %2,91 for event classification, %13,76 for argument classification.

Original languageEnglish
Title of host publicationDistributed Computing and Artificial Intelligence, Special Sessions, 19th International Conference
EditorsJosé Manuel Machado, Pablo Chamoso, Guillermo Hernández, Grzegorz Bocewicz, Roussanka Loukanova, Roussanka Loukanova, Esteban Jove, Angel Martin del Rey, Michela Ricca
PublisherSpringer Science and Business Media Deutschland GmbH
Pages49-58
Number of pages10
ISBN (Print)9783031232091
DOIs
Publication statusPublished - 2023
Event19th International Symposium on Distributed Computing and Artificial Intelligence, DCAI 2022 - L´Aquila, Italy
Duration: 13 Jul 202215 Jul 2022

Publication series

NameLecture Notes in Networks and Systems
Volume585 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference19th International Symposium on Distributed Computing and Artificial Intelligence, DCAI 2022
Country/TerritoryItaly
CityL´Aquila
Period13/07/2215/07/22

Bibliographical note

Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

  • Active learning
  • Corpus generation
  • Event extraction
  • Semi-supervised
  • Weak supervision

Fingerprint

Dive into the research topics of 'TFEEC: Turkish Financial Event Extraction Corpus'. Together they form a unique fingerprint.

Cite this