Abstract
Information extraction (IE) which refers to the task of turning texts into structured form is also employed in finance domain for extraction of information which have a big importance for different financial concepts such as market, stock, and indices etc. As many other applications in Natural Language Processing(NLP), annotated corpora which involves entities, that represent characteristics of the related domain, is also essential resources for training and evaluation of IE models. Unfortunately, the creation of these resources is rather thorny, thus the scarcity of annotated language resources is one of the most prominent problems for lesser-studied language; as in the case for Turkish. In this paper, we present an ontology of financial concepts, and an effort to produce a high-quality corpus which includes 500 news documents annotated with these concepts in Turkish. We employ the dataset in the training of a baseline entity recognition model, and performance achieved over the dataset is 64.5% F-scores.
Translated title of the contribution | Annotation of Financial Entities Using A Comprehensive Scheme in Turkish |
---|---|
Original language | Turkish |
Title of host publication | 2022 30th Signal Processing and Communications Applications Conference, SIU 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781665450928 |
DOIs | |
Publication status | Published - 2022 |
Event | 30th Signal Processing and Communications Applications Conference, SIU 2022 - Safranbolu, Turkey Duration: 15 May 2022 → 18 May 2022 |
Publication series
Name | 2022 30th Signal Processing and Communications Applications Conference, SIU 2022 |
---|
Conference
Conference | 30th Signal Processing and Communications Applications Conference, SIU 2022 |
---|---|
Country/Territory | Turkey |
City | Safranbolu |
Period | 15/05/22 → 18/05/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.