Information Extraction from Text Intensive and Visually Rich Banking Documents

Berke Oral, Erdem Emekligil, Seçil Arslan, Gülşen Eryiǧit*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

39 Citations (Scopus)

Abstract

Document types, where visual and textual information plays an important role in their analysis and understanding, pose a new and attractive area for information extraction research. Although cheques, invoices, and receipts have been studied in some previous multi-modal studies, banking documents present an unexplored area due to the naturalness of the text they possess in addition to their visual richness. This article presents the first study which uses visual and textual information for deep-learning based information extraction on text-intensive and visually rich scanned documents which are, in this instance, unstructured banking documents, or more precisely, money transfer orders. The impact of using different neural word representations (i.e., FastText, ELMo, and BERT) on IE subtasks (namely, named entity recognition and relation extraction stages), positional features of words on document images and auxiliary learning with some other tasks are investigated. The article proposes a new relation extraction algorithm based on graph factorization to solve the complex relation extraction problem where the relations within documents are n-ary, nested, document-level, and previously indeterminate in quantity. Our experiments revealed that the use of deep learning algorithms yielded around 10 percentage points improvement on the IE sub-tasks. The inclusion of word positional features yielded around 3 percentage points of improvement in some specific information fields. Similarly, our auxiliary learning experiments yielded around 2 percentage points of improvement on some information fields associated with the specific transaction type detected by our auxiliary task. The integration of the information extraction system into a real banking environment reduced cycle times substantially. When compared to the manual workflow, document processing pipeline shortened book-to-book money transfers to 10 minutes (from 29 min.) and electronic fund transfers (EFT) to 17 minutes (from 41 min.) respectively.

Original languageEnglish
Article number102361
JournalInformation Processing and Management
Volume57
Issue number6
DOIs
Publication statusPublished - Nov 2020

Bibliographical note

Publisher Copyright:
© 2020 Elsevier Ltd

Funding

This research is partially financially supported by The Scientific and Technological Research Council of Turkey (TUBITAK) and by Yapı Kredi Technology with a TUBITAK TEYDEB 1505 project (grant no 5190073). The authors want to especially thank Yapı Kredi Technology for the collection, analysis and interpretation of data from banking domain. We would like to offer our special thanks to all of our reviewers for their very valuable comments which we believe improved the final version of the article substantially. We also want to thank our colleagues Deniz Engin, Mehmet Yasin Akpınar and Mustafa İşbilen for their valuable discussions and support.

FundersFunder number
TUBITAK5190073
Yapı Kredi Technology
Türkiye Bilimsel ve Teknolojik Araştirma Kurumu

    Keywords

    • Banking Documents
    • Deep Learning
    • Information Extraction
    • Named Entity Recognition
    • NLP in Finance
    • Relation Extraction
    • Text Intensive Documents
    • Visually Rich Documents

    Fingerprint

    Dive into the research topics of 'Information Extraction from Text Intensive and Visually Rich Banking Documents'. Together they form a unique fingerprint.

    Cite this