TY - JOUR
T1 - Error annotation
T2 - a review and faceted taxonomy
AU - Eryiğit, Gülşen
AU - Golynskaia, Anna
AU - Sayar, Elif
AU - Türker, Tolgahan
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature B.V. 2025.
PY - 2025
Y1 - 2025
N2 - Classification of errors in language use plays a crucial role in language learning & teaching, error analysis studies, and language technology development. However, there is no standard and inclusive error classification method agreed upon among different disciplines, which causes repetition of similar efforts and a barrier in front of a common understanding in the field. This article brings a new and holistic perspective to error classifications and annotation schemes across different fields (i.e., learner corpora research, error analysis, grammar error correction, and machine translation), all serving the same purpose but employing different methods and approaches. The article first reviews previous error annotation efforts from different fields for nineteen languages with different characteristics, including the morphologically rich ones that pose diverse challenges for language technologies. It then introduces a faceted taxonomy for errors in language use, comprising multidimensional and hierarchical facets that can be utilized to create both fine- and coarse-grained error annotation schemes depending on specific requirements. We believe that the proposed taxonomy based on the principles of universality and diversity will address the emerging need for a common framework in error annotation.
AB - Classification of errors in language use plays a crucial role in language learning & teaching, error analysis studies, and language technology development. However, there is no standard and inclusive error classification method agreed upon among different disciplines, which causes repetition of similar efforts and a barrier in front of a common understanding in the field. This article brings a new and holistic perspective to error classifications and annotation schemes across different fields (i.e., learner corpora research, error analysis, grammar error correction, and machine translation), all serving the same purpose but employing different methods and approaches. The article first reviews previous error annotation efforts from different fields for nineteen languages with different characteristics, including the morphologically rich ones that pose diverse challenges for language technologies. It then introduces a faceted taxonomy for errors in language use, comprising multidimensional and hierarchical facets that can be utilized to create both fine- and coarse-grained error annotation schemes depending on specific requirements. We believe that the proposed taxonomy based on the principles of universality and diversity will address the emerging need for a common framework in error annotation.
KW - Error classification
KW - Learner corpus
KW - Taxonomy
UR - http://www.scopus.com/inward/record.url?scp=85214416918&partnerID=8YFLogxK
U2 - 10.1007/s10579-024-09794-0
DO - 10.1007/s10579-024-09794-0
M3 - Article
AN - SCOPUS:85214416918
SN - 1574-020X
JO - Language Resources and Evaluation
JF - Language Resources and Evaluation
ER -