Detecting Code-Switching between Turkish-English Language Pair

Zeynep Yirmibeşoğlu, Gülşen Eryiğit

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Citations (Scopus)

Abstract

Code-switching (usage of different languages within a single conversation context in an alternative manner) is a highly increasing phenomenon in social media and colloquial usage which poses different challenges for natural language processing. This paper introduces the first study for the detection of Turkish-English code-switching and also a small test data collected from social media in order to smooth the way for further studies. The proposed system using character level n-grams and conditional random fields (CRFs) obtains 95.6% micro-averaged F1-score on the introduced test data set.

Original languageEnglish
Title of host publication4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages110-115
Number of pages6
ISBN (Electronic)9781948087797
Publication statusPublished - 2018
Event4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Brussels, Belgium
Duration: 1 Nov 2018 → …

Publication series

Name4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop

Conference

Conference4th Workshop on Noisy User-Generated Text, W-NUT 2018
Country/TerritoryBelgium
CityBrussels
Period1/11/18 → …

Bibliographical note

Publisher Copyright:
© 2018 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'Detecting Code-Switching between Turkish-English Language Pair'. Together they form a unique fingerprint.

Cite this