Özet
Code-switching (usage of different languages within a single conversation context in an alternative manner) is a highly increasing phenomenon in social media and colloquial usage which poses different challenges for natural language processing. This paper introduces the first study for the detection of Turkish-English code-switching and also a small test data collected from social media in order to smooth the way for further studies. The proposed system using character level n-grams and conditional random fields (CRFs) obtains 95.6% micro-averaged F1-score on the introduced test data set.
Orijinal dil | İngilizce |
---|---|
Ana bilgisayar yayını başlığı | 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop |
Yayınlayan | Association for Computational Linguistics (ACL) |
Sayfalar | 110-115 |
Sayfa sayısı | 6 |
ISBN (Elektronik) | 9781948087797 |
Yayın durumu | Yayınlandı - 2018 |
Etkinlik | 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Brussels, Belgium Süre: 1 Kas 2018 → … |
Yayın serisi
Adı | 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop |
---|
???event.eventtypes.event.conference???
???event.eventtypes.event.conference??? | 4th Workshop on Noisy User-Generated Text, W-NUT 2018 |
---|---|
Ülke/Bölge | Belgium |
Şehir | Brussels |
Periyot | 1/11/18 → … |
Bibliyografik not
Publisher Copyright:© 2018 Association for Computational Linguistics.