Multiword Expressions in Statistical Dependency Parsing

Gülşen Eryiğit, Tugay İlbay, Ozan Arkan Can

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

38 Citations (Scopus)

Abstract

In this paper, we investigated the impact of extracting different types of multiword expressions (MWEs) in improving the accuracy of a data-driven dependency parser for a morphologically rich language (Turkish). We showed that in the training stage, the unification of MWEs of a certain type, namely compound verb and noun formations, has a negative effect on parsing accuracy by increasing the lexical sparsity. Our results gave a statistically significant improvement by using a variant of the treebank excluding this MWE type in the training stage. Our extrinsic evaluation of an ideal MWE recognizer (for only extracting MWEs of type named entities, duplications, numbers, dates and some predefined list of compound prepositions) showed that the preprocessing of the test data would improve the labeled parsing accuracy by 1.5%.

Original languageEnglish
Title of host publication2nd Workshop on Statistical Parsing of Morphologically Rich Languages, SPMRL 2011 - collocate with the International Workshop on Parsing Technologies, IWPT 2011 - Proceedings
EditorsDjame Seddah, Reut Tsarfaty, Jennifer Foster
PublisherAssociation for Computational Linguistics (ACL)
Pages45-55
Number of pages11
ISBN (Electronic)9781932432732
Publication statusPublished - 2011
Event2nd Workshop on Statistical Parsing of Morphologically Rich Languages, SPMRL 2011 - Dublin, Ireland
Duration: 6 Oct 2011 → …

Publication series

Name2nd Workshop on Statistical Parsing of Morphologically Rich Languages, SPMRL 2011 - collocate with the International Workshop on Parsing Technologies, IWPT 2011 - Proceedings

Conference

Conference2nd Workshop on Statistical Parsing of Morphologically Rich Languages, SPMRL 2011
Country/TerritoryIreland
CityDublin
Period6/10/11 → …

Bibliographical note

Publisher Copyright:
© 2011 Association for Computational Linguistics

Fingerprint

Dive into the research topics of 'Multiword Expressions in Statistical Dependency Parsing'. Together they form a unique fingerprint.

Cite this