Ana gezinime geç Aramaya geç Ana içeriğe geç

Automatic Unsupervised Extraction of Unigrams of Terms and Named Entities Using the K-Means Clustering Algorithm

  • Aliya Kalykulova
  • , Bilal Saoud*
  • , Ibraheem Shayea
  • , Dauren Sagidullauly
  • *Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: Kitap/Rapor/Konferans Bildirisinde BölümKonferans katkısıbilirkişi

Özet

Effective extraction of domain-specific terms and named entities is a key challenge in text mining. This paper investigates the use of the k-means clustering algorithm for unsupervised extraction of unigrams and named entities from text data. The approach groups terms based on their vector representations, enabling the identification of semantically similar words without labeled data. Experiments conducted on the ACTER (Annotated Corpora for Term Extraction Research) corpus evaluate the method using precision, recall, and F1-score. Results show average scores of 25.79% precision, 40.05% recall and 30.47% F1-score, with optimal performance achieved using 40 to 60 clusters. Future work will explore algorithm optimization and comparisons with alternative extraction techniques.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıSelected Papers from the International Conference on Artificial Intelligence - FICAILY2025 - Current Research, Industry Trends, and Innovations
EditörlerAli Othman Albaji
YayınlayanSpringer Science and Business Media Deutschland GmbH
Sayfalar375-386
Sayfa sayısı12
ISBN (Basılı)9783032002310
DOI'lar
Yayın durumuYayınlandı - 2026
EtkinlikInternational Conference on AI: Current Research, Industry Trends, and Innovations, FICAILY 2025 - Tripoli, Libya
Süre: 9 Tem 202510 Tem 2025

Yayın serisi

AdıStudies in Computational Intelligence
Hacim1229 SCI
ISSN (Basılı)1860-949X
ISSN (Elektronik)1860-9503

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???International Conference on AI: Current Research, Industry Trends, and Innovations, FICAILY 2025
Ülke/BölgeLibya
ŞehirTripoli
Periyot9/07/2510/07/25

Bibliyografik not

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

Parmak izi

Automatic Unsupervised Extraction of Unigrams of Terms and Named Entities Using the K-Means Clustering Algorithm' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap