Performance analysis of Naïve Bayes classification, Support Vector Machines and Neural Networks for spam categorization

A. Cüneyd Tantuǧ*, Gülşen Eryiǧit

*Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: Kitap/Rapor/Konferans Bildirisinde BölümBölümbilirkişi

9 Atıf (Scopus)

Özet

Spam mail recognition is a new growing field which brings together the topic of natural language processing and machine learning as it is in essence a two class classification of natural language texts. An important feature of spam recognition is that it is a cost-sensitive classification: misclassification of a nonspam mail as spam is generally a more severe error than misclassifying a spam mail as non-spam. In order to be compared, the methods applied to this field should be all evaluated with the same corpus and within the same cost-sensitive framework. In this paper, the performances of Support Vector Machines (SVM), Neural Networks (NN) and Naïve Bayes (NB) techniques are compared using a publicly available corpus (LINGSPAM) for different cost scenarios. The training time complexities of the methods are also evaluated. The results show that NN has significantly better performance than the two other, having acceptable training times. NB gives better results than SVM when the cost is extremely high while in all other cases SVM outperforms NB.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıApplied Soft Computing Technologies
Ana bilgisayar yayını alt yazısıThe Challenge of Complexity
EditörlerAjith Abraham, Bernard Baets, Mario Koeppen, Bertram Nickolay
Sayfalar495-504
Sayfa sayısı10
DOI'lar
Yayın durumuYayınlandı - 2006

Yayın serisi

AdıAdvances in Soft Computing
Hacim34
ISSN (Basılı)1615-3871
ISSN (Elektronik)1860-0794

Parmak izi

Performance analysis of Naïve Bayes classification, Support Vector Machines and Neural Networks for spam categorization' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap