Abstract
Naive Bayes (NB) classification is one of the most extensively used algorithms in data mining and machine learning due to its high efficiency and structural simplicity based on conditional independence of attributes. In this paper, we present a dependence metric to quantify the dependence among attributes and class attributes and propose feature-feature significance (FFS) and feature-class significance(FCS)to discover highly predictive attributes over less predictive ones in NB classification. We show how to get feature weights from FFS and FCS and propose a novel dependent feature weighted (DFW) NB classification. To increase performance further, we recommend clustering the random sample of interest due to the non-homogeneous dependence nature of features, and then using feature weighting to alleviate the conditional independence. As a consequence, we propose a cluster-based DFW (CDFW) NB as a result of weighting the DFW filters of random sub-samples by their accuracy and then merging them for performance augmentation. The experimental results show that the NB with DFW filter provides good results when compared to the conventional NB and all other feature weighting techniques.
Original language | English |
---|---|
Title of host publication | 2022 30th Signal Processing and Communications Applications Conference, SIU 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781665450928 |
DOIs | |
Publication status | Published - 2022 |
Externally published | Yes |
Event | 30th Signal Processing and Communications Applications Conference, SIU 2022 - Safranbolu, Turkey Duration: 15 May 2022 → 18 May 2022 |
Publication series
Name | 2022 30th Signal Processing and Communications Applications Conference, SIU 2022 |
---|
Conference
Conference | 30th Signal Processing and Communications Applications Conference, SIU 2022 |
---|---|
Country/Territory | Turkey |
City | Safranbolu |
Period | 15/05/22 → 18/05/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.
Keywords
- cluster-based dependence
- Feature weighting
- mutual dependence
- naive Bayes classification