Using Attribute-based Feature Selection Approaches and Machine Learning Algorithms for Detecting Fraudulent Website URLs

Mustafa Aydin, Ismail Butun, Kemal Bicakci, Nazife Baykal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Citations (Scopus)

Abstract

Phishing is a malicious form of online theft and needs to be prevented in order to increase the overall trust of the public on the Internet. In this study, for that purpose, the authors present their findings on the methods of detecting phishing websites. Data mining algorithms along with classifier algorithms are used in order to achieve a satisfactory result. In terms of classifiers, the Naïve Bayes, SMO, and J48 algorithms are used. As for the feature selection algorithm; Gain Ratio Attribute and ReliefF Attribute are selected. The results are provided in a comparative way. Accordingly; SMO and J48 algorithms provided satisfactory results in the detection of phishing websites, however, Naïve Bayes performed poor and is the least recommended method among all.

Original languageEnglish
Title of host publication2020 10th Annual Computing and Communication Workshop and Conference, CCWC 2020
EditorsSatyajit Chakrabarti, Rajashree Paul
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages774-779
Number of pages6
ISBN (Electronic)9781728137834
DOIs
Publication statusPublished - Jan 2020
Externally publishedYes
Event10th Annual Computing and Communication Workshop and Conference, CCWC 2020 - Las Vegas, United States
Duration: 6 Jan 20208 Jan 2020

Publication series

Name2020 10th Annual Computing and Communication Workshop and Conference, CCWC 2020

Conference

Conference10th Annual Computing and Communication Workshop and Conference, CCWC 2020
Country/TerritoryUnited States
CityLas Vegas
Period6/01/208/01/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Funding

ACC : Overall Accuracy CAR : Cumulative Abnormal Return CCH : Contrast Context Histogram DOM : Document Object Model DM : Data Mining DT : Decision Tree FP : False Positive LR : Logistic Regression PII : Personal Identification Information MLP : Multi-Layer Perceptron NB : Naïve Bayes NN : Neural Network SVM : Support Vector Machines TP : True Positive TSVM: Transductive SVM WEKA: Waikato Environment for Knowledge Analysis ACKNOWLEDGEMENTS This research has been partially supported by the Swedish Civil Contingencies Agency (MSB) through the projects RICS, by the EU Horizon 2020 Framework Programme under grant agreement 773717, and by the STINT grant IB2019-8185.

FundersFunder number
Horizon 2020 Framework Programme773717
Swedish Foundation for International Cooperation in Research and Higher EducationIB2019-8185
Myndigheten för Samhällsskydd och Beredskap

    Keywords

    • Attribute-based feature selection
    • Cyber theft
    • Data analysis
    • Fraudulent website detection
    • Machine learning algorithms

    Fingerprint

    Dive into the research topics of 'Using Attribute-based Feature Selection Approaches and Machine Learning Algorithms for Detecting Fraudulent Website URLs'. Together they form a unique fingerprint.

    Cite this