TY - JOUR
T1 - Web Based Anomaly Detection Using Zero-Shot Learning With CNN
AU - Demirel, Dilek Yilmazer
AU - Sandikkaya, Mehmet Tahir
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - In recent years, attacks targeting websites have become a persistent threat. Therefore, web application security has become a significant issue. Dealing with unbalanced data is the biggest obstacle to providing security for web applications since there are fewer malicious requests despite a large number of benign requests. This paper suggests a novel Zero-Shot Learning method employing a Convolutional Neural Network (ZSL-CNN) to address unbalanced data problem and high false positive rates. This approach uses only benign data during training while predicting unseen malicious requests. Five web request datasets are used for validation on a diverse set of samples. The first dataset is a novel dataset containing Internet banking web request logs provided by Yapi Kredi Teknoloji. Other datasets are (i) an open-source WAF dataset, (ii) CSIC 2010 HTTP dataset, (iii) HTTP Params 2015 dataset, and (iv) a hybrid dataset. URIs are extracted from these datasets and fed to the ZSL-CNN model after code embedding. The same datasets are tested using other well-known models such as Isolation Forest, Autoencoder, Denoising Autoencoder with Dropout, and One-Class SVM. As per the comparison of the outcomes, it is seen that true positive rate of ZSL-CNN model is the greatest, reaching 99.29%.
AB - In recent years, attacks targeting websites have become a persistent threat. Therefore, web application security has become a significant issue. Dealing with unbalanced data is the biggest obstacle to providing security for web applications since there are fewer malicious requests despite a large number of benign requests. This paper suggests a novel Zero-Shot Learning method employing a Convolutional Neural Network (ZSL-CNN) to address unbalanced data problem and high false positive rates. This approach uses only benign data during training while predicting unseen malicious requests. Five web request datasets are used for validation on a diverse set of samples. The first dataset is a novel dataset containing Internet banking web request logs provided by Yapi Kredi Teknoloji. Other datasets are (i) an open-source WAF dataset, (ii) CSIC 2010 HTTP dataset, (iii) HTTP Params 2015 dataset, and (iv) a hybrid dataset. URIs are extracted from these datasets and fed to the ZSL-CNN model after code embedding. The same datasets are tested using other well-known models such as Isolation Forest, Autoencoder, Denoising Autoencoder with Dropout, and One-Class SVM. As per the comparison of the outcomes, it is seen that true positive rate of ZSL-CNN model is the greatest, reaching 99.29%.
KW - CNN
KW - Zero-shot learning
KW - anomaly detection
KW - attack detection
KW - web attacks
UR - http://www.scopus.com/inward/record.url?scp=85167789964&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3303845
DO - 10.1109/ACCESS.2023.3303845
M3 - Article
AN - SCOPUS:85167789964
SN - 2169-3536
VL - 11
SP - 91511
EP - 91525
JO - IEEE Access
JF - IEEE Access
ER -