Two Stage Deep Learning Based Stacked Ensemble Model for Web Application Security

SEVRİ, MEHMET; KARACAN, HACER

doi:10.3837/tiis.2022.02.014

Two Stage Deep Learning Based Stacked Ensemble Model for Web Application Security

Atıf İçin Kopyala

SEVRİ M., KARACAN H.

KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, cilt.16, sa.2, ss.632-657, 2022 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 16 Sayı: 2
Basım Tarihi: 2022
Doi Numarası: 10.3837/tiis.2022.02.014
Dergi Adı: KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Applied Science & Technology Source, Compendex, Computer & Applied Sciences
Sayfa Sayıları: ss.632-657
Anahtar Kelimeler: Anomaly detection, deep learning, ensemble learning, web application firewall, web security, SYSTEM
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Recep Tayyip Erdoğan Üniversitesi Adresli: Hayır

Özet

Detecting web attacks is a major challenge, and it is observed that the use of simple models leads to low sensitivity or high false positive problems. In this study, we aim to develop a robust two-stage deep learning based stacked ensemble web application firewall. Normal and abnormal classification is carried out in the first stage of the proposed WAF model. The classification process of the types of abnormal traffics is postponed to the second stage and carried out using an integrated stacked ensemble model. By this way, clients' requests can be served without time delay, and attack types can be detected with high sensitivity. In addition to the high accuracy of the proposed model, by using the statistical similarity and diversity analyses in the study, high generalization for the ensemble model is achieved. Within the study, a comprehensive, up-to-date, and robust multi-class web anomaly dataset named GAZIHTTP is created in accordance with the real-world situations. The performance of the proposed WAF model is compared to state-of-the-art deep learning models and previous studies using the benchmark dataset. The proposed two-stage model achieved multi-class detection rates of 97.43% and 94.77% for GAZI-HTTP and ECML-PKDD, respectively.