[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

SmartiPhish: a reinforcement learning-based intelligent anti-phishing solution to detect spoofed website attacks

Published: 21 November 2023 Publication History

Abstract

Phishing, a well-known cyberattack that cannot be completely eradicated from the Internet, has increased dramatically since the COVID-19 pandemic. Despite previous efforts to reduce this prevalent Internet threat, constantly changing attacks make phishing detection a difficult task. The lack of continuous learning support provided by existing solutions and the lack of a systematic knowledge acquisition process make its detection more difficult. SmartiPhish is introduced in this context as the first anti-phishing solution with integrated continuous learning support with an innovative knowledge acquisition process. SmartiPhish combines deep learning and reinforcement learning to have a successful phishing detection solution. The deep learning model predicts a phishing probability for a given web page based on the URL and HTML content, and the probability is then passed to a reinforcement learning environment to make a decision based on the popularity of the web page and prior knowledge of it. SmartiPhish has a detection accuracy of 96.40% and a detection time of 4.3 s. SmartiPhish performs well in an imbalanced environment, and zero-day attack detection is also interesting. Furthermore, SmartiPhish demonstrated a 5.65% performance improvement in just six weeks, in contrast to the existing anti-phishing tools’ declining performance trend over time.

References

[1]
Chiew KL, Yong KSC, and Tan CL A survey of phishing attacks: their types, vectors and technical approaches Expert Syst. Appl. 2018 106 1-20
[2]
Dou Z, Khalil I, Khreishah A, Al-Fuqaha A, and Guizani M Systematization of knowledge (SoK): a systematic review of software-based web phishing detection IEEE Commun. Surv. Tutor. 2017 19 4 2797-2819
[3]
European Union Agency for Cybersecurity: Enisa threat landscape report 2018: 15 top cyber threats and trends. Technical report (2019).
[4]
APWG: Phishing activity trends report: 4th quarter 2021. Technical report, Anti-Phishing Working Group (2022)
[5]
Huang, H., Zhong, S., Tan, J.: Browser-side countermeasures for deceptive phishing attack. In: 2009 Fifth International Conference on Information Assurance and Security, vol. 1, pp. 352–355 (2009).
[6]
Yu, W.D., Nargundkar, S., Tiruthani, N.: A phishing vulnerability analysis of web based systems. In: 2008 IEEE Symposium on Computers and Communications, pp. 326–331 (2008).
[7]
Alkhalil Z, Hewage C, Nawaf L, and Khan I Phishing attacks: a recent comprehensive study and a new anatomy Front. Comput. Sci. 2021
[8]
Oest, A., Zhang, P., Wardman, B., Nunes, E., Burgis, J., Zand, A., Thomas, K., Doup´e, A., Ahn, G.-J.: Sunrise to sunset: Analyzing the end-to-end life cycle and effectiveness of phishing attacks at scale. In: 29th {USENIX} Security Symposium ({USENIX} Security 20) (2020)
[9]
Li Y, Yang Z, Chen X, Yuan H, and Liu W A stacking model using URL and HTML features for phishing webpage detection Futur. Gener. Comput. Syst. 2019 94 27-39
[10]
Khonji M, Iraqi Y, and Jones A Phishing detection: a literature survey IEEE Commun. Surv. Tutor. 2013 15 4 2091-2121
[11]
Bahnsen, A.C., Bohorquez, E.C., Villegas, S., Vargas, J., Gonz´alez, F.A.: Classifying phishing urls using recurrent neural networks. In: 2017 APWG Symposium on Electronic Crime Research (eCrime), pp. 1–8 (2017).
[12]
Feng J, Zou L, Ye O, and Han J Web2vec: phishing webpage detection method based on multidimensional features driven by deep learning IEEE Access 2020 8 221214-221224
[13]
Opara, C., Chen, Y., wei, B.: Look Before You Leap: Detecting Phishing Web Pages by Exploiting Raw URL And HTML Characteristics. arXiv (2020). https://arxiv.org/abs/2011.04412
[14]
Aassal AE, Baki S, Das A, and Verma RM An in-depth benchmarking and evaluation of phishing detection research for security needs IEEE Access 2020 8 22170-22192
[15]
Opara, C., Wei, B., Chen, Y.: Htmlphish: Enabling phishing web page detection by applying deep learning techniques on html analysis. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020).
[16]
Shirazi H, Bezawada B, Ray I, and Anderson C Foley SN Adversarial sampling attacks against phishing detection Data and Applications Security and Privacy XXXIII 2019 Cham Springer 83-101
[17]
Ariyadasa S, Fernando S, and Fernando S Combining long-term recurrent convolutional and graph convolutional networks to detect phishing sites using URL and HTML IEEE Access 2022 10 82355-82375
[18]
Sahoo, D., Liu, C., Hoi, S.C.H.: Malicious URL Detection using Machine Learning: A Survey. arXiv (2017). https://arxiv.org/abs/1701.07179
[19]
Mohammad RM, Thabtah F, and McCluskey L Predicting phishing websites based on self-structuring neural network Neural Comput. Appl. 2013 25 2 443-458
[20]
El-Alfy E-SM Detection of phishing websites based on probabilistic neural networks and k-medoids clustering Comput. J. 2017 60 12 1745-1759
[21]
Buber, E., Demir, O., Sahingoz, O.K.: Feature selections for the machine learning based detection of phishing websites. In: 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), pp. 1–5 (2017).
[22]
Yang P, Zhao G, and Zeng P Phishing website detection based on multidimensional features driven by deep learning IEEE Access 2019 7 15196-15209
[23]
Sahingoz OK, Buber E, Demir O, and Diri B Machine learning based phishing detection from URLs Expert Syst. Appl. 2019 117 345-357
[24]
Wang W, Zhang F, Luo X, and Zhang S PDRCNN: precise phishing detection with recurrent convolutional neural networks Secur. Commun. Netw. 2019 2019 1-15
[25]
Sameen M, Han K, and Hwang SO PhishHaven—an efficient real-time AI phishing URLs detection system IEEE Access 2020 8 83425-83443
[26]
Chen W, Zhang W, and Su Y Zhou Q, Gan Y, Jing W, Song X, Wang Y, and Lu Z Phishing detection research based on lstm recurrent neural network Data Science 2018 Singapore Springer 638-645
[27]
Bahnsen, A.C., Torroledo, I., Camacho, L.D., Villegas, S.: Deepphish: simulating malicious ai. In: 2018 APWG Symposium on Electronic Crime Research (eCrime), pp. 1–8 (2018)
[28]
LeCun Y, Bengio Y, and Hinton G Deep learning Nature 2015 521 7553 436-444
[29]
Chauhan, N.K., Singh, K.: A review on conventional machine learning vs deep learning. In: 2018 International Conference on Computing, Power and Communication Technologies (GUCON), pp. 347–352 (2018).
[30]
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction (2018)
[31]
Chatterjee, M., Namin, A.S.: Deep Reinforcement Learning for Detecting Malicious Websites. arXiv (2019). https://arxiv.org/abs/1905.09207
[32]
Alabdan R Phishing attacks survey: types, vectors, and technical approaches Future Internet 2020 12 10 168
[33]
Bahnsen, A.C., Torroledo, I., Camacho, L.D., Villegas, S.: Deepphish : Simulating malicious ai. (2018)
[34]
Verma, R.M., Zeng, V., Faridi, H.: Data quality for security challenges. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. ACM, New York, NY, USA (2019).
[35]
Butnaru A, Mylonas A, and Pitropakis N Towards lightweight URL-based phishing detection Future Internet 2021 13 6 154
[36]
Ariyadasa S, Fernando S, and Fernando S PhishRepo: a seamless collection of phishing data to fill a research gap in the phishing domain Int. J. Adv. Comput. Sci. Appl. 2022
[37]
Wu, C.-Y., Kuo, C.-C., Yang, C.-S.: A phishing detection system based on machine learning. In: 2019 International Conference on Intelligent Computing and Its Emerging Applications (ICEA), pp. 28–32 (2019).
[38]
Orunsolu AA, Sodiya AS, and Akinwale AT A predictive model for phishing detection J. King Saudi Univ. Comput. Inf. Sci. 2022 34 2 232-247
[39]
Ariyadasa S, Fernando S, and Fernando S Detecting phishing attacks using a combined model of LSTM and CNN Int. J. Adv. Appl. Sci. 2020 7 7 56-67
[40]
Franc¸ois-Lavet V, Henderson P, Islam R, Bellemare MG, and Pineau J An introduction to deep reinforcement learning Found. Trends Mach. Learn. 2018 11 3–4 219-354
[41]
Levine, S., Kumar, A., Tucker, G., Fu, J.: Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. arXiv (2020). https://arxiv.org/abs/2005.01643
[42]
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, and Hassabis D Human-level control through deep reinforcement learning Nature 2015 518 7540 529-533
[43]
Tuan Nguyen, L.A., To, B.L., Nguyen, H.K., Nguyen, M.H.: An efficient approach for phishing detection using single-layer neural network. In: 2014 International Conference on Advanced Technologies for Communications (ATC 2014), pp. 435–440 (2014).
[44]
Ariyadasa S, Fernando S, and Fernando S Phishing websites dataset Mendeley 2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Information Security
International Journal of Information Security  Volume 23, Issue 2
Apr 2024
838 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 November 2023
Accepted: 22 October 2023

Author Tags

  1. Continuous learning
  2. Cyberattack
  3. Internet security
  4. Knowledge acquisition
  5. Machine learning
  6. Phishing detection

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media