[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Extended Abstract: Assessing Language Models for Semantic Textual Similarity in Cybersecurity

  • Conference paper
  • First Online:
Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA 2024)

Abstract

In light of the significant strides made by large language models (LLMs) in the field of natural language processing (NLP) [5], our research seeks to evaluate and contrast their proficiency in establishing associations within the realm of cybersecurity. Our experimental framework involves juxtaposing actual connections from various cybersecurity knowledge graphs (including MITRE CAPEC, D3FEND, and CVE connections to ATT &CK) against predictions made by LLMs using semantic textual similarity (STS). These connections span a broad spectrum, encapsulating diverse abstractions of threat descriptions, attack patterns, defense strategies, and vulnerabilities. The language models chosen for this study are varied, comprising state-of-the-art models from STS leaderboards, LLMs (GPT3.5 and PaLM), and ATTACK BERT [1], a cybersecurity domain-specific language model. Our experiments provide valuable insights into the differentiation between language models and data sources, thereby facilitating the broader application of STS in cybersecurity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 79.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 69.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abdeen, B., Al-Shaer, E., Singhal, A., Khan, L., Hamlen, K.: SMET: semantic mapping of CVE to ATT &CK and its application to cybersecurity. In: Atluri, V., Ferrara, A.L. (eds.) DBSec 2023. LNCS, vol. 13942, pp. 243–260. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-37586-6_15

    Chapter  Google Scholar 

  2. Aghaei, E., Niu, X., Shadid, W., Al-Shaer, E.: SecureBERT: a domain-specific language model for cybersecurity. In: Li, F., Liang, K., Lin, Z., Katsikas, S.K. (eds.) Security and Privacy in Communication Systems. LNICST, vol. 462, pp. 39–56. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-25538-0_3

    Chapter  Google Scholar 

  3. Akbar, K.A., Halim, S.M., Hu, Y., Singhal, A., Khan, L., Thuraisingham, B.: Knowledge mining in cybersecurity: from attack to defense. In: Sural, S., Lu, H. (eds.) DBSec 2022. LNCS, vol. 13383, pp. 110–122. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10684-2_7

    Chapter  Google Scholar 

  4. Al-Hawawreh, M., Aljuhani, A., Jararweh, Y.: ChatGPT for cybersecurity: practical applications, challenges, and future directions. Clust. Comput. 26(6), 3421–3436 (2023)

    Article  Google Scholar 

  5. Bubeck, S., et al.: Sparks of artificial general intelligence: early experiments with GPT-4. arXiv preprint arXiv:2303.12712 (2023)

  6. Crumpler, W., Lewis, J.A.: The Cybersecurity Workforce Gap. JSTOR (2019)

    Google Scholar 

  7. Gupta, M., Akiri, C., Aryal, K., Parker, E., Praharaj, L.: From ChatGPT to ThreatGPT: impact of generative AI in cybersecurity and privacy. IEEE Access 11, 80218–80245 (2023)

    Article  Google Scholar 

  8. Huggingface: MTEB Leaderboard (2023). https://huggingface.co/spaces/mteb/leaderboard. Accessed 1 Dec 2023

  9. Kaiser, F.K., Andris, L.J., Tennig, T.F., Iser, J.M., Wiens, M., Schultmann, F.: Cyber threat intelligence enabled automated attack incident response. In: 2022 3rd International Conference on Next Generation Computing Applications (NextComp), pp. 1–6. IEEE (2022)

    Google Scholar 

  10. Kanakogi, K., et al.: Tracing CVE vulnerability information to CAPEC attack patterns using natural language processing techniques. Information 12(8), 298 (2021)

    Article  Google Scholar 

  11. Kuppa, A., Aouad, L., Le-Khac, N.A.: Linking CVE’s to MITRE ATT &CK techniques. In: Proceedings of the 16th International Conference on Availability, Reliability and Security, pp. 1–12 (2021)

    Google Scholar 

  12. McKenna, N., Li, T., Cheng, L., Hosseini, M.J., Johnson, M., Steedman, M.: Sources of hallucination by large language models on inference tasks. arXiv preprint arXiv:2305.14552 (2023)

  13. Min, B., et al.: Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. 56(2), 1–40 (2023)

    Article  Google Scholar 

  14. Ranade, P., Piplai, A., Joshi, A., Finin, T.: CyBERT: contextualized embeddings for the cybersecurity domain. In: 2021 IEEE International Conference on Big Data (Big Data), pp. 3334–3342. IEEE (2021)

    Google Scholar 

  15. Roy, S., Panaousis, E., Noakes, C., Laszka, A., Panda, S., Loukas, G.: SoK: the MITRE ATT &CK framework in research and practice. arXiv preprint arXiv:2304.07411 (2023)

  16. Sarker, I.H., Furhad, M.H., Nowrozy, R.: AI-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Comput. Sci. 2, 1–18 (2021)

    Article  Google Scholar 

  17. Venturebeat: Mental Health: 66% of cybersecurity analysts experienced burnout this year (2023). https://venturebeat.com/security/mental-health-cybersecurity-analysts/. Accessed 19 July 2023

  18. Wåreus, E., Hell, M.: Automated CPE labeling of CVE summaries with machine learning. In: Maurice, C., Bilge, L., Stringhini, G., Neves, N. (eds.) DIMVA 2020. LNCS, vol. 12223, pp. 3–22. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52683-2_1

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arian Soltani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Soltani, A., Nkashama, D.K., Masakuna, J.F., Frappier, M., Tardif, PM., Kabanza, F. (2024). Extended Abstract: Assessing Language Models for Semantic Textual Similarity in Cybersecurity. In: Maggi, F., Egele, M., Payer, M., Carminati, M. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2024. Lecture Notes in Computer Science, vol 14828. Springer, Cham. https://doi.org/10.1007/978-3-031-64171-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-64171-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-64170-1

  • Online ISBN: 978-3-031-64171-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics