Detecting Cyber Threat Event from Twitter Using IDCNN and BiLSTM
"> Figure 1
<p>The architecture of multitask Learning in our method.</p> "> Figure 2
<p>The architecuture of BiLSTM.</p> "> Figure 3
<p>The architecture of Named Entity Recognition (NER) in our method.</p> "> Figure 4
<p>The dilated convolution diagram. (<b>a</b>) The standard convolution, and the size of convolution kernels is 3 × 3. (<b>b</b>) The dilated convolution, and the size of convolution kernels is 3 × 3 and the dilation rate is 1.</p> ">
Abstract
:1. Introduction
- We combine the existing machine learning algorithms and propose a approach to detect cyber threat events from tweets effectively. The proposed model got excellent results and achieved an f1-score rate of 96.4% under 5-fold cross-validation on cyber threat event detection.
- We propose the MTL model to improve the cyber threat event detection task’s performance while maintaining the NER performance on the dataset. Moreover, the result showed that the proposed model could achieve more outstanding performance by comparing the f1-score with previous work.
2. Related Work
2.1. Event Detection on Twitter
2.2. Cyber Threat Event Detection
2.3. Multi-Task
3. The Proposed Model Architecuture
3.1. Cyber Threat Event Detection
3.2. Named Entity Recognition
4. Experiments Setup
4.1. The Datasets and Experimental Environment
4.2. Metric
4.3. Train Process
4.4. Evaluation and Result
5. Discussion
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- World Economic Forum. The Global Risks Report 2019. Available online: http://www3.weforum.org/docs/WEF_Global_Risks_Report_2019.pdf (accessed on 25 August 2020).
- Satyapanich, T.; Ferraro, F.; Finin, T. CASIE: Extracting Cybersecurity Event Information from Text. Umbc Fac. Collect. 2020, 34, 8749–8757. [Google Scholar] [CrossRef]
- Noor, U.; Anwar, Z.; Amjad, T.; Choo, K.K.R. A machine learning-based FinTech cyber threat attribution framework using high-level indicators of compromise. Future Gener. Comput. Syst. 2019, 96, 227–242. [Google Scholar] [CrossRef]
- Yagcioglu, S.; Seyfioglu, M.S.; Citamak, B.; Bardak, B.; Guldamlasioglu, S.; Yuksel, A.; Tatli, E.I. Detecting Cybersecurity Events from Noisy Short Text. arXiv 2019, arXiv:1904.05054. [Google Scholar]
- Mazoyer, B.; Cagé, J.; Hervé, N.; Hudelot, C. A French Corpus for Event Detection on Twitter. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 6220–6227. [Google Scholar]
- Da Costa Abreu, M.; Araujo De Souza, G. Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata. In Proceedings of the IEEE World Congress on Computational Intelligence (IEEE WCCI), Glasgow, UK, 19–24 July 2020. [Google Scholar]
- Ruder, S. An Overview of Multi-Task Learning in Deep Neural Networks. arXiv 2017, arXiv:1706.05098. [Google Scholar]
- Caruana, R. Multitask learning. Mach. Learn. 1997, 28, 41–75. [Google Scholar] [CrossRef]
- Baxter, J. A Bayesian/information theoretic model of learning to learn via multiple task sampling. Mach. Learn. 1997, 28, 7–39. [Google Scholar] [CrossRef]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Popescu, A.M.; Pennacchiotti, M. Detecting controversial events from twitter. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, ON, Canada, 19–23 October 2010; pp. 1873–1876. [Google Scholar]
- Lanagan, J.; Smeaton, A.F. Using Twitter to detect and tag important events in sports media. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, 17–21 July 2011. [Google Scholar]
- Nichols, J.; Mahmud, J.; Drews, C. Summarizing sporting events using twitter. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, Lisbon, Portugal, 14–17 February 2012; pp. 189–198. [Google Scholar]
- Walther, M.; Kaisser, M. Geo-spatial event detection in the twitter stream. In Proceedings of the European Conference on Information Retrieval, Moscow, Russia, 24–27 March 2013; pp. 356–367. [Google Scholar]
- Zhou, X.; Chen, L. Event detection over twitter social media streams. VLDB J. 2014, 23, 381–400. [Google Scholar] [CrossRef]
- D’Andrea, E.; Ducange, P.; Lazzerini, B.; Marcelloni, F. Real-time detection of traffic from twitter stream analysis. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2269–2283. [Google Scholar] [CrossRef]
- Pierce, C.E.; Bouri, K.; Pamer, C.; Proestel, S.; Rodriguez, H.W.; Van Le, H.; Freifeld, C.C.; Brownstein, J.S.; Walderhaug, M.; Edwards, I.R.; et al. Evaluation of Facebook and Twitter monitoring to detect safety signals for medical products: An analysis of recent FDA safety alerts. Drug Saf. 2017, 40, 317–331. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hasan, M.; Orgun, M.A.; Schwitter, R. Real-time event detection from the Twitter data stream using the TwitterNews+ Framework. Inf. Process. Manag. 2019, 56, 1146–1165. [Google Scholar] [CrossRef]
- Phuvipadawat, S.; Murata, T. Breaking news detection and tracking in Twitter. In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Washington, DC, USA, 31 August–3 September 2010; pp. 120–123. [Google Scholar]
- Cordeiro, M. Twitter event detection: Combining wavelet analysis and topic inference summarization. In Proceedings of the Doctoral Symposium on Informatics Engineering, Porto, Portugal, 26–27 January 2012; pp. 11–16. [Google Scholar]
- Kaleel, S.B.; Abhari, A. Cluster-discovery of Twitter messages for event detection and trending. J. Comput. Sci. 2015, 6, 47–57. [Google Scholar] [CrossRef]
- Yılmaz, Y.; Hero, A.O. Multimodal event detection in Twitter hashtag networks. J. Signal Process. Syst. 2018, 90, 185–200. [Google Scholar] [CrossRef] [Green Version]
- Dabiri, S.; Heaslip, K. Developing a Twitter-based traffic event detection model using deep learning architectures. Expert Syst. Appl. 2019, 118, 425–439. [Google Scholar] [CrossRef]
- Saeed, Z.; Abbasi, R.A.; Razzak, M.I.; Xu, G. Event detection in Twitter stream using weighted dynamic heartbeat graph approach. arXiv 2019, arXiv:1902.08522. [Google Scholar] [CrossRef]
- Nazir, F.; Ghazanfar, M.A.; Maqsood, M.; Aadil, F.; Rho, S.; Mehmood, I. Social media signal detection using tweets volume, hashtag, and sentiment analysis. Multimed. Tools Appl. 2019, 78, 3553–3586. [Google Scholar] [CrossRef]
- Sani, A.M.; Moeini, A. Real-time Event Detection in Twitter: A Case Study. In Proceedings of the 2020 6th International Conference on Web Research (ICWR), Tehran, Iran, 22–23 April 2020; pp. 48–51. [Google Scholar]
- Kang, M.H.; Mayfield, T. A cyber-event correlation framework and metrics[C]//System Diagnosis and Prognosis: Security and Condition Monitoring Issues III. International Society for Optics and Photonics. SPIE 2003, 5107, 72–82. [Google Scholar]
- Qiu, X.; Lin, X.; Qiu, L. Feature representation models for cyber attack event extraction. In Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), Omaha, NE, USA, 13–16 October 2016; pp. 29–32. [Google Scholar]
- Khandpur, R.P.; Ji, T.; Jan, S.; Wang, G.; Lu, C.T.; Ramakrishnan, N. Crowdsourcing cybersecurity: Cyber attack detection using social media. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1049–1057. [Google Scholar]
- Le Sceller, Q.; Karbab, E.B.; Debbabi, M.; Iqbal, F. Sonar: Automatic detection of cyber security events over the twitter stream. In Proceedings of the 12th International Conference on Availability, Reliability and Security, Reggio Calabria, Italy, 25–28 August 2017; pp. 1–11. [Google Scholar]
- Bose, A.; Behzadan, V.; Aguirre, C.; Hsu, W.H. A novel approach for detection and ranking of trendy and emerging cyber threat events in twitter streams. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Vancouver, BC, Canada, 27–30 August 2019; pp. 871–878. [Google Scholar]
- Ji, T.; Zhang, X.; Self, N.; Fu, K.; Lu, C.T.; Ramakrishnan, N. Feature driven learning framework for cybersecurity event detection. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Vancouver, BC, Canada, 27–30 August 2019; pp. 196–203. [Google Scholar]
- Zhang, Z.; Luo, P.; Loy, C.C.; Tang, X. Facial landmark detection by deep multi-task learning. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 94–108. [Google Scholar]
- Søgaard, A.; Goldberg, Y. Deep multi-task learning with low level tasks supervised at lower layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics Volume 2: Short Papers, Berlin, Germany, 7–12 August 2016; pp. 231–235. [Google Scholar]
- Wehrmann, J.; Becker, W.E.; Barros, R.C. A multi-task neural network for multilingual sentiment classification and language detection on twitter. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing, Pau, France, 9–13 April 2018; pp. 1805–1812. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Strubell, E.; Verga, P.; Belanger, D.; McCallum, A. Fast and accurate entity recognition with iterated dilated convolutions. arXiv 2017, arXiv:1702.02098. [Google Scholar]
- Dionísio, N.; Alves, F.; Ferreira, P.M.; Bessani, A. Towards end-to-end Cyberthreat Detection from Twitter using Multi-Task Learning. In Proceedings of the IJCNN 2020, Glasgow, UK, 19–24 July 2020. [Google Scholar]
Items | Configuration |
---|---|
OS | Ubuntu 16.04.3 TLS |
The system configuration | CPU: Intel i7-7700, RAM: 16G, GPU: GeForce GTX 2080 8G |
The library of python | keras, Scikit-learn, gensim, Matplotlib, pandas, numpy, gensim |
With Threat | Without Threat | |
---|---|---|
Predicted with threat | TP | FP |
Predicted without threat | FN | TN |
Method | Precision (%) | Recall (%) | F1-Score (%) | |
---|---|---|---|---|
Without LDA | non-cyber threat | 95.5 | 95.3 | 95.3 |
cyber threat | 95.4 | 95.5 | 95.4 | |
avg/total | 95.4 | 96.4 | 95.4 | |
With LDA | non-cyber threat | 96.4 | 96.4 | 96.6 |
cyber threat | 96.3 | 96.5 | 96.5 | |
avg/total | 96.4 | 96.4 | 96.4 |
Method | Precision (%) | Recall (%) | F1-Score (%) |
---|---|---|---|
BiLSTM+CRF/avg | 91.7 | 93.6 | 92.7 |
IDCNN+CRF/avg | 91.4 | 94.5 | 93.1 |
BiLSTM+IDCNN+CRF/avg | 92.4 | 94.5 | 93.4 |
Method | Precision (%) | Recall (%) | F1-Score (%) | |
---|---|---|---|---|
BiLSTM (detection) BiLSTM+CRF(NER) | Cyber threat detection | 95.9 | 95.9 | 95.9 |
NER | 91.6 | 93.4 | 92.5 | |
BiLSTM (detection) IDCNN+CRF(NER) | Cyber threat detection | 95.7 | 95.7 | 95.7 |
NER | 91.6 | 94.7 | 93.2 | |
BiLSTM (detection) BiLSTM+IDCNN+CRF(NER) | Cyber threat detection | 96.6 | 96.6 | 96.6 |
NER | 92.4 | 95.4 | 93.8 |
Method | Precision (%) | Recall (%) | F1-Score (%) | |
---|---|---|---|---|
WordRNN + CharRNN [39] | Cyber threat detection | - | - | 92.2 |
NER | - | - | 94.0 | |
Our method | Cyber threat detection | 95.1 | 97.4 | 96.2 |
NER | 94.1 | 94.0 | 94.1 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, Y.; Gao, J.; Liu, Z.; Huang, C. Detecting Cyber Threat Event from Twitter Using IDCNN and BiLSTM. Appl. Sci. 2020, 10, 5922. https://doi.org/10.3390/app10175922
Fang Y, Gao J, Liu Z, Huang C. Detecting Cyber Threat Event from Twitter Using IDCNN and BiLSTM. Applied Sciences. 2020; 10(17):5922. https://doi.org/10.3390/app10175922
Chicago/Turabian StyleFang, Yong, Jian Gao, Zhonglin Liu, and Cheng Huang. 2020. "Detecting Cyber Threat Event from Twitter Using IDCNN and BiLSTM" Applied Sciences 10, no. 17: 5922. https://doi.org/10.3390/app10175922
APA StyleFang, Y., Gao, J., Liu, Z., & Huang, C. (2020). Detecting Cyber Threat Event from Twitter Using IDCNN and BiLSTM. Applied Sciences, 10(17), 5922. https://doi.org/10.3390/app10175922