Abstract
Cyber Threat hunting is a proactive search for known attack behaviors in the organizational information system. It is an important component to mitigate advanced persistent threats (APTs). However, the attack behaviors recorded in provenance data may not be completely consistent with the known attack behaviors. In this paper, we propose DeepHunter, a graph neural network (GNN) based graph pattern matching approach that can match provenance data against known attack behaviors in a robust way. Specifically, we design a graph neural network architecture with two novel networks: attribute embedding networks that could incorporate Indicators of Compromise (IOCs) information, and graph embedding networks that could capture the relationships between IOCs. To evaluate DeepHunter, we choose five real and synthetic APT attack scenarios. Results show that DeepHunter can hunt all attack behaviors, and the accuracy and robustness of DeepHunter outperform the state-of-the-art method, Poirot.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Event tracing. https://docs.microsoft.com/en-us/windows/win32/etw/event-tracing-portal
Causal, adaptive, distributed, and efficient tracing system (cadets) (2018). https://www.cl.cam.ac.uk/research/security/cadets/. Accessed 21 Sept 2020
Trace: Preventing advanced persistent threat cyberattacks (2018). https://archive.sri.com/work/projects/trace-preventing-advanced-persisten-threat-cyberattacks. Accessed 21 Sept 2020
Bai, Y., Ding, H., Bian, S., Chen, T., Sun, Y., Wang, W.: SimGNN: a neural network approach to fast graph similarity computation. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 384–392 (2019)
Bai, Y., Ding, H., Gu, K., Sun, Y., Wang, W.: Learning-based efficient graph similarity computation via multi-scale convolutional set matching. In: AAAI, pp. 3219–3226 (2020)
Bates, A., Tian, D.J., Butler, K.R., Moyer, T.: Trustworthy whole-system provenance for the linux kernel. In: 24th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 15), pp. 319–334 (2015)
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. In: Advances in Neural Information Processing Systems, pp. 737–744 (1994)
FireEye (2018). https://openioc.org. openIOC
Fyrbiak, M., Wallat, S., Reinhard, S., Bissantz, N., Paar, C.: Graph similarity and its applications to hardware security. IEEE Trans. Comput. 69(4), 505–519 (2019)
Gehani, A., Tariq, D.: SPADE: support for provenance auditing in distributed environments. In: Narasimhan, P., Triantafillou, P. (eds.) Middleware 2012. LNCS, vol. 7662, pp. 101–120. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35170-9_6
Gibson, T., Schuchardt, K., Stephan, E.G.: Application of named graphs towards custom provenance views. In: Workshop on the Theory and Practice of Provenance (2009)
Graeber, M.: Abusing Windows Management Instrumentation (WMI) to Build a Persistent, Asyncronous, and Fileless Backdoor. Black Hat, Las Vegas (2015)
Hassan, W.U., Bates, A., Marino, D.: Tactical provenance analysis for endpoint detection and response systems. In: Proceedings of the IEEE Symposium on Security and Privacy (2020)
Hassan, W.U., et al.: NODOZE: combatting threat alert fatigue with automated provenance triage. In: NDSS (2019)
Hassan, W.U., Noureddine, M.A., Datta, P., Bates, A.: OmegaLog: high-fidelity attack investigation via transparent multi-layer log analysis. In: Proceedings NDSS (2020)
Homayoun, S., Dehghantanha, A., Ahmadzadeh, M., Hashemi, S., Khayami, R.: Know abnormal, find evil: frequent pattern mining for ransomware threat hunting and intelligence. IEEE Trans. Emerg. Top. Comput. 8, 341–351 (2017)
Hossain, M.N., et al.: \(\{\)SLEUTH\(\}\): real-time attack scenario reconstruction from \(\{\)COTS\(\}\) audit data. In: 26th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 17), pp. 487–504 (2017)
Hossain, M.N., Sheikhi, S., Sekar, R.: Combating dependence explosion in forensic analysis using alternative tag propagation semantics. In: 2020 IEEE Symposium on Security and Privacy (SP). IEEE (2020)
Husari, G., Al-Shaer, E., Ahmed, M., Chu, B., Niu, X.: TTPDrill: automatic and accurate extraction of threat actions from unstructured text of CTI sources. In: Proceedings of the 33rd Annual Computer Security Applications Conference, pp. 103–115 (2017)
Kaspar, R.: https://github.com/dzambon/graph-matching-toolkit (2018). mig-logcleaner-resurrected
Khan, A., Wu, Y., Aggarwal, C.C., Yan, X.: NeMa: fast graph search with label similarity. Proc. VLDB Endowment 6(3), 181–192 (2013)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
Lee, K.H., Zhang, X., Xu, D.: High accuracy attack provenance via binary-based execution partition. In: NDSS (2013)
Li, Y., Gu, C., Dullien, T., Vinyals, O., Kohli, P.: Graph matching networks for learning the similarity of graph structured objects. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, Long Beach, California, USA,09–15 Jun 2019, vol. 97, pp. 3835–3845. PMLR (2019). http://proceedings.mlr.press/v97/li19d.html
Li, Y., Gu, C., Dullien, T., Vinyals, O., Kohli, P.: Graph matching networks for learning the similarity of graph structured objects. arXiv preprint arXiv:1904.12787 (2019)
Liao, X., Yuan, K., Wang, X., Li, Z., Xing, L., Beyah, R.: Acing the IOC game: toward automatic discovery and analysis of open-source cyber threat intelligence. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 755–766 (2016)
Ma, S., Zhai, J., Wang, F., Lee, K.H., Zhang, X., Xu, D.: \(\{\)MPI\(\}\): Multiple perspective attack investigation with semantic aware execution partitioning. In: 26th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 17), pp. 1111–1128 (2017)
Manzoor, E., Milajerdi, S.M., Akoglu, L.: Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1035–1044 (2016)
Micro, T.: cryptocurrency Miner Uses WMI and EternalBlue To Spread Filelessly (2017). https://blog.trendmicro.com/trendlabs-security-intelligence/cryptocurrency-miner-uses-wmi-eternalblue-spread-filelessly/. Accessed 4 May 2020
Milajerdi, S.M., Eshete, B., Gjomemo, R., Venkatakrishnan, V.: POIROT: aligning attack behavior with kernel audit records for cyber threat hunting. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1795–1812 (2019)
Milajerdi, S.M., Gjomemo, R., Eshete, B., Sekar, R., Venkatakrishnan, V.: HOLMES: real-time apt detection through correlation of suspicious information flows. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 1137–1152. IEEE (2019)
MISP: Open Source Threat Intelligence Platform & Open Standards For Threat Information Sharing (2019). https://www.misp-project.org/
Mitre: Structured Threat Information eXpression (STIX) (2018). https://stixproject.github.io
Oprea, A., Li, Z., Yen, T.F., Chin, S.H., Alrwais, S.: Detection of early-stage enterprise infection by mining large-scale log data. In: 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 45–56. IEEE (2015)
Pasquier, T., et al.: Practical whole-system provenance capture. In: Proceedings of the 2017 Symposium on Cloud Computing, pp. 405–418 (2017)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Pei, K., et al.: HERCULE: attack story reconstruction via community discovery on correlated log graph. In: Proceedings of the 32nd Annual Conference on Computer Security Applications, pp. 583–595 (2016)
Pienta, R., Tamersoy, A., Tong, H., Chau, D.H.: MAGE: matching approximate patterns in richly-attributed graphs. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 585–590. IEEE (2014)
Pohly, D.J., McLaughlin, S., McDaniel, P., Butler, K.: Hi-fi: collecting high-fidelity whole-system provenance. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 259–268 (2012)
Qureshi, R.J., Ramel, J.-Y., Cardot, H.: Graph based shapes representation and recognition. In: Escolano, F., Vento, M. (eds.) GbRPR 2007. LNCS, vol. 4538, pp. 49–60. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72903-7_5
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, Valletta, Malta, pp. 45–50. ELRA, May 2010. http://is.muni.cz/publication/884893/en
Riesen, K., Emmenegger, S., Bunke, H.: A novel software toolkit for graph edit distance computation. In: Kropatsch, W.G., Artner, N.M., Haxhimusa, Y., Jiang, X. (eds.) GbRPR 2013. LNCS, vol. 7877, pp. 142–151. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38221-5_15
Smith, J. (2021). https://libraetd.lib.virginia.edu/public_view/5138jf509. Accessed 4 Mar 2021
Socher, R., Chen, D., Manning, C.D., Ng, A.: Reasoning with neural tensor networks for knowledge base completion. In: Advances in Neural Information Processing Systems, pp. 926–934 (2013)
Song, W., Yin, H., Liu, C., Song, D.: DeepMem: learning graph neural network models for fast and robust memory forensic analysis. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 606–618 (2018)
Sun, X., Dai, J., Liu, P., Singhal, A., Yen, J.: Using bayesian networks for probabilistic identification of zero-day attack paths. IEEE Trans. Inf. Forensics Secur. 13(10), 2506–2521 (2018)
Tong, H., Faloutsos, C., Gallagher, B., Eliassi-Rad, T.: Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 737–746 (2007)
Wang, Q., et al.: You are what you do: Hunting stealthy malware via data provenance analysis. In: Proceedings of the Symposium on Network and Distributed System Security (NDSS) (2020)
Wang, S., et al.: Heterogeneous graph matching networks for unknown malware detection. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 3762–3770. AAAI Press (2019)
Xiong, C., et al.: CONAN: a practical real-time APT detection system with high accuracy and efficiency. IEEE Trans. Depend. Secur. Comput. (2020)
Xu, X., Liu, C., Feng, Q., Yin, H., Song, L., Song, D.: Neural network-based graph embedding for cross-platform binary code similarity detection. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 363–376 (2017)
Zhu, Z., Dumitras, T.: ChainSmith: automatically learning the semantics of malicious campaigns by mining threat intelligence reports. In: 2018 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 458–472. IEEE (2018)
Acknowledgment
This work is supported by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02040200.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Wei, R., Cai, L., Zhao, L., Yu, A., Meng, D. (2021). DeepHunter: A Graph Neural Network Based Approach for Robust Cyber Threat Hunting. In: Garcia-Alfaro, J., Li, S., Poovendran, R., Debar, H., Yung, M. (eds) Security and Privacy in Communication Networks. SecureComm 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 398. Springer, Cham. https://doi.org/10.1007/978-3-030-90019-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-90019-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90018-2
Online ISBN: 978-3-030-90019-9
eBook Packages: Computer ScienceComputer Science (R0)