Abstract
Our society is facing a growing threat from data breaches where confidential information is stolen from computer servers. In order to steal data, hackers must first gain entry into the targeted systems. Commercial off-the-shelf intrusion detection systems are unable to defend against the intruders effectively. This research uses cyber behavior analytics to study and report how anomalies compare to normal behavior. In this paper, we present methods based on machine learning algorithms to detect intruders based on the file access patterns within a user file directory. We proposed a set of behavioral features of the user’s file access patterns in a file system. We validate the effectiveness of the features by conducting experiments on an existing file system dataset with four classification algorithms. To limit the false alarms, we trained and tested the classifiers by optimizing the performance within the lower range of the false positive rate. The results from our experiments show that our approach was able to detect intruders with a 0.94 F1 score and false positive rate of less than 3%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Altmann, A., Tolosi, L., Sander, O., Lengauer, T.: Permutation importance: a corrected feature importance measure. Bioinformatics 26(10), 1340–1347 (2010). ISSN 1460-2059, 1367–4803, p. 395
Anderson, J.P.: Computer security threat monitoring and surveillance. Technical report, James P. Anderson Company, Fort Washington, Pennsylvania (1980)
Atkinson, E.J., Therneau, T.M.: An Introduction to Recursive Partitioning Using the Rpart Routines. Mayo Foundation, Rochester (2000)
Bowen, B.M., Hershkop, S., Keromytis, A.D., Stolfo, S.J.: Baiting inside attackers using decoy documents. In: Chen, Y., Dimitriou, T.D., Zhou, J. (eds.) SecureComm 2009. LNICST, vol. 19, pp. 51–70. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05284-2_4
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Monterey (1984)
Camiña, J.B., Hernández-Gracidas, C., Monroy, R., Trejo, L.: The Windows-users and-intruder simulations logs dataset (WUIL): an experimental framework for masquerade detection mechanisms. Expert Syst. Appl. 41, 919–930 (2014)
Camiña, J.B., Monroy, R., Trejo, L.A., Medina-Perez, M.A.: Temporal and spatial locality: an abstraction for masquerade detection. IEEE Trans. Inf. Forensics Secur. 11(9), 2036–2051 (2016)
Camiña, B., Monroy, R., Trejo, L.A., Sánchez, E.: Towards building a masquerade detection method based on user file system navigation. In: Batyrshin, I., Sidorov, G. (eds.) MICAI 2011. LNCS (LNAI), vol. 7094, pp. 174–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25324-9_15
Camiña, J.B., Rodríguez, J., Monroy, R.: Towards a masquerade detection system based on user’s tasks. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) RAID 2014. LNCS, vol. 8688, pp. 447–465. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11379-1_22
Chen, Y.W., Lin, C.J.: Combining SVMs with various feature selection strategies. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction. STUDFUZZ, vol. 207, pp. 315–324. Springer, Berlin (2006). https://doi.org/10.1007/978-3-540-35488-8_13
Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. 13(SE-2), 222–232 (1987)
D’haeseleer, P., Forrest, S., Helman, P.: An immunological approach to change detection: algorithms, analysis, and implications. In: IEEE Symposium on Security and Privacy (1996)
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection. In: Barbará, D., Jajodia, S. (eds.) Applications of Data Mining in Computer Security. ADIS, vol. 6, pp. 77–101. Springer, Boston, MA (2002). https://doi.org/10.1007/978-1-4615-0953-0_4
Gregorutti, B., Michel, B., Saint-Pierre, P.: Correlation and variable importance in random forests. Stat. Comput. 27(3), 659–678 (2017)
Gupta, B., Rawat, A., Jain, A., Arora, A., Dhami, N.: Analysis of various decision tree algorithms for classification in data mining. Int. J. Comput. Appl. 163(8), 15–19 (2017)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)
Javitz, H.S., Valdes, A.: The SRI IDES statistical anomaly detector. In: Proceedings of IEEE Computer Society Symposium on Research in Security and Privacy, Oakland, CA, USA, pp. 316–326 (1991)
Jiang, Y., Metz, C.E., Nishikawa, R.M.: A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology 201(3), 745–750 (1996)
Killourhy, K., Maxion, R.: Why did my detector do That?! In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 256–276. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15512-3_14
Koerner, B.I.: Inside the cyberattack that shocked the US government, October 2016. https://www.wired.com/2016/10/inside-cyberattack-shocked-us-government/. Accessed 21 Mar 2018
Kuo, Y., Huang, S.S.: Detecting stepping-stone connection using association rule mining. In: Proceedings of International Conference on Availability, Reliability, and Security, Fukuoka, pp. 90–97 (2009)
Lunt, T.F.: A survey of intrusion detection techniques. Comput. Secur. 12, 405–418 (1993)
Ma, H., Bandos, A.I., Gur, D.: On the use of partial area under the ROC curve for comparison of two diagnostic tests. Biometrical J. 57, 304–320 (2015)
Newman, L.H.: How to protect yourself from that massive Equifax breach, September 2017. https://www.wired.com/story/how-to-protect-yourself-from-that-massive-equifax-breach/. Accessed 21 Mar 2018 (2017)
Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–453 (1998)
Pusara, M., Brodley, C.E.: User re-authentication via mouse movements. In: Proceedings of ACM Workshop on Visualization and Data Mining Computer Security (VizSEC/DMSEC), pp. 1–8 (2004)
Quinlan, J.R.: Introduction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Salem, M.B., Stolfo, S.J.: Modeling user search behavior for masquerade detection. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 181–200. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_10
Salzberg, S.L.: C4.5: programs for machine learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Mach. Learn. 16(3), 235–240 (1994)
Schonlau, M., DuMouchel, W., Ju, W.-H., Karr, A.F., Theus, M., Vardi, Y.: Computer intrusion: detecting masquerades. Statistic. Science 16(1), 58–74 (2001)
Stolfo, S.J., Hershkop, S., Bui, L.H., Ferster, R., Wang, K.: Anomaly detection in computer security and an application to file system accesses. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 14–28. Springer, Heidelberg (2005). https://doi.org/10.1007/11425274_2
Strobl, C., Boulesteix, A.L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinf. 9(1), 307 (2008)
Wu, H., Huang, S.S.: User behavior analysis in masquerade detection using principal component analysis. In: Proceedings of 8th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 201–206 (2008)
Yang, J., Huang, S.S.: Mining TCP/IP packets to detect stepping-stone intrusion. Comput. Secur. 26(7), 479–484 (2007)
Yuill, J., Zappe, M., Denning, D., Feer, F.: Honeyfiles: deceptive files for intrusion detection. In: Proceedings of the 5th Annual IEEE SMC Information Assurance Workshop (IAW 2004), pp. 116–122 (2004)
Zanero, S.: Behavioral intrusion detection. In: Aykanat, C., Dayar, T., Körpeoğlu, İ. (eds.) ISCIS 2004. LNCS, vol. 3280, pp. 657–666. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30182-0_66
Zhang, F., Wang, Y., Wang, H.: Gradient correlation: are ensemble classifiers more robust against evasion attacks in practical settings? In: Hacid, H., Cellary, W., Wang, H., Paik, H.-Y., Zhou, R. (eds.) WISE 2018. LNCS, vol. 11233, pp. 96–110. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02922-7_7
Acknowledgment
We would like to thank Raúl Monroy for creating and sharing the WUIL dataset [6]. This work was supported in part by the National Science Foundation (NSF) under grants NSF-1659755, NSF-1433817, and NSF-1356705.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, SH.S., Cao, Z., Raines, C.E., Yang, M.N., Simon, C. (2019). Detecting Intruders by User File Access Patterns. In: Liu, J., Huang, X. (eds) Network and System Security. NSS 2019. Lecture Notes in Computer Science(), vol 11928. Springer, Cham. https://doi.org/10.1007/978-3-030-36938-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-36938-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36937-8
Online ISBN: 978-3-030-36938-5
eBook Packages: Computer ScienceComputer Science (R0)