Abstract
Android malware detection has been an active area of research. In the past decade, several machine learning-based approaches based on different types of features that may characterize Android malware behaviors have been proposed. The usually-analyzed features include API usages and sequences at various abstraction levels (e.g., class and package), extracted using static or dynamic analysis. Additionally, features that characterize permission uses, native API calls and reflection have also been analyzed. Initial works used conventional classifiers such as Random Forest to learn on those features. In recent years, deep learning-based classifiers such as Recurrent Neural Network have been explored. Considering various types of features, analyses, and classifiers proposed in literature, there is a need of comprehensive evaluation on performances of current state-of-the-art Android malware classification based on a common benchmark. In this study, we evaluate the performance of different types of features and the performance between a conventional classifier, Random Forest (RF) and a deep learning classifier, Recurrent Neural Network (RNN). To avoid temporal and spatial biases, we evaluate the performances in a time- and space-aware setting in which classifiers are trained with older apps and tested on newer apps, and the distribution of test samples is representative of in-the-wild malware-to-benign ratio. Features are extracted from a common benchmark of 7,860 benign samples and 5,912 malware, whose release years span from 2010 to 2020. Among other findings, our study shows that permission use features perform the best among the features we investigated; package-level features generally perform better than class-level features; static features generally perform better than dynamic features; and RNN classifier performs better than RF classifier when trained on sequence-type features
Similar content being viewed by others
Notes
our crawling tool is available in https://github.com/Jesper20/msoftx
note that package level features and class level features result in distinct datasets.
\(L_{cg}\)=85000, \(L_{tr}\)=21000
To cross validate the results, we also ran Logistic Regression and Support Vector Machines for one of the datasets. The results are briefly discussed in Section 4.3.
Reasonable range is determined according to the time budget of 5 hours.
if the size of malware samples does not amount to 18% (which is the case for year 2018 dataset), we use all available malware instances without sampling.
In MaMaDroid’s experiments Onwuzurike et al. (2019), class-level and package-level features produced comparable performance
using feature importance library in Scikit-learn
References
Aafer Y, Du W, Yin H (2013) Droidapiminer: Mining api-level features for robust malware detection in android. In: International conference on security and privacy in communication systems, pp. 86–103. Springer
Afonso VM, de Amorim MF, Grégio ARA, Junquera GB, de Geus PL (2015) Identifying android malware using dynamically obtained features. Journal of Computer Virology and Hacking Techniques 11(1):9–17
Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 2623–2631
Allix K, Bissyandé TF, Jérome Q, Klein J, Le Traon Y et al (2016) Empirical assessment of machine learning-based malware detectors for android. Empirical Software Engineering 21(1):183–211
Allix K, Bissyandé TF, Klein J, Le Traon Y (2015) Are your training datasets yet relevant? In: International Symposium on Engineering Secure Software and Systems, pp. 51–67. Springer
Allix K, Bissyandé TF, Klein J, Le Traon Y (2016) Androzoo: Collecting millions of android apps for the research community. In: Proceedings of the 13th International Conference on Mining Software Repositories, pp. 468–471. ACM
Alshahrani H, Mansourt H, Thorn S, Alshehri A, Alzahrani A, Fu H (2019) Ddefender: Android application threat detection using static and dynamic analysis. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–6. IEEE (2018)
Android (2019) UI/Application Exerciser Monkey. https://developer.android.com/studio/test/monkey
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: Effective and explainable detection of android malware in your pocket. Ndss 14:23–26
Arzt S, Rasthofer S, Fritz C, Bodden E, Bartel A, Klein J, Le Traon Y, Octeau D, McDaniel P (2014) Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-awaretaint analysis for Android apps. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI’14, pp. 259-269. ACM, New York, NY, USA. https://doi.org/10.1145/2594291.2594299
Au KWY, Zhou YF, Huang Z, Lie D (2012) Pscout: analyzing the Android permission specification. In: Proceedings of the 2012 ACM conference on Computer and communications security, pp. 217–228. ACM
Bai Y, Xing Z, Li X, Feng Z, Ma D (2020) Unsuccessful story about few shot malware family classification and siamese network to the rescue. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp. 1560–1571. IEEE
Barandiaran I (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):1–22
Bläsing T, Batyuk L, Schmidt AD, Camtepe SA, Albayrak S (2010) An android application sandbox system for suspicious software detection. In: 2010 5th International Conference on Malicious and Unwanted Software, pp. 55–62. IEEE
Cai H (2020) Assessing and improving malware detection sustainability through app evolution studies. ACM Transactions on Software Engineering and Methodology (TOSEM) 29(2):1–28
Chan PP, Song WK (2014) Static detection of android malware by using permissions and api calls. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 1, pp. 82–87. IEEE
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research 16:321–357
Chen S, Xue M, Tang Z, Xu L, Zhu H (2016) Stormdroid: A streaminglized machine learning-based system for detecting android malware. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 377–388
Choudhary SR, Gorla A, Orso A (2015) Automated test input generation for android: Are we there yet?(e). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 429–440. IEEE
Demissie BF, Ceccato M, Shar LK (2018) Anflo: Detecting anomalous sensitive informa41 tion flows in android apps. In: 2018 IEEE/ACM 5th International Conference on Mobile Software Engineering and Systems (MOBILESoft), pp. 24–34. IEEE
Demissie BF, Ceccato M, Shar LK (2020) Security analysis of permission re-delegation vulnerabilities in android apps. Empir Softw Eng 25(6):5084–5136
Deng L, Yu D et al (2014) Deep learning: methods and applications. Foundations and Trends® in Signal Processing 7(3–4):197–387
Dini G, Martinelli F, Saracino A, Sgandurra D (2012) Madam: a multi-level anomaly detec tor for android malware. In: International Conference on Mathematical Methods, Models, and Architectures for Computer Network Security, pp. 240–253. Springer
Enck W, Ongtang M, McDaniel P (2009) On lightweight mobile phone application certifi51 cation. In: Proceedings of the 16th ACM conference on Computer and communications security, pp. 235–245. ACM
Eskandari M, Hashemi S (2012) A graph mining approach for detecting unknown malwares. J Vis Lang & Comput 23(3):154–162
Fan M, Liu J, Luo X, Chen K, Chen T, Tian Z, Zhang X, Zheng Q, Liu T (2016) Frequent subgraph based familial classification of android malware. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 24–35. IEEE
Fu X, Cai H (2019) On the deterioration of learning-based malware detectors for android. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 272–273. IEEE
Garcia J, Hammad M, Malek S (2018) Lightweight, obfuscation-resilient detection and family identification of android malware. ACM Transactions on Software Engineering and Methodology (TOSEM) 26(3):11
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855–864
Huang CY, Tsai YT, Hsu CH (2013) Performance evaluation on permission-based detection for android malware. In: Advances in Intelligent Systems and Applications-Volume 2, pp.111–120. Springer
Huang L, Joseph AD, Nelson B, Rubinstein BI, Tygar JD (2011) Adversarial machine learning. In: Proceedings of the 4th ACM workshop on Security and artificial intelligence, pp. 43–58
Ikram M, Beaume P, Kaafar MA (2019) Dadidroid: An obfuscation resilient tool for detecting android malware via weighted directed call graph modelling. arXiv:1905.09136
Karbab EB, Debbabi M, Derhab A, Mouheb D (2018) Maldozer: Automatic framework for android malware detection using deep learning. Digital Investigation 24:S48–S59
Kim T, Kang B, Rho M, Sezer S, Im EG (2018) A multimodal deep learning method for android malware detection using various features. IEEE Transactions on Information Forensics and Security 14(3):773–788
Lindorfer M, Neugschwandtner M, Platzer C (2015) Marvin: Efficient and comprehensive mobile app classification through static and dynamic analysis. In: 2015 IEEE 39th annual computer software and applications conference, vol. 2, pp. 422–433. IEEE
Lindorfer M, Neugschwandtner M, Weichselbaum L, Fratantonio Y, Van Der Veen V, Platzer C (2014) Andrubis-1,000,000 apps later: A view on current android malware behaviors. In: 2014 third international workshop on building analysis datasets and gathering experience returns for security (BADGERS), pp. 3–17. IEEE
Liu X, Liu J (2014) A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering, pp. 142–148. IEEE
Liu Y, Tantithamthavorn C, Li L, Liu Y (2022) Deep learning for android malware defenses: a systematic literature review. ACM Journal of the ACM (JACM)
Liu Z, Xia X, Hassan AE, Lo D, Xing Z, Wang X (2018) Neural-machine-translation based commit message generation: how far are we? In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 373–384
Ma Z, Ge H, Liu Y, Zhao M, Ma J (2019) A combination method for android malware detection based on control flow graphs and machine learning algorithms. IEEE access 7:21235–21245
McLaughlin N, Martinez del Rincon J, Kang B, Yerima S, Miller P, Sezer S, Safaei Y, Trickel E, Zhao Z, Doupé A et al (2017) Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp.301–308. ACM
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26
Narayanan A, Chandramohan M, Venkatesan R, Chen L, Liu Y, Jaiswal S (2017) graph2vec: Learning distributed representations of graphs. arXiv:1707.05005
Narayanan A, Soh C, Chen L, Liu Y, Wang L (2018) apk2vec: Semi-supervised multi-view representation learning for profiling android applications. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 357–366. IEEE
Narudin FA, Feizollah A, Anuar NB, Gani A (2016) Evaluation of machine learning classifiers for mobile malware detection. Soft Computing 20(1):343–357
Naway A, Li Y (2018) A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360
Onwuzurike L, Mariconti E, Andriotis P, Cristofaro ED, Ross G, Stringhini G (2019) Mamadroid: Detecting android malware by building markov chains of behavioral models (extended version). ACM Transactions on Privacy and Security (TOPS) 22(2):14
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12:2825–2830
Pendlebury F, Pierazzi F, Jordaney R, Kinder J, Cavallaro L et al (2019) Tesseract: Eliminating experimental bias in malware classification across space and time. In: Proceedings of the 28th USENIX Security Symposium, pp. 729–746. USENIX Association
Rastogi V, Chen Y, Jiang X (2013) Droidchameleon: evaluating android anti-malware against transformation attacks. In: Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, pp. 329–334
Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG, Álvarez G (2013) Puma:Permission usage to detect malware in android. In: International Joint Conference CISIS’12-ICEUTE 12-SOCO 12 Special Sessions, pp. 289–298. Springer
Shahpasand M, Hamey L, Vatsalan D, Xue M (2019) Adversarial attacks on mobile malware detection. In: 2019 IEEE 1st International Workshop on Artificial Intelligence for Mobile (AI4Mobile), pp. 17–20. IEEE
Shar LK, Demissie BF, Ceccato M, Minn W (2020) Experimental comparison of features and classifiers for android malware detection. In: Proceedings of the IEEE/ACM 7th International Conference on Mobile Software Engineering and Systems, pp. 50–60. IEEE/ACM
Sharma A, Dash SK (2014) Mining api calls and permissions for android malware detection. In: International Conference on Cryptology and Network Security, pp. 191–205. Springer
Shen F, Del Vecchio J, Mohaisen A, Ko SY, Ziarek L (2018) Android malware detection using complex-flows. IEEE Transactions on Mobile Computing 18(6):1231–1245
Shi L, Ming J, Fu J, Peng G, Xu D, Gao K, Pan X (2020) Vahunt: Warding off new repackaged android malware in app-virtualization’s clothing. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp. 535–549
Soot (2018) Soot - a java optimization framework, https://github.com/sable/soot
Spreitzenbarth M, Freiling F, Echtler F, Schreck T, Hoffmann J (2013) Mobile-sandbox: having a deeper look into android applications. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 1808–1815
Suarez-Tangil G, Dash SK, Ahmadi M, Kinder J, Giacinto G, Cavallaro L (2017) Droidsieve: Fast and accurate classification of obfuscated android malware. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp.309–320
Symantec (2019) Internet Security Threat Report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf
Thomé J, Shar LK, Bianculli D, Briand L (2017) Search-driven string constraint solving for vulnerability detection. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 198–208. IEEE
Tobiyama S, Yamaguchi Y, Shimada H, Ikuse T, Yagi T (2016) Malware detection with deep neural network using process behavior. In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 577–582. IEEE
Tobiyama S, Yamaguchi Y, Shimada H, Ikuse T, Yagi T (2016) Malware detection with deep neural network using process behavior. In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 577–582. IEEE
Wu B, Chen S, Gao C, Fan L, Liu Y, Wen W, Lyu MR (2021) Why an android app is classified as malware: Toward malware classification interpretation. ACM Transactions on Software Engineering and Methodology (TOSEM) 30(2):1–29
Wu DJ, Mao CH, Wei TE, Lee HM, Wu KP (2012) Droidmat: Android malware detection through manifest and api calls tracing. In: 2012 Seventh Asia Joint Conference on Information Security, pp. 62–69. IEEE
Xu B, Shirani A, Lo D, Alipour MA (2018) Prediction of relatedness in stack overflow: deep learning vs. svm: a reproducibility study. In: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10
Xu K, Li Y, Deng R, Chen K, Xu J (2019) Droidevolver: Self-evolving android malware detection system. In: 2019 IEEE European Symposium on Security and Privacy (EuroSP), pp. 47–62. https://doi.org/10.1109/EuroSP.2019.00014
Xu K, Li Y, Deng RH, Chen K (2018) Deeprefiner: Multi-layer android malware detection system applying deep neural networks. In: 2018 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 473–487. IEEE
Yang W, Prasad M, Xie T (2018) Enmobile: Entity-based characterization and analysis of mobile malware. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 384–394. IEEE
Yang X, Lo D, Li L, Xia X, Bissyandé TF, Klein J (2017) Characterizing malicious android apps by mining topic-specific data flow signatures. Information and Software Technology 90:27–39
Yerima SY, Sezer S, Muttik I (2015) High accuracy android malware detection using ensemble learning. IET Information Security 9(6):313–320
Yuan Z, Lu Y,Wang Z, Xue Y (2014) Droid-sec: deep learning in android malware detection. In: ACMSIGCOMMComputer Communication Review, vol. 44, pp. 371–372. ACM
Zhang M, Duan Y, Yin H, Zhao Z (2014) Semantics-aware android malware classification using weighted contextual api dependency graphs. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp. 1105–1116
Zhang X, Zhang Y, Zhong M, Ding D, Cao Y, Zhang Y, Zhang M, Yang M (2020) Enhancing state-of-the-art classifiers with api semantics to detect evolved android malware. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp. 757–770
Zhao Y, Li L, Wang H, Cai H, Bissyandé TF, Klein J, Grundy J (2021) On the impact of sample duplication in machine-learning-based android malware detection. ACM Transactions on Software Engineering and Methodology (TOSEM) 30(3):1–38
Zou D, Wu Y, Yang S, Chauhan A, Yang W, Zhong J, Dou S, Jin H (2021) Intdroid: Android malware detection based on api intimacy analysis. ACM Transactions on Software Engineering and Methodology (TOSEM) 30(3):1–32
Funding
The work of Lwin Khin Shar, Yan Naing Tun, Lingxiao Jiang, and David Lo is supported by the National Research Foundation, Singapore, and Cyber Security Agency of Singapore under its National Cybersecurity R &D Programme, National Satellite of Excellence in Mobile Systems Security and Cloud Security (NRF2018NCR-NSOE004-0001). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore and Cyber Security Agency of Singapore. The work of Mariano Ceccato is partially supported by project MIUR 2018-2022 “Dipartimenti di Eccellenza”.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interests
The authors declare that they have no conflict of interest.
Additional information
Communicated by: Jacques Klein.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shar, L.K., Demissie, B.F., Ceccato, M. et al. Experimental comparison of features, analyses, and classifiers for Android malware detection. Empir Software Eng 28, 130 (2023). https://doi.org/10.1007/s10664-023-10375-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10375-y