[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3387905.3388596acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Experimental comparison of features and classifiers for Android malware detection

Published: 07 October 2020 Publication History

Abstract

Android platform has dominated the smart phone market for years now and, consequently, gained a lot of attention from attackers. Malicious apps (malware) pose a serious threat to the security and privacy of Android smart phone users. Available approaches to detect mobile malware based on machine learning rely on features extracted with static analysis or dynamic analysis techniques. Different types of machine learning classifiers (such as support vector machine and random forest) deep learning classifiers (based on deep neural networks) are then trained on extracted features, to produce models that can be used to detect mobile malware. The usually-analyzed features include permissions requested/used, frequency of API calls, use of API calls, and sequence of API calls. The API calls are analyzed at various granularity levels such as method, class, package, and family.
In the view of the proposals of different types of classifiers and the use of different types of features and different underlying analyses used for feature extraction, there is a need for a comprehensive evaluation on the effectiveness of the current state-of-the-art studies in malware detection on a common benchmark. In this work, we provide a baseline comparison of several conventional machine learning classifiers and deep learning classifiers, without fine tuning. We also provide the evaluation of different types of features that characterize the use of API calls at class level and the sequence of API calls at method level. Features have been extracted from a common benchmark of 4572 benign samples and 2399 malware samples, using both static analysis and dynamic analysis.
Among other interesting findings, we observed that classifiers trained on the use of API calls generally perform better than those trained on the sequence of API calls. Classifiers trained on static analysis-based features perform better than those trained on dynamic analysis-based features. Deep learning classifiers, despite their sophistication, are not necessarily better than conventional classifiers, especially when they are not optimized. However, deep learning classifiers do perform better than conventional classifiers when trained on dynamic analysis-based features.

References

[1]
Y. Aafer, W. Du, and H. Yin. Droidapiminer: Mining api-level features for robust malware detection in android. In International conference on security and privacy in communication systems, pages 86--103. Springer, 2013.
[2]
K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon. Androzoo: Collecting millions of android apps for the research community. In Proceedings of the 13th International Conference on Mining Software Repositories, pages 468--471. ACM, 2016.
[3]
H. Alshahrani, H. Mansourt, S. Thorn, A. Alshehri, A. Alzahrani, and H. Fu. Ddefender: Android application threat detection using static and dynamic analysis. In 2018 IEEE International Conference on Consumer Electronics (ICCE), pages 1--6. IEEE, 2018.
[4]
Android. UI/Application Exerciser Monkey. https://developer.android.com/studio/test/monkey, 2019.
[5]
D. Arp, M. Spreitzenbarth, H. Gascon, K. Rieck, and C. Siemens. Drebin: Effective and explainable detection of android malware in your pocket. 2014.
[6]
S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. McDaniel. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, pages 259--269, New York, NY, USA, 2014. ACM.
[7]
I. Barandiaran. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell, 20(8):1--22, 1998.
[8]
L. Breiman. Classification and regression trees. Routledge, 2017.
[9]
P. P. Chan and W.-K. Song. Static detection of android malware by using permissions and api calls. In 2014 International Conference on Machine Learning and Cybernetics, volume 1, pages 82--87. IEEE, 2014.
[10]
K. Chen, P. Wang, Y. Lee, X. Wang, N. Zhang, H. Huang, W. Zou, and P. Liu. Finding unknown malice in 10 seconds: Mass vetting for new threats at the google-play scale. In 24th {USENIX} Security Symposium ({USENIX} Security 15), pages 659--674, 2015.
[11]
S. Chen, M. Xue, Z. Tang, L. Xu, and H. Zhu. Stormdroid: A streaminglized machine learning-based system for detecting android malware. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pages 377--388, 2016.
[12]
M. Christodorescu, S. Jha, S. A. Seshia, D. Song, and R. E. Bryant. Semantics-aware malware detection. In 2005 IEEE Symposium on Security and Privacy (S&P'05), pages 32--46. IEEE, 2005.
[13]
N. Cristianini, J. Shawe-Taylor, et al. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.
[14]
L. Deng, D. Yu, et al. Deep learning: methods and applications. Foundations and Trends® in Signal Processing, 7(3--4):197--387, 2014.
[15]
G. Dini, F. Martinelli, A. Saracino, and D. Sgandurra. Madam: a multi-level anomaly detector for android malware. In International Conference on Mathematical Methods, Models, and Architectures for Computer Network Security, pages 240--253. Springer, 2012.
[16]
W. Enck, M. Ongtang, and P. McDaniel. On lightweight mobile phone application certification. In Proceedings of the 16th ACM conference on Computer and communications security, pages 235--245. ACM, 2009.
[17]
M. Eskandari and S. Hashemi. A graph mining approach for detecting unknown malwares. Journal of Visual Languages & Computing, 23(3):154--162, 2012.
[18]
M. Fan, J. Liu, X. Luo, K. Chen, T. Chen, Z. Tian, X. Zhang, Q. Zheng, and T. Liu. Frequent subgraph based familial classification of android malware. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pages 24--35. IEEE, 2016.
[19]
J. Garcia, M. Hammad, and S. Malek. Lightweight, obfuscation-resilient detection and family identification of android malware. ACM Transactions on Software Engineering and Methodology (TOSEM), 26(3):11, 2018.
[20]
C.-Y. Huang, Y.-T. Tsai, and C.-H. Hsu. Performance evaluation on permission-based detection for android malware. In Advances in Intelligent Systems and Applications-Volume 2, pages 111--120. Springer, 2013.
[21]
M. Ikram, P. Beaume, and M. A. Kaafar. Dadidroid: An obfuscation resilient tool for detecting android malware via weighted directed call graph modelling. arXiv preprint arXiv:1905.09136, 2019.
[22]
E. B. Karbab, M. Debbabi, A. Derhab, and D. Mouheb. Android malware detection using deep learning on api method sequences. arXiv preprint arXiv:1712.08996, 2017.
[23]
E. B. Karbab, M. Debbabi, A. Derhab, and D. Mouheb. Maldozer: Automatic framework for android malware detection using deep learning. Digital Investigation, 24:S48--S59, 2018.
[24]
Y. Liao and V. R. Vemuri. Use of k-nearest neighbor classifier for intrusion detection. Computers & security, 21(5):439--448, 2002.
[25]
M. Lindorfer, M. Neugschwandtner, L. Weichselbaum, Y. Fratantonio, V. Van Der Veen, and C. Platzer. Andrubis--1,000,000 apps later: A view on current android malware behaviors. In 2014 third international workshop on building analysis datasets and gathering experience returns for security (BADGERS), pages 3--17. IEEE, 2014.
[26]
X. Liu and J. Liu. A two-layered permission-based android malware detection scheme. In 2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering, pages 142--148. IEEE, 2014.
[27]
N. McLaughlin, J. Martinez del Rincon, B. Kang, S. Yerima, P. Miller, S. Sezer, Y. Safaei, E. Trickel, Z. Zhao, A. Doupé, et al. Deep android malware detection. In Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pages 301--308. ACM, 2017.
[28]
F. A. Narudin, A. Feizollah, N. B. Anuar, and A. Gani. Evaluation of machine learning classifiers for mobile malware detection. Soft Computing, 20(1):343--357, 2016.
[29]
A. Naway and Y. Li. A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360, 2018.
[30]
L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. Ross, and G. Stringhini. Mamadroid: Detecting android malware by building markov chains of behavioral models (extended version). ACM Transactions on Privacy and Security (TOPS), 22(2):14, 2019.
[31]
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, pages 8024--8035, 2019.
[32]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011.
[33]
V. Rastogi, Y. Chen, and X. Jiang. Droidchameleon: evaluating android anti-malware against transformation attacks. In Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, pages 329--334, 2013.
[34]
B. Sanz, I. Santos, C. Laorden, X. Ugarte-Pedrero, P. G. Bringas, and G. Álvarez. Puma: Permission usage to detect malware in android. In International Joint Conference CISISâĂŹ12-ICEUTE 12-SOCO 12 Special Sessions, pages 289--298. Springer, 2013.
[35]
L. K. Shar. Experimental comparison of features and machine learning classifiers for android malware detection. https://github.com/sharlwinkhin/msoft20, 2020.
[36]
A. Sharma and S. K. Dash. Mining api calls and permissions for android malware detection. In International Conference on Cryptology and Network Security, pages 191--205. Springer, 2014.
[37]
F. Shen, J. Del Vecchio, A. Mohaisen, S. Y. Ko, and L. Ziarek. Android malware detection using complex-flows. IEEE Transactions on Mobile Computing, 18(6):1231--1245, 2018.
[38]
Soot. Soot - a java optimization framework, https://github.com/sable/soot. 2018.
[39]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929--1958, 2014.
[40]
Symantec. Internet Security Threat Report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf, 2019.
[41]
S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, and T. Yagi. Malware detection with deep neural network using process behavior. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), volume 2, pages 577--582. IEEE, 2016.
[42]
S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, and T. Yagi. Malware detection with deep neural network using process behavior. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), volume 2, pages 577--582. IEEE, 2016.
[43]
R. Vinayakumar, K. Soman, P. Poornachandran, and S. Sachin Kumar. Detecting android malware using long short-term memory (lstm). Journal of Intelligent & Fuzzy Systems, 34(3):1277--1288, 2018.
[44]
I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016.
[45]
K. Xu, Y. Li, R. H. Deng, and K. Chen. Deeprefiner: Multi-layer android malware detection system applying deep neural networks. In 2018 IEEE European Symposium on Security and Privacy (EuroS&P), pages 473--487. IEEE, 2018.
[46]
W. Yang, M. Prasad, and T. Xie. Enmobile: Entity-based characterization and analysis of mobile malware. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pages 384--394. IEEE, 2018.
[47]
S. Y. Yerima, S. Sezer, and I. Muttik. High accuracy android malware detection using ensemble learning. IET Information Security, 9(6):313--320, 2015.
[48]
Z. Yuan, Y. Lu, Z. Wang, and Y. Xue. Droid-sec: deep learning in android malware detection. In ACM SIGCOMM Computer Communication Review, volume 44, pages 371--372. ACM, 2014.
[49]
H. Zhang. The optimality of naive bayes. AA, 1(2):3, 2004.
[50]
M. Zhang, Y. Duan, H. Yin, and Z. Zhao. Semantics-aware android malware classification using weighted contextual api dependency graphs. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1105--1116, 2014.

Cited By

View all
  • (2024)SeGDroid: An Android malware detection method based on sensitive function call graph learningExpert Systems with Applications10.1016/j.eswa.2023.121125235(121125)Online publication date: Jan-2024
  • (2023)Mitigating Malware Attacks using Machine Learning: A Review2023 International Conference on Artificial Intelligence and Smart Communication (AISC)10.1109/AISC56616.2023.10085630(1032-1038)Online publication date: 27-Jan-2023
  • (2023)Experimental comparison of features, analyses, and classifiers for Android malware detectionEmpirical Software Engineering10.1007/s10664-023-10375-y28:6Online publication date: 26-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MOBILESoft '20: Proceedings of the IEEE/ACM 7th International Conference on Mobile Software Engineering and Systems
July 2020
158 pages
ISBN:9781450379595
DOI:10.1145/3387905
  • General Chair:
  • David Lo,
  • Program Chairs:
  • Leonardo Mariani,
  • Ali Mesbah
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Android
  2. deep learning
  3. machine learning
  4. malware detection

Qualifiers

  • Research-article

Funding Sources

  • National Research Foundation Singapore

Conference

MOBILESoft '20
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)5
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)SeGDroid: An Android malware detection method based on sensitive function call graph learningExpert Systems with Applications10.1016/j.eswa.2023.121125235(121125)Online publication date: Jan-2024
  • (2023)Mitigating Malware Attacks using Machine Learning: A Review2023 International Conference on Artificial Intelligence and Smart Communication (AISC)10.1109/AISC56616.2023.10085630(1032-1038)Online publication date: 27-Jan-2023
  • (2023)Experimental comparison of features, analyses, and classifiers for Android malware detectionEmpirical Software Engineering10.1007/s10664-023-10375-y28:6Online publication date: 26-Sep-2023
  • (2022)Systematic Review on Various Techniques of Android Malware DetectionComputing Science, Communication and Security10.1007/978-3-031-10551-7_7(82-99)Online publication date: 2-Jul-2022
  • (2021)Empirical Evaluation of Minority Oversampling Techniques in the Context of Android Malware Detection2021 28th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC53868.2021.00042(349-359)Online publication date: Dec-2021
  • (2021)A first look at Android applications in Google Play related to COVID-19Empirical Software Engineering10.1007/s10664-021-09943-x26:4Online publication date: 21-Apr-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media