Abstract
To build effective malware analysis techniques and to evaluate new detection tools, up-to-date datasets reflecting the current Android malware landscape are essential. For such datasets to be maximally useful, they need to contain reliable and complete information on malware’s behaviors and techniques used in the malicious activities. Such a dataset shall also provide a comprehensive coverage of a large number of types of malware. The Android Malware Genome created circa 2011 has been the only well-labeled and widely studied dataset the research community had easy access to (As of 12/21/2015 the Genome authors have stopped supporting the dataset sharing due to resource limitation). But not only is it outdated and no longer represents the current Android malware landscape, it also does not provide as detailed information on malware’s behaviors as needed for research. Thus it is urgent to create a high-quality dataset for Android malware. While existing information sources such as VirusTotal are useful, to obtain the accurate and detailed information for malware behaviors, deep manual analysis is indispensable. In this work we present our approach to preparing a large Android malware dataset for the research community. We leverage existing anti-virus scan results and automation techniques in categorizing our large dataset (containing 24,650 malware app samples) into 135 varieties (based on malware behavioral semantics) which belong to 71 malware families. For each variety, we select three samples as representatives, for a total of 405 malware samples, to conduct in-depth manual analysis. Based on the manual analysis result we generate detailed descriptions of each malware variety’s behaviors and include them in our dataset. We also report our observations on the current landscape of Android malware as depicted in the dataset. Furthermore, we present detailed documentation of the process used in creating the dataset, including the guidelines for the manual analysis. We make our Android malware dataset available to the research community.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Some malware can get pass Google’s vetting system and end up in Google Play.
- 2.
Tool website: http://pag.arguslab.org/argus-saf.
- 3.
Tool website: http://pag.arguslab.org/argus-cit.
- 4.
Table 2 aggregates the behaviors over all malware varieties in a family. The more specific per-variety breakdown can be found at our Android malware website.
References
Building Web Apps in WebView. https://developer.android.com/guide/webapps/webview.html
Contagio Mobile Malware Mini Dump. http://contagiominidump.blogspot.com/
Drive-by Download. https://en.wikipedia.org/wiki/Drive-by_download
hexdump. https://www.freebsd.org/cgi/man.cgi?query=hexdump&sektion=1
VirusShare. https://virusshare.com/
VirusTotal. https://www.virustotal.com/
IDC: Smartphone OS Market Share 2015, 2014, 2013, and 2012 (2015). http://www.idc.com/prodserv/smartphone-os-market-share.jsp
A Growing Number of Android Malware Families Believed to Have a CommonOrigin: A Study Based on Binary Code (2016). https://community.fireeye.com/external/1438
Allix, K., Bissyand, T.F., Klein, J., Le Traon, Y.: AndroZoo: collecting millions of android apps for the research community. In: Proceedings of the Mining Software Repositories (MSR) (2016)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: Drebin: effective and explainable detection of android malware in your pocket. In: Proceedings of the NDSS (2014)
Bianchi, A., Corbetta, J., Invernizzi, L., Fratantonio, Y., Kruegel, C., Vigna, G.: What the app. is that? deception and countermeasures in the android user interface. In: Proceedings of the IEEE S&P (2015)
Nikita Buchka. Taking root (2015). https://securelist.com/blog/mobile/71981/taking-root/
Buchka, N., Kuzin, M.: Attack on Zygote: a new twist in the evolution of mobile threats (2016). https://securelist.com/analysis/publications/74032/attack-on-zygote-a-new-twist-in-the-evolution-of-mobile-threats/
Chen, K., Wang, P., Lee, Y., Wang, X., Zhang, N., Huang, H., Zou, W., Liu, P.: Finding unknown malice in 10 seconds: mass vetting for new threats at the google-play scale. In: Proceedings of the USENIX Security Symposium, pp. 659–674 (2015)
Hurier, M., Allix, K., Bissyandé, T.F., Klein, J., Le Traon, Y.: On the lack of consensus in anti-virus decisions: metrics and insights on building ground truths of android malware. In: Caballero, J., Zurutuza, U., Rodríguez, R.J. (eds.) DIMVA 2016. LNCS, vol. 9721, pp. 142–162. Springer, Cham (2016). doi:10.1007/978-3-319-40667-1_8
Kessem, L.: Android Malware About to Get Worse: GM Bot Source Code Leaked (2016). https://securityintelligence.com/android-malware-about-to-get-worse-gm-bot-source-code-leaked/
Li, Y., Jang, J., Hu, X., Ou, X.: Android Malware Clustering through Malicious Payload Mining. Technical Report 2017–1, Argus Cybersecurity Lab, University of South Florida (2017). http://www.arguslab.org/tech_reports/2017-1
Lindorfer, M., Neugschwandtner, M., Weichselbaum, L., Fratantonio, Y., Van Der Veen, V., Platzer, C.: ANDRUBIS - 1,000,000 apps later: a view on current android malware behaviors. In: 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), pp. 3–17. IEEE (2014)
Maggi, F., Bellini, A., Salvaneschi, G., Zanero, S.: Finding non-trivial malware naming inconsistencies. In: Jajodia, S., Mazumdar, C. (eds.) ICISS 2011. LNCS, vol. 7093, pp. 144–159. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25560-1_10
Mohaisen, A., Alrawi, O.: AV-Meter: an evaluation of antivirus scans and labels. In: Dietrich, S. (ed.) DIMVA 2014. LNCS, vol. 8550, pp. 112–131. Springer, Cham (2014). doi:10.1007/978-3-319-08509-8_7
Polkovnichenko, A., Koriat, O., Horde, V.: A New Type of Android Malware on Google Play (2016). http://blog.checkpoint.com/2016/05/09/viking-horde-a-new-type-of-android-malware-on-google-play/
Rasthofer, S., Arzt, S., Miltenberger, M., Bodden, E.: Harvesting runtime values in android applications that feature anti-analysis techniques. In: Proceedings of the NDSS (2016)
Rastogi, V., Shao, R., Chen, Y., Pan, X., Zou, S., Riley, R.: Are these ads safe: detecting hidden attacks through the mobile app-web interfaces. In: Proceedings of the NDSS (2016)
Sebastián, M., Rivera, R., Kotzias, P., Caballero, J.: AVclass: a tool for massive malware labeling. In: Monrose, F., Dacier, M., Blanc, G., Garcia-Alfaro, J. (eds.) RAID 2016. LNCS, vol. 9854, pp. 230–253. Springer, Cham (2016). doi:10.1007/978-3-319-45719-2_11
Shatilin, I.: Banking Trojans: mobile’s major cyberthreat. https://blog.kaspersky.com/android-banking-trojans/9897/
Tam, K., Khan, S.J., Fattori, A., Cavallaro, L.: CopperDroid: automatic reconstruction of android malware behaviors. In: Proceedings of the NDSS (2015)
Wei, F., Roy, S., Ou, X., Robby: Amandroid: a precise and general inter-component data flow analysis framework for security vetting of android apps. In: Proceedings of the CCS (2014)
Woods, V., van der Meulen, R.: Gartner Says Emerging Markets Drove Worldwide Smartphone Sales to 15.5 Percent Growth in Third Quarter of 2015 (2015). http://www.gartner.com/newsroom/id/3169417
Zhang, Y.: Kemoge: Another Mobile Malicious Adware Infecting Over 20 Contries (2015). https://www.fireeye.com/blog/threat-research/2015/10/kemoge_another_mobi.html
Zhou, W., Chen, Z., Su, J., Xie, J., Huang, H.: SlemBunk: An Evolving Android Trojan Family Targeting Users of Worldwide Banking Apps (2015). https://www.fireeye.com/blog/threat-research/2015/12/slembunk_an_evolvin.html
Zhou, W., Deyu, H., Jimmy, S., Yong Kang, R.: The Latest Family of Android Malware Attacking Users in Russia via SMS Phishing (2016). https://www.fireeye.com/blog/threat-research/2016/04/rumms-android-malware.html
Zhou, W., Huang, H., Chen, Z., Xie, J., Su, J.: SlemBunk Part II: Prolonged Attack Chain and Better-Organized Campaign (2016). https://www.fireeye.com/blog/threat-research/2016/01/slembunk-part-two.html
Zhou, Y., Jiang, X.: Dissecting android malware: Characterization and evolution. In: Proceedings of the IEEE S&P (2012)
Acknowledgment
This research is partially supported by the U.S. National Science Foundation under Grant No. 1622402. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. This research is also partially supported by David and Amy Fulton Grant received by co-author Roy at Bowling Green State University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wei, F., Li, Y., Roy, S., Ou, X., Zhou, W. (2017). Deep Ground Truth Analysis of Current Android Malware. In: Polychronakis, M., Meier, M. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2017. Lecture Notes in Computer Science(), vol 10327. Springer, Cham. https://doi.org/10.1007/978-3-319-60876-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-60876-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60875-4
Online ISBN: 978-3-319-60876-1
eBook Packages: Computer ScienceComputer Science (R0)