Abstract
Context: Obfuscation is a common technique used to protect software against malicious reverse engineering. Obfuscators manipulate the source code to make it harder to analyze and more difficult to understand for the attacker. Although different obfuscation algorithms and implementations are available, they have never been directly compared in a large scale study.
Aim: This paper aims at evaluating and quantifying the effect of several different obfuscation implementations (both open source and commercial), to help developers and project managers to decide which algorithms to use.
Method: In this study we applied 44 obfuscations to 18 subject applications covering a total of 4 millions lines of code. The effectiveness of these source code obfuscations has been measured using 10 code metrics, considering modularity, size and complexity of code.
Results: Results show that some of the considered obfuscations are effective in making code metrics change substantially from original to obfuscated code, although this change (called potency of the obfuscation) is different on different metrics. In the paper we recommend which obfuscations to select, given the security requirements of the software to be protected.
Similar content being viewed by others
Notes
On the downside, Sandmark is quite old and it cannot handle the newest Java constructs, from Java version 1.5 onwards.
Most of these switches are self-explanatory, but http://www.zelix.com/Klassmaster/docs/obfuscateStatement.htmlprovides a full description.
As provided in the “Recently updated” section of the Java applications, http://sourceforge.net/directory/language:java/os:linux/freshness:recently-updated/.
Available at http://www.spinellis.gr/sw/ckjm/
Detailed analysis not reported for reason of space shows that the majority of them are different each other.
As suggested by Collberg (Collberg et al. 2003), we use the potency to measure the magnitude of the difference of a specific metric between clear and obfuscated code.
Available obfuscation tools are ProGuard, yGuard, JODE, JavaGuard, RetroGuard, jarg, etc
References
Anckaert B, Madou M, De Sutter B, De Bus B, De Bosschere K, Preneel B (2007) Program obfuscation: a quantitative approach. In: Proceedings of the 2007 ACM workshop on quality of protection, QoP ’07,pp. 15-20. ACM, New York, NY, USA. doi:10.1145/1314257.1314263
Basili V, Briand L, Melo W (1996) A validation of object-oriented design metrics as quality indicators. Software engineering. IEEE Trans 22(10):751–761
Ceccato M, Capiluppi A, Falcarin P, Boldyreff C (2013) A large study on the effect of code obfuscation on the quality of java code: Detailed analysis of data. Tech. rep., FBK, TR-FBK-SE-2013-3,. http://se.fbk.eu/sites/se.fbk.eu/files/TR-FBK-SE-2013-3.pdf
Ceccato M, Di Penta M, Nagra J, Falcarin P, Ricca F, Torchiano M, Tonella P Towards experimental evaluation of code obfuscation techniques. In:proceedings of the 4th ACM workshop on quality of protection, QoP ’08, pp. 39–46. ACM, New York, NY, USA (2008). doi:10.1145/1456362.1456371
Ceccato M, Penta M, Falcarin P, Ricca F, Torchiano M, Tonella P A family of experiments to assess the effectiveness and efficiency of source code obfuscation techniques. Empirical software engineeringpp. 1-35 (2013). doi:10.1007/s10664-013-9248-x
Ceccato M, Penta MD, Nagra J, Falcarin P, Ricca F, Torchiano M, Tonella P (2009) The effectiveness of source code obfuscation: An experimental assessment. In: ICPC. IEEE Comput Soc:178–187
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20:476–493. doi:10.1109/32.295895. http://dl.acm.org/citation.cfm?id=630808.631131
Cohen FB (1993) Operating system protection through program evolution. Comput Secur 12:565–584. doi:10.1016/0167-4048(93)90054-9. http://dl.acm.org/citation.cfm?id=179007.179012
Collberg C, Myles G, Huntwork A (2003) Sandmark–a tool for software protection research. IEEE Secur Priv 1:40–49. doi:10.1109/MSECP.2003.1219058. http://dl.acm.org/citation.cfm?id=939830.939941
Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Tech Rep:148. http://www.cs.auckland.ac.nz/%7Ecollberg/Research/Publications/CollbergThomborsonLow97a/index.html
Collberg CS, Thomborson C (2002) Watermarking, tamper-proofing, and obfuscation: tools for software protection. IEEE Trans Softw Eng 28:735–746. doi:10.1109/TSE.2002.1027797. http://dl.acm.org/citation.cfm?id=636196.636198
Falcarin P, Collberg C, Atallah M, Jakubowski M (2011) Guest editors’ introduction:software protection. IEEE Softw 28:24–27. doi:10.1109/MS.2011.34. doi:10.1109/MS.2011.34
Goto H, Mambo M, Matsumura K, Shizuya H (2000) An approach to the objective and quantitative evaluation of tamper-resistant software, In: Proceedings of the third international workshop on information security, ISW ’00. Springer-Verlag, London, UK, pp 82–96. http://dl.acm.org/citation.cfm?id=648024.744206
Heffner K, Collberg C (2004) The obfuscation executive. In:Information security. Springer, pp 428–440
Hosking AL, Nystrom N, Whitlock D, Cutts Q, Diwan A (2001) Partial redundancy elimination for access path expressions. Software:practice and experience, vol 31. doi:10.1002/spe.371
Jakubowski MH, Saw CW, Venkatesan R (2009) Iterated transformations and quantitative metrics for software protection. In: SECRYPT
Jureczko M, Spinellis D (2010) Using object-oriented design metrics to predict software defects, monographs of system dependability, vol. models and methodology of system dependability, pp. 69–81. Oficyna Wydawnicza Politechniki Wroclawskiej, Wroclaw, Poland
Karnick M, MacBride J, McGinnis S, Tang Y, Ramachandran R (2006) A qualitative analysis of java obfuscation. In: proceedings of 10th IASTED international conference on software engineering and applications, Dallas TX, USA
Kouznetsov P Jad - the fast JAva Decompiler. http://www.kpdus.com/jad.html
Linn C, Debray S (2003) Obfuscation of executable code to improve resistance to static disassembly. In: Proceedings of the 10th ACM conference on computer and communications security, CCS ’03,pp. 290–299. ACM, New York, NY, USA. doi:10.1145/948109.948149
Lv Z, Ri S, Uhvhdufk DE, Dw D, Wkh Y, Ri X, Srsxodu W, Zrun QDS, Vkrzhg ZH (2005) On the relationship between cyclomatic complexity and oo ness, 9th ECOOP workshop on quantitative approaches in ObjectOriented software engineering
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng:308–320
Sheskin D (2007) Handbook of parametric and nonparametric statistical procedures (4th Ed.). Chapman & All
Simon F, Steinbrückner F, Lewerentz C (2001) Metrics based refactoring. In: Proceedings of the Fifth European Conference on software maintenance and Reengineering, CSMR ’01, pp. 30-. IEEE Computer Society, Washington, DC, USA. http://dl.acm.org/citation.cfm?id=794203.795287
Sutherland I, Kalb GE, Blyth A, Mulley G (2006) An empirical examination of the reverse engineering process for binary files. Comput & Secur 25(3):221–228
Udupa SK, Debray SK, Madou M Deobfuscation:reverse engineering obfuscated code. In:proceedings of the 12th Working conference on reverse engineering, pp. 45–54. IEEE Computer Society, Washington, DC, USA (2005). http://dl.acm.org/citation.cfm?id=1107841.1108171
Vasa R, Schneider J.g. (2003) Evolution of cyclomatic complexity in object oriented software. Proceedings of 7th ECOOP workshop on quantitative approaches in ObjectOriented software engineering QAOOSE, vol 03. http://www.it.swin.edu.au/personal/jschneider/Pub/qaoose03.pdf
Visaggio CA, Pagin GA, Canfora G (2013) An empirical study of metric-based methods to detect obfuscated code. Int J Secur & Appl 7(2):59
Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2000) Experimentation in software engineering - an introduction, Kluwer Academic Publishers
Wyseur B (2009) White-box cryptography. Ph.D. thesis, Katholieke Universiteit Leuven. http://www.cosic.esat.kuleuven.be/publications/talk-98.pdf
Zeng Y, Liu F, Luo X, Yang C (2011) Software watermarking through obfuscated interpretation: Implementation nad analysis. J Multimed 6(4):329–339
Acknowledgments
The authors would like to thank Marco Torchiano for the interesting discussion on the analysis procedure and the Zelix Klassmaster™developers for the full evaluation copy of their tool and the feedback provided.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Andrea De Lucia
Rights and permissions
About this article
Cite this article
Ceccato, M., Capiluppi, A., Falcarin, P. et al. A large study on the effect of code obfuscation on the quality of java code. Empir Software Eng 20, 1486–1524 (2015). https://doi.org/10.1007/s10664-014-9321-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-014-9321-0