More Web Proxy on the site http://driver.im/

research-article

GLITCH: Automated Polyglot Security Smell Detection in Infrastructure as Code

Authors:

João F. FerreiraAuthors Info & Claims

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Article No.: 47, Pages 1 - 12

https://doi.org/10.1145/3551349.3556945

Published: 05 January 2023 Publication History

Abstract

Infrastructure as Code (IaC) is the process of managing IT infrastructure via programmable configuration files (also called IaC scripts). Like other software artifacts, IaC scripts may contain security smells, which are coding patterns that can result in security weaknesses. Automated analysis tools to detect security smells in IaC scripts exist, but they focus on specific technologies such as Puppet, Ansible, or Chef. This means that when the detection of a new smell is implemented in one of the tools, it is not immediately available for the technologies supported by the other tools — the only option is to duplicate the effort.

This paper presents an approach that enables consistent security smell detection across different IaC technologies. We conduct a large-scale empirical study that analyzes security smells on three large datasets containing 196,755 IaC scripts and 12,281,251 LOC. We show that all categories of security smells are identified across all datasets and we identify some smells that might affect many IaC projects. To conduct this study, we developed GLITCH, a new technology-agnostic framework that enables automated polyglot smell detection by transforming IaC scripts into an intermediate representation, on which different security smell detectors can be defined. GLITCH currently supports the detection of nine different security smells in scripts written in Ansible, Chef, or Puppet. We compare GLITCH with state-of-the-art security smell detectors. The results obtained not only show that GLITCH can reduce the effort of writing security smell analyses for multiple IaC technologies, but also that it has higher precision and recall than the current state-of-the-art tools.

References

[1]

Ahmad Alnafessah, Alim Ul Gias, Runan Wang, Lulai Zhu, Giuliano Casale, and Antonio Filieri. 2021. Quality-Aware DevOps Research: Where Do We Stand?IEEE Access 9(2021), 44476–44489.

[2]

James Fryman. 2014. DNS outage post mortem. https://github.blog/2014-01-18-dns-outage-post-mortem/ Accessed: 3 May 2022.

[3]

Michele Guerriero, Martin Garriga, Damian A Tamburri, and Fabio Palomba. 2019. Adoption, support, and challenges of infrastructure-as-code: Insights from industry. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 580–589.

[4]

Oliver Hanappi, Waldemar Hummer, and Schahram Dustdar. 2016. Asserting reliable convergence for configuration management scripts. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 328–343.

Digital Library

[5]

Rebecca Hersher. 2017. Amazon and the $150 Million typo. https://www.npr.org/sections/thetwo-way/2017/03/03/518322734/amazon-and-the-150-million-typo?t=1651588365675 Accessed: 3 May 2022.

[6]

Katsuhiko Ikeshita, Fuyuki Ishikawa, and Shinichi Honiden. 2017. Test suite reduction in idempotence testing of infrastructure as code. In International Conference on Tests and Proofs. Springer, 98–115.

[7]

Yujuan Jiang and Bram Adams. 2015. Co-evolution of infrastructure and source code-an empirical study. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 45–55.

[8]

Xianhao Jin and Francisco Servant. 2021. What helped, and what did not? An Evaluation of the Strategies to Improve Continuous Integration. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 213–225.

Digital Library

[9]

John C Kelly, Joseph S Sherif, and Jonathan Hops. 1992. An analysis of defect densities found during software inspections. Journal of Systems and Software 17, 2 (1992), 111–117.

Digital Library

[10]

Julien Lepiller, Ruzica Piskac, Martin Schäf, and Mark Santolucito. 2021. Analyzing Infrastructure as Code to Prevent Intra-update Sniping Vulnerabilities. In TACAS (2). 105–123.

[11]

MITRE. 2022. CWE-Common Weakness Enumeration. https://cwe.mitre.org/index.html.

[12]

Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating github for engineered software projects. Empirical Software Engineering 22, 6 (2017), 3219–3253.

Digital Library

[13]

Pars Mutaf. 1999. Defending against a Denial-of-Service Attack on TCP. In Recent Advances in Intrusion Detection.

[14]

National Institute of Standards and Technology. 2014. Security and Privacy Controls for Federal Information Systems and Organizations. https://www.nist.gov/publications/security-and-privacy-controls-federal-information-systems-and-organizations-including-0.

[15]

Akond Rahman, Effat Farhana, Chris Parnin, and Laurie Williams. 2020. Gang of eight: A defect taxonomy for infrastructure as code scripts. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 752–764.

Digital Library

[16]

Akond Rahman, Effat Farhana, and Laurie Williams. 2020. The ‘as code’activities: development anti-patterns for infrastructure as code. Empirical Software Engineering 25, 5 (2020), 3430–3467.

Digital Library

[17]

Akond Rahman, Rezvan Mahdavi-Hezaveh, and Laurie Williams. 2019. A systematic mapping study of infrastructure as code research. Information and Software Technology 108 (2019), 65–77.

[18]

Akond Rahman, Chris Parnin, and Laurie Williams. 2019. The seven sins: Security smells in infrastructure as code scripts. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 164–175.

Digital Library

[19]

Akond Rahman, Md Rayhanur Rahman, Chris Parnin, and Laurie Williams. 2021. Security smells in ansible and chef scripts: A replication study. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 1(2021), 1–31.

Digital Library

[20]

Akond Rahman and Laurie Williams. 2018. Characterizing defective configuration scripts used for continuous deployment. In 2018 IEEE 11th International conference on software testing, verification and validation (ICST). IEEE, 34–45.

[21]

Akond Rahman and Laurie Williams. 2019. Source code properties of defective infrastructure as code scripts. Information and Software Technology 112 (2019), 148–163.

Digital Library

[22]

Eric Rescorla 2000. HTTP over TLS. RFC 2818, May.

[23]

Johnny Saldaña. 2021. The coding manual for qualitative researchers. sage.

[24]

Julian Schwarz, Andreas Steffens, and Horst Lichter. 2018. Code smells in infrastructure as code. In 2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC). IEEE, 220–228.

[25]

Rian Shambaugh, Aaron Weiss, and Arjun Guha. 2016. Rehearsal: A configuration verification tool for puppet. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 416–430.

Digital Library

[26]

Tushar Sharma, Marios Fragkoulis, and Diomidis Spinellis. 2016. Does your configuration code smell?. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE, 189–200.

Digital Library

[27]

Thodoris Sotiropoulos, Dimitris Mitropoulos, and Diomidis Spinellis. 2020. Practical fault detection in Puppet programs. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 26–37.

Digital Library

[28]

Eduard Van der Bent, Jurriaan Hage, Joost Visser, and Georgios Gousios. 2018. How good is your puppet? an empirically defined and validated quality model for puppet. In 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER). IEEE, 164–174.

Cited By

Drosos GSotiropoulos TAlexopoulos GMitropoulos DSu Z(2024)When Your Infrastructure Is a Buggy Program: Understanding Faults in Infrastructure as Code EcosystemsProceedings of the ACM on Programming Languages10.1145/36897998:OOPSLA2(2490-2520)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689799
Hassan MSalvador JSantu SRahman A(2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660790
Sokolowski DSpielmann DSalvaneschi G(2024)Automated Infrastructure as Code Program TestingIEEE Transactions on Software Engineering10.1109/TSE.2024.339307050:6(1585-1599)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3393070
Show More Cited By

Index Terms

GLITCH: Automated Polyglot Security Smell Detection in Infrastructure as Code
1. Security and privacy
  1. Software and application security
2. Software and its engineering
  1. Software notations and tools
    1. Software configuration management and version control systems
    2. Software maintenance tools

Recommendations

Smelly variables in ansible infrastructure code: detection, prevalence, and lifetime
MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories

Infrastructure as Code is the practice of automating the provisioning, configuration, and orchestration of network nodes using code in which variable values such as configuration parameters, node hostnames, etc. play a central role. Mistakes in these ...
Security Smells in Ansible and Chef Scripts: A Replication Study
Continuous Special Section: AI and SE

Context: Security smells are recurring coding patterns that are indicative of security weakness and require further inspection. As infrastructure as code (IaC) scripts, such as Ansible and Chef scripts, are used to provision cloud-based servers and ...
Does your configuration code smell?
MSR '16: Proceedings of the 13th International Conference on Mining Software Repositories

Infrastructure as Code (IaC) is the practice of specifying computing system configurations through code, and managing them through traditional software engineering methods. The wide adoption of configuration management and increasing size and complexity ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

October 2022

2006 pages

ISBN:9781450394758

DOI:10.1145/3551349

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Evaluated & Reusable / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

FCT
Feder / FCT
EuroHPC

Conference

ASE '22

ASE '22: 37th IEEE/ACM International Conference on Automated Software Engineering

October 10 - 14, 2022

MI, Rochester, USA

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
407
Total Downloads

Downloads (Last 12 months)213
Downloads (Last 6 weeks)19

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Drosos GSotiropoulos TAlexopoulos GMitropoulos DSu Z(2024)When Your Infrastructure Is a Buggy Program: Understanding Faults in Infrastructure as Code EcosystemsProceedings of the ACM on Programming Languages10.1145/36897998:OOPSLA2(2490-2520)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689799
Hassan MSalvador JSantu SRahman A(2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660790
Sokolowski DSpielmann DSalvaneschi G(2024)Automated Infrastructure as Code Program TestingIEEE Transactions on Software Engineering10.1109/TSE.2024.339307050:6(1585-1599)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3393070
Zhang ZYin SWei WMa XKeung JLi FHu W(2024)Practitioners' Expectations on Code Smell Detection2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00175(1324-1333)Online publication date: 2-Jul-2024
https://doi.org/10.1109/COMPSAC61105.2024.00175
Shimizu RNunomura YKanuka H(2024)Test-suite-guided discovery of least privilege for cloud infrastructure as codeAutomated Software Engineering10.1007/s10515-024-00420-531:1Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1007/s10515-024-00420-5
Nasiri RKumara ITamburri Dvan den Heuvel W(2024)Towards a Taxonomy of Infrastructure as Code Misconfigurations: An Ansible StudyService-Oriented Computing10.1007/978-3-031-72578-4_5(83-103)Online publication date: 19-Oct-2024
https://doi.org/10.1007/978-3-031-72578-4_5
Opdebeeck RZerouali ADe Roover C(2023)Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort?2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR)10.1109/MSR59073.2023.00079(534-545)Online publication date: May-2023
https://doi.org/10.1109/MSR59073.2023.00079
Reddy Konala PKumar VBainbridge D(2023)SoK: Static Configuration Analysis in Infrastructure as Code Scripts2023 IEEE International Conference on Cyber Security and Resilience (CSR)10.1109/CSR57506.2023.10224925(281-288)Online publication date: 31-Jul-2023
https://doi.org/10.1109/CSR57506.2023.10224925
Saavedra NGonçalves JHenriques MFerreira JMendes A(2023)Polyglot Code Smell Detection for Infrastructure as Code with GLITCH2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00162(2042-2045)Online publication date: 11-Sep-2023
https://doi.org/10.1109/ASE56229.2023.00162
Rahman ABose DZhang YPandita R(2023)An empirical study of task infections in Ansible scriptsEmpirical Software Engineering10.1007/s10664-023-10432-629:1Online publication date: 29-Dec-2023
https://dl.acm.org/doi/10.1007/s10664-023-10432-6

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents