Abstract
Code smells violate software development principles that make the software more prone to errors and changes. Researchers have developed code smell detectors using manual and semi-automatic methods to identify these issues. However, three key challenges have limited the practical use of these detectors: developers’ subjective perceptions of code smells, lack of consensus in the detection process, and difficulty in setting appropriate detection thresholds. While code smell detection using machine learning has progressed significantly, there still appears to be a gap in understanding the effective utilization of deep learning (DL) approaches. This paper aims to review and identify current methods for code smell detection using DL techniques. A systematic literature review is conducted on 35 primary studies from a collection of 8739 publications between 2013 and the present. The analysis reveals that common code smells detected include Feature Envy, God Classes, Long Methods, Complex Classes, and Large Classes. The most popular DL algorithms used are Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), often combined with other techniques for better results. Algorithms that train models on large datasets with fewer independent variables demonstrate exemplary performance. The paper also highlights open issues and provides guidelines for future metric identification and selection research.
Similar content being viewed by others
Data availibility
No data was used for the research described in the article.
References
Bastías, O.A., Díaz, J., López Fenner, J.: Exploring the intersection between software maintenance and machine learning; a systematic mapping study. Appl. Sci. 13, 1710 (2023)
AlOmar, E.A., Ivanov, A., Kurbatova, Z., Golubev, Y., Mkaouer, M.W., Ouni, A., Bryksin, T., Nguyen, L., Kini, A., Thakur, A.: Just-in-time code duplicates extraction. Inform. Softw. Technol. 158, 107169 (2023)
Li, Z., Wang, S., Wang, W., Liang, P., Mo, R., Li, B.: Understanding bugs in multi-language deep learning frameworks, (2023). arXiv:2303.02695
Al Khatib, S.M., Alkharabsheh, K., Alawadi, S.: Selection of human evaluators for design smell detection using dragonfly optimization algorithm: An empirical study. Inform. Softw. Technol. 155, 107120 (2023)
Kim, D.K.: A deep neural network-based approach to finding similar code segments. IEICE TRANSACTIONS Inform. Syst. 103, 874–878 (2020)
Alkharabsheh, K., Alawadi, S., Kebande, V.R., Crespo, Y., Fernández-Delgado, M., Taboada, J.A.: A comparison of machine learning algorithms on design smell detection using balanced and imbalanced dataset: a study of god class. Inform. Softw. Technol. 143, 106736 (2022)
Brown, W.H., Malveau, R.C., McCormick, H.W.S., Mowbray, T.J.: AntiPatterns: refactoring software, architectures, and projects in crisis. Wiley, Hoboken (1998)
Gesi, J., Liu, S., Li, J., Ahmed, I., Nagappan, N., Lo, D., de Almeida, E.S., Kochhar, P.S., Bao, L.: Code smells in machine learning systems, (2022). arXiv:2203.00803
Muralidhar, N., Muthiah, S., Butler, P., Jain, M., Yu, Y., Burne, K., Li, W., Jones, D., Arunachalam, P., McCormick, H.S., Ramakrishnan, N.: Using antipatterns to avoid mlops mistakes, (2021). arXiv:2107.00079
Kaur, I., Kaur, A.: A novel four-way approach designed with ensemble feature selection for code smell detection. IEEE Access 9, 8695–8707 (2021)
Fowler, M.: Refactoring, Addison-Wesley Professional, (2018)
Spadini, D., Palomba, F., Zaidman, A., Bruntink, M., Bacchelli, A., On the relation of test smells to software code quality, in: IEEE international conference on software maintenance and evolution (ICSME). IEEE 2018, 1–12 (2018)
Barbez, A.: Deep Learning Structural and Historical Features for Anti-Patterns Detection, Master’s thesis, École Polytechnique de Montréal, (2018). https://publications.polymtl.ca/3724/
Kruchten, P., Nord, R.L., Ozkaya, I.: Technical debt: from metaphor to theory and practice. IEEE Softw. 29, 18–21 (2012)
Gupta, A., Suri, B., Misra, S.: A systematic literature review: code bad smells in java source code, in: Computational Science and Its Applications–ICCSA 2017: 17th International Conference, Trieste, Italy, July 3-6, 2017, Proceedings, Part V 17, Springer, pp. 665–682, (2017)
Fowler, M., Beck, K., Brant, J., Opdyke, W., Roberts, D.: Refactoring: Improving the design of existing code addison-wesley professional. CA, USA, Berkeley (1999)
Mantyla, M.: Bad smells in software-a taxonomy and an empirical study, Ph.D. thesis, PhD thesis, Helsinki University of Technology, (2003)
Kessentini, M., Mahaouachi, R., Ghedira, K.: What you like in design use to correct bad-smells. Softw. Qual. J. 21, 551–571 (2013)
Liu, H., Liu, Q., Niu, Z., Liu, Y.: Dynamic and automatic feedback-based threshold adaptation for code smell detection. IEEE Trans. Softw. Eng. 42, 544–558 (2015)
Wake, W.C.: Refactoring workbook, Addison-Wesley Professional, (2004)
Pascarella, L., Palomba, F., Bacchelli, A.: Fine-grained just-in-time defect prediction. J. Syst. Softw. 150, 22–36 (2019)
de Paulo Sobrinho, E.V., De Lucia, A., de Almeida Maia, M.: A systematic literature review on bad smells-5 w’s: which, when, what, who, where. IEEE Trans. Softw. Eng. 47, 17–66 (2018)
Rasool, G., Arshad, Z.: A review of code smell mining techniques. J. Softw.: Evol. Process 27, 867–895 (2015)
Sabir, F., Palma, F., Rasool, G., Guéhéneuc, Y.-G., Moha, N.: A systematic literature review on the detection of smells and their evolution in object-oriented and service-oriented systems. Softw.: Prac. Exp. 49, 3–39 (2019)
Zhang, M., Hall, T., Baddoo, N.: Code bad smells: a review of current knowledge. J. Softw. Maint. Evol.: Res. Prac. 23, 179–202 (2011)
Misbhauddin, M., Alshayeb, M.: Uml model refactoring: a systematic literature review. Empir. Softw. Eng. 20, 206–251 (2015)
Misbhauddin, M., Alshayeb, M.: Uml model refactoring: a systematic literature review. Empir. Softw. Eng. 20, 206–251 (2015)
Azeem, M.I., Palomba, F., Shi, L., Wang, Q.: Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Inform. Softw. Technol. 108, 115–138 (2019)
Caram, F.L., Rodrigues, B.R.D.O., Campanelli, A.S., Parreiras, F.S.: Machine learning techniques for code smells detection: a systematic mapping study. Int. J. Softw. Eng. Knowl. Eng. 29, 285–316 (2019)
AbuHassan, A., Alshayeb, M., Ghouti, L.: Software smell detection techniques: a systematic literature review. J. Softw.: Evol. Process 33, e2320 (2021)
Singh, S., Kaur, S.: A systematic literature review: Refactoring for disclosing code smells in object oriented software. Ain Shams Eng. J. 9, 2129–2151 (2018)
Agnihotri, M., Chug, A.: A systematic literature survey of software metrics, code smells and refactoring techniques. J. Inform. Process. Syst. 16, 915–934 (2020)
Lewowski, T., Madeyski, L.: How far are we from reproducible research on code smell detection? a systematic literature review. Inform. Softw. Technol. 144, 106783 (2022)
Alazba, A., Aljamaan, H., Alshayeb, M.: Deep learning approaches for bad smell detection: a systematic literature review. Empir. Softw. Eng. 28, 77 (2023)
Lei, M., Li, H., Li, J., Aundhkar, N., Kim, D.-K.: Deep learning application on code clone detection: a review of current knowledge. J. Syst. Softw. 184, 111141 (2022)
Alsolai, H., Roper, M.: A systematic literature review of machine learning techniques for software maintainability prediction. Inform. Softw. Technol. 119, 106214 (2020)
Fernandes, E., Oliveira, J., Vale, G., Paiva, T., Figueiredo, E.: A review-based comparative study of bad smell detection tools, in: Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, pp. 1–12, (2016)
Kitchenham, B.A.: Systematic review in software engineering: where we are and where we should be going, in: Proceedings of the 2nd international workshop on Evidential assessment of software technologies, p. 1–2 (2012)
Ho, A., Bui, A.M.T., Nguyen, P.T., Di Salle, A.: Fusion of deep convolutional and lstm recurrent neural networks for automated detection of code smells, in: Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, EASE ’23, Association for Computing Machinery, New York, NY, USA, p. 229-234. (2023) https://doi.org/10.1145/3593434.3593476
Bhave, A., Sinha, R.: Deep multimodal architecture for detection of long parameter list and switch statements using distilbert, in: 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM), pp. 116–120. (2022) https://doi.org/10.1109/SCAM55253.2022.00018
Nandani, H., Saad, M., Sharma, T.: Dacos-a manually annotated dataset of code smells, (2023). arXiv:2303.08729
Kaur, S., Singh, S.: Improving the quality of open source software, Agile Software Development: Trends, Challenges and Applications, 309–323 (2023)
Tummalapalli, S., Kumar, L., Bhanu Murthy, N.L.: Web service anti-patterns detection using cnn with varying sequence padding size, in: Mobile Application Development: Practice and Experience: 12th Industry Symposium in Conjunction with 18th ICDCIT 2022, Springer, pp. 153–165 (2023)
Sepahvand, R., Akbari, R., Jamasb, B., Hashemi, S., Boushehrian, O.: Using word embedding and convolution neural network for bug triaging by considering design flaws. Sci. Comput. Program. 228, 102945 (2023)
Dewangan, S., Rao, R.S., Mishra, A., Gupta, M.: Code smell detection using ensemble machine learning algorithms. Appl. Sci. 12, 10321 (2022)
Tarwani, S., Chug, A.: Application of deep learning models for code smell prediction, in: 2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), pp. 1–5. (2022) https://doi.org/10.1109/ICRITO56286.2022.9965048
Zhang, Y., Ge, C., Hong, S., Tian, R., Dong, C., Liu, J.: Delesmell: code smell detection based on deep learning and latent semantic analysis. Knowl.-Based Syst. 255, 109737 (2022)
Zhang, M., Jia, J.: Feature envy detection with deep learning and snapshot ensemble, in: 2022 9th International Conference on Dependable Systems and Their Applications (DSA), pp. 215–223. (2022) https://doi.org/10.1109/DSA56465.2022.00037
Yedida, R., Menzies, T.: How to improve deep learning for software analytics (a case study with code smell detection), in: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR), pp. 156–166. (2022) https://doi.org/10.1145/3524842.3528458
Li, Y., Zhang, X.: Multi-label code smell detection with hybrid model based on deep learning., in: SEKE, pp. 42–47. (2022)
Virmajoki, J., Knutas, A., Kasurinen, J.: Detecting code smells with ai: a prototype study, in: 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO), pp. 1393–1398. (2022) https://doi.org/10.23919/MIPRO55190.2022.9803727
Imam, A.T., Al-Srour, B.R., Alhroob, A.: The automation of the detection of large class bad smell by using genetic algorithm and deep learning. J. King Saud Univ.: Comput. Inform. Sci. 34, 2621–2636 (2022)
Lomio, F., Moreschini, S., Lenarduzzi, V.: A machine and deep learning analysis among sonarqube rules, product, and process metrics for fault prediction. Empir. Softw. Eng. 27, 189 (2022)
Xu, W., Zhang, X.: Multi-granularity code smell detection using deep learning method based on abstract syntax tree, volume 2021-July, Pittsburgh, PA, United states, pp. 503 – 509. (2021) https://doi.org/10.18293/SEKE2021-014
Gupta, H., Kulkarni, T.G., Kumar, L., Neti, L.B.M., Krishna, A.: An empirical study on predictability of software code smell using deep learning models, in: Advanced Information Networking and Applications, Springer International Publishing, 2021, pp. 120–132. https://doi.org/10.1007%2F978-3-030-75075-6_10. https://doi.org/10.1007/978-3-030-75075-6_10
Jo, Y.-B., Lee, J., Yoo, C.-J.: Two-pass technique for clone detection and type classification using tree-based convolution neural network. Appl. Sci. 11, 6613 (2021)
Hua, W., Sui, Y., Wan, Y., Liu, G., Xu, G.: Fcca: hybrid code representation for functional clone detection using attention networks. IEEE Trans. Reliab. 70, 304–318 (2021)
Zhang, Y., Dong, C.: Mars: detecting brain class/method code smell based on metric-attention mechanism and residual network. J. Softw.: Evol. Process (2021). https://doi.org/10.1002/smr.2403
Sharma, T., Efstathiou, V., Louridas, P., Spinellis, D.: Code smell detection by deep direct-learning and transfer-learning. J. Syst. Softw. 176, 110936 (2021)
Liu, H., Jin, J., Xu, Z., Zou, Y., Bu, Y., Zhang, L.: Deep learning based code smell detection. IEEE Trans. Softw. Eng. 47, 1811–1837 (2021)
Wang, H., Liu, J., Kang, J., Yin, W., Sun, H., Wang, H.: Feature envy detection based on bi-lstm with self-attention mechanism, in: 2020 IEEE Intl Conf on Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking (ISPA/BDCloud/SocialCom/SustainCom), pp. 448–457. (2020) https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082
Hadj-Kacem, M., Bouassida, N.: Deep representation learning for code smells detection using variational auto-encoder, in. International Joint Conference on Neural Networks (IJCNN) 2019, 1–8 (2019). https://doi.org/10.1109/IJCNN.2019.8851854
Barbez, A., Khomh, F., Guéhéneuc, Y.-G.: Deep learning anti-patterns from code metrics history, in. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2019, 114–124 (2019). https://doi.org/10.1109/ICSME.2019.00021
Guo, X., Shi, C., Jiang, H.: Deep semantic-based feature envy identification, in: Proceedings of the 11th Asia-Pacific Symposium on Internetware, Internetware ’19, Association for Computing Machinery, New York, NY, USA, (2019). https://doi.org/10.1145/3361242.3361257
Das, A.K., Yadav, S., Dhal, S.: Detecting code smells using deep learning, in: TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), pp. 2081–2086. (2019) https://doi.org/10.1109/TENCON.2019.8929628
Sharma, T., Efstathiou, V., Louridas, P., Spinellis, D.: On the feasibility of transfer-learning code smells using deep learning, (2019). arXiv:1904.03031
Liu, H., Xu, Z., Zou, Y.: Deep learning based feature envy detection, in: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 385–396. (2018) https://doi.org/10.1145/3238147.3238166
Hadj-Kacem, M., Bouassida, N.: A hybrid approach to detect code smells using deep learning., in: ENASE, pp. 137–146 (2018)
Li, L., Feng, H., Zhuang, W., Meng, N., Ryder, B.: Cclearner: A deep learning-based clone detection approach, in: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp. 249–260 (2017)
Kim, D.K.: Finding bad code smells with neural network models, International. J. Electr. Comput. Eng. 7, 3613 (2017)
White, M., Tufano, M., Vendome, C., Poshyvanyk, D.: Deep learning code fragments for code clone detection, in: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE ’16, Association for Computing Machinery, New York, NY, USA, 2016, p. 87-98. https://doi.org/10.1145/2970276.2970326
Hall, T., Zhang, M., Bowes, D., Sun, Y.: Some code smells have a significant but small effect on faults. ACM Trans. Softw. Eng. Methodol. 23, 1–39 (2014)
Lanza, M., Marinescu, R.: Object-Oriented Metrics in Practice. Evaluate, and Improve the Design of Object-oriented Systems, Springer Verlag, Using Software Metrics to Characterize (2010)
Saheb-Nassagh, R., Ashtiani, M., Minaei-Bidgoli, B.: A probabilistic-based approach for automatic identification and refactoring of software code smells. Appl. Soft Comput. 130, 109658 (2022)
Li, F., Zou, K., Keung, J.W., Yu, X., Feng, S., Xiao, Y.: On the relative value of imbalanced learning for code smell detection. Prac. Exp. Softw. (2023). https://doi.org/10.1002/spe.3235
Ali, Y.M.B.: Adversarial attacks on deep learning networks in image classification based on smell bees optimization algorithm. Futur. Gener. Comput. Syst. 140, 185–195 (2023)
Alkharabsheh, K., Alawadi, S., Kebande, V.R., Crespo, Y., Fernández-Delgado, M., Taboada, J.A.: A comparison of machine learning algorithms on design smell detection using balanced and imbalanced dataset: a study of god class. Inform. Softw. Technol. 143, 106736 (2022)
Gupta, R., Kumar Singh, S.: A novel metric based detection of temporary field code smell and its empirical analysis. J. King Saud Univ.: Comput. Inform. Sci. 34, 9478–9500 (2022)
Shoenberger, I., Mkaouer, M.W., Kessentini, M.: On the use of smelly examples to detect code smells in javascript. In: Squillero, G., Sim, K. (eds.) Appl. Evol. Comput., pp. 20–34. Springer International Publishing, Cham (2017)
Palomba, F., Bavota, G., Di Penta, M., Fasano, F., Oliveto, R., De Lucia, A.: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation, in: Proceedings of the 40th International Conference on Software Engineering, pp. 482–482 (2018)
Chen, Q., Câmara, R., Campos, J., Souto, A., Ahmed, I.: The smelly eight: An empirical study on the prevalence of code smells in quantum computing, in: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE, pp. 358–370 (2023)
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. List of Metrics with their Definitions used in this Study
Acronym | Definition |
---|---|
AMW | Afferent Method and Weighted Method |
ATFD | Access to Foreign Data |
BUR | Buse Readability Score |
CBO | Coupling Between Objects |
CDISP | Class Dependency Metric |
CINT | Coupling Intensity |
CM | Coupling Between Objects |
CYCLO | Cyclomatic Complexity |
DIT | Depth of Inheritance Tree |
FDP | Foreign Data Providers |
LAA | Locality of Attribute Accesses |
LB | Basic Lack of Cohesion in Methods |
LCOM | Lack of Cohesion in Methods |
LOC | Lines Of Code |
MAXNESTING | Maximum Nesting Level |
NOAM | Number of Accessed Members |
NOAV | Number of Accessed Variables |
NOC | Number of Children |
NOM | Number of Methods |
NOP | Number of Parents |
PAR | Number of Parameters |
SDC | Size of Data Context |
SEC | Signal Event Coupling |
TCC | Tight Class Cohesion |
WMC | Weighted Methods Count |
Appendix B. List of Primary Studies Selected for this SLR
ID | Title | References |
---|---|---|
S1 | Fusion of deep convolutional and LSTM recurrent neural networks for automated detection of code smells | [39] |
S2 | Deep Multimodal Architecture for Detection of Long Parameter List and Switch Statements using DistilBERT | [40] |
S3 | DACOS-A Manually Annotated Dataset of Code Smells | [41] |
S4 | Improving the Quality of Open Source Software | [42] |
S5 | Web Service Anti-patterns Detection Using CNN with Varying Sequence Padding Size | [43] |
S6 | Using Word Embedding and Convolution Neural Network for Bug Triaging by Considering Design Flaws | [44] |
S7 | Code Smell Detection Using Ensemble Machine Learning Algorithms | [45] |
S8 | Application of Deep Learning models for Code Smell Prediction | [46] |
S9 | DeleSmell: Code smell detection based on deep learning and latent semantic analysis | [47] |
S10 | Feature Envy Detection with Deep Learning and Snapshot Ensemble | [48] |
S11 | How to improve deep learning for software analytics: (a case study with code smell detection) | [49] |
S12 | Multi-Label Code Smell Detection with Hybrid Model based on Deep Learning | [50] |
S13 | Detecting Code Smells with AI: a Prototype Study | [51] |
S14 | The automation of the detection of large class bad smell by using genetic algorithm and deep learning | [52] |
S15 | A machine and deep learning analysis among SonarQube rules, product, and process metrics for fault prediction | [53] |
S16 | Multi-granularity code smell detection using deep learning method based on abstract syntax tree | [54] |
S17 | An Empirical Study on Predictability of Software Code Smell Using Deep Learning Models | [55] |
S18 | Two-Pass Technique for Clone Detection and Type Classification Using Tree-Based Convolution Neural Network | [56] |
S19 | FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks | [57] |
S20 | MARS: Detecting brain class/method code smell based on metric-attention mechanism and residual network | [58] |
S21 | Code smell detection by deep direct-learning and transfer-learning | [59] |
S22 | Deep Learning Based Code Smell Detection | [60] |
S23 | Feature Envy Detection based on Bi-LSTM with Self-Attention Mechanism | [61] |
S24 | Deep Representation Learning for Code Smells Detection using Variational Auto-Encoder | [62] |
S25 | Deep Learning Anti-Patterns from Code Metrics History | [63] |
S26 | Deep semantic-Based Feature Envy Identification | [64] |
S27 | Detecting code smells using deep learning | [65] |
S28 | On the feasibility of transfer-learning code smells using deep learning | [66] |
S29 | Deep learning based feature envy detection | [67] |
S30 | A hybrid approach to detect code smells using deep learning | [68] |
S31 | Cclearner: A deep learning-based clone detection approach | [69] |
S32 | Finding bad code smells with neural network models | [70] |
S33 | Deep learning code fragments for code clone detection | [71] |
S34 | Some code smells have a significant but small effect on faults | [72] |
S35 | What you like in design use to correct bad-smell | [18] |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Malhotra, R., Jain, B. & Kessentini, M. Examining deep learning’s capability to spot code smells: a systematic literature review. Cluster Comput 26, 3473–3501 (2023). https://doi.org/10.1007/s10586-023-04144-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-023-04144-1