[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3617555.3617873acmconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article
Open access

Comparing Word-Based and AST-Based Models for Design Pattern Recognition

Published: 08 December 2023 Publication History

Abstract

Design patterns (DPs) provide reusable and general solutions for frequently encountered problems. Patterns are important to maintain the structure and quality of software products, in particular in large and distributed systems like automotive software. Modern language models (like Code2Vec or Word2Vec) indicate a deep understanding of programs, which has been shown to help in such tasks as program repair or program comprehension, and therefore show promise for DPR in industrial contexts. The models are trained in a self-supervised manner, using a large unlabelled code base, which allows them to quantify such abstract concepts as programming styles, coding guidelines, and, to some extent, the semantics of programs. This study demonstrates how two language models—Code2Vec and Word2Vec, trained on two public automotive repositories, can show the separation of programs containing specific DPs. The results show that the Code2Vec and Word2Vec produce average F1-scores of 0.781 and 0.690 on open-source Java programs, showing promise for DPR in practice.

References

[1]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages, 3, POPL (2019), 1–29.
[2]
Apostolos Ampatzoglou, Georgia Frantzeskou, and Ioannis Stamelos. 2012. A methodology to assess the impact of design patterns on software quality. Information and Software Technology, 54, 4 (2012), 331–346.
[3]
Giuliano Antoniol and Yann-Gaël Guéhéneuc. 2008. Demima: A multilayered approach for design pattern identification. IEEE Transactions on Software Engineering, 34, 5 (2008), 667–684.
[4]
Rhys Compton, Eibe Frank, Panos Patros, and Abigail Koay. 2020. Embedding java classes with code2vec: Improvements from variable obfuscation. In Proceedings of the 17th International Conference on Mining Software Repositories. 243–253.
[5]
Riccardo Coppola and Maurizio Morisio. 2016. Connected car: technologies, issues, future trends. ACM Computing Surveys (CSUR), 49, 3 (2016), 1–36.
[6]
Erich Gamma, Ralph Johnson, Richard Helm, Ralph E Johnson, and John Vlissides. 1995. Design patterns: elements of reusable object-oriented software. Pearson Deutschland GmbH.
[7]
Kapilan Kulayan Arumugam Gandhi and Chamundeswari Arumugam. 2017. An approach for secure software update in Infotainment system. In Proceedings of the 10th Innovations in Software Engineering Conference. 127–131.
[8]
Yann-Gaël Guéhéneuc and Giuliano Antoniol. 2008. Demima: A multilayered approach for design pattern identification. IEEE transactions on software engineering, 34, 5 (2008), 667–684.
[9]
Tae-Hwan Jung. 2021. Commitbert: Commit message generation using pre-trained programming language model. arXiv preprint arXiv:2105.14242.
[10]
David J Ketchen and Christopher L Shook. 1996. The application of cluster analysis in strategic management research: an analysis and critique. Strategic management journal, 17, 6 (1996), 441–458.
[11]
Chris McCormick. 2016. Word2vec tutorial-the skip-gram model. Apr-2016.[Online]. Available: http://mccormickml. com/2016/04/19/word2vec-tutorial-the-skip-gram-model.
[12]
Alexander Mirnig, Tim Kaiser, Artur Lupp, Nicole Perterer, Alexander Meschtscherjakov, Thomas Grah, and Manfred Tscheligi. 2016. Automotive user experience design patterns: an approach and pattern examples. Int. J. Adv. Intell. Syst, 9 (2016), 275–286.
[13]
Dana Movshovitz-Attias and William Cohen. 2013. Natural language models for predicting programming comments. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 35–40.
[14]
Anh Tuan Nguyen and Tien N. Nguyen. 2015. Graph-Based Statistical Language Model for Code. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. 1, 858–868. https://doi.org/10.1109/ICSE.2015.336
[15]
Murali Padmanabha, Daniel Kriesten, and Ulrich Heinkel. [n. d.]. System Design of a Modern Embedded Linux for In-Car Applications.
[16]
Dhasarathy Parthasarathy, Cecilia Ekelin, Anjali Karri, Jiapeng Sun, and Panagiotis Moraitis. 2022. Measuring design compliance using neural language models: an automotive case study. In Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering. 12–21.
[17]
Christoph Rieger and Tim A Majchrzak. 2016. Weighted evaluation framework for cross-platform app development approaches. In Information Systems: Development, Research, Applications, Education: 9th SIGSAND/PLAIS EuroSymposium 2016, Gdansk, Poland, September 29, 2016, Proceedings 9. 18–39.
[18]
Baptiste Roziere, Marie-Anne Lachaux, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised translation of programming languages. Advances in Neural Information Processing Systems, 33 (2020), 20601–20611.
[19]
Hannes Thaller, Lukas Linsbauer, and Alexander Egyed. 2019. Feature maps: A comprehensible software representation for design pattern detection. In 2019 IEEE 26th international conference on software analysis, evolution and reengineering (SANER). 207–217.
[20]
Nikolaos Tsantalis, Alexander Chatzigeorgiou, George Stephanides, and Spyros T Halkidis. 2006. Design pattern detection using similarity scoring. IEEE transactions on software engineering, 32, 11 (2006), 896–909.
[21]
Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media.
[22]
Renhao Xiong and Bixin Li. 2019. Accurate design pattern detection based on idiomatic implementation matching in java language context. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). 163–174.
[23]
Marco Zanoni, Francesca Arcelli Fontana, and Fabio Stella. 2015. On applying machine learning techniques for design pattern detection. Journal of Systems and Software, 103 (2015), 102–117.

Cited By

View all
  • (2024)Applying Pattern Language to Enhance IIoT System Design and Integration: From Theory to PracticeInformation10.3390/info1510059515:10(595)Online publication date: 30-Sep-2024
  • (2024)Smarter Project Selection for Software Engineering ResearchProceedings of the 20th International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3663533.3664037(12-21)Online publication date: 10-Jul-2024

Index Terms

  1. Comparing Word-Based and AST-Based Models for Design Pattern Recognition

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PROMISE 2023: Proceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engineering
    December 2023
    68 pages
    ISBN:9798400703751
    DOI:10.1145/3617555
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 December 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Design Patterns
    2. NLP
    3. Programming Language Models

    Qualifiers

    • Research-article

    Funding Sources

    • CHAIR (Chalmers AI Research Center) project ?T4AI?, the National Science Centre of Poland, Vinnova, Software Center, Volvo Cars, AB Volvo, and the National Science Centre of Poland

    Conference

    PROMISE '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 98 of 213 submissions, 46%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)269
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Applying Pattern Language to Enhance IIoT System Design and Integration: From Theory to PracticeInformation10.3390/info1510059515:10(595)Online publication date: 30-Sep-2024
    • (2024)Smarter Project Selection for Software Engineering ResearchProceedings of the 20th International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3663533.3664037(12-21)Online publication date: 10-Jul-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media