[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

On the effectiveness of clone detection by string matching: Research Articles

Published: 01 January 2006 Publication History

Abstract

Although duplicated code is known to pose severe problems for software maintenance, it is difficult to identify in large systems. Many different techniques have been developed to detect software clones, some of which are very sophisticated, but are also expensive to implement and adapt. Lightweight techniques based on simple string matching are easy to implement, but how effective are they? We present a simple string-based approach which we have successfully applied to a number of different languages such COBOL, JAVA, C++, PASCAL, PYTHON, SMALLTALK, C and PDP-11 ASSEMBLER. In each case the maximum time to adapt the approach to a new language was less than 45 minutes. In this paper we investigate a number of simple variants of string-based clone detection that normalize differences due to common editing operations, and assess the quality of clone detection for very different case studies. Our results confirm that this inexpensive clone detection technique generally achieves high recall and acceptable precision. Over-zealous normalization of the code before comparison, however, can result in an unacceptable numbers of false positives. Copyright © 2005 John Wiley & Sons, Ltd.

Cited By

View all
  • (2024)Binary Folding Compression for Efficient Software DistributionProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636006(169-176)Online publication date: 8-Apr-2024
  • (2022)A Survey of Binary Code Fingerprinting Approaches: Taxonomy, Methodologies, and FeaturesACM Computing Surveys10.1145/348686055:1(1-41)Online publication date: 17-Jan-2022
  • (2021)CodeShovelProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00135(1510-1522)Online publication date: 22-May-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Software Maintenance and Evolution: Research and Practice
Journal of Software Maintenance and Evolution: Research and Practice  Volume 18, Issue 1
January 2006
57 pages

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 January 2006

Author Tags

  1. clone detection
  2. duplicated code
  3. software maintenance
  4. string matching

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Binary Folding Compression for Efficient Software DistributionProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636006(169-176)Online publication date: 8-Apr-2024
  • (2022)A Survey of Binary Code Fingerprinting Approaches: Taxonomy, Methodologies, and FeaturesACM Computing Surveys10.1145/348686055:1(1-41)Online publication date: 17-Jan-2022
  • (2021)CodeShovelProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00135(1510-1522)Online publication date: 22-May-2021
  • (2019)Survey on Software Clone Detection ResearchProceedings of the 2019 3rd International Conference on Management Engineering, Software Engineering and Service Sciences10.1145/3312662.3312707(9-16)Online publication date: 12-Jan-2019
  • (2019)B2SFinderProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00100(1038-1049)Online publication date: 10-Nov-2019
  • (2016)Development nature mattersEmpirical Software Engineering10.1007/s10664-015-9368-621:2(517-564)Online publication date: 1-Apr-2016
  • (2016)Detecting plagiarized mobile apps using API birthmarksAutomated Software Engineering10.1007/s10515-015-0182-623:4(591-618)Online publication date: 1-Dec-2016
  • (2015)An empirical assessment of Bellon's clone benchmarkProceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering10.1145/2745802.2745821(1-10)Online publication date: 27-Apr-2015
  • (2014)CBR Clone Based Software Flaw Detection IssuesProceedings of the 7th International Conference on Security of Information and Networks10.1145/2659651.2659745(487-491)Online publication date: 9-Sep-2014
  • (2013)How much really changes?Proceedings of the 7th International Workshop on Software Clones10.5555/2662708.2662725(83-89)Online publication date: 19-May-2013
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media