Martinez-Gil, 2024 - Google Patents
Source code clone detection using unsupervised similarity measuresMartinez-Gil, 2024
View PDF- Document ID
- 2373420482146541305
- Author
- Martinez-Gil J
- Publication year
- Publication venue
- International Conference on Software Quality
External Links
Snippet
Assessing similarity in source code has gained significant attention in recent years due to its importance in software engineering tasks such as clone detection and code search and recommendation. This work presents a comparative analysis of unsupervised similarity …
- 238000001514 detection method 0 title abstract description 29
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30587—Details of specialised database models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/75—Structural analysis for program understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/14—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for phylogeny or evolution, e.g. evolutionarily conserved regions determination or phylogenetic tree construction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yu et al. | A comprehensive approach to the recovery of design pattern instances based on sub-patterns and method signatures | |
Alrubaye et al. | Learning to recommend third-party library migration opportunities at the API level | |
Allamanis et al. | Mining semantic loop idioms | |
Majumdar et al. | Automated evaluation of comments to aid software maintenance | |
Nichols et al. | Syntax-based improvements to plagiarism detectors and their evaluations | |
Martinez-Gil | Source code clone detection using unsupervised similarity measures | |
Sheneamer | An automatic advisor for refactoring software clones based on machine learning | |
Hora et al. | Characteristics of method extractions in Java: A large scale empirical study | |
Kaur et al. | A systematic literature review on the use of machine learning in code clone research | |
Yuan et al. | Java code clone detection by exploiting semantic and syntax information from intermediate code-based graph | |
Nguyen et al. | GPTSniffer: A CodeBERT-based classifier to detect source code written by ChatGPT | |
Tinnes et al. | Mining domain-specific edit operations from model repositories with applications to semantic lifting of model differences and change profiling | |
Ragkhitwetsagul | Code similarity and clone search in large-scale source code data | |
Dramko et al. | DIRE and its data: Neural decompiled variable renamings with respect to software class | |
Kammer et al. | Plagiarism detection in Haskell programs using call graph matching | |
Setoodeh et al. | A proposed model for source code reuse detection in computer programs | |
Watson | Deep learning in software engineering | |
Martinez-Gil | Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity Measures | |
Elgendy et al. | A Survey of the Metrics, Uses, and Subjects of Diversity-Based Techniques in Software Testing | |
Mehsen et al. | Detecting Source Code Plagiarism in Student Assignment Submissions Using Clustering Techniques | |
Mokaddem et al. | A generic approach to detect design patterns in model transformations using a string-matching algorithm | |
Mirza | Style analysis for source code plagiarism detection | |
Kuttal et al. | Source code comments: Overlooked in the realm of code clone detection | |
Allamanis et al. | Mining semantic loop idioms from Big Code | |
Wang et al. | Detecting copy directions among programs using extreme learning machines |