8000 GitHub - l-schulte/cdbs_feature_location: Tool for the evaluation of topic modelling techniques in the context of feature location in codebases of complex software systems - created as part of my master’s thesis at Karlstad University.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tool for the evaluation of topic modelling techniques in the context of feature location in codebases of complex software systems - created as part of my master’s thesis at Karlstad University.

Notifications You must be signed in to change notification settings

l-schulte/cdbs_feature_location

Repository files navigation

Abstract

Software maintenance and the understanding of where in the source code features are implemented are two strongly coupled tasks that make up a large portion of the effort spent on developing applications. The concept of feature location investigated in this thesis can serve as a supporting factor in those tasks as it facilitates the automation of otherwise manual searches for source code artifacts. Challenges in this subject area include the aggregation and composition of a training corpus from historical codebase data for models as well as the integration and optimization of qualified topic modeling techniques. Building up on previous research, this thesis provides a comparison of two different techniques and introduces a toolkit that can be used to reproduce and extend on the results discussed. Specifically, in this thesis a changeset-based approach to feature location is pursued and applied to a large open-source Java project. The project is used to optimize and evaluate the performance of Latent Dirichlet Allocation models and Pachinko Allocation models, as well as to compare the accuracy of the two models with each other. As discussed at the end of the thesis, the results do not indicate a clear favorite between the models. Instead, the outcome of the comparison depends on the metric and viewpoint from which it is assessed.

Keywords

feature location, topic modeling, changesets, latent dirichlet distribution, pachinko allocation, mining software repositories, source code comprehension

Publication download:

Karlstad University Library

About

Tool for the evaluation of topic modelling techniques in the context of feature location in codebases of complex software systems - created as part of my master’s thesis at Karlstad University.

Resources

Stars

Watchers

Forks

Packages

No packages published
0