[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ICPC.2010.33guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

My Repository Runneth Over: An Empirical Study on Diversifying Data Sources to Improve Feature Search

Published: 30 June 2010 Publication History

Abstract

Research on feature location that applies information retrieval techniques have experimented the kinds of inputs to the corpus and the algorithms that could be used. At first, only source code was used. Later extraction techniques were improved, and data from other software tools and analyses were used to expand or augment the repository. But, does having more diverse data in the repository always produce better results? In this paper, we report on an empirical study to examine the effect of increasing data diversity to improve feature location through search. In particular, we looked at the effect of including: i) change sets from revision control system, ii) tickets from issue trackers, and iii) elements from a Static Dependency Graph (SDG). We searched for three features of Jajuk, an open source Java jukebox, and two features of jEdit, an open source Java text editor. We used four different corpuses built with a combination of the above data. We used Eclipse’s code search and an index built with source code as baseline conditions. We found that it is not always better to have more diverse data. Adding SDG data to change sets increased recall, but drove down precision. Adding data from issue trackers had little effect and in one case lowered recall. We also found that large-scale refactoring of the code decreases the effectiveness using change sets for feature location.

Cited By

View all
  • (2021)Opportunities and Challenges in Code Search ToolsACM Computing Surveys10.1145/348002754:9(1-40)Online publication date: 8-Oct-2021
  • (2011)Investigating how to effectively combine static concern location techniquesProceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation10.1145/1985429.1985439(37-40)Online publication date: 28-May-2011

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICPC '10: Proceedings of the 2010 IEEE 18th International Conference on Program Comprehension
June 2010
227 pages
ISBN:9780769541136

Publisher

IEEE Computer Society

United States

Publication History

Published: 30 June 2010

Author Tags

  1. change sets
  2. code search
  3. component
  4. feature location
  5. program comprehension

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Opportunities and Challenges in Code Search ToolsACM Computing Surveys10.1145/348002754:9(1-40)Online publication date: 8-Oct-2021
  • (2011)Investigating how to effectively combine static concern location techniquesProceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation10.1145/1985429.1985439(37-40)Online publication date: 28-May-2011

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media