[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Concept location using program dependencies and information retrieval (DepIR)

Published: 01 April 2013 Publication History

Abstract

ContextThe functionality of a software system is most often expressed in terms of concepts from its problem or solution domains. The process of finding where these concepts are implemented in the source code is known as concept location and it is a prerequisite of software change. ObjectiveWe investigate a static approach to concept location named DepIR that combines program dependency search (DepS) with information retrieval-based search (IR). In this approach, programmers explore the static program dependencies of the source code components retrieved by the IR search engine. MethodThe paper presents an empirical study that compares DepIR with its constituent techniques. The evaluation is based on an empirical method of reenactment that emulates the steps of concept location for 50 past changes mined from software repositories of five software systems. ResultsThe results of the study indicate that DepIR significantly outperforms both DepS and IR. ConclusionDepIR allows developers to perform concept location efficiently. It allows finding concepts even with queries that do not rank the relevant software components highly. Since formulating a good query is not always easy, this tolerance of lower-quality queries significantly broadens the usability of DepIR compared to the traditional IR.

References

[1]
N. Wilde, J.A. Gomez, T. Gust, D. Strasburg, Locating user functionality in old code, in: IEEE International Conference on Software Maintenance (ICSM'92), Orlando, FL, 1992, pp. 200-205.
[2]
T.J. Biggerstaff, B.G. Mitbander, D.E. Webster, The concept assignment problem in program understanding, in: 15th IEEE/ACM International Conference on Software Engineering (ICSE'94), 1994, pp. 482-498.
[3]
V. Rajlich, Software Engineering: The Current Practice, Chapman and Hall/CRC, 2011.
[4]
V. Rajlich, Changing the paradigm of software engineering, in: Communications of ACM, 2006, pp. 67-70.
[5]
Bohner, S. and Arnold, R., Software Change Impact Analysis. 1996. IEEE Computer Society, Los Alamitos, CA.
[6]
B. Dit, M. Revelle, M. Gethers, D. Poshyvanyk, Feature location in source code: a taxonomy and survey, Journal of Software Maintenance and Evolution: Research and Practice, 2011, http://dx.doi.org/10.1002/smr.567.
[7]
A. Marcus, V. Rajlich, J. Buchta, M. Petrenko, A. Sergeyev, Static techniques for concept location in object-oriented code, in: 13th IEEE International Workshop on Program Comprehension (IWPC'05), 2005, pp. 33-42.
[8]
Aho, A.V., Pattern matching in strings. In: Formal Language Theory: Perspectives and Open Problems, Academic Press, New York. pp. 325-347.
[9]
M. Petrenko, V. Rajlich, R. Vanciu, Partial domain comprehension in software evolution and maintenance, in: IEEE International Conference on Software Comprehension, 2008, pp. 13-22.
[10]
Poshyvanyk, D., Guéhéneuc, Y.G., Marcus, A., Antoniol, G. and Rajlich, V., Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Transactions on Software Engineering. v33. 420-432.
[11]
A. Marcus, A. Sergeyev, V. Rajlich, J. Maletic, An information retrieval approach to concept location in source code, in: 11th IEEE Working Conference on Reverse Engineering (WCRE'04), Delft, The Netherlands, 2004, pp. 214-223.
[12]
K. Chen, V. Rajlich, Case study of feature location using dependency graph, in: IEEE International Workshop on Program Comprehension, IEEE Computer Society, 2000, pp. 241-249.
[13]
SNIAFL: towards a static non-interactive approach to feature location. ACM Transactions on Software Engineering and Methodologies (TOSEM). v15. 195-226.
[14]
P. Shao, R.K. Smith, Feature location by IR modules and call graph, in: Annual Southeast Regional Conference, 2009.
[15]
S. Hayashi, K. Sekine, M. Saeki, iFL: an interactive environment for understanding feature implementations in: IEEE International Conference on Software Maintenance, 2010, pp. 1-5.
[16]
E. Hill, L. Pollock, K. Vijay-Shanker, Investigating how to effectively combine static concern location techniques, in: 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and, Evaluation, 2011.
[17]
M. Eaddy, A.V. Aho, G. Antoniol, Y.-G. Gueheneuc, CERBERUS: tracing requirements to source code using information retrieval, dynamic analysis, and program analysis, in: IEEE International Conference on Program Comprehension, 2008, pp. 53-62.
[18]
M. Revelle, D. Poshyvanyk, An exploratory study on assessing feature location techniques, in: IEEE International Conference on Program Comprehension, 2010, pp. 218-222.
[19]
S. Ratanotayanon, H.J. Choi, S.E. Sim, Using transitive changesets to support feature location, in: IEEE/ACM International Conference On Automated Software Engineering, 2010, pp. 341-344.
[20]
S. Hayashi, T. Yoshikawa, M. Saeki, Sentence-to-code traceability recovery with domain ontologies, in: Asia Pacific, Software Engineering Conference, 2010, pp. 385-394.
[21]
D. Shepherd, Z. Fry, E. Gibson, L. Pollock, K. Vijay-Shanker, Using natural language program analysis to locate and understand action-oriented concerns, in: International Conference on Aspect Oriented Software, Development (AOSD'07), 2007, pp. 212-224.
[22]
Gold, N.E., Harman, M., Binkley, D. and Hierons, R.M., Unifying program slicing and concept assignment for higher-level executable source code extraction. Journal of Software Practice and Experience. v35. 977-1006.
[23]
R. Al-Ekram, K. Kontogiannis, Source code modularization using lattice of concept slices, in: the Eighth Euromicro Working Conference on Software Maintenance and Reengineering, 2004, pp. 195-203.
[24]
N.E. Gold, M. Harman, Z. Li, K. Mahdavi, Allowing overlapping boundaries in source code using a search based approach to concept binding, in: IEEE International Conference on Software, Maintenance, 2006, pp. 310-319.
[25]
E. Hill, L. Pollock, K. Vijay-Shanker, Exploring the neighborhood with dora to expedite software maintenance, in: IEEE/ACM International Conference on Automated Software Engineering, 2007, pp. 14-23.
[26]
Ko, A.J., Myers, B.A., Coblenz, M.J. and Aung, H.H., An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Transactions on Software Engineering (TSE). v32. 971-987.
[27]
Sillito, J., Murphy, G.C. and De Volder, K., Asking and answering questions during a programming change task. IEEE Transactions on Software Engineering. v34. 1-18.
[28]
J. Sillito, K. De Volder, B. Fisher, G.C. Murphy, Managing software change tasks: an exploratory study, in: International Symposium on Empirical Software Engineering (ISESE 2005), Noosa Heads, Australia, 2005, pp. 23-32.
[29]
Marchionini, G., Information Seeking in Electronic Environments. 1997. Cambridge University Press, Cambridge, United Kindom.
[30]
Robillard, M.P., Coelho, W. and Murphy, G.C., How effective developers investigate source code: an exploratory study. IEEE Transactions on Software Engineering. v30. 889-903.
[31]
T. Khun, O. Thomann, Abstract Syntax Tree, in: Eclipse Corner Articles, 2007.
[32]
M. Petrenko, V. Rajlich, Variable granularity for improving precision of impact analysis, in: IEEE International Conference on Program Comprehension, 2009, pp. 10-19.
[33]
D. Poshyvanyk, A. Marcus, Y. Dong, JIRiSS - an eclipse plug-in for source code exploration, in: 14th IEEE International Conference on Program Comprehension (ICPC'06), Athens, Greece, 2006, pp. 252-255.
[34]
Hassan, A.E. and Holt, R.C., Replaying development history to assess the effectiveness of change propagation tools. Empirical Software Engineering. v11. 335-367.
[35]
G. Canfora, L. Cerulo, Impact analysis by mining software and change request repositories, in: 11th IEEE International Symposium on Software Metrics, 2005, pp. 9-29.
[36]
G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, Identifying the starting impact set of a maintenance request: a case study, in: European Conference on Software Maintenance and Reengineering, IEEE Computer Society, 2000, pp. 227-230.
[37]
Jensen, C. and Scacchi, W., Discovering, modeling, and reenacting open source software development processes, new trends in software process modeling. Series in Software Engineering and Knowledge Engineering. v18. 1-20.
[38]
D. Liu, A. Marcus, D. Poshyvanyk, V. Rajlich, Feature location via information retrieval based filtering of a single scenario execution trace, in: 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE'07), Atlanta, Georgia, 2007, pp. 234-243.
[39]
Dijkstra, E.W., A note on two problems in connexion with graphs. Numerische Mathematik. v1. 269-271.
[40]
Conover, W.J., Practical Nonparametric Statistics. 1999. third ed. Wiley, New York, NY.
[41]
McKeithen, K.B., Reitman, J.S., Rueter, H.H. and Hitle, S.C., Knowledge organisation and skill differences in computer programmers. Cognitive Psychology. v13. 307-325.
[42]
D.N. Perkins, F. Martin, Fragile knowledge and neglected strategies in novice programmers, in: Empirical Studies of Programmers, 1986, pp. 213-229.
[43]
V. Rajlich, Intension are a key to program comprehension, in: IEEE International Conference on Program Comprehension (ICPC '09), 2009, pp. 1-9.
[44]
H.K. Wright, M. Kim, D.E. Perry, Validity concerns in software engineering research, in: FSE/SDP Workshop on Future of Software Engineering Research, 2010, pp. 411-414.
[45]
Mockus, A., Fielding, R.T. and Herbsleb, J.D., Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology. v11. 309-346.

Cited By

View all
  • (2022)Automated assertion generation via information retrieval and its integration with deep learningProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510149(163-174)Online publication date: 21-May-2022
  • (2018)Locating bugs without looking backAutomated Software Engineering10.1007/s10515-017-0226-125:3(383-434)Online publication date: 1-Sep-2018
  • (2017)A historical, textual analysis approach to feature locationInformation and Software Technology10.1016/j.infsof.2017.04.00388:C(110-126)Online publication date: 1-Aug-2017
  • Show More Cited By
  1. Concept location using program dependencies and information retrieval (DepIR)

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Information and Software Technology
    Information and Software Technology  Volume 55, Issue 4
    April, 2013
    126 pages

    Publisher

    Butterworth-Heinemann

    United States

    Publication History

    Published: 01 April 2013

    Author Tags

    1. Concept location
    2. Dependency search
    3. IR

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Automated assertion generation via information retrieval and its integration with deep learningProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510149(163-174)Online publication date: 21-May-2022
    • (2018)Locating bugs without looking backAutomated Software Engineering10.1007/s10515-017-0226-125:3(383-434)Online publication date: 1-Sep-2018
    • (2017)A historical, textual analysis approach to feature locationInformation and Software Technology10.1016/j.infsof.2017.04.00388:C(110-126)Online publication date: 1-Aug-2017
    • (2016)FLINTSProccedings of the 10th European Conference on Software Architecture Workshops10.1145/2993412.3003390(1-7)Online publication date: 28-Nov-2016
    • (2015)Exploring the use of concern element role information in feature location evaluationProceedings of the 2015 IEEE 23rd International Conference on Program Comprehension10.5555/2820282.2820303(140-150)Online publication date: 16-May-2015
    • (2015)Dual Execution for On the Fly Fine Grained Execution ComparisonACM SIGARCH Computer Architecture News10.1145/2786763.269439443:1(325-338)Online publication date: 14-Mar-2015
    • (2015)Dual Execution for On the Fly Fine Grained Execution ComparisonACM SIGPLAN Notices10.1145/2775054.269439450:4(325-338)Online publication date: 14-Mar-2015
    • (2015)Dual Execution for On the Fly Fine Grained Execution ComparisonProceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2694344.2694394(325-338)Online publication date: 14-Mar-2015
    • (2015)Link analysis algorithms for static concept locationEmpirical Software Engineering10.1007/s10664-014-9327-720:6(1666-1720)Online publication date: 1-Dec-2015
    • (2014)Software evolution and maintenanceFuture of Software Engineering Proceedings10.1145/2593882.2593893(133-144)Online publication date: 31-May-2014
    • Show More Cited By

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media