[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1806799.1806868acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

A search engine for finding highly relevant applications

Published: 01 May 2010 Publication History

Abstract

A fundamental problem of finding applications that are highly relevant to development tasks is the mismatch between the high-level intent reflected in the descriptions of these tasks and low-level implementation details of applications. To reduce this mismatch we created an approach called Exemplar (EXEcutable exaMPLes ARchive) for finding highly relevant software projects from large archives of applications. After a programmer enters a natural-language query that contains high-level concepts (e.g., MIME, data sets), Exemplar uses information retrieval and program analysis techniques to retrieve applications that implement these concepts. Our case study with 39 professional Java programmers shows that Exemplar is more effective than Sourceforge in helping programmers to quickly find highly relevant applications.

References

[1]
N. Anquetil and T. C. Lethbridge. Assessing the relevance of identifier names in a legacy software system. In CASCON, page 4, 1998.
[2]
R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval. ACM Press / Addison-Wesley, 1999.
[3]
T. J. Biggerstaff, B. G. Mitbander, and D. E. Webster. Program understanding and the concept assigment problem. Commun. ACM, 37(5):72--82, 1994.
[4]
S. Chatterjee, S. Juvekar, and K. Sen. Sniff: A search engine for java using free-form queries. In FASE, pages 385--400, 2009.
[5]
D. Cubranic, G. C. Murphy, J. Singer, and K. S. Booth. Hipikat: A project memory for software development. IEEE Trans. Software Eng., 31(6):446--465, 2005.
[6]
U. Dekel and J. D. Herbsleb. Improving api documentation usability with knowledge pushing. In ICSE, pages 320--330, 2009.
[7]
G. W. Furnas, T. K. Landauer, L. M. Gomez, and S. T. Dumais. The vocabulary problem in human-system communication. Commun. ACM, 30(11):964--971, 1987.
[8]
M. Grechanik, K. M. Conroy, and K. Probst. Finding relevant applications for prototyping. In MSR, page 12, 2007.
[9]
S. Henninger. Supporting the construction and evolution of component repositories. In ICSE, pages 279--288, 1996.
[10]
R. Hill and J. Rideout. Automatic method completion. In ASE, pages 228--235, 2004.
[11]
R. Holmes and G. C. Murphy. Using structural context to recommend source code examples. In ICSE, pages 117--125, 2005.
[12]
R. Holmes, R. J. Walker, and G. C. Murphy. Strathcona example recommendation tool. In ESEC/SIGSOFT FSE, pages 237--240, 2005.
[13]
J. Howison and K. Crowston. The perils and pitfalls of mining Sourceforge. In MSR, 2004.
[14]
E. Hull, K. Jackson, and J. Dick. Requirements Engineering. SpringerVerlag, 2004.
[15]
K. Inoue, R. Yokomori, H. Fujiwara, T. Yamamoto, M. Matsushita, and S. Kusumoto. Component rank: Relative significance rank for software component search. In ICSE, pages 14--24, 2003.
[16]
K. Inoue, R. Yokomori, T. Yamamoto, M. Matsushita, and S. Kusumoto. Ranking significance of software components based on use relations. IEEE Trans. Softw. Eng., 31(3):213--225, 2005.
[17]
C. W. Krueger. Software reuse. ACM Comput. Surv., 24(2):131--183, 1992.
[18]
O. A. L. Lemos, S. K. Bajracharya, J. Ossher, R. S. Morla, P. C. Masiero, P. Baldi, and C. V. Lopes. Codegenie: using test-cases to search and reuse source code. In ASE '07, pages 525--526, New York, NY, USA, 2007. ACM.
[19]
G. Little and R. C. Miller. Keyword programming in java. Automated Software Engg., 16(1):37--71, 2009.
[20]
D. Liu, A. Marcus, D. Poshyvanyk, and V. Rajlich. Feature location via information retrieval based filtering of a single scenario execution trace. In ASE, pages 234--243, 2007.
[21]
D. Mandelin, L. Xu, R. Bodík, and D. Kimelman. Jungloid mining: helping to navigate the API jungle. In PLDI, pages 48--61, 2005.
[22]
C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008.
[23]
S. S. Muchnick. Advanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.
[24]
G. C. Murphy, D. Notkin, and K. J. Sullivan. Software reflexion models: Bridging the gap between source and high-level models. In SIGSOFT FSE, pages 18--28, 1995.
[25]
J. Ossher, S. Bajracharya, E. Linstead, P. Baldi, and C. Lopes. Sourcererdb: An aggregated repository of statically analyzed and cross-linked open source java projects. MSR, 0:183--186, 2009.
[26]
J. Pérez-Iglesias, J. R. Pérez-Agüera, V. Fresno, and Y. Z. Feinstein. Integrating the Probabilistic Models BM25/BM25F into Lucene. CoRR, abs/0911.5046, 2009.
[27]
D. Poshyvanyk and M. Grechanik. Creating and evolving software by searching, selecting and synthesizing relevant source code. In ICSE Companion, pages 283--286, 2009.
[28]
S. P. Reiss. Semantics-based code search. In ICSE, pages 243--253, 2009.
[29]
S. E. Robertson, S. Walker, and M. Hancock-Beaulieu. Okapi at trec-7: Automatic ad hoc, filtering, vlc and interactive. In TREC, pages 199--210, 1998.
[30]
M. P. Robillard. Automatic generation of suggestions for program investigation. In ESEC/SIGSOFT FSE, pages 11--20, 2005.
[31]
N. Sahavechaphan and K. T. Claypool. XSnippet: mining for sample code. In OOPSLA, pages 413--430, 2006.
[32]
G. Salton. Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley, Boston, USA, 1989.
[33]
R. M. Sirkin. Statistics for the Social Sciences. Sage Publications, third edition, August 2005.
[34]
J. Stylos and B. A. Myers. A web-search tool for finding API components and examples. In IEEE Symposium on VL and HCC, pages 195--202, 2006.
[35]
N. Tansalarak and K. T. Claypool. Finding a needle in the haystack: A technique for ranking matches between components. In CBSE, pages 171--186, 2005.
[36]
S. Thummalapenta and T. Xie. Parseweb: a programmer assistant for reusing open source code on the web. In ASE '07, pages 204--213, New York, NY, USA, 2007. ACM.
[37]
S. Thummalapenta and T. Xie. Spotweb: Detecting framework hotspots and coldspots via mining open source code on the web. In ASE '08, pages 327--336, Washington, DC, USA, 2008. IEEE Computer Society.
[38]
I. H. Witten, A. Moffat, and T. C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images, Second Edition. Morgan Kaufmann, 1999.
[39]
Y. Ye and G. Fischer. Supporting reuse by delivering task-relevant and personalized information. In ICSE, pages 513--523, 2002.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '10: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
May 2010
627 pages
ISBN:9781605587196
DOI:10.1145/1806799
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2010

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)4
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ExCS: accelerating code search with code expansionScientific Reports10.1038/s41598-024-73907-614:1Online publication date: 25-Nov-2024
  • (2023)Big Code Search: A BibliographyACM Computing Surveys10.1145/360490556:1(1-49)Online publication date: 26-Aug-2023
  • (2023)Code Search: A Survey of Techniques for Finding CodeACM Computing Surveys10.1145/356597155:11(1-31)Online publication date: 9-Feb-2023
  • (2023)MulCS: Towards a Unified Deep Representation for Multilingual Code Search2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00021(120-131)Online publication date: Mar-2023
  • (2023)ASTSDL: predicting the functionality of incomplete programming code via an AST-sequence-based deep learning modelScience China Information Sciences10.1007/s11432-021-3665-167:1Online publication date: 27-Dec-2023
  • (2022)Phrase2Set: Phrase-to-Set Machine Translation and Its Software Engineering Applications2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER53432.2022.00068(502-513)Online publication date: Mar-2022
  • (2022)ASTENS-BWAScience of Computer Programming10.1016/j.scico.2022.102839222:COnline publication date: 1-Oct-2022
  • (2021)Query Sense Discovery Approach to Realize the User's Search IntentInternational Journal of Information Retrieval Research10.4018/IJIRR.28960912:1(1-18)Online publication date: 15-Oct-2021
  • (2021)Hybrid Course Recommendation System Design for a Real-Time Student Automation ApplicationEuropean Journal of Science and Technology10.31590/ejosat.944596Online publication date: 23-Jun-2021
  • (2021)CodeMatcher: Searching Code Based on Sequential Semantics of Important Query WordsACM Transactions on Software Engineering and Methodology10.1145/346540331:1(1-37)Online publication date: 28-Sep-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media