[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Efficient storage and fast querying of source code

Published: 01 July 2011 Publication History

Abstract

Enabling fast and detailed insights over large portions of source code is an important task in a global development ecosystem. Numerous data structures have been developed to store source code and to support various structural queries, to help in navigation, evaluation and analysis. Many of these data structures work with tree-based or graph-based representations of source code. The goal of this project is to elaborate a data storage that enables efficient storing and fast querying of structural information. The naive adjacency list method has been enhanced with the use of recent data compression approaches for column-oriented databases to allow no-loss albeit compact storage of fine-grained structural data. The graph indexing has enabled the proposed data model to expeditiously answer fine-grained structural queries. This paper describes the basics of the proposed approach and illustrates its technical feasibility.

References

[1]
Abadi, D., Madden, S., & Ferreira, M. (2006). Integrating compression and execution in column-oriented database systems. In Proceedings of the international conference on management of data (pp. 671-682). ACM.
[2]
Bajracharya, S., Ngo, T., Linstead, E., Dou, Y., Rigor, P., Baldi, P., et al. (2006). Sourcerer: A search engine for open source code supporting structure-based search. In Companion to the 21st SIGPLAN symposium on object-oriented programming systems, languages, and applications (pp. 681-682). ACM.
[3]
Begel, A. (2007). Codifier: A programmer-centric search user interface. In Proceedings of the workshop on human-computer interaction and information retrieval (pp. 23-24).
[4]
Hajiyev, E., Verbaere, M., & de Moor, O. (2006). CodeQuest: Scalable source code queries with datalog. In Proceedings of the 20th European conference on object-oriented programming (Vol. 4067, pp. 2-27). Berlin: Springer.
[5]
Hill, E., Pollock, L., & Vijay-Shanker, K. (2007). Exploring the neighborhood with dora to expedite software maintenance. In Proceedings of the 22nd international conference on automated software engineering (pp. 14-23). ACM.
[6]
Holmes, R., Walker, R. J., & Murphy, G. C. (2006). Approximate structural context matching: An approach to recommend relevant examples. IEEE Transactions on Software Engineering, 32(12), 952-970.
[7]
Hummel, O., & Atkinson, C. (2006). Using the web as a reuse repository. In Proceedings of the international conference on software reuse (pp. 298-311).
[8]
Hummel, O., Janjic, W., & Atkinson, C. (2008). Code conjurer: Pulling reusable software out of thin air. IEEE Software, 25(5), 45-52.
[9]
Janzen, D., & Volder, K. D. (2003). Programs as information. In Proceedings of the OOPSLA workshop on eclipse technology exchange (pp. 69-73). New York: ACM.
[10]
Keller, H., & Krüger, S. (2007). ABAP objects: ABAP programming in SAP NetWeaver. Galileo Press.
[11]
Koskinen, J., Salminen, A., & Paakki, J. (2004). Hypertext support for the information needs of software maintainers. Journal of Software Maintenance and Evolution: Research and Practice, 16(3), 187-215.
[12]
Lethbridge, T., & Singer, J. (2001). Studies of the work practices of software engineers. In H. Erdogmus, & O. Tanir (Eds.), Advances in software engineering: Comprehension, evaluation, and evolution (pp. 53-76). Springer.
[13]
Liu, D., & Xu, S. (2007). Challenges of using LSI for concept location. In Proceedings of the 45th annual southeast regional conference (pp. 449-454). ACM.
[14]
Marcus, A., Sergeyev, A., Rajlich, V., & Maletic, J. I. (2004). An information retrieval approach to concept location in source code. In Proceedings of the 11th working conference on reverse engineering (pp. 214-223). IEEE Computer Society.
[15]
McCormick, E., & Volder, K. D. (2004). JQuery: Finding your way through tangled code. In Proceedings of the 19th annual SIGPLAN conference on object-oriented programming systems, languages, and applications (pp. 9-10). ACM.
[16]
Poshyvanyk, D., Petrenko, M., Marcus, A., Xie, X., & Liu, D. (2006). Source code exploration with Google. In Proceedings of the 22nd IEEE international conference on software maintenance (pp. 334-338). IEEE Computer Society.
[17]
Schaffner, J., Bog, A., Krüger, J., & Zeier, A. (2008). A hybrid row-column OLTP database architecture for operational reporting. In Proceedings of the international workshop on business intelligence for the real time enterprise.
[18]
Sim, S. E., Clarke, C. L. A., & Holt, R. C. (1998). Archetypal source code searches: A survey of software developers and maintainers. In Proceedings of the 6th international workshop on program comprehension (pp. 180-187). IEEE Computer Society.
[19]
Stockinger, K., Cieslewicz, J., Wu, K., Rotem, D., & Shoshani, A. (2009). Using bitmap index for joint queries on structured and text data. Annals of Information Systems, 3, 1-23.
[20]
Transier, F., & Sanders, P. (2008). Compressed inverted indexes for in-memory search engines. In Proceedings of the 9th workshop on algorithm engineering and experiments.
[21]
Trißl, S., & Leser, U. (2007). Fast and practical indexing and querying of very large graphs. In Proceedings of the ACM SIGMOD international conference on management of data (pp. 845-856). ACM.
[22]
von Mayrhauser, A., & Vans, A. M. (1997). Program understanding needs during corrective maintenance of large scale software. In Proceedings of the 21st international computer software and applications conference (pp. 630-637). IEEE Computer Society.

Cited By

View all
  • (2018)Information systems frontiersInformation Systems Frontiers10.1007/s10796-014-9544-z17:1(217-237)Online publication date: 24-Dec-2018
  • (2010)Towards query formulation and visualization of structural search resultsProceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation10.1145/1809175.1809184(33-36)Online publication date: 1-May-2010

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Systems Frontiers
Information Systems Frontiers  Volume 13, Issue 3
July 2011
145 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 July 2011

Author Tags

  1. Global code repository
  2. Source code analysis
  3. Source code search
  4. Structural information

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Information systems frontiersInformation Systems Frontiers10.1007/s10796-014-9544-z17:1(217-237)Online publication date: 24-Dec-2018
  • (2010)Towards query formulation and visualization of structural search resultsProceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation10.1145/1809175.1809184(33-36)Online publication date: 1-May-2010

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media