[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/11785477_2guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

CodeQuest: scalable source code queries with datalog

Published: 03 July 2006 Publication History

Abstract

Source code querying tools allow programmers to explore relations between different parts of the code base. This paper describes such a tool, named codeQuest. It combines two previous proposals, namely the use of logic programming and database systems.
As the query language we use safe Datalog, which was originally introduced in the theory of databases. That provides just the right level of expressiveness; in particular recursion is indispensable for source code queries. Safe Datalog is like Prolog, but all queries are guaranteed to terminate, and there is no need for extra-logical annotations.
Our implementation of Datalog maps queries to a relational database system. We are thus able to capitalise on the query optimiser provided by such a system. For recursive queries we implement our own optimisations in the translation from Datalog to SQL. Experiments confirm that this strategy yields an efficient, scalable code querying system.

References

[1]
Eclipse. http://www.eclipse.org.
[2]
JQuery. http://www.cs.ubc.ca/labs/spl/projects/jquery/.
[3]
XSB. http://xsb.sourceforge.net/.
[4]
The TyRuBa metaprogramming system. http://tyruba.sourceforge.net/.
[5]
CodeQuest. http://progtools.comlab.ox.ac.uk/projects/codequest/.
[6]
Serge Abiteboul, Peter Buneman, and Dan Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann Publishers, 2000.
[7]
Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995.
[8]
Leonor Abraido-Fandino. An overview of Refine 2.0. In Procs. of the Second International Symposium on Knowledge Engineering and Software Engineering, 1987.
[9]
Krzysztof R. Apt and Roland N. Bol. Logic programming and negation: A survey. Journal of Logic Programming, 19/20:9-71, 1994.
[10]
Pavel Avgustinov, Aske Simon Christensen, Laurie Hendren, Sascha Kuzins, Jennifer Lhoták, Ondřej Lhoták, Oege de Moor, Damien Sereni, Ganesh Sittampalam, and Julian Tibble. abc: An extensible AspectJ compiler. In Aspect-Oriented Software Development (AOSD), pages 87-98. ACM Press, 2005.
[11]
Roland Backhouse and Paul Hoogendijk. Elements of a relational theory of datatypes. In Bernhard Möller, Helmut Partsch, and Stephen Schuman, editors, Formal Program Development, volume 755 of Lecture Notes in Computer Science, pages 7-42. Springer Verlag, 1993.
[12]
François Bancilhon, David Maier, Yehoshua Sagiv, and Jeffrey D. Ullman. Magic sets and other strange ways to implement logic programs. In Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, March 24-26, 1986, Cambridge, Massachusetts, pages 1-16. ACM, 1986.
[13]
Cast. Company website at: http://www.castsoftware.com.
[14]
Yih Chen, Michael Nishimoto, and C. V. Ramamoorthy. The C information abstraction system. IEEE Transactions on Software Engineering, 16(3):325-334, 1990.
[15]
Mariano Consens, Alberto Mendelzon, and Arthur Ryman. Visualizing and querying software structures. In ICSE '92: Proceedings of the 14th international conference on Software engineering, pages 138-156, New York, NY, USA, 1992. ACM Press.
[16]
Roger F. Crew. ASTLOG: A language for examining abstract syntax trees. In USENIX Conference on Domain-Specific Languages, pages 229-242, 1997.
[17]
Stephen Dawson, C. R. Ramakrishnan, and David ScottWarren. Practical program analysis using general purpose logic programming systems. In ACM Symposium on Programming Language Design and Implementation, pages 117-126. ACM Press, 1996.
[18]
Henk Doornbos, Roland Carl Backhouse, and Jaap van der Woude. A calculational approach to mathematical induction. Theoretical Computer Science, 179(1-2):103- 135, 1997.
[19]
Michael Eichberg, Michael Haupt, Mira Mezini, and Thorsten Schäfer. Comprehensive software understanding with sextant. In ICSM '05: Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM'05), pages 315-324, Washington, DC, USA, September 2005. IEEE Computer Society.
[20]
Hervé Gallaire and Jack Minker. Logic and Databases. Plenum Press, New York, 1978.
[21]
Stefan Hanenberg Günter Kniesel, Tobias Rho. Evolvable pattern implementations need generic aspects. In Proc. of ECOOP 2004 Workshop on Reflection, AOP and Meta-Data for Software Evolution, pages 116-126. June 2004.
[22]
Kris Gybels and Johan Brichau. Arranging language features for more robust pattern-based crosscuts. In 2nd International Conference on Aspect-oriented Software Development, pages 60-69. ACM Press, 2003.
[23]
Elnar Hajiyev. CodeQuest: Source Code Querying with Datalog. MSc Thesis, Oxford University Computing Laboratory, September 2005. Available at http://progtools.comlab.ox.ac.uk/projects/codequest/.
[24]
Doug Janzen and Kris de Volder. Navigating and querying code without getting lost. In 2nd International Conference on Aspect-Oriented Software Development, pages 178-187, 2003.
[25]
Stan Jarzabek. Design of flexible static program analyzers with PQL. IEEE Transactions on Software Engineering, 24(3):197-215, 1998.
[26]
Shahram Javey, Kin'ichi Mitsui, Hiroaki Nakamura, Tsuyoshi Ohira, Kazu Yasuda, Kazushi Kuse, Tsutomu Kamimura, and Richard Helm. Architecture of the XL C++ browser. In CASCON '92: Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research, pages 369-379. IBM Press, 1992.
[27]
Karel Ježek and Vladimír Toncar. Experimental deductive database. In Workshop on Information Systems Modelling, pages 83-90, 1998.
[28]
Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, Jeffrey Palm, and William G. Griswold. An overview of AspectJ. In J. Lindskov Knudsen, editor, European Conference on Object-oriented Programming, volume 2072 of Lecture Notes in Computer Science, pages 327-353. Springer, 2001.
[29]
Bronislaw Knaster. Un théorème sur les fonctions d'ensembles. Annales de la Societé Polonaise de Mathematique, 6:133-134, 1928.
[30]
Kemal Koymen. A datalog interface for SQL (abstract). In CSC '90: Proceedings of the 1990 ACM annual conference on Cooperation, page 422, New York, NY, USA, 1990. ACM Press.
[31]
Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin, Dzintars Avots, Michael Carbin, and Christopher Unkel. Context-sensitive program analysis as database queries. In PODS '05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 1-12, New York, NY, USA, 2005. ACM Press.
[32]
Mark A. Linton. Implementing relational views of programs. In Peter B. Henderson, editor, Software Development Environments (SDE), pages 132-140, 1984.
[33]
Michael Martin, Benjamin Livshits, and Monica S. Lam. Finding application errors using PQL: a program query language. In Proceedings of the 20th annual ACM SIGPLAN OOPSLA Conference, pages 365-383, 2005.
[34]
Edward McCormick and Kris De Volder. JQuery: finding your way through tangled code. In OOPSLA '04: Companion to the 19th annual ACM SIGPLAN OOPSLA conference, pages 9-10, New York, NY, USA, 2004. ACM Press.
[35]
Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. Polyglot: An extensible compiler framework for Java. In 12th International Conference on Compiler Construction, volume 2622 of Lecture Notes in Computer Science, pages 138- 152, 2003.
[36]
Santanu Paul and Atul Prakash. Querying source code using an algebraic query language. IEEE Transactions on Software Engineering, 22(3):202-217, 1996.
[37]
Magellan Project. Web page at: http://www.st.informatik.tu-darmstadt.de/ static/pages/projects/Magellan/XIRC.html. 2005.
[38]
Thomas W. Reps. Demand interprocedural program analysis using logic databases. In Workshop on Programming with Logic Databases, ILPS, pages 163-196, 1993.
[39]
Konstantinos Sagonas, Terrance Swift, and David S. Warren. XSB as an efficient deductive database engine. In SIGMOD '94: Proceedings of the 1994 ACM SIGMOD international conference on Management of data, pages 442-453, New York, NY, USA, 1994. ACM Press.
[40]
Eric Sword. Create a root combinedplot interface. JFreeChart feature request: http://sourceforge.net/tracker/index.php?func=detail&aid=1234995& group_id=15494&atid=365494, 2005.
[41]
Peri Tarr, William Harrison, and Harold Ossher. Pervasive query support in the concern manipulation environment. Technical Report RC23343, IBM Research Division, Thomas J. Watson Research Center, 2004.
[42]
Michael Thompson. Bluephoenix: Application modernization technology audit. Available at: http://www.bitpipe.com/detail/RES/1080665824_99.html., 2004.
[43]
John Whaley, Dzintars Avots, Michael Carbin, and Monica S. Lam. Using datalog and binary decision diagrams for program analysis. In Kwangkeun Yi, editor, Proceedings of the 3rd Asian Symposium on Programming Languages and Systems, volume 3780, pages 97-118. Springer-Verlag, November 2005.

Cited By

View all
  • (2024)A Typed Multi-level Datalog IR and Its Compiler FrameworkProceedings of the ACM on Programming Languages10.1145/36897678:OOPSLA2(1586-1614)Online publication date: 8-Oct-2024
  • (2024)Finding Cross-Rule Optimization Bugs in Datalog EnginesProceedings of the ACM on Programming Languages10.1145/36498158:OOPSLA1(110-136)Online publication date: 29-Apr-2024
  • (2024)Optimizing Nested Recursive QueriesProceedings of the ACM on Management of Data10.1145/36392712:1(1-27)Online publication date: 26-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECOOP'06: Proceedings of the 20th European conference on Object-Oriented Programming
July 2006
525 pages
ISBN:3540357262
  • Editor:
  • Dave Thomas

Sponsors

  • ILOG
  • Google Inc.
  • Sun Microsystems
  • Microsoft Research: Microsoft Research
  • IBM: IBM

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 03 July 2006

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Typed Multi-level Datalog IR and Its Compiler FrameworkProceedings of the ACM on Programming Languages10.1145/36897678:OOPSLA2(1586-1614)Online publication date: 8-Oct-2024
  • (2024)Finding Cross-Rule Optimization Bugs in Datalog EnginesProceedings of the ACM on Programming Languages10.1145/36498158:OOPSLA1(110-136)Online publication date: 29-Apr-2024
  • (2024)Optimizing Nested Recursive QueriesProceedings of the ACM on Management of Data10.1145/36392712:1(1-27)Online publication date: 26-Mar-2024
  • (2023)Code Search: A Survey of Techniques for Finding CodeACM Computing Surveys10.1145/356597155:11(1-31)Online publication date: 9-Feb-2023
  • (2021)Programming and execution models for next generation code intelligence systems (keynote)Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3478688(1-2)Online publication date: 20-Aug-2021
  • (2021)DIFFBASE: a differential factbase for effective software evolution managementProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468605(503-515)Online publication date: 20-Aug-2021
  • (2020)Fixpoints for the masses: programming with first-class Datalog constraintsProceedings of the ACM on Programming Languages10.1145/34281934:OOPSLA(1-28)Online publication date: 13-Nov-2020
  • (2020)Modular collaborative program analysis in OPALProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3368089.3409765(184-196)Online publication date: 8-Nov-2020
  • (2019)A framework for writing trigger-action todo comments in executable formatProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3338965(385-396)Online publication date: 12-Aug-2019
  • (2019)MetaDL: analysing Datalog in DatalogProceedings of the 8th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis10.1145/3315568.3329970(38-43)Online publication date: 22-Jun-2019
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media