[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Automatic generation of library bindings using static analysis

Published: 15 June 2009 Publication History

Abstract

High-level languages are growing in popularity. However, decades of C software development have produced large libraries of fast, time-tested, meritorious code that are impractical to recreate from scratch. Cross-language bindings can expose low-level C code to high-level languages. Unfortunately, writing bindings by hand is tedious and error-prone, while mainstream binding generators require extensive manual annotation or fail to offer the language features that users of modern languages have come to expect.
We present an improved binding-generation strategy based on static analysis of unannotated library source code. We characterize three high-level idioms that are not uniquely expressible in C's low-level type system: array parameters, resource managers, and multiple return values. We describe a suite of interprocedural analyses that recover this high-level information, and we show how the results can be used in a binding generator for the Python programming language. In experiments with four large C libraries, we find that our approach avoids the mistakes characteristic of hand-written bindings while offering a level of Python integration unmatched by prior automated approaches. Among the thousands of functions in the public interfaces of these libraries, roughly 40% exhibit the behaviors detected by our static analyses.

References

[1]
B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In POPL '88: Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 1--11, New York, NY, USA, 1988. ACM. ISBN 0-89791-252-7.
[2]
L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, Department of Computer Science, University of Cophenhagen, May 1994.
[3]
D. M. Beazley. SWIG: an easy to use tool for integrating scripting languages with C and C++. In TCLTK'96: Proceedings of the 4th conference on USENIX Tcl/Tk Workshop, 1996, pages 15--15, Berkeley, CA, USA, 1996. USENIX Association.
[4]
D. M. Beazley. Simplified wrapper and interface generator. http://www.swig.org, Nov. 2008.
[5]
E. Busboom, A. Cancro, and W. Goesgens. libical. http://freeassociation.sourceforge.net/, Nov. 2008.
[6]
P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In POPL '78: Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 84--96, New York, NY, USA, 1978. ACM.
[7]
Ctypesgen Developers. ctypesgen. http://code.google.com/p/ctypesgen/, Nov. 2008.
[8]
R. Cytron and R. Gershbein. Efficient accommodation of may-alias information in SSA form. In PLDI '93: Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, pages 36--45, New York, NY, USA, 1993. ACM. ISBN 0-89791-598-4.
[9]
M. Elder, S. Jackson, and B. Liblit. Code sandwiches. Technical Report 1647, University of Wisconsin-Madison, Oct. 2008.
[10]
J. S. Foster, R. Johnson, J. Kodumal, and A. Aiken. Flow-insensitive type qualifiers. ACM Trans. Program. Lang. Syst., 28(6):1035--1087, 2006.
[11]
M. Furr and J. S. Foster. Checking type safety of foreign function calls. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 62--72, New York, NY, USA, 2005. ACM. ISBN 1-59593-056-6.
[12]
J. Gailly and M. Adler. zlib home site. http://zlib.net/, Nov. 2008.
[13]
M. Galassi, J. Davies, J. Theiler, B. Gough, G. Jungman, M. Booth, and F. Rossi. GNU Scientific Library Reference Manual. Network Theory Ltd., Bristol, United Kingdom, revised second edition, Aug. 2006.
[14]
The GNOME Project. GNOME Bug Tracking System. http://bugzilla.gnome.org, Jan. 2009.
[15]
H. S. Gunawi, C. Rubio-González, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and B. Liblit. EIO: Error handling is occasionally correct. In M. Baker and E. Riedel, editors, FAST, pages 207--222. USENIX, 2008. ISBN 978-1-931971-56-0.
[16]
M. J. Harrold and M. L. Soffa. Efficient computation of interprocedural definition-use chains. ACM Trans. Program. Lang. Syst., 16(2):175--204, 1994. ISSN 0164-0925.
[17]
D. L. Heine and M. S. Lam. A practical flow-sensitive and context-sensitive C and C++ memory leak detector. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 168--181, New York, NY, USA, 2003. ACM. ISBN 1-58113-662-5.
[18]
T. Heller. ctypeslib -- useful additions to the ctypes FFI library. http://pypi.python.org/pypi/ctypeslib/, Nov. 2008.
[19]
S. Jaroszewicz. ctypesGSL. http://www.cs.umb.edu/sj/ctypesGsl/, Aug. 2008.
[20]
T. Kientzle. libarchive. http://people.freebsd.org/~kientzle/libarchive/, Nov. 2008.
[21]
T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler. From uncertainty to belief: inferring the specification within. In OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation, pages 161--176, Berkeley, CA, USA, 2006. USENIX Association. ISBN 1-931971-47-1.
[22]
C. Lattner. LLVM and Clang: Next generation compiler technology. In BSDCan 2008: The BSD Conference, Ottawa, Canada, May 2008.
[23]
C. Lattner and V. S. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO, pages 75--88. IEEE Computer Society, 2004. ISBN 0-7695-2102-9.
[24]
A. Makhorin. GLPK (GNU linear programming kit). http://www.gnu.org/software/glpk/, Nov. 2008.
[25]
M.-T. Pham. ctypes-glpk: A Python wrapper for GLPK using ctypes. http://code.google.com/p/ctypes-glpk, Nov. 2008.
[26]
J. Reppy and C. Song. Application-specific foreign-interface generation. In GPCE '06: Proceedings of the 5th international conference on Generative programming and component engineering, pages 49--58, New York, NY, USA, 2006. ACM. ISBN 1-59593-237-2.
[27]
C. Rubio-González, H. S. Gunawi, B. Liblit, R. H. Arpaci-Dusseau, and A. C. Arpaci-Dusseau. Error propagation analysis for file systems. In Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation, Dublin, Ireland, June 15--20 2009.
[28]
J. Seward. bzip2. http://www.bzip.org/, Nov. 2008.
[29]
Silicon Graphics, Inc. libacl. http://oss.sgi.com/projects/xfs/, Feb. 2008.
[30]
Silicon Graphics, Inc. libattr. http://oss.sgi.com/projects/xfs/, Feb. 2008.

Cited By

View all
  • (2023)Matching Linear Algebra and Tensor Code to Specialized Hardware AcceleratorsProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580262(85-97)Online publication date: 17-Feb-2023
  • (2023)CGORewritter: A better way to use C library in G2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00072(688-692)Online publication date: Mar-2023
  • (2013)Exposing behavioral differences in cross-language API mapping relationsProceedings of the 16th international conference on Fundamental Approaches to Software Engineering10.1007/978-3-642-37057-1_10(130-145)Online publication date: 16-Mar-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 44, Issue 6
PLDI '09
June 2009
478 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1543135
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '09: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2009
    492 pages
    ISBN:9781605583921
    DOI:10.1145/1542476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2009
Published in SIGPLAN Volume 44, Issue 6

Check for updates

Author Tags

  1. bindings
  2. dataflow analysis
  3. ffi
  4. foreign function interfaces
  5. multi-language code reuse
  6. static library analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)3
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Matching Linear Algebra and Tensor Code to Specialized Hardware AcceleratorsProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580262(85-97)Online publication date: 17-Feb-2023
  • (2023)CGORewritter: A better way to use C library in G2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00072(688-692)Online publication date: Mar-2023
  • (2013)Exposing behavioral differences in cross-language API mapping relationsProceedings of the 16th international conference on Fundamental Approaches to Software Engineering10.1007/978-3-642-37057-1_10(130-145)Online publication date: 16-Mar-2013
  • (2023)Ownership Guided C to Rust TranslationComputer Aided Verification10.1007/978-3-031-37709-9_22(459-482)Online publication date: 17-Jul-2023
  • (2018)The h2m-AutoFortran tool for facilitating C-Fortran interoperabilityACM SIGPLAN Fortran Forum10.1145/3266145.326614737:2(8-18)Online publication date: 8-Aug-2018
  • (2016)Array length inference for C library bindingsProceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering10.1145/2970276.2970310(461-471)Online publication date: 25-Aug-2016
  • (2015)Automatic array property detection via static analysisCompanion Proceedings of the 2015 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity10.1145/2814189.2815367(69-70)Online publication date: 25-Oct-2015
  • (2015)Mutation-based fault localization for real-world multilingual programsProceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2015.14(464-475)Online publication date: 9-Nov-2015
  • (2013)Analyzing memory ownership patterns in C librariesACM SIGPLAN Notices10.1145/2555670.246416248:11(97-108)Online publication date: 20-Jun-2013
  • (2013)Analyzing memory ownership patterns in C librariesProceedings of the 2013 international symposium on memory management10.1145/2491894.2464162(97-108)Online publication date: 20-Jun-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media