[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3324884.3416628acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Generating concept based API element comparison using a knowledge graph

Published: 27 January 2021 Publication History

Abstract

Developers are concerned with the comparison of similar APIs in terms of their commonalities and (often subtle) differences. Our empirical study of Stack Overflow questions and API documentation confirms that API comparison questions are common and can often be answered by knowledge contained in API reference documentation. Our study also identifies eight types of API statements that are useful for API comparison. Based on these findings, we propose a knowledge graph based approach APIComp that automatically extracts API knowledge from API reference documentation to support the comparison of a pair of API classes or methods from different aspects. Our approach includes an offline phase for constructing an API knowledge graph, and an online phase for generating an API comparison result for a given pair of API elements. Our evaluation shows that the quality of different kinds of extracted knowledge in the API knowledge graph is generally high. Furthermore, the comparison results generated by APIComp are significantly better than those generated by a baseline approach based on heuristic rules and text similarity, and our generated API comparison results are useful for helping developers in API selection tasks.

References

[1]
2020. Replication Package. Retrieved August 31, 2020 from https://fudanselab.github.io/Research-ASE2020-APIComp/
[2]
Barthélémy Dagenais and Martin P. Robillard. 2012. Recovering traceability links between an API and its learning resources. In 34th International Conference on Software Engineering, ICSE 2012, June 2--9, 2012, Zurich, Switzerland. 47--57.
[3]
Davide Fucci, Alireza Mollaalizadehbahnemiri, and Walid Maalej. 2019. On using machine learning to identify knowledge in API reference documentation. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26--30, 2019, Marlon Dumas, Dietmar Pfahl, Sven Apel, and Alessandra Russo (Eds.). ACM, 109--119.
[4]
Hideaki Hata, Christoph Treude, Raula Gaikovina Kula, and Takashi Ishio. 2019. 9.6 million links in source code comments: purpose, evolution, and decay. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25--31, 2019. 1211--1221.
[5]
Xiaojiang Huang, Xiaojun Wan, and Jianguo Xiao. 2011. Comparative News Summarization Using Linear Programming. In The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19--24 June, 2011, Portland, Oregon, USA - Short Papers. The Association for Computer Linguistics, 648--653.
[6]
He Jiang, Jingxuan Zhang, Zhilei Ren, and Tao Zhang. 2017. An unsupervised approach for discovering relevant tutorial fragments for APIs. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20--28, 2017. 38--48.
[7]
Hongwei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, and Xuejiao Zhao. 2018. Improving API Caveats Accessibility by Mining API Caveats Knowledge Graph. In 2018 IEEE International Conference on Software Maintenance and Evolution, ICSME 2018, Madrid, Spain, September 23--29, 2018. 183--193.
[8]
Mingwei Liu, Xin Peng, Andrian Marcus, Zhenchang Xing, Wenkai Xie, Shuangshuang Xing, and Yang Liu. 2019. Generating query-specific class API summaries. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26--30, 2019. 120--130.
[9]
Walid Maalej and Martin P. Robillard. 2013. Patterns of Knowledge in API Reference Documentation. IEEE Trans. Software Eng. 39, 9 (2013), 1264--1282.
[10]
Mary L McHugh. 2012. Interrater reliability: the kappa statistic. Biochemia medica: Biochemia medica 22, 3 (2012), 276--282.
[11]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5--8, 2013, Lake Tahoe, Nevada, United States. 3111--3119.
[12]
George A. Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM 38, 11 (1995), 39--41.
[13]
Martin Monperrus, Michael Eichberg, Elif Tekes, and Mira Mezini. 2012. What should developers be aware of? An empirical study on the directives of API documentation. Empirical Software Engineering 17, 6 (2012), 703--737.
[14]
Laura Moreno, Andrian Marcus, Lori L. Pollock, and K. Vijay-Shanker. 2013. JSummarizer: An automatic generator of natural language summaries for Java classes. In IEEE 21st International Conference on Program Comprehension, ICPC 2013, San Francisco, CA, USA, 20--21 May, 2013. 230--232.
[15]
Rahul Pandita, Xusheng Xiao, Hao Zhong, Tao Xie, Stephen Oney, and Amit M. Paradkar. 2012. Inferring method specifications from natural language API descriptions. In 34th International Conference on Software Engineering, ICSE 2012, June 2--9, 2012, Zurich, Switzerland. 815--825.
[16]
Xiang Ren, Yuanhua Lv, Kuansan Wang, and Jiawei Han. 2017. Comparative Document Analysis for Large Text Corpora. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, Cambridge, United Kingdom, February 6--10, 2017, Maarten de Rijke, Milad Shokouhi, Andrew Tomkins, and Min Zhang (Eds.). ACM, 325--334.
[17]
Lin Shi, Hao Zhong, Tao Xie, and Mingshu Li. 2011. An Empirical Study on Evolution of API Documentation. In Fundamental Approaches to Software Engineering - 14th International Conference, FASE 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings. 416--431.
[18]
Ravindra Singh and Naurang Singh Mangat. 2013. Elements of survey sampling. Vol. 15. Springer Science & Business Media.
[19]
Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori L. Pollock, and K. Vijay-Shanker. 2010. Towards automatically generating summary comments for Java methods. In ASE 2010, 25th IEEE/ACM International Conference on Automated Software Engineering, Antwerp, Belgium, September 20--24, 2010, Charles Pecheur, Jamie Andrews, and Elisabetta Di Nitto (Eds.). ACM, 43--52.
[20]
StackOverflow. 2019. Stack Overflow data dump version from March 3, 2019. https://archive.org/download/stackexchange/.
[21]
Siddharth Subramanian, Laura Inozemtseva, and Reid Holmes. 2014. Live API documentation. In 36th International Conference on Software Engineering, ICSE '14, Hyderabad, India - May 31 - June 07, 2014. 643--652.
[22]
Jiamou Sun, Zhenchang Xing, Rui Chu, Heilai Bai, Jinshui Wang, and Xin Peng. 2019. Know-How in Programming Tasks: From Textual Tutorials to Task-Oriented Knowledge Graph. In 2019 IEEE International Conference on Software Maintenance and Evolution, ICSME 2019, Cleveland, OH, USA, September 29 - October 4, 2019. IEEE, 257--268.
[23]
Maksim Tkachenko and Hady W. Lauw. 2019. CompareLDA: A Topic Model for Document Comparison. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. 7112--7119.
[24]
Christoph Treude and Martin P. Robillard. 2016. Augmenting API documentation with insights from stack overflow. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14--22, 2016. 392--403.
[25]
Denny Vrandecic. 2013. The Rise of Wikidata. IEEE Intelligent Systems 28, 4 (2013), 90--95.
[26]
Chong Wang, Xin Peng, Mingwei Liu, Zhenchang Xing, Xuefang Bai, Bing Xie, and Tuo Wang. 2019. A Learning-Based Approach for Automatic Construction of Domain Glossary from Source Code and Documentation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 97--108.
[27]
Bernard L Welch. 1947. The generalization of Student's problem when several different population variances are involved. Biometrika 34, 1/2 (1947), 28--35.
[28]
wikidata. 2019. String. https://www.wikidata.org/wiki/Q37484380.
[29]
wikidata. 2019. string. https://www.wikidata.org/wiki/Q326426.
[30]
wikidata. 2019. string. https://www.wikidata.org/wiki/Q1376436.
[31]
Hao Zhong, Lu Zhang, Tao Xie, and Hong Mei. 2011. Inferring specifications for resources from natural language API documentation. Autom. Softw. Eng. 18, 3--4 (2011), 227--261.
[32]
Yu Zhou, Ruihang Gu, Taolue Chen, Zhiqiu Huang, Sebastiano Panichella, and Harald C. Gall. 2017. Analyzing APIs documentation and code to detect directive defects. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20--28, 2017. 27--37.

Cited By

View all
  • (2024)Let’s Discover More API Relations: A Large Language Model-Based AI Chain for Unsupervised API Relation InferenceACM Transactions on Software Engineering and Methodology10.1145/368046933:8(1-34)Online publication date: 23-Jul-2024
  • (2024)Revealing the Unseen: AI Chain on LLMs for Predicting Implicit Dataflows to Generate Dataflow Graphs in Dynamically Typed CodeACM Transactions on Software Engineering and Methodology10.1145/367245833:7(1-29)Online publication date: 12-Jun-2024
  • (2024)Practical, Automated Scenario-Based Mobile App TestingIEEE Transactions on Software Engineering10.1109/TSE.2024.341467250:7(1949-1966)Online publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
December 2020
1449 pages
ISBN:9781450367684
DOI:10.1145/3324884
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. API
  2. documentation
  3. knowledge extraction
  4. knowledge graph

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China

Conference

ASE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)4
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Let’s Discover More API Relations: A Large Language Model-Based AI Chain for Unsupervised API Relation InferenceACM Transactions on Software Engineering and Methodology10.1145/368046933:8(1-34)Online publication date: 23-Jul-2024
  • (2024)Revealing the Unseen: AI Chain on LLMs for Predicting Implicit Dataflows to Generate Dataflow Graphs in Dynamically Typed CodeACM Transactions on Software Engineering and Methodology10.1145/367245833:7(1-29)Online publication date: 12-Jun-2024
  • (2024)Practical, Automated Scenario-Based Mobile App TestingIEEE Transactions on Software Engineering10.1109/TSE.2024.341467250:7(1949-1966)Online publication date: 1-Jul-2024
  • (2024)Neural Library Recommendation by Embedding Project-Library Knowledge GraphIEEE Transactions on Software Engineering10.1109/TSE.2024.339350450:6(1620-1638)Online publication date: 24-Apr-2024
  • (2024)Answering Uncertain, Under-Specified API Queries Assisted by Knowledge-Aware Human-AI DialogueIEEE Transactions on Software Engineering10.1109/TSE.2023.334695450:2(280-295)Online publication date: 1-Feb-2024
  • (2024)ROS package search for robot software development: a knowledge graph-based approachFrontiers of Computer Science10.1007/s11704-024-3660-919:6Online publication date: 12-Dec-2024
  • (2023)KG4CraSolver: Recommending Crash Solutions via Knowledge GraphProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616317(1242-1254)Online publication date: 30-Nov-2023
  • (2023)Recommending Analogical APIs via Knowledge Graph EmbeddingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616305(1496-1508)Online publication date: 30-Nov-2023
  • (2023)API Entity and Relation Joint Extraction from Text via Dynamic Prompt-tuned Language ModelACM Transactions on Software Engineering and Methodology10.1145/360718833:1(1-25)Online publication date: 23-Nov-2023
  • (2023)Task-Oriented ML/DL Library Recommendation based on a Knowledge GraphIEEE Transactions on Software Engineering10.1109/TSE.2023.3285280(1-16)Online publication date: 2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media