[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2961111.2962588acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

Mining Technology Landscape from Stack Overflow

Published: 08 September 2016 Publication History

Abstract

The sheer number of available technologies and the complex relationships among them make it challenging to choose the right technologies for software projects. Developers often turn to online resources (e.g., expert articles and community answers) to get a good understanding of the technology landscape. Such online resources are primarily opinion-based and are often out of date. Furthermore, information is often scattered in many online resources, which has to be aggregated to have a big picture of the technology landscape. In this paper, we exploit the fact that Stack Overflow users tag their questions with the main technologies that the questions revolve around, and develop association rule mining and community detection techniques to mine technology landscape from Stack Overflow question tags. The mined technology landscape is represented in a graphical Technology Associative Network (TAN). Our empirical study shows that the mined TAN captures a wide range of technologies, the complex relationships among the technologies, and the trend of the technologies in the developers' discussions on Stack Overflow. We develop a website (https://graphofknowledge.appspot.com/) for the community to access and evaluate the mined technology landscape. The website visit statistics by Google Analytics shows the developers' general interests in our technology landscape service. We also report a small-scale user study to evaluate the potential usefulness of our tool.

References

[1]
R. Agrawal, R. Srikant, et al. Fast algorithms for mining association rules. In VLDB, volume 1215, pages 487--499, 1994.
[2]
L. Bao, J. Li, Z. Xing, X. Wang, X. Xia, and B. Zhou. Extracting and analyzing time-series hci data from screen-captured task videos. Empirical Software Engineering, pages 1--41, 2016.
[3]
A. Barua, S. W. Thomas, and A. E. Hassan. What are developers talking about? an analysis of topics and trends in stack overflow. Empirical Software Engineering, 19(3):619--654, 2014.
[4]
M. Bastian, S. Heymann, M. Jacomy, et al. Gephi: an open source software for exploring and manipulating networks. ICWSM, 8:361--362, 2009.
[5]
V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.
[6]
C. Chen, S. Gao, and Z. Xing. Mining analogical libraries in q&a discussions -incorporating relational and categorical knowledge into word embedding. In The 23rd SANER, pages 338--348. IEEE, 2016.
[7]
C. Chen and Z. Xing. Towards correlating search on google and asking on stack overflow. In The 40th COMPSAC, pages 83--92. IEEE, 2016.
[8]
R. L. Cilibrasi and P. M. Vitanyi. The google similarity distance. TKDE, 19(3):370--383, 2007.
[9]
J. Ferrante, K. J. Ottenstein, and J. D. Warren. The program dependence graph and its use in optimization. ACM TOPLAS, 9(3):319--349, 1987.
[10]
R. Gligorov, W. ten Kate, Z. Aleksovski, and F. Van Harmelen. Using google distance to weight approximate ontology matches. In WWW, pages 767--776. ACM, 2007.
[11]
Google trends, https://www.google.com.sg/trends/.
[12]
Traffic from search engine robots. https://support.google.com/analytics/answer/1315708?hl=en.
[13]
T. Gruber. Ontology of folksonomy: A mash-up of apples and oranges. International Journal on Semantic Web and Information Systems (IJSWIS), 3(1):1--11, 2007.
[14]
H. Halpin, V. Robu, and H. Shepherd. The complex dynamics of collaborative tagging;. In WWW, pages 211--220. ACM, 2007.
[15]
D. Helic, M. Strohmaier, C. Trattner, M. Muhr, and K. Lerman. Pragmatic evaluation of folksonomies. In WWW, pages 417--426. ACM, 2011.
[16]
M. Jacomy, S. Heymann, T. Venturini, and M. Bastian. Forceatlas2, a continuous graph layout algorithm for handy network visualization. Medialab center of research, 560, 2011.
[17]
G. Macgregor and E. McCulloch. Collaborative tagging as a knowledge organisation and resource discovery tool. Library review, 55(5):291--300, 2006.
[18]
S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns. What makes a good code example?: A study of programming q&a in stackoverflow. In 28th ICSM, pages 25--34. IEEE, 2012.
[19]
V. Robu, H. Halpin, and H. Shepherd. Emergence of consensus and shared vocabularies in collaborative tagging systems. ACM TWEB, 3(4): 14, 2009.
[20]
C. Rosen and E. Shihab. What are mobile developers asking about? a large scale study using stack overflow. Empirical Software Engineering, pages 1--32, 2015.
[21]
J. Rumbaugh, I. Jacobson, and G. Booch. Unified Modeling Language Reference Manual, The. Pearson Higher Education, 2004.
[22]
M. Sanderson and B. Croft. Deriving concept hierarchies from text. In SIGIR, pages 206--213. ACM, 1999.
[23]
P. Schmitz. Inducing ontology from flickr tags. In Collaborative Web Tagging Workshop at WWW, volume 50, 2006.
[24]
E. Simpson. Clustering tags in enterprise and web folksonomies. In ICWSM, 2008.
[25]
M.-A. Storey, L.-T. Cheng, I. Bull, and P. Rigby. Shared waypoints and social tagging to support collaboration in software development. In Proceedings of the 20th CSCW, pages 195--198. ACM, 2006.
[26]
Y. Tian, D. Lo, and J. Lawall. Automated construction of a software-specific word similarity database. In CSMR-WCRE, pages 44--53. IEEE, 2014.
[27]
C. Treude and M. Storey. How tagging helps bridge the gap between social and technical aspects in software development. In ICSE, pages 12--22. IEEE, 2009.
[28]
F. B. Viegas, M. Wattenberg, and J. Feinberg. Participatory visualization with wordle. IEEE TVCG, 15(6):1137--1144, 2009.
[29]
S. Wang, D. Lo, and L. Jiang. Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging. In ICSM, pages 604--607. IEEE, 2012.
[30]
J. Yang and L. Tan. Inferring semantically related words from software context. In MSR, pages 161--170. IEEE, 2012.

Cited By

View all
  • (2024)Community Security Champions: Studying the Most Influential Users on Security Stack Exchange2024 IEEE Secure Development Conference (SecDev)10.1109/SecDev61143.2024.00015(93-104)Online publication date: 7-Oct-2024
  • (2023)Identifying Concepts in Software ProjectsIEEE Transactions on Software Engineering10.1109/TSE.2023.326585549:7(3660-3674)Online publication date: Jul-2023
  • (2023)A data-driven framework for knowledge exchange analysis of development issues in medical applications: A case study of COVID-192023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00065(386-393)Online publication date: 6-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEM '16: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
September 2016
457 pages
ISBN:9781450344272
DOI:10.1145/2961111
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 September 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Association Rule Mining
  2. Community Detection
  3. Technology Associative Network
  4. Technology Landscape

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ESEM '16
Sponsor:

Acceptance Rates

ESEM '16 Paper Acceptance Rate 27 of 122 submissions, 22%;
Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Community Security Champions: Studying the Most Influential Users on Security Stack Exchange2024 IEEE Secure Development Conference (SecDev)10.1109/SecDev61143.2024.00015(93-104)Online publication date: 7-Oct-2024
  • (2023)Identifying Concepts in Software ProjectsIEEE Transactions on Software Engineering10.1109/TSE.2023.326585549:7(3660-3674)Online publication date: Jul-2023
  • (2023)A data-driven framework for knowledge exchange analysis of development issues in medical applications: A case study of COVID-192023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA60479.2023.00065(386-393)Online publication date: 6-Sep-2023
  • (2023)Investigating Technology Usage Span by Analyzing Users' Q&A Traces in Stack Overflow2023 30th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC60848.2023.00076(589-593)Online publication date: 4-Dec-2023
  • (2023)A fly in the ointment: an empirical study on the characteristics of Ethereum smart contract code weaknessesEmpirical Software Engineering10.1007/s10664-023-10398-529:1Online publication date: 30-Nov-2023
  • (2022)A Comprehensive Survey on Affinity Analysis, Bibliomining, and Technology Mining: Past, Present, and Future ResearchApplied Sciences10.3390/app1210522712:10(5227)Online publication date: 21-May-2022
  • (2022)Understanding the Dynamics of Knowledge Building Process in Online Knowledge-Sharing PlatformComplexity10.1155/2022/73921862022Online publication date: 1-Jan-2022
  • (2022)COVID-Vis: Visualizing knowledge exchange on scientific software development in the COVID-19 eraProceedings of the 26th Pan-Hellenic Conference on Informatics10.1145/3575879.3576019(367-372)Online publication date: 25-Nov-2022
  • (2022)TechSpaces: Identifying and Clustering Popular Programming TechnologiesProceedings of the 16th Brazilian Symposium on Software Components, Architectures, and Reuse10.1145/3559712.3559715(60-67)Online publication date: 3-Oct-2022
  • (2022)Exploring D3 Implementation Challenges on Stack Overflow2022 IEEE Visualization and Visual Analytics (VIS)10.1109/VIS54862.2022.00009(1-5)Online publication date: Oct-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media