[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3643991.3644893acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

A Four-Dimension Gold Standard Dataset for Opinion Mining in Software Engineering

Published: 02 July 2024 Publication History

Abstract

We present the first four-dimension gold standard dataset to advance opinion mining focused on the software engineering domain. Through a well-defined sampling and annotation strategy leveraging multiple coders, we construct a corpus of 2,000 Stack Overflow posts labeled with four dimensions/tuples, including sentiments, polar facts, aspects, and named entities. This multidimensional ground truth dataset opens up new research opportunities for opinion mining in domain-adapted NLP tools for software engineering by capturing existing relationships between extracted elements at a more granular level. It also facilitates investigating the effects of sentiments in the developers' social forums.

References

[1]
T. Ahmed, A. Bosu, A. Iqbal, and S. Rahimi. 2017. Senticr: A Customized Sentiment Analysis Tool for Code Review Interactions. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE, 106--111.
[2]
F. Calefato, F. Lanubile, F. Maiorano, and N. Novielli. 2018. Sentiment Polarity Detection for Software Development. Empirical Software Engineering 23, 3 (2018), 1352--1382.
[3]
F. Calefato, F. Lanubile, and N. Novielli. 2018. How to ask for technical help? Evidence-based guidelines for writing questions on Stack Overflow. Information and Software Technology 94 (2018), 186--207.
[4]
J. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 5 (1971), 378--382.
[5]
M. Islam, M. Ahmmed, and M. Zibran. 2019. MarValous: Machine Learning Based Detection of Emotions in the Valence-Arousal Space in Software Engineering Text. In 34th ACM/SIGAPP Symposium On Applied Computing (SAC). 1786--1793.
[6]
M. Islam and M. Zibran. 2016. Exploration and Exploitation of Developers' Sentimental Variations in Software Engineering. International Journal of Software Innovation 4, 4 (2016), 35--55.
[7]
M. Islam and M. Zibran. 2016. Towards Understanding and Exploiting Developers' Emotional Variations in Software Engineering. In Proceedings of the International Conference on Software Engineering, Management, and Applications. 185--192.
[8]
M. Islam and M. Zibran. 2017. A Comparison of Dictionary Building Methods for Sentiment Analysis in Software Engineering Text. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement. 478--479.
[9]
M. Islam and M. Zibran. 2017. Leveraging Automated Sentiment Analysis in Software Engineering. In Proceedings of the 14th International Conference on Mining Software Repositories. 203--214.
[10]
M. Islam and M. Zibran. 2018. A Comparison of Software Engineering Domain Specific Sentiment Analysis Tools. In 25th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 487--491.
[11]
M. Islam and M. Zibran. 2018. Deva: Sensing Emotions in the Valence Arousal Space in Software Engineering Text. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing. 1536--1543.
[12]
M. Islam and M. Zibran. 2018. DEVA: Sensing Emotions in the Valence Arousal Space in Software Engineering Text. In Proceedings of the ACM/SIGAPP Symposium On Applied Computing. 1536--1543.
[13]
M. Islam and M. Zibran. 2018. Sentiment Analysis of Software Bug Related Commit Messages. In Proceedingd of the International Conference on Software Engineering and Data Engineering. 3--8.
[14]
M. Islam and M. Zibran. 2018. SentiStrength-SE: Exploiting Domain Specificity for Improved Sentiment Analysis in Software Engineering Text. Journal of Systems and Software 145 (2018), 125--146.
[15]
Md Rakibul Islam, Md Fazle Rabbi, Jo Youngeun, Arifa Champa, Ethan Young, Camden Wilson, Gavin Scott, and Minhaz Zibran. 2024. New Opinion Mining Dataset. (2024).
[16]
J. Klie, M. Bugert, B. Boullosa, R. Castilho, and I. Gurevych. 2018. The INCEpTION Platform: Machine-Assisted and Knowledge-Oriented Interactive Annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations. Association for Computational Linguistics, 5--9.
[17]
B. Lin, N. Cassee, A. Serebrenik, G. Bavota, N. Novielli, and M. Lanza. 2022. Opinion Mining for Software Development: A Systematic Literature Review. ACM Transactions on Software Engineering and Methodology, 41 pages.
[18]
B. Lin, F. Zampetti, G. Bavota, M. Penta, and M. Lanza. 2019. Pattern-Based Mining of Opinions in QA Websites. In Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE). 548--559.
[19]
B. Lin, F. Zampetti, G. Bavota, M. Di Penta, M. Lanza, and R. Oliveto. 2018. Sentiment Analysis for Software Engineering: How Far Can We Go?. In Proceedings of the 40th International Conference on Software Engineering. 94--104.
[20]
B. Liu. 2012. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies 5, 1 (May 2012), 1--167.
[21]
N. Novielli, F. Calefato, D. Dongiovanni, D. Girardi, and F. Lanubile. 2020. Can We Use SE-specific Sentiment Analysis Tools in a Cross-Platform Setting?. In Proceedings of the 17th International Conference on Mining Software Repositories. ACM, 158--168.
[22]
N. Novielli, D. Girardi, and F. Lanubile. 2018. A benchmark study on sentiment analysis for software engineering research. In MSR '18: Proceedings of the 15th International Conference on Mining Software Repositories. 364--375.
[23]
M. Ortu, A. Murgia, G. Destefanis, P. Tourani, R. Tonelli, R. Tonelli, M. Marchesi, and B. Adams. 2016. The Emotional Side of Software Developers in JIRA. In Proceedings of the 13th International Conference on Mining Software Repositories. 480--483.
[24]
M. Thelwall, K. Buckley, and G. Paltoglou. 2012. Sentiment Strength Detection for the Social Web. J. Am. Soc. Inf. Sci. Technol. 63, 1 (2012), 163--173.
[25]
G. Uddin and F. Khomh. 2019. Automatic mining of opinions expressed about APIs in Stack Overflow. IEEE Transactions on Software Engineering (2019), 522--559.
[26]
D. Ye, Z. Xing, C. Foo, Z. Ang, J. Li, and N. Kapre. 2016. Software-Specific Named Entity Recognition in Software Engineering Social Content. In Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering. 90--101.

Index Terms

  1. A Four-Dimension Gold Standard Dataset for Opinion Mining in Software Engineering

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MSR '24: Proceedings of the 21st International Conference on Mining Software Repositories
    April 2024
    788 pages
    ISBN:9798400705878
    DOI:10.1145/3643991
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 July 2024

    Check for updates

    Author Tags

    1. sentiment analysis
    2. opinion mining
    3. aspects
    4. named entity
    5. software engineering
    6. natural language processing

    Qualifiers

    • Research-article

    Conference

    MSR '24
    Sponsor:

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 179
      Total Downloads
    • Downloads (Last 12 months)179
    • Downloads (Last 6 weeks)29
    Reflects downloads up to 09 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media