Development emails content analyzer: intention mining in developer discussions

Published: 09 November 2015


Written development communication (e.g. mailing lists, issue trackers) constitutes a precious source of information to build recommenders for software engineers, for example aimed at suggesting experts, or at redocumenting existing source code. In this paper we propose a novel, semi-supervised approach named DECA (Development Emails Content Analyzer) that uses Natural Language Parsing to classify the content of development emails according to their purpose (e.g. feature request, opinion asking, problem discovery, solution proposal, information giving etc), identifying email elements that can be used for specific tasks. A study based on data from Qt and Ubuntu, highlights a high precision (90%) and recall (70%) of DECA in classifying email content, outperforming traditional machine learning strategies. Moreover, we successfully used DECA for re-documenting source code of Eclipse and Lucene, improving the recall, while keeping high precision, of a previous approach based on ad-hoc heuristics.


  • (2024)DRMiner: Extracting Latent Design Rationale from Jira Issue LogsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695019(468-480)Online publication date: 27-Oct-2024
  • (2024)Can GitHub Issues Help in App Review Classifications?ACM Transactions on Software Engineering and Methodology10.1145/367817033:8(1-42)Online publication date: 18-Jul-2024
  • (2024)Analyzing and Detecting Information Types of Developer Live Chat ThreadsACM Transactions on Software Engineering and Methodology10.1145/364367733:5(1-32)Online publication date: 4-Jun-2024
  1. Development emails content analyzer: intention mining in developer discussions



    ASE '15: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering
November 2015
    November 2015
    935 pages



    Published: 09 November 2015

    empirical study
    natural language processing
    unstructured data mining


    Overall Acceptance Rate 82 of 337 submissions, 24%


