Abstract
Dialogue act recognition (DAR) and sentiment classification (SC) are crucial tasks in dialogue systems, aiming to uncover speakers’ implicit intentions and sentiment by analyzing contextual information. Recent approaches have sought to improve accuracy by jointly modeling dialogue acts and sentiments, considering complex relationships and latent structures. However, these methods often neglect two critical challenges. Firstly, real-world dialogues follow a chronological order, with interlocutors discussing one or more topics. Secondly, the joint task of dialogue act recognition and sentiment classification operates at a sentence level, making it essential to effectively utilize fine-grained word-level information from utterances. To tackle these challenges, we propose a multi-perspective global–local interaction framework. It captures overall contextual information and simulates the flow of dialogue acts and sentiments for each speaker. We delve into explicit intra-task interactions, cross-task collaborations, and token-level information reuse from three perspectives. We also incorporate a time span to accommodate real-world scenarios with chronological and multi-topic dialogues. Experimental results on widely-used benchmark datasets demonstrate the superiority of our framework over mainstream approaches. Comprehensive analysis validates the effectiveness of each component, showcasing the potential for enhancing DAR and SC tasks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
Mastodon [7]: https://github.com/cerisara/DialogSentimentMastodon Dailydialog [13]: http://yanran.li/dailydialog
References
Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explorations Newsl 19(2):25–35
Ni J, Young T, Pandelea V, Xue F, Cambria E (2023) Recent advances in deep learning based dialogue systems: A systematic survey. Artif Intell Rev 56(4):3055–3155
Liu B (2012) Sentiment analysis and opinion mining. Synthesis lectures on human language technologies 5(1):1–167
Fung P, Dey A, Siddique FB, Lin R, Yang Y, Bertero D, Wan Y, Chan RHY, Wu C-S (2016) Zara: A virtual interactive dialogue system incorporating emotion, sentiment and personality recognition. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pp. 278–281
Kim M, Kim H (2018) Integrated neural network model for identifying speech acts, predicators, and sentiments of dialogue utterances. Pattern Recogn Lett 101:1–5
Ma Y, Nguyen KL, Xing FZ, Cambria E (2020) A survey on empathetic dialogue systems. Information Fusion 64:50–70
Cerisara C, Jafaritazehjani S, Oluokun A, Le HT (2018) Multi-task dialog act and sentiment recognition on mastodon. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 745–754
Qin L, Che W, Li Y, Ni M, Liu T (2020) Dcr-net: A deep co-interactive relation network for joint dialog act recognition and sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8665–8672
Li J, Fei H, Ji D (2020) Modeling local contexts for joint dialogue act recognition and sentiment classification with bi-channel dynamic convolutions. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 616–626
Qin L, Li Z, Che W, Ni M, Liu T (2021) Co-gat: A co-interactive graph attention network for joint dialog act recognition and sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13709–13717
Xing B, Tsang I (2022) Darer: Dual-task temporal relational recurrent reasoning network for joint dialog sentiment classification and act recognition. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 3611–3621
Sordoni A, Galley M, Auli M, Brockett C, Ji Y, Mitchell M, Nie J-Y, Gao J, Dolan WB (2015) A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 196–205
Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: A manually labelled multi-turn dialogue dataset. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 986–995
Chen Z, Yang R, Zhao Z, Cai D, He X (2018) Dialogue act recognition via crf-attentive structured network. In: The 41st International Acm Sigir Conference on Research & Development in Information Retrieval, pp. 225–234
Kumar H, Agarwal A, Dasgupta R, Joshi S (2018) Dialogue act sequence labeling using hierarchical encoder with crf. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, pp. 3440–3447
Raheja V, Tetreault J (2019) Dialogue act classification with context-aware self-attention. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 3727–3733
Li R, Lin C, Collinson M, Li X, Chen G (2019) A dual-attention hierarchical recurrent neural network for dialogue act classification. In: 23rd Conference on Computational Natural Language Learning, CoNLL 2019, pp. 383–392. Association for Computational Linguistics
Colombo P, Chapuis E, Manica M, Vignon E, Varni G, Clavel C (2020) Guiding attention in sequence-to-sequence models for dialogue act prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7594–7601
He Z, Tavabi L, Lerman K, Soleymani M (2021) Speaker turn modeling for dialogue act classification. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2150–2157
Wu T-W, Su R, Juang B-H (2021) A context-aware hierarchical bert fusion network for multi-turn dialog act detection. arXiv preprint arXiv:2109.01267
Pengfei G, Yinglong M (2022) A universality-individuality integration model for dialog act classification. arXiv preprint arXiv:2204.06185
Gella S, Padmakumar A, Lange PL, Hakkani-Tur D (2022) Dialog acts for task driven embodied agents. In: Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 111–123
Chang F-J, Muniyappa T, Sathyendra KM, Wei K, Strimel GP, McGowan R (2023) Dialog act guided contextual adapter for personalized speech recognition. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE
Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8(4):1253
Qiu M, Huang X, Chen C, Ji F, Qu C, Wei W, Huang J, Zhang Y (2021) Reinforced history backtracking for conversational question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13718–13726
Musto C, de Gemmis M, Semeraro G, Lops P (2017) A multi-criteria recommender system exploiting aspect-based sentiment analysis of users’ reviews. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, pp. 321–325
Liu P, Zhang L, Gulla JA (2021) Multilingual review-aware deep recommender system via aspect-based sentiment analysis. ACM Transactions on Information Systems (TOIS) 39(2):1–33
Ghosal D, Majumder N, Poria S, Chhaya N, Gelbukh A (2019) Dialoguegcn: A graph convolutional neural network for emotion recognition in conversation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 154–164
Majumder N, Poria S, Hazarika D, Mihalcea R, Gelbukh A, Cambria E (2019) Dialoguernn: An attentive rnn for emotion detection in conversations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6818–6825
Zhang C, Li Q, Song D (2019) Aspect-based sentiment classification with aspect-specific graph convolutional networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4568–4578
Bai X, Liu P, Zhang Y (2020) Investigating typed syntactic dependencies for targeted sentiment classification using graph attention neural network. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29:503–514
Shen W, Wu S, Yang Y, Quan X (2021) Directed acyclic graph network for conversational emotion recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1551–1560
Augustine E, Jandaghi P, Albalak A, Pryor C, Dickens C, Wang W, Getoor L (2022) Emotion recognition in conversation using probabilistic soft logic. arXiv preprint arXiv:2207.07238
Ghazarian S, Hedayatnia B, Papangelis A, Liu Y, Hakkani-Tur D (2022) What is wrong with you?: Leveraging user sentiment for automatic dialog evaluation. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 4194–4204
Cheng Z, Zhou J, Wu W, Chen Q, He L (2023) Tell model where to attend: Improving interpretability of aspect-based sentiment classification via small explanation annotations. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE
Kumar S, Mondal I, Akhtar MS, Chakraborty T (2023) Explaining (sarcastic) utterances to enhance affect understanding in multimodal dialogues. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 12986–12994
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 5753–5763
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Q., Li, J. A multi-perspective global–local interaction framework for identifying dialogue acts and sentiments of dialogue utterances jointly. Int. J. Mach. Learn. & Cyber. 15, 1995–2011 (2024). https://doi.org/10.1007/s13042-023-02010-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-02010-5