[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3488560.3498509acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections

Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations

Published: 15 February 2022 Publication History


The onset of the COVID-19 pandemic has brought the mental health of people under risk. Social counselling has gained remarkable significance in this environment. Unlike general goal-oriented dialogues, a conversation between a patient and a therapist is considerably implicit, though the objective of the conversation is quite apparent. In such a case, understanding the intent of the patient is imperative in providing effective counselling in therapy sessions, and the same applies to a dialogue system as well. In this work, we take forward a small but an important step in the development of an automated dialogue system for mental-health counselling. We develop a novel dataset, named HOPE, to provide a platform for the dialogue-act classification in counselling conversations. We identify the requirement of such conversation and propose twelve domain-specific dialogue-act (DAC) labels. We collect ~ 12.9K utterances from publicly-available counselling session videos on YouTube, extract their transcripts, clean, and annotate them with DAC labels. Further, we propose SPARTA, a transformer-based architecture with a novel speaker- and time-aware contextual learning for the dialogue-act classification. Our evaluation shows convincing performance over several baselines, achieving state-of-the-art on HOPE. We also supplement our experiments with extensive empirical and qualitative analyses of SPARTA.

Supplementary Material

MP4 File (wsdmfp745.mp4)
Presentation video for "Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations." The video explains the dataset collected and annotated for mental health counseling conversations. Further slides explain the use of the SPARTA model for the dialogue-act classification task.


Ali Ahmadvand, Jason Ingyu Choi, and Eugene Agichtein. 2019. Contextual dialogue act classification for open-domain conversational agents. In SIGIR . 1273--1276.
Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, and Jiayu Zhou. 2017. Patient Subtyping via Time-Aware LSTM Networks. In SIGKDD. 65--74.
Manjot Bedi, Shivani Kumar, Md Shad Akhtar, and Tanmoy Chakraborty. 2021. Multi-modal Sarcasm Detection and Humor Classification in Code-mixed Conversations. IEEE Transactions on Affective Computing (2021), 1--1. https://doi.org/10.1109/TAFFC.2021.3083522
Christophe Cerisara, Somayeh Jafaritazehjani, Adedayo Oluokun, and Hoa T Le. 2018. Multi-task dialog act and sentiment recognition on Mastodon. In COLING. 745--754.
Zheqian Chen, Rongqin Yang, Zhou Zhao, Deng Cai, and Xiaofei He. 2018. Dialogue act recognition via crf-attentive structured network. In SIGIR . 225--234.
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation. In EMNLP. 1724--1734.
Jacob Cohen. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, Vol. 20, 1 (1960), 37--46.
Pierre Colombo, Emile Chapuis, Matteo Manica, Emmanuel Vignon, Giovanna Varni, and C. Clavel. 2020. Guiding attention in Sequence-to-sequence models for Dialogue Act prediction. In AAAI .
Alexis Conneau, Holger Schwenk, Lo"ic Barrault, and Yann Lecun. 2017. Very Deep Convolutional Networks for Text Classification. In EACL . 1107--1116.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.
Alex Fine, Patrick Crutchley, Jenny Blase, Joshua Carroll, and Glen Coppersmith. 2020. Assessing population-level symptoms of anxiety, depression, and suicide risk in real time using NLP applied to social media data. In Natural Language Processing and Computational Social Science. 50--54.
J. J. Godfrey, E. C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: telephone speech corpus for research and development. In [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. 517--520 vol.1. https://doi.org/10.1109/ICASSP.1992.225858
Jonathan Gratch, Ron Artstein, Gale M Lucas, Giota Stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, et al. 2014. The distress analysis interview corpus of human and computer interviews. In LREC . 3123--3128.
Sergio Grau, Emilio Sanchis, Maria Jose Castro, and David Vilar. 2004. Dialogue act classification using a Bayesian approach. In 9th Conference Speech and Computer .
Kai Hua, Zhiyuan Feng, Chongyang Tao, Rui Yan, and Lu Zhang. 2020. Learning to Detect Relevant Contexts and Knowledge for Response Selection in Retrieval-Based Dialogue Systems. In CIKM. 525--534.
Zornitsa Kozareva and Sujith Ravi. 2019. ProSeqo: Projection Sequence Networks for On-Device Text Classification. In EMNLP-IJCNLP . 3894--3903.
K. Kretzschmar, H. Tyroll, G. Pavarini, A. Manzini, and I. Singh. 2019. Can Your Phone Be Your Therapist? Young People's Ethical Perspectives on the Use of Fully Automated Conversational Agents (Chatbots) in Mental Health Support . Biomed Inform Insights, Vol. 11 (2019), 1178222619829083.
Shivani Kumar, Anubhav Shrimal, Md Shad Akhtar, and Tanmoy Chakraborty. 2021. Discovering Emotion and Reasoning its Flip in Multi-Party Conversations using Masked Memory Network and Transformer. arxiv: 2103.12360 [cs.CL]
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In ICML . Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 282--289.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. 1989. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. (Dec. 1989), 541--551.
Ji Young Lee and Franck Dernoncourt. 2016. Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks. In NAACL . Association for Computational Linguistics, San Diego, California, 515--520.
Maxwell Levis, Christine Leonard Westgate, Jiang Gui, Bradley V. Watts, and Brian Shiner. 2020. Natural language processing of clinical mental health notes may add predictive value to existing suicide risk models. Psychological Medicine (2020), 1--10.
Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. 2017. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In IJCNLP .
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent Neural Network for Text Classification with Multi-Task Learning. In IJCAI . AAAI Press, 2873--2879.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, Vol. abs/1907.11692 (2019).
Adam S. Miner, Albert Haque, Jason Alan Fries, S. Fleming, D. Wilfley, G. Terence Wilson, A. Milstein, D. Jurafsky, B. Arnow, W. Stewart Agras, L. Fei-Fei, and N. Shah. 2020. Assessing the accuracy of automatic speech recognition for psychotherapy. NPJ Digital Medicine, Vol. 3 (2020).
Rohan Mishra, Pradyumn Prakhar Sinha, Ramit Sawhney, Debanjan Mahata, Puneet Mathur, and Rajiv Ratn Shah. 2019. SNAP-BATNET: Cascading author profiling and social network graphs for suicide ideation detection on social media. In NAACL: Student Research Workshop . 147--156.
Fabrizio Morbini, David DeVault, Kallirroi Georgila, Ron Artstein, David Traum, and Louis-Philippe Morency. 2014. A Demonstration of Dialogue Processing in SimSensei Kiosk. In SIGDIAL . 254.
Spain C Moreno, C Arango, Carmen Moreno, Til Wykes, Silvana Galderisi, Merete Nordentoft, Nicolas Crossley, Nev Jones, Mary Cannon, Christoph U Correll, Louise Byrne, Sarah Carr, Eric Y H Chen, Philip Gorwood, Sonia Johnson, Hilkka K"a rkk"a inen, John H Krystal, Jimmy Lee, Jeffrey Lieberman, Carlos Ló pez-Jaramillo, Miia M"a nnikkö, Michael R Phillips, Hiroyuki Uchida, Eduard Vieta, Antonio Vita, and Celso Arango. 2020. Position Paper How mental health care should change as a consequence of the COVID-19 pandemic . The Lancet Psychiatry, Vol. 7 (2020), 813--824. https://doi.org/10.1016/S2215-0366(20)30307--2
Tarek Naous, Christian Hokayem, and Hazem Hajj. 2020. Empathy-driven Arabic Conversational Chatbot. In Proceedings of the Fifth Arabic Natural Language Processing Workshop. 58--68.
Daniel Ortega, Chia-Yu Li, Gisela Vallejo, Pavel Denisov, and Ngoc Thang Vu. 2019. Context-aware neural-based dialog act classification on automatically generated transcriptions. In ICASSP. 7265--7269.
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.). 8024--8035.
J. Pennebaker, M. Mehl, and Kate Niederhoffer. 2003. Psychological aspects of natural language. use: our words, our selves. Annual review of psychology, Vol. 54 (2003), 547--77.
Libo Qin, Wanxiang Che, Yangming Li, Minheng Ni, and Ting Liu. 2020. DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification. In AAAI. 8665--8672.
Vipul Raheja and Joel Tetreault. 2019 a. Dialogue Act Classification with Context-Aware Self-Attention. In NAACL . 3727--3733.
Vipul Raheja and Joel Tetreault. 2019 b. Dialogue Act Classification with Context-Aware Self-Attention. In NAACL . 3727--3733.
Sujith Ravi. 2017. ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections. CoRR, Vol. abs/1708.00630 (2017). arxiv: 1708.00630 http://arxiv.org/abs/1708.00630
Norbert Reithinger and Martin Klesen. 1997. Dialogue act classification using language models. In Fifth European Conference on Speech Communication and Technology .
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986. Learning Internal Representations by Error Propagation .MIT Press, Cambridge, MA, USA, 318--362.
Tulika Saha, Dhawal Gupta, Sriparna Saha, and P. Bhattacharyya. 2020. Emotion Aided Dialogue Act Classification for Task-Independent Conversations in a Multi-modal Framework. Cognitive Computation (2020), 1--13.
Guokan Shang, Antoine Tixier, Michalis Vazirgiannis, and Jean-Pierre Lorré. 2020 a. Speaker-change Aware CRF for Dialogue Act Classification. In COLING . 450--464.
Guokan Shang, Antoine Jean-Pierre Tixier, M. Vazirgiannis, and J. Lorré. 2020 b. Speaker-change Aware CRF for Dialogue Act Classification. In COLING .
Teun A Van Dijk. 2011. Discourse studies: A multidisciplinary introduction .Sage.
Baoxin Wang. 2018. Disconnected Recurrent Neural Networks for Text Categorization. In ACL . 2311--2320.
Xiaomei Wang, Sudeep Hegde, Changwon Son, Bruce Keller, Alec Smith, and Farzan Sasangohar. [n.d.]. Investigating Mental Health of US College Students During the COVID-19 Pandemic: Cross-Sectional Survey Study. JMIR, Vol. 22, 9 (Sept. [n.,d.]), e22817.
Wei Wei, Jiayi Liu, Xianling Mao, Guibing Guo, Feida Zhu, Pan Zhou, and Yuchong Hu. 2019. Emotion-Aware Chat Machine: Automatic Emotional Response Generation for Human-like Emotional Interaction. In CIKM. 1401--1410.
Joseph Weizenbaum. 1966. ELIZA -- A Computer Program For the Study of Natural Language Communication Between Man and Machine . CACM, Vol. 9 (January 1966), 36--45. Issue 1.
Yue Yu, Siyao Peng, and Grace Hui Yang. 2019. Modeling Long-Range Context for Concurrent Dialogue Acts Recognition. In CIKM . 2277--2280.
Yazhou Zhang, Prayag Tiwari, Dawei Song, Xiaoliu Mao, Panpan Wang, Xiang Li, and Hari Mohan Pandey. 2021. Learning interaction dynamics with an interactive LSTM for conversational sentiment analysis. Neural Networks, Vol. 133 (2021), 40 -- 56.

Cited By

View all
  • (2024)Research on discourse role recognition in task-oriented collaborative dialogueJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23526346:3(5709-5721)Online publication date: 5-Mar-2024
  • (2024)Towards Human-centered Proactive Conversational AgentsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657843(807-818)Online publication date: 10-Jul-2024
  • (2024)Toward Connecting Speech Acts and Search Actions in Conversational Search TasksProceedings of the 2023 ACM/IEEE Joint Conference on Digital Libraries10.1109/JCDL57899.2023.00027(119-131)Online publication date: 26-Jun-2024
  • Show More Cited By

Index Terms

  1. Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations



    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors


    Published In

    cover image ACM Conferences
    WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
    February 2022
    1690 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 February 2022


    Request permissions for this article.

    Check for updates

    Author Tags

    1. dialogue-act classification
    2. mental-health counselling


    • Research-article


    WSDM '22

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)396
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 13 Jan 2025

    Other Metrics


    Cited By

    View all
    • (2024)Research on discourse role recognition in task-oriented collaborative dialogueJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23526346:3(5709-5721)Online publication date: 5-Mar-2024
    • (2024)Towards Human-centered Proactive Conversational AgentsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657843(807-818)Online publication date: 10-Jul-2024
    • (2024)Toward Connecting Speech Acts and Search Actions in Conversational Search TasksProceedings of the 2023 ACM/IEEE Joint Conference on Digital Libraries10.1109/JCDL57899.2023.00027(119-131)Online publication date: 26-Jun-2024
    • (2024)Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01382(14585-14595)Online publication date: 16-Jun-2024
    • (2024)Dynamically retrieving knowledge via query generation for informative dialogue generationNeurocomputing10.1016/j.neucom.2023.127036569(127036)Online publication date: Feb-2024
    • (2024)HAM-GNN: A hierarchical attention-based multi-dimensional edge graph neural network for dialogue act classificationExpert Systems with Applications10.1016/j.eswa.2024.125459(125459)Online publication date: Sep-2024
    • (2023)A Primer on Seq2Seq Models for Generative ChatbotsACM Computing Surveys10.1145/360428156:3(1-58)Online publication date: 6-Oct-2023
    • (2023)A multi-task learning framework for politeness and emotion detection in dialogues for mental health counselling and legal aidExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120025224:COnline publication date: 15-Aug-2023
    • (2022)A computational approach to measure the linguistic characteristics of psychotherapy timing, responsiveness, and consistencynpj Mental Health Research10.1038/s44184-022-00020-91:1Online publication date: 2-Dec-2022

    View Options

    Login options

    View options


    View or Download as a PDF file.



    View online with eReader.








    Share this Publication link

    Share on social media