Multi-representation fusion network for multi-turn response selection in retrieval-based chatbots

C Tao, W Wu, C Xu, W Hu, D Zhao, R Yan - Proceedings of the twelfth …, 2019 - dl.acm.org
Proceedings of the twelfth ACM international conference on web search and …, 2019dl.acm.org
We consider context-response matching with multiple types of representations for multi-turn
response selection in retrieval-based chatbots. The representations encode semantics of
contexts and responses on words, n-grams, and sub-sequences of utterances, and capture
both short-term and long-term dependencies among words. With such a number of
representations in hand, we study how to fuse them in a deep neural architecture for
matching and how each of them contributes to matching. To this end, we propose a multi …
We consider context-response matching with multiple types of representations for multi-turn response selection in retrieval-based chatbots. The representations encode semantics of contexts and responses on words, n-grams, and sub-sequences of utterances, and capture both short-term and long-term dependencies among words. With such a number of representations in hand, we study how to fuse them in a deep neural architecture for matching and how each of them contributes to matching. To this end, we propose a multi-representation fusion network where the representations can be fused into matching at an early stage, at an intermediate stage, or at the last stage. We empirically compare different representations and fusing strategies on two benchmark data sets. Evaluation results indicate that late fusion is always better than early fusion, and by fusing the representations at the last stage, our model significantly outperforms the existing methods, and achieves new state-of-the-art performance on both data sets. Through a thorough ablation study, we demonstrate the effect of each representation to matching, which sheds light on how to select them in practical systems.
ACM Digital Library