[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Interactive Topic Modeling for Exploring Asynchronous Online Conversations: Design and Evaluation of ConVisIT

Published: 22 February 2016 Publication History

Abstract

Since the mid-2000s, there has been exponential growth of asynchronous online conversations, thanks to the rise of social media. Analyzing and gaining insights from such conversations can be quite challenging for a user, especially when the discussion becomes very long. A promising solution to this problem is topic modeling, since it may help the user to understand quickly what was discussed in a long conversation and to explore the comments of interest. However, the results of topic modeling can be noisy, and they may not match the user’s current information needs. To address this problem, we propose a novel topic modeling system for asynchronous conversations that revises the model on the fly on the basis of users’ feedback. We then integrate this system with interactive visualization techniques to support the user in exploring long conversations, as well as in revising the topic model when the current results are not adequate to fulfill the user’s information needs. Finally, we report on an evaluation with real users that compared the resulting system with both a traditional interface and an interactive visual interface that does not support human-in-the-loop topic modeling. Both the quantitative results and the subjective feedback from the participants illustrate the potential benefits of our interactive topic modeling approach for exploring conversations, relative to its counterparts.

Supplementary Material

hoque (hoque.zip)
Supplemental movie, appendix, image and software files for, Interactive Topic Modeling for Exploring Asynchronous Online Conversations: Design and Evaluation of ConVisIT

References

[1]
David Andrzejewski, Xiaojin Zhu, and Mark Craven. 2009. Incorporating domain knowledge into topic modeling via Dirichlet forest priors. In Proceedings of the International Conference on Machine Learning. 25--32.
[2]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Machine Learn. Res. 3 (2003), 993--1022.
[3]
Giuseppe Carenini, Gabriel Murray, and Raymond Ng. 2011. Methods for Mining and Summarizing Text Conversations. Morgan Claypool.
[4]
Sheelagh Carpendale. 2008. Evaluating information visualizations. In Information Visualization. 19--45.
[5]
Jaegul Choo, Changhyun Lee, Chandan K. Reddy, and Haesun Park. 2013. Utopian: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Visual. Comput. Graph. 19, 12 (2013), 1992--2001.
[6]
Jason Chuang, Sonal Gupta, Christopher Manning, and Jeffrey Heer. 2013a. Topic model diagnostics: Assessing domain relevance via topical alignment. In Proceedings of the Conference on Machine Learning. 612--620.
[7]
Jason Chuang, Yuening Hu, Ashley Jin, John D. Wilkerson, Daniel A. McFarland, Christopher D. Manning, and Jeffrey Heer. 2013b. Document exploration with topic modeling: Designing interactive visualizations to support effective analysis workflows. In NIPS Workshop on Topic Models: Computation, Application and Evaluation.
[8]
Jason Chuang, Christopher D. Manning, and Jeffrey Heer. 2012. Termite: Visualization techniques for assessing textual topic models. In Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI). 74--77.
[9]
Andy Cockburn, Amy Karlson, and Benjamin B. Bederson. 2008. A review of overview+ detail, zooming, and focus+ context interfaces. ACM Comput. Surv. (CSUR) 41, 1 (2008), 2.
[10]
Kushal Dave, Martin Wattenberg, and Michael Muller. 2004. Flash forums and forumreader: Navigating a new kind of large-scale online discussion. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW). 232--241.
[11]
Wenwen Dou, Li Yu, Xiaoyu Wang, Zhiqiang Ma, and William Ribarsky. 2013. HierarchicalTopics: Visually exploring large text collections using topic hierarchies. IEEE Trans. Visual. Comput. Graph. 19, 12 (2013), 2002--2011.
[12]
Alex Endert, Patrick Fiaux, and Chris North. 2012. Semantic interaction for visual text analytics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). ACM, New York, NY, 473--482.
[13]
Joseph L. Fleiss, Bruce Levin, and Myunghee Cho Paik. 2013. Statistical Methods for Rates and Proportions. John Wiley & Sons, New York, NY.
[14]
Michel Galley, Kathleen McKeown, Eric Fosler-Lussier, and Hongyan Jing. 2003. Discourse segmentation of multi-party conversation. In Proceedings of the Annual Meeting on Association for Computational Linguistics. 562--569.
[15]
S. Havre, E. Hetzler, P. Whitney, and L. Nowell. 2002. ThemeRiver: Visualizing thematic changes in large document collections. IEEE Trans. Visual. Comput. Graph. 8, 1 (2002), 9--20.
[16]
Jeffrey Heer and George G. Robertson. 2007. Animated transitions in statistical data graphics. IEEE Trans. Visual. Comput. Graph. 13, 6 (2007), 1240--1247.
[17]
E. Hoque and G. Carenini. 2014. ConVis: A visual text analytic system for exploring blog conversations. Comput. Graph. Forum (Proc. EuroVis) 33, 3 (2014), 221--230.
[18]
E. Hoque and G. Carenini. 2015. ConVisIT: Interactive topic modeling for exploring asynchronous online conversations. In Proceedings of the ACM Conference on Intelligent User Interfaces. ACM, New York, NY, 169--180.
[19]
Yuening Hu, Jordan Boyd-Graber, Brianna Satinoff, and Alison Smith. 2014. Interactive topic modeling. Machine Learn. 95, 3 (2014), 423--469.
[20]
Quentin Jones, Gilad Ravid, and Sheizaf Rafaeli. 2004. Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical exploration. Inform. Syst. Res. 15, 2 (2004), 194--210.
[21]
Shafiq Joty, Giuseppe Carenini, and Raymond T. Ng. 2013. Topic segmentation and labeling in asynchronous conversations. J. Artificial Intell. Res. 47 (2013), 521--573.
[22]
Maurits Clemens Kaptein, Clifford Nass, and Panos Markopoulos. 2010. Powerful and consistent analysis of Likert-type ratingscales. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 2391--2394.
[23]
Bernard Kerr. 2003. Thread arcs: An email thread visualization. In IEEE Symposium on Information Visualization. 211--218.
[24]
H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. 2012. Empirical studies in information visualization: Seven scenarios. IEEE Trans. Visual. Comput. Graph. 18, 9 (2012), 1520--1536.
[25]
Hanseung Lee, Jaeyeon Kihm, Jaegul Choo, John Stasko, and Haesun Park. 2012. iVisClustering: An interactive visual document clustering via topic modeling. In Computer Graphics Forum, Vol. 31. Wiley Online Library, New York, NY, 1155--1164.
[26]
Tamara Munzner. 2014. Visualization Analysis and Design. CRC Press, Boca Raton, FL.
[27]
Mark E. J. Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2 (2004), 026113.
[28]
Pentti Paatero and Unto Tapper. 1994. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 2 (1994), 111--126.
[29]
Shimei Pan, Michelle X. Zhou, Yangqiu Song, Weihong Qian, Fei Wang, and Shixia Liu. 2013. Optimizing temporal topic segmentation for intelligent text visualization. In Proceedings of the ACM Conference on Intelligent User Interfaces. ACM, New York, NY, 339--350.
[30]
Vıctor Pascual-Cid and Andreas Kaltenbrunner. 2009. Exploring asynchronous online discussions through hierarchical visualisation. In IEEE Conference on Information Visualization. 191--196.
[31]
Peter Pirolli, Patricia Schank, Marti Hearst, and Christine Diehl. 1996. Scatter/gather browsing communicates the topic structure of a very large text collection. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’96). 213--220.
[32]
Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D. Manning. 2009. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP). 248--256.
[33]
Warren Sack. 2000. Conversation map: An interface for very-large-scale conversations. J. Manag. Inform. Syst. 17, 3 (2000), 73--92.
[34]
Michael Sedlmair, Miriah Meyer, and Tamara Munzner. 2012. Design study methodology: Reflections from the trenches and the stacks. IEEE Trans. Visual. Comput. Graph. 18, 12 (2012), 2431--2440.
[35]
Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Machine Intell. 22, 8 (2000), 888--905.
[36]
Markus Steinberger, Manuela Waldner, Marc Streit, Alexander Lex, and Dieter Schmalstieg. 2011. Context-preserving visual links. IEEE Trans. Visual. Comput. Graphi. 17, 12 (2011), 2249--2258.
[37]
Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. 2011. Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 2 (2011), 267--307.
[38]
Gina Danielle Venolia and Carman Neustaedter. 2003. Understanding sequence and reply relationships within email conversations: A mixed-model visualization. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 361--368.
[39]
Fernanda B. Viégas, Scott Golder, and Judith Donath. 2006. Visualizing email content: Portraying relationships from conversational histories. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). 979--988.
[40]
Martin Wattenberg and David Millen. 2003. Conversation thumbnails for large-scale discussions. In Extended Abstracts on SIGCHI Conference on Human Factors in Computing Systems (CHI). 742--743.
[41]
Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. 2010. Tiara: A visual exploratory text analytic system. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining. 153--162.
[42]
Yi Yang, Shimei Pan, Yangqiu Song, Jie Lu, and Mercan Topkara. 2015. User-directed non-disruptive topic model update for effective exploration of dynamic content. In Proceedings of the ACM Conference on Intelligent User Interfaces. 158--168.
[43]
Ding Zhou, Sergey A. Orshanskiy, Hongyuan Zha, and C. Lee Giles. 2007. Co-ranking authors and documents in a heterogeneous network. In Proceedings of the 7th IEEE International Conference on Data Mining. 739--744.

Cited By

View all
  • (2024)Evaluating Interactive Topic Models in Applied SettingsExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3637133(1-8)Online publication date: 11-May-2024
  • (2022)ClioQuery: Interactive Query-oriented Text Analytics for Comprehensive Investigation of Historical News ArchivesACM Transactions on Interactive Intelligent Systems10.1145/352402512:3(1-49)Online publication date: 26-Jul-2022
  • (2022)KnAC: an approach for enhancing cluster analysis with background knowledge and explanationsApplied Intelligence10.1007/s10489-022-04310-953:12(15537-15560)Online publication date: 23-Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Interactive Intelligent Systems
ACM Transactions on Interactive Intelligent Systems  Volume 6, Issue 1
Special Issue on New Directions in Eye Gaze for Interactive Intelligent Systems (Part 2 of 2), Regular Articles and Special Issue on Highlights of IUI 2015 (Part 1 of 2)
May 2016
219 pages
ISSN:2160-6455
EISSN:2160-6463
DOI:10.1145/2896319
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2016
Accepted: 01 November 2015
Revised: 01 November 2015
Received: 01 July 2015
Published in TIIS Volume 6, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Interactive topic modeling
  2. asynchronous conversation
  3. computer mediated communication
  4. text visualization

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Natural Sciences and Engineering Research Council (NSERC)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Evaluating Interactive Topic Models in Applied SettingsExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3637133(1-8)Online publication date: 11-May-2024
  • (2022)ClioQuery: Interactive Query-oriented Text Analytics for Comprehensive Investigation of Historical News ArchivesACM Transactions on Interactive Intelligent Systems10.1145/352402512:3(1-49)Online publication date: 26-Jul-2022
  • (2022)KnAC: an approach for enhancing cluster analysis with background knowledge and explanationsApplied Intelligence10.1007/s10489-022-04310-953:12(15537-15560)Online publication date: 23-Nov-2022
  • (2022)What the GraphThe Humanities in the Digital: Beyond Critical Digital Humanities10.1007/978-3-031-16950-2_5(107-135)Online publication date: 29-Sep-2022
  • (2021)An Examination of Grouping and Spatial Organization Tasks for High-Dimensional Data ExplorationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.302889027:2(1742-1752)Online publication date: Feb-2021
  • (2020)ClaimViz: Visual Analytics for Identifying and Verifying Factual Claims2020 IEEE Visualization Conference (VIS)10.1109/VIS47514.2020.00056(246-250)Online publication date: Oct-2020
  • (2020)Analysis of online posts to discover student learning challenges and inform targeted curriculum improvement actions2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE)10.1109/TALE48869.2020.9368343(774-779)Online publication date: 8-Dec-2020
  • (2020)ConVisQA: A Natural Language Interface for Visually Exploring Online Conversations2020 24th International Conference Information Visualisation (IV)10.1109/IV51561.2020.00077(440-447)Online publication date: Sep-2020
  • (2019)A Visual Analytics Approach for Interactive Document ClusteringACM Transactions on Interactive Intelligent Systems10.1145/324138010:1(1-33)Online publication date: 9-Aug-2019
  • (2018)Enhancing situation awareness of public safety events by visualizing topic evolution using social mediaProceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data Age10.1145/3209281.3209378(1-10)Online publication date: 30-May-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media