Acta Informatica Pragensia: Review of Latent Dirichlet Allocation Methods Usable in Voice of Customer Analysis

Acta Informatica Pragensia 2018, 7(2), 152-165 | DOI: 10.18267/j.aip.1202433

Review of Latent Dirichlet Allocation Methods Usable in Voice of Customer Analysis

Lucie Sperkova: Department of Information Technologies, Faculty of Informatics and Statistics, University of Economics, Prague, W. Churchill Sq. 1938/4, 130 67 Prague, Czech Republic

The aim of the article is to detect and review existing topic modelling methods of Latent Dirichlet Allocation and their modifications usable in Voice of Customer analysis. Voice of Customer is expressed mainly through textual comments which often focus on the evaluation of products or services the customer consumes. The most studied data source are customer reviews which contain next to the textual comments also ratings in form of scales. The aim of the topic models is to mine the topics and their aspects the customers are evaluating in their reviews and assign to them a particular sentiment or emotion. The author completed a systematic literature review of peer-reviewed published journal articles indexed in leading databases of Scopus and Web of Science and concerning the current use of Latent Dirichlet Allocation model variants in Voice of Customer textual analysis for performing the tasks of aspect detection, emotion detection, personality detection and sentiment assignation. In total, 38 modifications of the LDA model were identified with the reference to their first application in the research of text analytics. The review is intended for researchers in customer analytics the field of sentiment or emotion detection, and moreover as results from the review, for studies in personality recognition based on the textual data. The review offers a basic overview and comparison of LDA modifications which can be considered as a knowledge baseline for selection in a specific application. The scope of the literature examination is limited to the period of years 2003–2018 with the application relevant to the analysis of Voice of Customer subjective textual data only which is closely connected to the area of marketing or customer relationship management.

Keywords: Aspect detection, LDA, Sentiment, Text analytics, Topic models, VoC

Received: August 27, 2018; Accepted: December 3, 2018; Prepublished online: December 8, 2018; Published: December 31, 2018 Show citation

Sperkova, L. (2018). Review of Latent Dirichlet Allocation Methods Usable in Voice of Customer Analysis. Acta Informatica Pragensia, 7(2), 152-165. doi: 10.18267/j.aip.120

Download citation

References

Andrzejewski, D., Zhu, X., & Craven, M. (2009). Incorporating domain knowledge into topic modelling via Dirichlet forest priors. In Proceedings of the 26th annual international conference on machine learning (pp. 25-32). New York: ACM. doi: 10.1145/1553374.1553378 Go to original source...
Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., & Yu, Y. (2009). Joint emotion-topic modeling for social affective text mining. In Ninth IEEE International Conference on Data Mining (pp. 699-704). New York: IEEE. doi: 10.1109/ICDM.2009.94 Go to original source...
Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., & Yu, Y. (2012). Mining social emotions from affective text. IEEE transactions on knowledge and data engineering, 24(9), 1658-1670. doi: 10.1109/TKDE.2011.188 Go to original source...
Blei, D. M., & McAuliffe, J. D. (2007). Supervised topic models. In Advances in neural information processing systems 20 (pp. 121-128). Neural Information Processing Systems Foundation.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993-1022.
Brody, S., & Elhadad, N. (2010). An unsupervised aspect-sentiment model for online reviews. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 804-812). Stroudsburg: Association for Computational Linguistics.
Büschken, J., & Allenby, G. M. (2016). Sentence-based text analysis for customer reviews. Marketing Science, 35(6), 953-975. doi: 10.1287/mksc.2016.0993 Go to original source...
Chen, Z., Mukherjee, A., & Liu, B. (2014). Aspect extraction with automated prior knowledge learning. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 347-358). Stroudsburg: Association for Computational Linguistics. Go to original source...
Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Leveraging Multi-Domain Prior Knowledge in Topic Models. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (pp. 2071-2077). Burlington: Morgan Kaufmann Publisher.
Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Discovering coherent topics using general knowledge. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management (pp. 209-218). New York: ACM. doi: 10.1145/2505515.2505519 Go to original source...
Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Exploiting domain knowledge in aspect extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 1655-1667). Stroudsburg: Association for Computational Linguistics.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407. doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-93.0.CO;2-9" class=" tt" target="_blank"> Go to original source...
Dong, L. Y., Ji, S. J., Zhang, C. J., Zhang, Q., Chiu, D. W., Qiu, L. Q., & Li, D. (2018). An unsupervised topic-sentiment joint probabilistic model for detecting deceptive reviews. Expert Systems with Applications, 114, 210-223. doi: 10.1016/j.eswa.2018.07.005 Go to original source...
Fei, G., Chen, Z., & Liu, B. (2014). Review topic discovery with phrases using the pólya urn model. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers (pp. 667-676). Stroudsburg: Association for Computational Linguistics.
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235. doi: 10.1073/pnas.0307752101 Go to original source...
Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological review, 114(2), 211. doi: 10.1037/0033-295X.114.2.211 Go to original source...
Griffiths, T. L., Steyvers, M., Blei, D. M., & Tenenbaum, J. B. (2005). Integrating topics and syntax. In Proceedings of the 18th Annual Conference on Neural Information Processing Systems (pp. 537-544). Neural information processing systems foundation.
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of the 15th Conference on Uncertainty in artificial intelligence (pp. 289-296). Burlington: Morgan Kaufmann Publishers.
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177). New York: ACM. doi: 10.1145/1014052.1014073 Go to original source...
Huang, F., Zhang, S., Zhang, J., & Yu, G. (2017). Multimodal learning for topic sentiment analysis in microblogging. Neurocomputing, 253, 144-153. doi: 10.1016/j.neucom.2016.10.086 Go to original source...
Jelodar, H., Wang, Y., Yuan, C., & Feng, X. (2017). Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey. ArXiv preprint. ArXiv:1711.04305. Retrieved November 30, 2018, from: https://arxiv.org/abs/1711.04305
Jo, Y., & Oh, A. H. (2011). Aspect and sentiment unification model for online review analysis. In Proceedings of the 4th ACM international conference on Web search and data mining (pp. 815-824). New York: ACM. doi: 10.1145/1935826.1935932 Go to original source...
Lakkaraju, H., Bhattacharyya, C., Bhattacharya, I., & Merugu, S. (2011). Exploiting coherence for the simultaneous discovery of latent facets and associated sentiments. In Proceedings of the 2011 SIAM international conference on data mining (pp. 498-509). Society for Industrial and Applied Mathematics. Go to original source...
Li, F., Huang, M., & Zhu, X. (2010). Sentiment Analysis with Global Topics and Local Dependency. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (Vol. 10, pp. 1371-1376). AAAI Press. Go to original source...
Lin, C., & He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 375-384). New York: ACM. doi: 10.1145/1645953.1646003 Go to original source...
Lin, C., He, Y., Everson, R., & Ruger, S. (2012). Weakly supervised joint sentiment-topic detection from text. IEEE Transactions on Knowledge and Data engineering, 24(6), 1134-1145. doi: 10.1109/TKDE.2011.48 Go to original source...
Liu, B. (2015). Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge: CUP. Go to original source...
Liu, Y., Wang, J., & Jiang, Y. (2016). PT-LDA: A latent variable model to predict personality traits of social network users. Neurocomputing, 210, pp. 155-163. doi: 10.1016/j.neucom.2015.10.144 Go to original source...
Lu, Y., Mei, Q., & Zhai, C. (2011). Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA. Information Retrieval, 14(2), 178-203. doi: 10.1007/s10791-010-9141-9 Go to original source...
Lu, Y., Zhai, C., & Sundaresan, N. (2009). Rated aspect summarization of short comments. In Proceedings of the 18th international conference on World wide web (pp. 131-140). New York: ACM. doi: 10.1145/1526709.1526728 Go to original source...
McCrae, R. R., & John, O. P. (1992). An introduction to the five-factor model and its applications. Journal of personality, 60(2), 175-215. doi: 10.1111/j.1467-6494.1992.tb00970.x Go to original source...
Mei, Q., Ling, X., Wondra, M., Su, H., & Zhai, C. (2007). Topic sentiment mixture: modeling facets and opinions in weblogs. In Proceedings of the 16th international conference on World Wide Web (pp. 171-180). New York: ACM. Go to original source...
Moghaddam, S., & Ester, M. (2012). On the design of LDA models for aspect-based opinion mining. In Proceedings of the 21st ACM international conference on Information and knowledge management (pp. 803-812). New York: ACM. doi: 10.1145/2396761.2396863 Go to original source...
Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., The PRISMA Group. (2009). Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Medicine, 6(7), e1000097. doi: 10.1371/journal.pmed.1000097 Go to original source...
Mukherjee, S., Basu, G., & Joshi, S. (2014). Joint author sentiment topic model. In Proceedings of the 2014 SIAM International Conference on Data Mining (pp. 370-378). Society for Industrial and Applied Mathematics. Go to original source...
Ramage, D., Hall, D., Nallapati, R., & Manning, C. D. (2009). Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 248-256). Stroudsburg: Association for Computational Linguistics. Go to original source...
Rao, Y., Li, Q., Mao, X., & Wenyin, L. (2014). Sentiment topic models for social emotion mining. Information Sciences, 266, 90-100. doi: 10.1016/j.ins.2013.12.059 Go to original source...
Rao, Y., Lei, J., Wenyin, L., Li, Q., & Chen, M. (2014). Building emotional dictionary for sentiment analysis of online news. World Wide Web, 17(4), 723-742. doi: 10.1007/s11280-013-0221-9 Go to original source...
Rao, Y., Li, Q., Wenyin, L., Wu, Q., & Quan, X. (2014). Affective topic model for social emotion detection. Neural Networks, 58, 29-37. doi: 10.1016/j.neunet.2014.05.007 Go to original source...
Sauper, C., Haghighi, A., & Barzilay, R. (2011). Content models with attitude. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (pp. 350-358). Stroudsburg: Association for Computational Linguistics.
Tang, Y. K., Mao, X. L., & Huang, H. (2016). Labeled phrase latent Dirichlet allocation. In Proceedings of the International Conference on Web Information Systems Engineering (pp. 525-536). Cham: Springer. doi: 10.1007/978-3-319-48740-3_39 Go to original source...
Titov, I., & McDonald, R. (2008). Modeling online reviews with multi-grain topic models. In Proceedings of the 17th international conference on World Wide Web (pp. 111-120). New York: ACM. doi: 10.1145/1367497.1367513 Go to original source...
Titov, I., & McDonald, R. (2008). A joint model of text and aspect ratings for sentiment summarization. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (pp. 308-316). Stroudsburg: Association for Computational Linguistics.
Wallace, B. C., Paul, M. J., Sarkar, U., Trikalinos, T. A., & Dredze, M. (2014). A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews. Journal of the American Medical Informatics Association, 21(6), 1098-1103. doi: 10.1136/amiajnl-2014-002711 Go to original source...
Wang, H., Lu, Y., & Zhai, C. (2010). Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 783-792). New York: ACM. doi: 10.1145/1835804.1835903 Go to original source...
Wang, H., Wu, F., Lu, W., Yang, Y., Li, X., Li, X., & Zhuang, Y. (2018). Identifying objective and subjective words via topic modeling. IEEE transactions on neural networks and learning systems, 29(3), 718-730. doi: 10.1109/TNNLS.2016.2626379 Go to original source...
Xu, K., Qi, G., Huang, J., Wu, T., & Fu, X. (2018). Detecting bursts in sentiment-aware topics from social media. Knowledge-Based Systems, 141, 44-54. doi: 10.1016/j.knosys.2017.11.007 Go to original source...
Zhai, Z., Liu, B., Xu, H., & Jia, P. (2011). Constrained LDA for grouping product features in opinion mining. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 448-459). Berlin: Springer. doi: 10.1007/978-3-642-20841-6_37 Go to original source...
Zhan, T. J., & Li, C. H. (2011). Semantic dependent word pairs generative model for fine-grained product feature mining. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 460-475). Berlin: Springer. Go to original source...
Zhang, Y., Ji, D. H., Su, Y., & Wu, H. (2013). Joint naive bayes and lda for unsupervised sentiment analysis. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 402-413). Berlin: Springer. Go to original source...
Zhao, W. X., Jiang, J., Yan, H., & Li, X. (2010). Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 56-65). Stroudsburg: Association for Computational Linguistics.

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.

Return to the content