[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3583780.3615205acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Latent Aspect Detection via Backtranslation Augmentation

Published: 21 October 2023 Publication History

Abstract

Within the context of review analytics, aspects are the features of products and services at which customers target their opinions and sentiments. Aspect detection helps product owners and service providers identify shortcomings and prioritize customers' needs. Existing methods focus on detecting the surface form of an aspect falling short when aspects are latent in reviews, especially in an informal context like in social posts. In this paper, we propose data augmentation via natural language backtranslation to extract latent occurrences of aspects. We presume that backtranslation (1) can reveal latent aspects because they may not be commonly known in the target language and can be generated through backtranslation; (2) augments context-aware synonymous aspects from a target language to the original language, hence addressing the out-of-vocabulary issue; and (3) helps with the semantic disambiguation of polysemous words and collocations. Through our experiments on well-known aspect detection methods across semeval datasets of restaurant and laptop reviews, we demonstrate that review augmentation via backtranslation yields a steady performance boost in baselines. We further contribute LADy at https://github.com/fani-lab/LADy, a benchmark library to support the reproducibility of our research.

Supplementary Material

MP4 File (2684-video.mp4)
This video highlights our paper, "Latent Aspect Detection via Backtranslation Augmentation," accepted at CIKM 2023. In the context of review analysis, aspects refer to the specific attributes of products or services that customers focus on when expressing their opinions. The aspect within a review serves as a main component, providing a comprehensive understanding of what customers like or dislike about a product or service. Existing methods focus on detecting aspects with a surface form and can not properly detect their latent occurrences. The paper introduces an approach to empower existing aspect detection methods to extract latent aspects. It employs backtranslation as an augmentation method, revealing latent aspects, addressing out-of-vocabulary challenges, and disambiguating word meanings. Experimental results on restaurant and laptop domains in benchmark datasets demonstrate the effectiveness of backtranslation in enhancing aspect detection methods in case of having latent aspects.

References

[1]
Nadia Alboukaey, Ammar Joukhadar, and Nada Ghneim. 2020. Dynamic behavior based churn prediction in mobile telecom. Expert Syst. Appl., Vol. 162 (2020), 113779.
[2]
Federico Bianchi, Silvia Terragni, and Dirk Hovy. 2021. Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Online, 759--766.
[3]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research, Vol. 3, Jan (2003), 993--1022.
[4]
Samuel Brody and Noemie Elhadad. 2010. An Unsupervised Aspect-Sentiment Model for Online Reviews. In NAACL 2010. 804--812. https://aclanthology.org/N10--1122/
[5]
Hongjie Cai, Rui Xia, and Jianfei Yu. 2021. Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021. Association for Computational Linguistics, 340--350.
[6]
Marta R. Costa-jussà, James Cross, Onur cC elebi, Maha Elbayad, Kenneth Heafield, and et al. 2022. No Language Left Behind: Scaling Human-Centered Machine Translation. CoRR, Vol. abs/2207.04672 (2022). showeprint[arXiv]2207.04672
[7]
Xiang Dai and Heike Adel. 2020. An Analysis of Simple Data Augmentation for Named Entity Recognition. In Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8--13, 2020. International Committee on Computational Linguistics, 3861--3867.
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171--4186.
[9]
Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq R. Joty, Dragomir R. Radev, and Yashar Mehdad. 2021. Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6--11, 2021. Association for Computational Linguistics, 704--717.
[10]
Hao Fei, Yafeng Ren, Shengqiong Wu, Bobo Li, and Donghong Ji. 2021. Latent Target-Opinion as Prior for Document-Level Sentiment Classification: A Variational Approach from Fine-Grained Perspective. In WWW '21: The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19--23, 2021. ACM / IW3C2, 553--564.
[11]
John M. Giorgi, Osvald Nitski, Bo Wang, and Gary D. Bader. 2021. DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1--6, 2021. Association for Computational Linguistics, 879--895.
[12]
Mengting Hu, Shiwan Zhao, Honglei Guo, Chao Xue, Hang Gao, Tiegang Gao, Renhong Cheng, and Zhong Su. 2021. Multi-Label Few-Shot Learning for Aspect Category Detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 6330--6340.
[13]
Hemlata Jain, Ajay Khunteta, and Sumit Srivastava. 2021. Telecom churn prediction and used techniques, datasets and performance measures: a review. Telecommun. Syst., Vol. 76, 4 (2021), 613--630.
[14]
Ning Li, Chi-Yin Chow, and Jia-Dong Zhang. 2019b. Seeded-BTM: Enabling Biterm Topic Model with Seeds for Product Aspect Mining. In 21st IEEE International Conference on High Performance Computing and Communications. IEEE, 2751--2758.
[15]
Xin Li, Lidong Bing, Piji Li, Wai Lam, and Zhimou Yang. 2018. Aspect Term Extraction with History Attention and Selective Transformation. In IJCAI 2019. 4194--4200.
[16]
Xin Li, Lidong Bing, Wenxuan Zhang, and Wai Lam. 2019a. Exploiting BERT for End-to-End Aspect-based Sentiment Analysis. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019). Association for Computational Linguistics, Hong Kong, China, 34--41.
[17]
Yu Li, Xiao Li, Yating Yang, and Rui Dong. 2020. A Diverse Data Augmentation Strategy for Low-Resource Neural Machine Translation. Inf., Vol. 11, 5 (2020), 255.
[18]
Muhammad Marong, Nowshath K Batcha, and Raheem Mafas. 2020. Sentiment Analysis in E-Commerce: A Review on The Techniques and Algorithms. Journal of Applied Technology and Innovation (e-ISSN: 2600--7304), Vol. 4, 1 (2020), 6.
[19]
Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos, and et al. 2016. SemEval-2016 Task 5: Aspect Based Sentiment Analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 19--30.
[20]
Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Suresh Manandhar, and Ion Androutsopoulos. 2015. SemEval-2015 Task 12: Aspect Based Sentiment Analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2015. The Association for Computer Linguistics, 486--495.
[21]
Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar. 2014. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation, SemEval@COLING. The Association for Computer Linguistics, 27--35.
[22]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3--7, 2019. Association for Computational Linguistics, 3980--3990.
[23]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving Neural Machine Translation Models with Monolingual Data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7--12, 2016, Berlin, Germany, Volume 1: Long Papers. The Association for Computer Linguistics.
[24]
Tian Shi, Liuqing Li, Ping Wang, and Chandan K Reddy. 2021. A simple and effective self-supervised contrastive learning framework for aspect detection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 13815--13824.
[25]
Akash Srivastava and Charles Sutton. 2017. Autoencoding Variational Inference For Topic Models. In 5th International Conference on Learning Representations, ICLR 2017. OpenReview.net. https://openreview.net/forum?id=BybtVK9lg
[26]
Steinþó r Steingr'i msson, Hrafn Loftsson, and Andy Way. 2021. CombAlign: a Tool for Obtaining High-Quality Word Alignments. In Proceedings of the 23rd Nordic Conference on Computational Linguistics, NoDaLiDa 2021, Reykjavik, Iceland (Online), May 31 - June 2, 2021. Linkö ping University Electronic Press, Sweden, 64--73. https://aclanthology.org/2021.nodalida-main.7/
[27]
Stéphan Tulkens and Andreas van Cranenburgh. 2020. Embarrassingly Simple Unsupervised Aspect Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3182--3187.
[28]
Zhen Wu, Fei Zhao, Xin-Yu Dai, Shujian Huang, and Jiajun Chen. 2020. Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7--12, 2020. AAAI Press, 9298--9305. https://ojs.aaai.org/index.php/AAAI/article/view/6469
[29]
Hua Xu, Fan Zhang, and Wei Wang. 2015. Implicit feature identification in Chinese reviews using explicit topic mining model. Knowl. Based Syst., Vol. 76 (2015), 166--175.
[30]
Lu Xu, Yew Ken Chia, and Lidong Bing. 2021. Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4755--4766.
[31]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In 22nd International World Wide Web Conference, WWW '13, Rio de Janeiro, Brazil, May 13--17, 2013. International World Wide Web Conferences Steering Committee / ACM, 1445--1456.
[32]
Carl Yang, Xiaolin Shi, Luo Jie, and Jiawei Han. 2018. I Know You'll Be Back: Interpretable New User Clustering and Churn Prediction on a Mobile Social Application. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19--23, 2018. ACM, 914--922.
[33]
Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V. Le. 2018. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. In 6th International Conference on Learning Representations, ICLR 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=B14TlG-RW
[34]
Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7--12, 2015, Montreal, Quebec, Canada. 649--657. https://proceedings.neurips.cc/paper/2015/hash/250cf8b51c773f3f8dc8b4be867a9a02-Abstract.html
[35]
He Zhao, Longtao Huang, Rong Zhang, Quan Lu, and Hui Xue. 2020. SpanMlt: A Span-based Multi-Task Learning Framework for Pair-wise Aspect and Opinion Terms Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3239--3248.

Cited By

View all
  • (2024)No Query Left Behind: Query Refinement via BacktranslationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679729(1961-1972)Online publication date: 21-Oct-2024
  • (2024)LADy 💃: A Benchmark Toolkit for Latent Aspect Detection Enriched with Backtranslation AugmentationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657894(1172-1178)Online publication date: 10-Jul-2024
  • (2024)Enhancing RAG’s Retrieval via Query BacktranslationsWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0579-8_20(270-285)Online publication date: 29-Nov-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. aspect detection
  2. backtranslation augmentation
  3. review analysis

Qualifiers

  • Short-paper

Funding Sources

  • Natural Sciences and Engineering Research Council of Canada (NSERC)

Conference

CIKM '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)No Query Left Behind: Query Refinement via BacktranslationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679729(1961-1972)Online publication date: 21-Oct-2024
  • (2024)LADy 💃: A Benchmark Toolkit for Latent Aspect Detection Enriched with Backtranslation AugmentationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657894(1172-1178)Online publication date: 10-Jul-2024
  • (2024)Enhancing RAG’s Retrieval via Query BacktranslationsWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0579-8_20(270-285)Online publication date: 29-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media