[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2623330.2623721acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Detecting anomalies in dynamic rating data: a robust probabilistic model for rating evolution

Published: 24 August 2014 Publication History

Abstract

Rating data is ubiquitous on websites such as Amazon, TripAdvisor, or Yelp. Since ratings are not static but given at various points in time, a temporal analysis of rating data provides deeper insights into the evolution of a product's quality. In this work, we tackle the following question: Given the time stamped rating data for a product or service, how can we detect the general rating behavior of users as well as time intervals where the ratings behave anomalous? We propose a Bayesian model that represents the rating data as sequence of categorical mixture models. In contrast to existing methods, our method does not require any aggregation of the input but it operates on the original time stamped data. To capture the dynamic effects of the ratings, the categorical mixtures are temporally constrained: Anomalies can occur in specific time intervals only and the general rating behavior should evolve smoothly over time. Our method automatically determines the intervals where anomalies occur, and it captures the temporal effects of the general behavior by using a state space model on the natural parameters of the categorical distributions. For learning our model, we propose an efficient algorithm combining principles from variational inference and dynamic programming. In our experimental study we show the effectiveness of our method and we present interesting discoveries on multiple real world datasets.

Supplementary Material

MP4 File (p841-sidebyside.mp4)

References

[1]
C. C. Aggarwal. Outlier Analysis. Springer, 2013.
[2]
A. Ahmed and E. P. Xing. Timeline: A dynamic hierarchical dirichlet process model for recovering birth/death and evolution of topics in text stream. In UAI, pages 20--29, 2010.
[3]
F. Bengtsson et al. Computing maximum-scoring segments in almost linear time. In Computing and Combinatorics, pages 255--264. Springer, 2006.
[4]
J. Bentley. Programming pearls: algorithm design techniques. Communications of the ACM, 27(9):865--873, 1984.
[5]
C. M. Bishop. Pattern recognition and machine learning. Springer, 2007.
[6]
D. M. Blei and J. D. Lafferty. Dynamic topic models. In ICML, pages 113--120, 2006.
[7]
N. Günnemann, S. Günnemann, and C. Faloutsos. Robust multivariate autoregression for anomaly detection in dynamic product ratings. In WWW, pages 361--372, 2014.
[8]
N. Jindal and B. Liu. Opinion spam and analysis. In WSDM, pages 219--230, 2008.
[9]
N. Koenigstein, G. Dror, and Y. Koren. Yahoo! music recommendations: modeling music ratings with temporal dynamics and item taxonomy. In RecSys, pages 165--172, 2011.
[10]
Y. Koren. Collaborative filtering with temporal dynamics. In KDD, pages 447--456, 2009.
[11]
G. F. Lawler. Introduction to stochastic processes. CRC Press, 2006.
[12]
K. Lerman. Dynamics of a collaborative rating system. In WebKDD/SNA-KDD, pages 77--96, 2007.
[13]
R. B. Litterman. Forecasting with bayesian vector autoregressions. Journal of Business & Economic Statistics, 4(1):25--38, 1986.
[14]
H. Lütkepohl. New introduction to multiple time series analysis. Cambridge University Press, 2005.
[15]
X. Song, M. Wu, C. M. Jermaine, and S. Ranka. Statistical change detection for multi-dimensional data. In KDD, pages 667--676, 2007.
[16]
J.-A. Ting, E. Theodorou, and S. Schaal. Learning an outlier-robust kalman filter. In ECML, pages 748--756, 2007.
[17]
C. Wang, D. M. Blei, and D. Heckerman. Continuous time dynamic topic models. In UAI, pages 579--586, 2008.
[18]
H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In KDD, pages 783--792, 2010.

Cited By

View all
  • (2024)Detection of Fake Reviews in Yelp Dataset Using Machine Learning and Chain Classifier ApproachMicro-Electronics and Telecommunication Engineering10.1007/978-981-99-9562-2_27(331-346)Online publication date: 22-Mar-2024
  • (2023)Application of Business Big Data Management and Decision MakingE-Commerce Big Data Mining and Analytics10.1007/978-981-99-3588-8_9(181-203)Online publication date: 30-Jul-2023
  • (2021)RacketStoreProceedings of the 21st ACM Internet Measurement Conference10.1145/3487552.3487837(639-657)Online publication date: 2-Nov-2021
  • Show More Cited By

Index Terms

  1. Detecting anomalies in dynamic rating data: a robust probabilistic model for rating evolution

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2014
      2028 pages
      ISBN:9781450329569
      DOI:10.1145/2623330
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 August 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. anomaly detection
      2. categorical mixtures
      3. robust mining

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      KDD '14
      Sponsor:

      Acceptance Rates

      KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 12 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Detection of Fake Reviews in Yelp Dataset Using Machine Learning and Chain Classifier ApproachMicro-Electronics and Telecommunication Engineering10.1007/978-981-99-9562-2_27(331-346)Online publication date: 22-Mar-2024
      • (2023)Application of Business Big Data Management and Decision MakingE-Commerce Big Data Mining and Analytics10.1007/978-981-99-3588-8_9(181-203)Online publication date: 30-Jul-2023
      • (2021)RacketStoreProceedings of the 21st ACM Internet Measurement Conference10.1145/3487552.3487837(639-657)Online publication date: 2-Nov-2021
      • (2021)A Combined Approach For Collaborative Filtering Based Recommender Systems with Matrix Factorisation and Outlier DetectionJournal of Business Analytics10.1080/2573234X.2021.1947752(1-14)Online publication date: 12-Jul-2021
      • (2020)Probabilistic Inference and Trustworthiness Evaluation of Associative Links toward Malicious Attack Detection for Online RecommendationsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2020.3023114(1-1)Online publication date: 2020
      • (2020)Identifying ground truth in opinion spam: an empirical survey based on review psychologyApplied Intelligence10.1007/s10489-020-01764-7Online publication date: 15-Jun-2020
      • (2019)The Art and Craft of Fraudulent App Promotion in Google PlayProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security10.1145/3319535.3345658(2437-2454)Online publication date: 6-Nov-2019
      • (2019)GhostLink: Latent Network Inference for Influence-aware RecommendationThe World Wide Web Conference10.1145/3308558.3313449(1310-1320)Online publication date: 13-May-2019
      • (2019)A Contrast Metric for Fraud Detection in Rich GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.287653131:12(2235-2248)Online publication date: 1-Dec-2019
      • (2018)Combating Crowdsourced Review ManipulatorsProceedings of the Eleventh ACM International Conference on Web Search and Data Mining10.1145/3159652.3159726(306-314)Online publication date: 2-Feb-2018
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media