More Web Proxy on the site http://driver.im/

research-article

Public Access

Combating Crowdsourced Review Manipulators: A Neighborhood-Based Approach

Authors:

Parisa Kaghazgaran,

James Caverlee,

Anna SquicciariniAuthors Info & Claims

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Pages 306 - 314

https://doi.org/10.1145/3159652.3159726

Published: 02 February 2018 Publication History

Abstract

We propose a system called TwoFace to uncover crowdsourced review manipulators who target online review systems. A unique feature of TwoFace is its three-phase framework:(i) in the first phase, we intelligently sample actual evidence of manipulation(e.g., review manipulators) by exploiting low moderation crowdsourcing platforms that reveal evidence of strategic manipulation;(ii) we then propagate the suspiciousness of these seed users to identify similar users through a random walk over a "suspiciousness»» graph; and(iii) finally, we uncover(hidden) distant users who serve structurally similar roles by mapping users into a low-dimensional embedding space that captures community structure. Altogether, the TwoFace system recovers 83% to 93% of all manipulators in a sample from Amazon of 38,590 reviewers, even when the system is seeded with only a few samples from malicious crowdsourcing sites.

References

[1]

Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion Fraud Detection in Online Reviews by Network Effects. In ICWSM.

[2]

Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW.

Digital Library

[3]

Paul Grey. 2015. How Many Products Does Amazon Sell?, https://export-x.com, Last Access: 01/10/2017.

[4]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855--864.

Digital Library

[5]

Stephan Günnemann, Nikou Günnemann, and Christos Faloutsos. 2014. Detecting anomalies in dynamic rating data: A robust probabilistic model for rating evolution. In SIGKDD.

[6]

Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating web spam with trustrank. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. VLDB Endowment, 576--587.

Digital Library

[7]

Zellig S Harris. 1954. Distributional structure. Word 10, 2--3(1954), 146--162.

[8]

Ruining He and Julian McAuley. 2016. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 507--517.

Digital Library

[9]

Bryan Hooi, Neil Shah, Alex Beutel, Stephan Gunneman, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. 2015. BIRDNEST: Bayesian Inference for Ratings-Fraud Detection. arXiv(2015).

[10]

Bryan Hooi, Neil Shah, Alex Beutel, Stephan Günnemann, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. 2016. Birdnest: Bayesian inference for ratings-fraud detection. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 495--503.

[11]

Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. Fraudar: Bounding graph fraud in the face of camouflage. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 895--904.

Digital Library

[12]

Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Inferring strange behavior from connectivity pattern in social networks. In PAKDD.

[13]

Parisa Kaghazgaran, James Caverlee, and Majid Alfifi. 2017. Behavioral Analysis of Review Fraud: Linking Malicious Crowdsourcing to Amazon and Beyond. In ICWSM.

[14]

Huayi Li, Geli Fei, Shuai Wang, Bing Liu, Weixiang Shao, Arjun Mukherjee, and Jidong Shao. 2017. Bimodal distribution and co-bursting in review spam detection. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1063--1072.

Digital Library

[15]

Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52.

Digital Library

[16]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.

Digital Library

[17]

Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie S Glance. 2013. What yelp fake review filter might be doing?. In ICWSM.

[18]

Myle Ott, Claire Cardie, and Jeff Hancock. 2012. Estimating the prevalence of deception in online review communities. In Proceedings of the 21st international conference on World Wide Web. ACM, 201--210.

Digital Library

[19]

Myle Ott, Claire Cardie, and Jeffrey T Hancock. 2013. Negative Deceptive Opinion Spam. In HLT-NAACL.

[20]

Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In ACL.

Digital Library

[21]

Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In WWW.

Digital Library

[22]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701--710.

Digital Library

[23]

B Aditya Prakash, Ashwin Sridharan, Mukund Seshadri, Sridhar Machiraju, and Christos Faloutsos. 2010. Eigenspokes: Surprising patterns and scalable community chipping in large graphs. In PAKDD.

Digital Library

[24]

Shebuti Rayana and Leman Akoglu. 2015. Collective opinion spam detection: Bridging review networks and metadata. In SIGKDD.

Digital Library

[25]

Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fbox: An adversarial perspective. In ICDM.

Digital Library

[26]

Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-zoom: Fast dense-block detection in tensors with quality guarantees. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 264--280.

Digital Library

[27]

Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-cube: Denseblock detection in terabyte-scale tensors. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 681--689.

Digital Library

[28]

Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. ACM, 1067--1077.

Digital Library

[29]

Grigorios Tsoumakas and Ioannis Katakis. 2006. Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3, 3(2006).

[30]

Bimal Viswanath, Muhammad Ahmad Bashir, Muhammad Bilal Zafar, Simon Bouget, Saikat Guha, Krishna P Gummadi, Aniket Kate, and Alan Mislove. 2015. Strength in numbers: Robust tamper detection in crowd computations. In Proceedings of the 2015 ACM on Conference on Online Social Networks. ACM, 113--124.

Digital Library

[31]

Guan Wang, Sihong Xie, Bing Liu, and S Yu Philip. 2011. Review graph based online store review spammer detection. In ICDM.

Digital Library

[32]

Elizabeth Weise. 2016. Amazon bans 'incentivized' reviews, goo.gl/K8Woqd, Last Access: 01/10/2017. USATODAY.

[33]

Sihong Xie, Guan Wang, Shuyang Lin, and Philip S Yu. 2012. Review spam detection via temporal pattern discovery. In SIGKDD.

Digital Library

[34]

Junting Ye and Leman Akoglu. 2015. Discovering opinion spammer groups by network footprints. In ECML-PKDD.

Digital Library

Cited By

Zhang ZSu XWu JTessone CLiao H(2025)Heterogeneous graph representation learning via mutual information estimation for fraud detectionJournal of Network and Computer Applications10.1016/j.jnca.2024.104046234(104046)Online publication date: Feb-2025
https://doi.org/10.1016/j.jnca.2024.104046
Zhang ZAo XTessone CLiu GZhou MMao RLiao H(2025)Multiplex graph fusion network with reinforcement structure learning for fraud detection in online e-commerce platformsExpert Systems with Applications10.1016/j.eswa.2024.125598262(125598)Online publication date: Mar-2025
https://doi.org/10.1016/j.eswa.2024.125598
Song YWei YYuan HSun QFu XWang LLi X(2024)CausalFD: causal invariance-based fraud detection against camouflaged preferenceInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02209-015:11(5053-5070)Online publication date: 27-May-2024
https://doi.org/10.1007/s13042-024-02209-0
Show More Cited By

Recommendations

Characterizing and detecting malicious crowdsourcing
SIGCOMM '13: Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM

Popular Internet services in recent years have shown that remarkable things can be achieved by harnessing the power of the masses. However, crowd-sourcing systems also pose a real challenge to existing security mechanisms deployed to protect Internet ...
How many crowdsourced workers should a requester hire?

Recent years have seen an increased interest in crowdsourcing as a way of obtaining information from a potentially large group of workers at a reduced cost. The crowdsourcing process, as we consider in this paper, is as follows: a requester hires a ...
Crowdsourced App Review Manipulation
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

With the rapid adoption of smartphones worldwide and the reliance on app marketplaces to discover new apps, these marketplaces are critical for connecting users with apps. And yet, the user reviews and ratings on these marketplaces may be strategically ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

February 2018

821 pages

ISBN:9781450355810

DOI:10.1145/3159652

General Chairs:
Yi Chang
Jilin University, Huawei Inc.
,
Chengxiang Zhai
University of Illinois Urbana-Champaign
,
Program Chairs:
Yan Liu
University of Southern California
,
Yoelle Maarek
Amazon

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 February 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

AFOSR

Conference

WSDM 2018

Sponsor:

WSDM 2018: The Eleventh ACM International Conference on Web Search and Data Mining

February 5 - 9, 2018

CA, Marina Del Rey, USA

Acceptance Rates

WSDM '18 Paper Acceptance Rate 81 of 514 submissions, 16%;

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
724
Total Downloads

Downloads (Last 12 months)102
Downloads (Last 6 weeks)16

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang ZSu XWu JTessone CLiao H(2025)Heterogeneous graph representation learning via mutual information estimation for fraud detectionJournal of Network and Computer Applications10.1016/j.jnca.2024.104046234(104046)Online publication date: Feb-2025
https://doi.org/10.1016/j.jnca.2024.104046
Zhang ZAo XTessone CLiu GZhou MMao RLiao H(2025)Multiplex graph fusion network with reinforcement structure learning for fraud detection in online e-commerce platformsExpert Systems with Applications10.1016/j.eswa.2024.125598262(125598)Online publication date: Mar-2025
https://doi.org/10.1016/j.eswa.2024.125598
Song YWei YYuan HSun QFu XWang LLi X(2024)CausalFD: causal invariance-based fraud detection against camouflaged preferenceInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02209-015:11(5053-5070)Online publication date: 27-May-2024
https://doi.org/10.1007/s13042-024-02209-0
Zou YXiang SMiao QCheng DJiang C(2024)Subgraph Patterns Enhanced Graph Neural Network for Fraud DetectionDatabase Systems for Advanced Applications10.1007/978-981-97-5572-1_26(375-384)Online publication date: 31-Aug-2024
https://doi.org/10.1007/978-981-97-5572-1_26
Fang XTang YSun GShen CChen H(2024)Truth Discovery Against Disguised Attack Mechanism in CrowdsourcingWeb and Big Data10.1007/978-981-97-2387-4_5(64-79)Online publication date: 28-Apr-2024
https://doi.org/10.1007/978-981-97-2387-4_5
Saklani ATiwari SPannu H(2024)Fake News Detection Using Heterogeneous Information from Multimedia ContentThe Future of Artificial Intelligence and Robotics10.1007/978-3-031-60935-0_39(427-437)Online publication date: 20-Aug-2024
https://doi.org/10.1007/978-3-031-60935-0_39
Recabarren RCarbunar BHernandez NShafin ACalandrino JTroncoso C(2023)Strategies and vulnerabilities of participants in venezuelan influence operationsProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620611(6683-6700)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.5555/3620237.3620611
Hu QZhang XLi FTang ZWang S(2023)Measuring and Understanding Crowdturfing in the App StoreInformation10.3390/info1407039314:7(393)Online publication date: 11-Jul-2023
https://doi.org/10.3390/info14070393
Yan BYang CShi CFang YLi QYe YDu J(2023)Graph Mining for Cybersecurity: A SurveyACM Transactions on Knowledge Discovery from Data10.1145/361022818:2(1-52)Online publication date: 13-Nov-2023
https://dl.acm.org/doi/10.1145/3610228
Ren JXia FLee INoori Hoshyar AAggarwal C(2023)Graph Learning for Anomaly Analytics: Algorithms, Applications, and ChallengesACM Transactions on Intelligent Systems and Technology10.1145/357090614:2(1-29)Online publication date: 16-Feb-2023
https://dl.acm.org/doi/10.1145/3570906
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents