[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1244408.1244415acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesiea-aeiConference Proceedingsconference-collections
Article

Using spam farm to boost PageRank

Published: 08 May 2007 Publication History

Abstract

Nowadays web spamming has emerged to take the economic advantage of high search rankings and threatened the accuracy and fairness of those rankings. Understanding spamming techniques is essential for evaluating the strength and weakness of a ranking algorithm, and for fighting against web spamming. In this paper, we identify the optimal spam farm structure under some realistic assumptions in the single target spam farm model. Our result extends the optimal spam farm claimed by Gyöngyi and Garcia-Molina through dropping the assumption that leakage is constant. We also characterize the optimal spam farms under additional constraints, which the spammer may deploy to disguise the spam farm by deviating from the unconstrained optimal structure.

References

[1]
T. Sarls A. Benczr, K. Csalogny and M. Uher. Spamrank - fully automatic link spam detection. In Proc. Int'l Workshop Adversarial Information Retrieval on the Web, 2005.
[2]
David Aldous and James Allen Fill. Reversible markov chains and random walks on graphs. www.stat.berkeley.edu/aldous/RWG/book.html.
[3]
Monica Bianchini, Marco Gori, and Franco Scarselli. Inside pagerank. ACM Trans. Inter. Tech., 5(1):92--128, 2005.
[4]
Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. In WWW7: Proceedings of the seventh international conference on World Wide Web 7, pages 107--117, Brisbane, Australia, 1998.
[5]
Alice Cheng and Eric Friedman. Manipulability of pagerank under sybil strategies. 2006. First Workshop on the Economics of Networked Systems.
[6]
Steve Chien, Cynthia Dwork, Ravi Kumar, Daniel R. Simon, and D. Sivakumar. Link evolution: Analysis and algorithms. Internet Mathematics, 1(3):277--304, 2003.
[7]
Dennis Fetterly, Mark Manasse, and Marc Najork. Spam, damn spam, and statistics: using statistical analysis to locate spam web pages. In WebDB '04: Proceedings of the 7th International Workshop on the Web and Databases, pages 1--6, New York, NY, USA, 2004. ACM Press.
[8]
Zoltán Gyöngyi and Hector Garcia-Molina. Link spam alliances. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 517--528. VLDB, 2005.
[9]
Zoltán Gyöngyi and Hector Garcia-Molina. Spam: It's not just for inboxes anymore. IEEE Computer Magazine, 38(10):28--34, October 2005.
[10]
Zoltán Gyöngyi and Hector Garcia-Molina. Web spam taxonomy. In First International Workshop on Adversarial Information Retrieval on the Web, 2005.
[11]
Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. Combating web spam with TrustRank. In Proceedings of the 30th International Conference on Very Large Databases, pages 576--587. Morgan Kaufmann, 2004.
[12]
Monika R. Henzinger, Rajeev Motwani, and Craig Silverstein. Challenges in web search engines. SIGIR Forum, 36(2):11--22, 2002.
[13]
John G. Kemeny and J Laurie Snell. Finite markov chains, 1960. D. Van Nostrand Company.
[14]
A. Langville and C. Meyer. Deeper inside pagerank. Internet Mathematics, 1(3):335--380, 2005.
[15]
Andrew Y. Ng, Alice X. Zheng, and Michael I. Jordan. Link analysis, eigenvectors and stability. In IJCAI, pages 903--910, 2001.
[16]
Ignacio Palacios-Huerta and Oscar Volij. The measurement of intellectual influence. Econometrica, 72(3):963--977, 2004.
[17]
Tina Liu Sibel Adali and Malik Magdon-Ismail. Optimal link bombs are uncoordinated. In Proceeding of AIRWeb, 2005.
[18]
Baoning Wu and Brian D. Davison. Identifying link farm spam pages. In WWW '05: Special interest tracks and posters of the 14th international conference on World Wide Web, pages 820--829, New York, NY, USA, 2005. ACM Press.

Cited By

View all
  • (2021)Prioritizing Original News on FacebookProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481943(4046-4054)Online publication date: 26-Oct-2021
  • (2019)Trust Mends Blockchains: Living up to Expectations2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2019.00136(1358-1368)Online publication date: Jul-2019
  • (2018)When Trust Saves EnergyCompanion Proceedings of the The Web Conference 201810.1145/3184558.3191553(1165-1169)Online publication date: 23-Apr-2018
  • Show More Cited By

Index Terms

  1. Using spam farm to boost PageRank

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIRWeb '07: Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
    May 2007
    98 pages
    ISBN:9781595937322
    DOI:10.1145/1244408
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 May 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Markov chain
    2. PageRank algorithm
    3. link spamming

    Qualifiers

    • Article

    Conference

    AIRWeb'07

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Prioritizing Original News on FacebookProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481943(4046-4054)Online publication date: 26-Oct-2021
    • (2019)Trust Mends Blockchains: Living up to Expectations2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2019.00136(1358-1368)Online publication date: Jul-2019
    • (2018)When Trust Saves EnergyCompanion Proceedings of the The Web Conference 201810.1145/3184558.3191553(1165-1169)Online publication date: 23-Apr-2018
    • (2017)A PageRank-improved ranking algorithm based on cheating similarity and cheating relevance2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)10.1109/ICIS.2017.7960003(257-263)Online publication date: May-2017
    • (2016)Web directories: selected features and their impact on directory qualityProgram10.1108/PROG-08-2014-006050:3(340-352)Online publication date: 4-Jul-2016
    • (2015)On characterizing and computing the diversity of hyperlinks for anti-spamming page rankingKnowledge-Based Systems10.1016/j.knosys.2014.12.02877:C(56-67)Online publication date: 1-Mar-2015
    • (2015)Evaluation of Spam Impact on Arabic Websites PopularityJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2014.04.00527:2(222-229)Online publication date: 1-Apr-2015
    • (2012)An approach of two-way spam detection based on boosting pages analysis2012 9th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology10.1109/ECTICon.2012.6254182(1-4)Online publication date: May-2012
    • (2012)An analysis of optimal link bombsTheoretical Computer Science10.1016/j.tcs.2012.02.019437(1-20)Online publication date: 1-Jun-2012
    • (2012)Link Analysis and Web SearchComputational Complexity10.1007/978-1-4614-1800-9_112(1746-1766)Online publication date: 2012
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media