More Web Proxy on the site http://driver.im/

research-article

Whither Social Networks for Web Search?

Authors:

Rakesh Agrawal,

Behzad Golshan,

Evangelos PapalexakisAuthors Info & Claims

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1661 - 1670

https://doi.org/10.1145/2783258.2788571

Published: 10 August 2015 Publication History

Abstract

Access to diverse perspectives nurtures an informed citizenry. Google and Bing have emerged as the duopoly that largely arbitrates which English language documents are seen by web searchers. A recent study shows that there is now a large overlap in the top organic search results produced by them. Thus, citizens may no longer be able to gain different perspectives by using different search engines.

We present the results of our empirical study that indicates that by mining Twitter data one can obtain search results that are quite distinct from those produced by Google and Bing. Additionally, our user study found that these results were quite informative. The gauntlet is now on search engines to test whether our findings hold in their infrastructure for different social networks and whether enabling diversity has sufficient business imperative for them.

Supplementary Material

MP4 File (p1661.mp4)

Download
286.75 MB

References

[1]

Amazon Mechanical Turk, Requester Best Practices Guide. Amazon Web Services, June 2011.

[2]

R. Agrawal, B. Golshan, and E. Papalexakis. A study of distinctiveness in web results of two search engines. In 24th international conference on World Wide Web, Web Science Track. ACM, 2015.

Digital Library

[3]

O. Alonso, C. Carson, D. Gerster, X. Ji, and S. U. Nabar. Detecting uninteresting content in text streams. In SIGIR Crowdsourcing for Search Evaluation Workshop, 2010.

[4]

A. Anandkumar, R. Ge, D. Hsu, S. M. Kakade, and M. Telgarsky. Tensor decompositions for learning latent variable models. Journal of Machine Learning Research, 15(1):2773--2832, 2014.

Digital Library

[5]

D. Antoniades, I. Polakis, G. Kontaxis, E. Athanasopoulos, S. Ioannidis, E. P. Markatos, and T. Karagiannis. we.b: The web of short URLs. In 20th international conference on World Wide Web, pages 715--724. ACM, 2011.

Digital Library

[6]

Z. Bar-Yossef, I. Keidar, and U. Schonfeld. Do not crawl in the DUST: different urls with similar text. ACM Transactions on the Web, 3(1):3, 2009.

Digital Library

[7]

R. Batool, A. M. Khattak, J. Maqbool, and S. Lee. Precise tweet classification and sentiment analysis. In IEEE/ACIS 12th international conference on Computer and Information Science, pages 461--466. IEEE, 2013.

[8]

K. Bharat and A. Broder. A technique for measuring the relative size and overlap of public web search engines. Computer Networks and ISDN Systems, 30(1):379--388, 1998.

Digital Library

[9]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.

Digital Library

[10]

A. Broder. A taxonomy of web search. ACM Sigir forum, 36(2):3--10, 2002.

Digital Library

[11]

M. Busch, K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. Lin. Earlybird: Real-time search at Twitter. In IEEE 28th international conference on Data Engineering, pages 1360--1369. IEEE, 2012.

Digital Library

[12]

C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In 20th international conference on World Wide Web, pages 675--684. ACM, 2011.

Digital Library

[13]

C. L. A. Clarke, N. Craswell, I. Soboroff, and E. M. Voorhees. Overview of the TREC 2011 web track. Technical report, NIST, 2011.

[14]

W. Ding and G. Marchionini. A comparative study of web search service performance. In ASIS Annual Meeting, volume 33, pages 136--42. ERIC, 1996.

[15]

A. Dong, R. Zhang, P. Kolari, J. Bai, F. Diaz, Y. Chang, Z. Zheng, and H. Zha. Time is of the essence: improving recency ranking using twitter data. In 19th international conference on World Wide Web, pages 331--340. ACM, 2010.

Digital Library

[16]

Y. Duan, L. Jiang, T. Qin, M. Zhou, and H.-Y. Shum. An empirical study on learning to rank of tweets. In 23rd international conference on Computational Linguistics, pages 295--303. Association for Computational Linguistics, 2010.

Digital Library

[17]

W. DuBay. The principles of readability. Impact Information, 2004.

[18]

E. Enge, S. Spencer, J. Stricchiola, and R. Fishkin. The art of SEO. O'Reilly, 2012.

Digital Library

[19]

Federal Communications Commission. Editorializing by broadcast licensees. Washington, DC: GPO, 1949.

[20]

J. L. Fleiss. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5):378, 1971.

[21]

S. Gauch and G. Wang. Information fusion with profusion. In 1st World Conference of the Web Society, 1996.

[22]

A. Hannak, P. Sapiezynski, A. Molavi Kakhki, B. Krishnamurthy, D. Lazer, A. Mislove, and C. Wilson. Measuring personalization of web search. In 22nd international conference on World Wide Web, pages 527--538. ACM, 2013.

Digital Library

[23]

R. A. Harshman. Foundations of the parafac procedure: models and conditions for an" explanatory" multimodal factor analysis. Technical report, UCLA, 1970.

[24]

T. Joachims. Optimizing search engines using clickthrough data. In ACM SIGKDD international conference on Knowledge Discovery and Data mining, pages 133--142. ACM, 2002.

Digital Library

[25]

U. Kang, E. Papalexakis, A. Harpale, and C. Faloutsos. Gigatensor: scaling tensor analysis up by 100 times - algorithms and discoveries. In 18th ACM SIGKDD international conference on Knowledge Discovery and Data mining, pages 316--324. ACM, 2012.

Digital Library

[26]

G. R. Klare and B. Buck. Know Your Reader: The scientific approach to readability. Heritage House, 1954.

[27]

T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM review, 51(3):455--500, 2009.

Digital Library

[28]

H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In 19th international conference on World Wide Web, pages 591--600. ACM, 2010.

Digital Library

[29]

H. W. Lauw, A. Ntoulas, and K. Kenthapadi. Estimating the quality of postings in the real-time web. In Proc. of SSM conference, 2010.

[30]

S. Lawrence and C. L. Giles. Searching the world wide web. Science, 280(5360):98--100, 1998.

[31]

S. H. Lee, S. J. Kim, and S. H. Hong. On URL normalization. In Computational Science and Its Applications--ICCSA 2005, pages 1076--1085. Springer, 2005.

Digital Library

[32]

T. Lei, R. Cai, J.-M. Yang, Y. Ke, X. Fan, and L. Zhang. A pattern tree-based approach to learning URL normalization rules. In 19th international conference on World Wide Web, pages 611--620. ACM, 2010.

Digital Library

[33]

V. Maltese, F. Giunchiglia, K. Denecke, P. Lewis, C. Wallner, A. Baldry, and D. Madalli. On the interdisciplinary foundations of diversity. University of Trento, 2009.

[34]

M.-C. Marcos and C. González-Caro. Comportamiento de los usuarios en la página de resultados de los buscadores. un estudio basado en eye tracking. El profesional de la información, 19(4):348--358, 2010.

[35]

J. Martinez-Romo and L. Araujo. Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications, 40(8):2992--3000, 2013.

Digital Library

[36]

W. Meng, C. Yu, and K.-L. Liu. Building efficient and effective metasearch engines. ACM Computing Surveys, 34(1):48--89, 2002.

Digital Library

[37]

K. Nishida, T. Hoshide, and K. Fujimura. Improving tweet stream classification by detecting changes in word probability. In 35th international ACM SIGIR conference on Research and development in information retrieval), pages 971--980. ACM, 2012.

Digital Library

[38]

K. Purcell, J. Brenner, and L. Rainie. Search engine use 2012. Pew Internet & American Life Project, 2012.

[39]

M. S. Rahman, T.-K. Huang, H. V. Madhyastha, and M. Faloutsos. Efficient and scalable socware detection in online social networks. In USENIX Security Symposium, pages 663--678, 2012.

Digital Library

[40]

D. M. Romero, W. Galuba, S. Asur, and B. A. Huberman. Influence and passivity in social media. In Machine learning and knowledge discovery in databases, pages 18--33. Springer, 2011.

Digital Library

[41]

T. Rowlands, D. Hawking, and R. Sankaranarayana. New-web search with microblog annotations. In 19th international conference on World Wide Web, pages 1293--1296. ACM, 2010.

Digital Library

[42]

I. Santos, I. Mi\ nambres-Marcos, C. Laorden, P. Galán-García, A. Santamaría-Ibirika, and P. G. Bringas. Twitter content-based spam filtering. In International Joint Conference SOCO'13-CISIS'13-ICEUTE'13, pages 449--458. Springer, 2014.

[43]

E. Selberg and O. Etzioni. Multi-service search and comparison using the metacrawler. In 4th international conference on World Wide Web, 1995.

[44]

A. Spink, B. J. Jansen, C. Blakely, and S. Koshman. A study of results overlap and uniqueness among major web search engines. Information Processing & Management, 42(5):1379--1391, 2006.

Digital Library

[45]

B. Sriram, D. Fuhry, E. Demir, H. Ferhatosmanoglu, and M. Demirbas. Short text classification in twitter to improve information filtering. In 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 841--842. ACM, 2010.

Digital Library

[46]

N. J. Stroud and A. Muddiman. Exposure to news and diverse views in the internet age. ISJLP, 8:605, 2012.

[47]

H. Takemura and K. Tajima. Tweet classification based on their lifetime duration. In 21st ACM international conference on Information and knowledge management, pages 2367--2370. ACM, 2012.

Digital Library

[48]

K. Tao, F. Abel, C. Hauff, and G.-J. Houben. Twinder: a search engine for twitter streams. In Web Engineering, pages 153--168. Springer, 2012.

Digital Library

[49]

J. Teevan, D. Ramage, and M. R. Morris.# twittersearch: a comparison of microblog search and web search. In 4th ACM international conference on Web Search and Data Mining, pages 35--44. ACM, 2011.

Digital Library

[50]

I. Uysal and W. B. Croft. User oriented tweet ranking: a filtering approach to microblogs. In 20th ACM international conference on Information and knowledge management, pages 2261--2264. ACM, 2011.

Digital Library

[51]

W. M. Webberley. Inferring Interestingness in Online Social Networks. PhD thesis, Cardiff University, 2014.

[52]

R. W. White and S. T. Dumais. Characterizing and predicting search engine switching behavior. In 18th ACM conference on Information and knowledge management, pages 87--96. ACM, 2009.

Digital Library

[53]

M.-C. Yang, J.-T. Lee, S.-W. Lee, and H.-C. Rim. Finding interesting posts in twitter based on retweet graph analysis. In 35th international ACM SIGIR conference on Research and development in information retrieval, pages 1073--1074. ACM, 2012.

Digital Library

[54]

M.-C. Yang and H.-C. Rim. Identifying interesting twitter contents using topical analysis. Expert Systems with Applications, 41(9):4330--4336, 2014.

Digital Library

Cited By

Badache I(2020)2SRM: Learning social signals for predicting relevant search resultsWeb Intelligence10.3233/WEB-20042618:1(15-33)Online publication date: 4-Mar-2020
https://doi.org/10.3233/WEB-200426
Dritsa KSotiropoulos TSkarpetis HLouridas P(2020)Search Engine Similarity Analysis: A Combined Content and Rankings ApproachWeb Information Systems Engineering – WISE 202010.1007/978-3-030-62008-0_2(21-37)Online publication date: 21-Oct-2020
https://doi.org/10.1007/978-3-030-62008-0_2
Brusilovsky PSmyth BShapira B(2018)Social SearchSocial Information Access10.1007/978-3-319-90092-6_7(213-276)Online publication date: 3-May-2018
https://doi.org/10.1007/978-3-319-90092-6_7
Show More Cited By

Index Terms

Whither Social Networks for Web Search?
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
  2. Information systems applications
    1. Data mining

Recommendations

Personalizing LinkedIn Feed
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

LinkedIn dynamically delivers update activities from a user's interpersonal network to more than 300 million members in the personalized feed that ranks activities according their "relevance" to the user. This paper discloses the implementation details ...
The Effectiveness of Marketing Strategies in Social Media: Evidence from Promotional Events
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

This paper studies a novel social media venture and seeks to understand the effectiveness of marketing strategies in social media platforms by evaluating their impact on participating brands and organizations. We use a real-world data set and employ a ...
Exploiting Data Mining for Authenticity Assessment and Protection of High-Quality Italian Wines from Piedmont
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

This paper discusses the data mining approach followed in a project called TRAQUASwine, aimed at the definition of methods for data analytical assessment of the authenticity and protection, against fake versions, of some of the highest value Nebbiolo-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2015

2378 pages

ISBN:9781450336642

DOI:10.1145/2783258

General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 August 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

KDD '15

Sponsor:

KDD '15: The 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 10 - 13, 2015

NSW, Sydney, Australia

Acceptance Rates

KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
492
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Badache I(2020)2SRM: Learning social signals for predicting relevant search resultsWeb Intelligence10.3233/WEB-20042618:1(15-33)Online publication date: 4-Mar-2020
https://doi.org/10.3233/WEB-200426
Dritsa KSotiropoulos TSkarpetis HLouridas P(2020)Search Engine Similarity Analysis: A Combined Content and Rankings ApproachWeb Information Systems Engineering – WISE 202010.1007/978-3-030-62008-0_2(21-37)Online publication date: 21-Oct-2020
https://doi.org/10.1007/978-3-030-62008-0_2
Brusilovsky PSmyth BShapira B(2018)Social SearchSocial Information Access10.1007/978-3-319-90092-6_7(213-276)Online publication date: 3-May-2018
https://doi.org/10.1007/978-3-319-90092-6_7
Alonso OKandylas VTremblay SHofman JSen SFox PMcGuinness DPoirer LBoldi PKinder-Kurlanda K(2017)What's Happening and What HappenedProceedings of the 2017 ACM on Web Science Conference10.1145/3091478.3091484(191-200)Online publication date: 25-Jun-2017
https://dl.acm.org/doi/10.1145/3091478.3091484
Agrawal RGolshan BPapalexakis E(2017)Homogeneity in Web Search ResultsACM Transactions on Intelligent Systems and Technology10.1145/30577318:5(1-35)Online publication date: 12-Jul-2017
https://dl.acm.org/doi/10.1145/3057731
Yang LPapalexakis E(2017)Exploration of Social and Web Image Search Results Using Tensor Decomposition2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW.2017.239(1915-1920)Online publication date: Jul-2017
https://doi.org/10.1109/CVPRW.2017.239
Aldhaheri AJeongkyu Lee (2017)Event detection on large social media using temporal analysis2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC.2017.7868467(1-6)Online publication date: Jan-2017
https://doi.org/10.1109/CCWC.2017.7868467
Papalexakis EFaloutsos CSidiropoulos N(2016)Tensors for Data Mining and Data FusionACM Transactions on Intelligent Systems and Technology10.1145/29159218:2(1-44)Online publication date: 3-Oct-2016
https://dl.acm.org/doi/10.1145/2915921
Zhao FZhu YJin HYang L(2016)A personalized hashtag recommendation approach using LDA-based topic model in microblog environmentFuture Generation Computer Systems10.1016/j.future.2015.10.01265:C(196-206)Online publication date: 1-Dec-2016
https://dl.acm.org/doi/10.1016/j.future.2015.10.012
Agrawal RGolshan BPapalexakis ESharma AAgrawal RGrossglauser M(2015)Overlap Between Google and Bing Web Search Results!Proceedings of the 2015 ACM on Conference on Online Social Networks10.1145/2817946.2820604(95-95)Online publication date: 2-Nov-2015
https://dl.acm.org/doi/10.1145/2817946.2820604
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten