[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3394486.3403305acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Embedding-based Retrieval in Facebook Search

Published: 20 August 2020 Publication History

Abstract

Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an integral part of this context and is a unique aspect of Facebook search. While embedding-based retrieval (EBR) has been applied in web search engines for years, Facebook search was still mainly based on a Boolean matching model. In this paper, we discuss the techniques for applying EBR to a Facebook Search system. We introduce the unified embedding framework developed to model semantic embeddings for personalized search, and the system to serve embedding-based retrieval in a typical search system based on an inverted index. We discuss various tricks and experiences on end-to-end optimization of the whole system, including ANN parameter tuning and full-stack optimization. Finally, we present our progress on two selected advanced topics about modeling. We evaluated EBR on verticals for Facebook Search with significant metrics gains observed in online A/B experiments. We believe this paper will provide useful insights and experiences to help people on developing embedding-based retrieval systems in search engines.

References

[1]
Ricardo Baeza-Yates and Berthier Ribeiro-Neto. 2011. Modern Information Retrieval: The Concepts and Technology behind Search 2nd ed.). Addison-Wesley Publishing Company, USA.
[2]
Y. Bengio, A. Courville, and P. Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 8 (Aug 2013), 1798--1828.
[3]
Michael Curtiss, Iain Becker, Tudor Bosman, Sergey Doroshenko, Lucian Grijincu, Tom Jackson, Sandhya Kunnatur, Soren Lassen, Philip Pronin, Sriram Sankar, Guanghao Shen, Gintaras Woss, Chao Yang, and Ning Zhang. 2013. Unicorn: a system for searching the social graph. Proceedings of the VLDB Endowment, Vol. 6, 11, 1150--1161.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR, Vol. abs/1810.04805 (2018). arxiv: 1810.04805 http://arxiv.org/abs/1810.04805
[5]
Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized Product Quantization for Approximate Nearest Neighbor Search. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6]
Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In Defense of the Triplet Loss for Person Re-Identification. CoRR, Vol. abs/1703.07737 (2017). arxiv: 1703.07737 http://arxiv.org/abs/1703.07737
[7]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM '13). Association for Computing Machinery, New York, NY, USA, 2333--2338.
[8]
Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2011. Product Quantization for Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 33, 1 (Jan. 2011), 117--128.
[9]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2017. Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017).
[10]
Yann LeCun, Yoshua Bengio, and Geoffrey E. Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444.
[11]
Victor Lempitsky. 2012. The Inverted Multi-Index. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (CVPR '12). IEEE Computer Society, USA, 3069--3076.
[12]
Hang Li and Jun Xu. 2014. Semantic Matching in Search. Now Publishers Inc., Hanover, MA, USA.
[13]
Bhaskar Mitra and Nick Craswell. 2018. An Introduction to Neural Information Retrieval. Foundations and Trends® in Information Retrieval, Vol. 13, 1 (December 2018), 1--126.
[14]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In CVPR. IEEE Computer Society, 815--823. http://dblp.uni-trier.de/db/conf/cvpr/cvpr2015.html#SchroffKP15
[15]
Josef Sivic and Andrew Zisserman. 2003. Video Google: A Text Retrieval Approach to Object Matching in Videos. In Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2 (ICCV '03). IEEE Computer Society, USA, 1470.
[16]
Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. 2015. Deep Metric Learning via Lifted Structured Feature Embedding. CoRR, Vol. abs/1511.06452 (2015). arxiv: 1511.06452 http://arxiv.org/abs/1511.06452
[17]
Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Krähenbü hl. 2017. Sampling Matters in Deep Embedding Learning. CoRR, Vol. abs/1706.07567 (2017). arxiv: 1706.07567 http://arxiv.org/abs/1706.07567
[18]
Yuhui Yuan, Kuiyuan Yang, and Chao Zhang. 2017. Hard-Aware Deeply Cascaded Embedding. In The IEEE International Conference on Computer Vision (ICCV).

Cited By

View all
  • (2025)Revisiting multi-dimensional classification from a dimension-wise perspectiveFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-3272-919:1Online publication date: 1-Jan-2025
  • (2024)Scalable billion-point approximate nearest neighbor search using SmartSSDsProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692061(1135-1150)Online publication date: 10-Jul-2024
  • (2024)Evaluation of Social Media Tools in Health Tourism Marketing with Multi Criteria Decision Making MethodsUluslararası Yönetim Akademisi Dergisi10.33712/mana.14274277:1(183-202)Online publication date: 10-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2020

Check for updates

Author Tags

  1. deep learning
  2. embedding
  3. information retrieval
  4. search

Qualifiers

  • Research-article

Conference

KDD '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,740
  • Downloads (Last 6 weeks)238
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Revisiting multi-dimensional classification from a dimension-wise perspectiveFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-3272-919:1Online publication date: 1-Jan-2025
  • (2024)Scalable billion-point approximate nearest neighbor search using SmartSSDsProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692061(1135-1150)Online publication date: 10-Jul-2024
  • (2024)Evaluation of Social Media Tools in Health Tourism Marketing with Multi Criteria Decision Making MethodsUluslararası Yönetim Akademisi Dergisi10.33712/mana.14274277:1(183-202)Online publication date: 10-Sep-2024
  • (2024)Recommendation as Instruction Following: A Large Language Model Empowered Recommendation ApproachACM Transactions on Information Systems10.1145/3708882Online publication date: 20-Dec-2024
  • (2024)Kale: Elastic GPU Scheduling for Online DL Model TrainingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698532(36-51)Online publication date: 20-Nov-2024
  • (2024)Mitigating Sample Selection Bias with Robust Domain Adaption in Multimedia RecommendationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680615(7581-7590)Online publication date: 28-Oct-2024
  • (2024)Combating Missed Recalls in E-commerce Search: A CoT-Prompting Testing ApproachCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663842(220-231)Online publication date: 10-Jul-2024
  • (2024)Efficient Data Access Paths for Mixed Vector-Relational SearchProceedings of the 20th International Workshop on Data Management on New Hardware10.1145/3662010.3663448(1-9)Online publication date: 10-Jun-2024
  • (2024)An Analysis on Matching Mechanisms and Token Pruning for Late-interaction ModelsACM Transactions on Information Systems10.1145/363981842:5(1-28)Online publication date: 29-Apr-2024
  • (2024)Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor SearchACM Transactions on Storage10.1145/363947120:2(1-30)Online publication date: 19-Feb-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media