[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3534678.3539194acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Learning Backward Compatible Embeddings

Published: 14 August 2022 Publication History

Abstract

Embeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product recommendation). The produced embeddings are then widely consumed by consumer teams to solve their unintended tasks (e.g., fraud detection). However, as the embedding model gets updated and retrained to improve performance on the intended task, the newly-generated embeddings are no longer compatible with the existing consumer models. This means that historical versions of the embeddings can never be retired or all consumer teams have to retrain their models to make them compatible with the latest version of the embeddings, both of which are extremely costly in practice.
Here we study the problem of embedding version updates and their backward compatibility. We formalize the problem where the goal is for the embedding team to keep updating the embedding version, while the consumer teams do not have to retrain their models. We develop a solution based on learning backward compatible embeddings, which allows the embedding model version to be updated frequently, while also allowing the latest version of the embedding to be quickly transformed into any backward compatible historical version of it, so that consumer teams do not have to retrain their models. Our key idea is that whenever a new embedding model is trained, we learn it together with a light-weight backward compatibility transformation that aligns the new embedding to the previous version of it. Our learned backward transformations can then be composed to produce any historical version of embedding. Under our framework, we explore six methods and systematically evaluate them on a real-world recommender system application. We show that the best method, which we call BC-Aligner, maintains backward compatibility with existing unintended tasks even after multiple model version updates. Simultaneously, BC-Aligner achieves the intended task performance similar to the embedding model that is solely optimized for the intended task. Code is publicly available at https://github.com/snap-stanford/bc-emb

References

[1]
Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Conference on Empirical Methods in Natural Language Processing (EMNLP). 2289--2294.
[2]
Goran Glavas, Robert Litschko, Sebastian Ruder, and Ivan Vulic. 2019. How to (properly) evaluate cross-lingual word embeddings: On strong baselines, comparative analyses, and some misconceptions. arXiv preprint arXiv:1902.00508 (2019).
[3]
Palash Goyal, Nitin Kamra, Xinran He, and Yan Liu. 2018. Dyngem: Deep embedding method for dynamic graphs. arXiv preprint arXiv:1805.11273 (2018).
[4]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). ACM, 855--864.
[5]
William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems (NeurIPS). 1025--1035.
[6]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In www. 173--182.
[7]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. In Advances in Neural Information Processing Systems (NeurIPS) .
[8]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR) .
[9]
Alexandre Klementiev, Ivan Titov, and Binod Bhattarai. 2012. Inducing crosslingual distributed representations of words. In International Conference ON Computational Linguistics (COLING). 1459--1474.
[10]
Qiang Meng, Chixiang Zhang, Xiaoqiang Xu, and Feng Zhou. 2021. Learning compatible embeddings. In International Conference on Computer Vision (ICCV). 9939--9948.
[11]
Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013a. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013).
[12]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (NeurIPS). 3111--3119.
[13]
Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects. In Conference on Empirical Methods in Natural Language Processing (EMNLP) . 188--197.
[14]
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).
[15]
Sebastian Ruder, Ivan Vulić, and Anders Søgaard. 2019. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research, Vol. 65 (2019), 569--631.
[16]
Yantao Shen, Yuanjun Xiong, Wei Xia, and Stefano Soatto. 2020. Towards backward-compatible representation learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6368--6377.
[17]
Kamil Tagowski, Piotr Bielak, and Tomasz Kajdanowicz. 2021. Embedding Alignment Methods in Dynamic Networks. In International Conference on Computational Science. Springer, 599--613.
[18]
Andrew Z Wang, Rex Ying, Pan Li, Nikhil Rao, Karthik Subbian, and Jure Leskovec. 2021. Bipartite Dynamic Representations for Abuse Detection. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining . 3638--3648.
[19]
Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural graph collaborative filtering. In ACM SIGIR conference on Research and development in Information Retrieval (SIGIR). 165--174.
[20]
Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. 2015. Normalized word embedding and orthogonal transform for bilingual word translation. In North American Chapter of the Association for Computational Linguistics (NAACL) . 1006--1011.
[21]
Zijun Yao, Yifan Sun, Weicong Ding, Nikhil Rao, and Hui Xiong. 2018. Dynamic word embeddings for evolving semantic discovery. In Proceedings of the eleventh acm international conference on web search and data mining . 673--681.
[22]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 974--983.
[23]
Mozhi Zhang, Keyulu Xu, Ken-ichi Kawarabayashi, Stefanie Jegelka, and Jordan Boyd-Graber. 2019. Are Girls Neko or Sh$backslash$= ojo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization. arXiv preprint arXiv:1906.01622 (2019).
[24]
Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. Aligraph: A comprehensive graph neural network platform. arXiv preprint arXiv:1902.08730 (2019).

Cited By

View all
  • (2024)Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02720(28793-28804)Online publication date: 16-Jun-2024
  • (2023)L2R: Lifelong Learning for First-stage Retrieval with Backward-Compatible RepresentationsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614947(183-192)Online publication date: 21-Oct-2023
  • (2023)AutoML in The Wild: Obstacles, Workarounds, and ExpectationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581082(1-15)Online publication date: 19-Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. backward compatibility
  2. embeddings
  3. graph neural networks
  4. recommender systems

Qualifiers

  • Research-article

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)2
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02720(28793-28804)Online publication date: 16-Jun-2024
  • (2023)L2R: Lifelong Learning for First-stage Retrieval with Backward-Compatible RepresentationsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614947(183-192)Online publication date: 21-Oct-2023
  • (2023)AutoML in The Wild: Obstacles, Workarounds, and ExpectationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581082(1-15)Online publication date: 19-Apr-2023
  • (2023)Large-to-small Image Resolution Asymmetry in Deep Metric Learning2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00150(1451-1460)Online publication date: Jan-2023
  • (2023) BT 2 : Backward-compatible Training with Basis Transformation 2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01031(11195-11204)Online publication date: 1-Oct-2023
  • (2023)Efficient Sequence Embedding for SARS-CoV-2 Variants ClassificationBioinformatics Research and Applications10.1007/978-981-99-7074-2_2(16-30)Online publication date: 9-Oct-2023
  • (2023)RAFEN – Regularized Alignment Framework for Embeddings of NodesComputational Science – ICCS 202310.1007/978-3-031-35995-8_25(352-364)Online publication date: 3-Jul-2023
  • (2023)BioSequence2Vec: Efficient Embedding Generation for Biological SequencesAdvances in Knowledge Discovery and Data Mining10.1007/978-3-031-33377-4_14(173-185)Online publication date: 25-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media