More Web Proxy on the site http://driver.im/

research-article

RapidScorer: Fast Tree Ensemble Evaluation by Maximizing Compactness in Data Level Parallelization

Authors:

Ruofei ZhangAuthors Info & Claims

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 941 - 950

https://doi.org/10.1145/3219819.3219857

Published: 19 July 2018 Publication History

Abstract

Relevance ranking models based on additive ensembles of regression trees have shown quite good effectiveness in web search engines. In the era of big data, tree ensemble models grow large in both tree depth and ensemble size to provide even better search relevance and user experience. However, the computational cost for their scoring process is high, such that it becomes a challenging issue to apply the big tree ensemble models in a search engine which needs to answer thousands of queries per second. Although several works have been proposed to improve the scoring process, the challenge is still great especially when the model size grows large. In this paper, we present RapidScorer, a novel framework for speeding up the scoring process of industry-scale tree ensemble models, without hurting the quality of scoring results. RapidScorer introduces a modified run length encoding called epitome to the bitvector representation of the tree nodes. Epitome can greatly reduce the computation cost to traverse the tree ensemble, and work with several other proposed strategies to maximize the compactness of data units in memory. The achieved compactness makes it possible to fully utilize data parallelization to improve model scalability. Experiments on two web search benchmarks show that, RapidScorer achieves significant speed-up over the state-of-the-art methods: V-QuickScorer, ranging from 1.3x to 3.5x; QuickScorer, ranging from 2.1x to 25.0x; VPred, ranging from 2.3x to 18.3x; and XGBoost, ranging from 2.6x to 42.5x.

References

[1]

2011. Intel® 64 and IA-32 Architectures Software Developer Manual. (2011).

[2]

Nima Asadi, Jimmy Lin, and Arjen P. De Vries. 2014. Runtime optimizations for tree-based machine learning models. IEEE Transactions on Knowledge and Data Engineering 26, 9 (2014), 2281--2292.

[3]

L. Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.

Digital Library

[4]

C. J. C. Burges. 2010. Fr. RankNet to LambdaRank to LambdaMART: An Overview.

[5]

B. Barla Cambazoglu, Hugo Zaragoza, Olivier Chapelle, Jiang Chen, Ciya Liao, Zhaohui Zheng, and Jon Degenhardt. 2010. Early exit optimizations for additive machine learned ranking systems. In Proceedings of the third ACM international conference on Web search and data mining. ACM, 411--420.

Digital Library

[6]

Olivier Chapelle and Yi Chang. 2011. Yahoo! Learning to Rank Challenge Overview. In Proceedings of the Yahoo! Learning to Rank Challenge. 1--24.

Digital Library

[7]

Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 785--794.

Digital Library

[8]

T. Chen, L. Tang, Q. Liu, D. Yang, S. Xie, X. Cao, C. Wu, E. Yao, Z. Liu, Z. Jiang, et al. 2012. Combining factorization model and additive forest for collaborative followee recommendation. KDD CUP (2012).

[9]

Gianni Conte, Stefano Tommesani, and Francesco Zanichelli. 2000. The long and winding road to high-performance image processing with MMX/SSE. In Computer Architectures for Machine Perception, 2000. Proceedings. Fifth IEEE International Workshop on. IEEE, 302--310.

Digital Library

[10]

Domenico Dato, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2016. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Transactions on Information Systems (TOIS) 35, 2 (2016), 15.

Digital Library

[11]

Jeffrey Dean. 2014. Large Scale Deep Learning. (2014). https://research.google.com/people/jeff/CIKM-keynote-Nov2014.pdf

[12]

W. Dong, J. Li, R. Yao, C. Li, T. Yuan, and L. Wang. 2016. Characterizing Driving Styles with Deep Learning. arXiv:1607.03611 (2016).

[13]

N. Firasta, M. Buxton, P. Jinbo, K. Nasri, and S. Kuo. 2008. Intel AVX: New frontiers in performance improvements and energy efficiency. White paper (2008).

[14]

Jerome H. Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.

[15]

Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et al. 2014. Practical lessons from predicting clicks on ads at facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1--9.

Digital Library

[16]

Peter Hilton and Jean Pedersen. 1991. Catalan numbers, their generalization, and their uses. The Mathematical Intelligencer 13, 2 (1991), 64--75.

[17]

Xin Jin, Tao Yang, and Xun Tang. 2016. A comparison of cache blocking methods for fast execution of ensemble-based score computation. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 629--638.

Digital Library

[18]

Rie Johnson and Tong Zhang. 2014. Learning nonlinear functions using regularized greedy forest. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2014), 942--954.

[19]

Pat Langley and Stephanie Sage. 1994. Oblivious decision trees and abstract cases. In Working notes of the AAAI-94 workshop on case-based reasoning. Seattle, WA, 113--117.

[20]

Xiaoliang Ling, Weiwei Deng, Chen Gu, Hucheng Zhou, Cui Li, and Feng Sun. 2017. Model Ensemble for Click Prediction in Bing Search Ads. In Proceedings of the 26th International Conference on World Wide Web Companion. 689--698.

Digital Library

[21]

Yin Lou and Mikhail Obukhov. 2017. BDT: Gradient Boosted Decision Tables for High Accuracy and Scoring Efficiency. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1893--1901.

Digital Library

[22]

Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Salvatore Trani. 2016. Post-learning optimization of tree ensembles for efficient ranking. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 949--952.

Digital Library

[23]

Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2015. Quickscorer: A fast algorithm to rank documents with additive ensembles of regression trees. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 73--82.

Digital Library

[24]

Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2016. Exploiting CPU SIMD extensions to speed-up document scoring with tree ensembles. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 833--836.

Digital Library

[25]

B. Panda, J. S. Herbach, S. Basu, and R. J. Bayardo. 2009. PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce. In VLDB.

Digital Library

[26]

Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2017. CatBoost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516 (2017).

[27]

Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 Datasets. CoRR abs/1306.2597 (2013). http://arxiv.org/abs/1306.2597

[28]

A. H. Robinson and Colin Cherry. 1967. Results of a prototype television bandwidth compression scheme. Proc. IEEE 55, 3 (1967), 356--364.

[29]

Dan Romik. 2000. Stirling's approximation for n!: The ultimate short proof? The American Mathematical Monthly 107, 6 (2000), 556.

[30]

Schigehiko Schamoni. 2012. Ranking with Boosted Decision Trees. (2012). http://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/materials/Schamoni_boosteddecisiontrees.pdf

[31]

Xun Tang, Xin Jin, and Tao Yang. 2014. Cache-conscious runtime optimization for ranking ensembles. In Proceedings of the 37th international ACM SIGIR conference on Research &development in information retrieval. ACM, 1123--1126.

Digital Library

[32]

S. Thakkur and T. Huff. 1999. Internet streaming SIMD ext. Computer (1999).

Digital Library

[33]

Stephen Tyree, Kilian Q. Weinberger, Kunal Agrawal, and Jennifer Paykin. 2011. Parallel Boosted Regression Trees for Web Search Ranking (WWW). 387--396.

Digital Library

[34]

F. Yan, Y. He, O. Ruwase, and E. Smirni. 2016. SERF: Efficient Scheduling for Fast Deep Neural Network Serving via Judicious Parallelism. Supercomputing (2016).

Digital Library

[35]

Jie Zhu, Ying Shan, J. C. Mao, Dong Yu, Holakou Rahmanian, and Yi Zhang. 2017. Deep embedding forest: Forest-based serving with deep embedding features. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1703--1711.

Digital Library

Cited By

Blockeel HDevos LFrénay BNanfack GNijssen S(2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
https://doi.org/10.3389/frai.2023.1124553
Guan HMasood SDwarampudi MGunda VMin HYu LNag SZou J(2023)A Comparison of End-to-End Decision Forest Inference PipelinesProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624656(200-215)Online publication date: 30-Oct-2023
https://dl.acm.org/doi/10.1145/3620678.3624656
Van Lunteren J(2023)Accelerating Decision-Tree-Based Inference Through Adaptive Parallelization2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT58117.2023.00023(176-186)Online publication date: 21-Oct-2023
https://doi.org/10.1109/PACT58117.2023.00023
Show More Cited By

Index Terms

RapidScorer: Fast Tree Ensemble Evaluation by Maximizing Compactness in Data Level Parallelization
1. Computing methodologies
  1. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms
      1. Optimization algorithms
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank
  2. World Wide Web
    1. Online advertising
      1. Sponsored search advertising
    2. Web searching and information discovery
      1. Content ranking
      2. Web search engines

Recommendations

Hyper-systolic algorithms for N-body computations and parallel level-3 BLAS libraries
Special issue on parallelization techniques for numerical modelling

Hyper-systolic algorithms represent a new class of parallel computing structures. Because of their regular communication and compute patterns they are well suited for implementation on most parallel architectures, in particular, high performance SIMD ...
X-CLEaVER: Learning Ranking Ensembles by Growing and Pruning Trees
Regular Papers

Learning-to-Rank (LtR) solutions are commonly used in large-scale information retrieval systems such as Web search engines, which have to return highly relevant documents in response to user query within fractions of seconds. The most effective LtR ...
Performance evaluation of PCA based reduced features of leaf images extracted by DWT using random Forest and XGBoost classifier
Abstract
The leaf disease classification is a method for putting diseases into groups based on their properties, like texture, shape, and color. Even though DL features are very good at classifying leaf diseases, some authors focused on handcrafted ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2018

2925 pages

ISBN:9781450355520

DOI:10.1145/3219819

General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '18

Sponsor:

KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 19 - 23, 2018

London, United Kingdom

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
648
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)8

Reflects downloads up to 20 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Blockeel HDevos LFrénay BNanfack GNijssen S(2023)Decision trees: from efficient prediction to responsible AIFrontiers in Artificial Intelligence10.3389/frai.2023.11245536Online publication date: 26-Jul-2023
https://doi.org/10.3389/frai.2023.1124553
Guan HMasood SDwarampudi MGunda VMin HYu LNag SZou J(2023)A Comparison of End-to-End Decision Forest Inference PipelinesProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624656(200-215)Online publication date: 30-Oct-2023
https://dl.acm.org/doi/10.1145/3620678.3624656
Van Lunteren J(2023)Accelerating Decision-Tree-Based Inference Through Adaptive Parallelization2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT58117.2023.00023(176-186)Online publication date: 21-Oct-2023
https://doi.org/10.1109/PACT58117.2023.00023
Busolin FLucchese CNardini FOrlando SPerego RTrani S(2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3331088
Chen KSu CHakert CBuschjäger SLee CLee JMorik KChen J(2022)Efficient Realization of Decision Trees for Real-Time InferenceACM Transactions on Embedded Computing Systems10.1145/350801921:6(1-26)Online publication date: 18-Oct-2022
https://dl.acm.org/doi/10.1145/3508019
Dato DMacAvaney SNardini FPerego RTonellotto NAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)The Istella22 DatasetProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531740(3099-3107)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531740
Nardini FRulli CTrani SVenturini R(2022)Distilled Neural Networks for Efficient Learning to RankIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3152585(1-1)Online publication date: 2022
https://doi.org/10.1109/TKDE.2022.3152585
Hakert CChen KChen J(2022)Immediate Split Trees: Immediate Encoding of Floating Point Split Values in Random ForestsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26419-1_32(531-546)Online publication date: 19-Sep-2022
https://dl.acm.org/doi/10.1007/978-3-031-26419-1_32
Gil-Costa VLoor FMolina RNardini FPerego RTrani S(2022)Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAsAdvances in Information Retrieval10.1007/978-3-030-99736-6_18(260-273)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99736-6_18
Schleich MGeng ZZhang YSuciu D(2021)GeCoProceedings of the VLDB Endowment10.14778/3461535.346155514:9(1681-1693)Online publication date: 22-Oct-2021
https://dl.acm.org/doi/10.14778/3461535.3461555
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents