[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3077136.3080725acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

X-DART: Blending Dropout and Pruning for Efficient Learning to Rank

Published: 07 August 2017 Publication History

Abstract

In this paper we propose X-DART, a new Learning to Rank algorithm focusing on the training of robust and compact ranking models. Motivated from the observation that the last trees of MART models impact the prediction of only a few instances of the training set, we borrow from the DART algorithm the dropout strategy consisting in temporarily dropping some of the trees from the ensemble while new weak learners are trained. However, differently from this algorithm we drop permanently these trees on the basis of smart choices driven by accuracy measured on the validation set. Experiments conducted on publicly available datasets shows that X-DART outperforms DART in training models providing the same effectiveness by employing up to 40% less trees.

References

[1]
G. Capannini, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, and N. Tonellotto. 2016. Quality versus efficiency in document scoring with learning-to-rank models. Information Processing & Management Vol. 52, 6 (2016), 1161 -- 1177.
[2]
T. Chen and C. Guestrin 2016. XGBoost: A Scalable Tree Boosting System. In Proc. ACM SIGKDD. ACM, 785--794.
[3]
D. Dato, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini 2016. Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees. ACM Trans. Inf. Syst. Vol. 35, 2, Article bibinfoarticleno15 (2016), bibinfonumpages15:1--15:31 pages.
[4]
J. H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics Vol. 29 (2000), 1189--1232.
[5]
A. Gulin, I. Kuralenok, and D. Pavlov 2011. Winning The Transfer Learning Track of Yahoo!'s Learning To Rank Challenge with YetiRank. Yahoo! Learning to Rank Challenge. 63--76.
[6]
C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, F. Silvestri, and S. Trani. Post-Learning Optimization of Tree Ensembles for Efficient Ranking Proc. ACM SIGIR '16'. ACM, 949--952.
[7]
C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2015. QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees Proc. ACM SIGIR. 73--82.
[8]
K.V. Rashmi and R. Gilad-Bachrach 2015. Dart: Dropouts meet multiple additive regression trees. Journal of Machine Learning Research Vol. 38 (2015).
[9]
M. D. Smucker, J. Allan, and B. Carterette 2007. A Comparison of Statistical Significance Tests for Information Retrieval Evaluation Proc. CIKM. ACM.
[10]
N. Srivastava, G. E Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research Vol. 15, 1 (2014), 1929--1958.
[11]
Q. Wu, C.J.C. Burges, K.M. Svore, and J. Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval (2010).

Cited By

View all
  • (2024)ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657994(3051-3054)Online publication date: 10-Jul-2024
  • (2024)Influence-Balanced XGBoost: Improving XGBoost for Imbalanced Data Using Influence FunctionsIEEE Access10.1109/ACCESS.2024.352015912(193473-193486)Online publication date: 2024
  • (2023)ReNeuIR at SIGIR 2023: The Second Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591922(3456-3459)Online publication date: 19-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
August 2017
1476 pages
ISBN:9781450350228
DOI:10.1145/3077136
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dropout
  2. multiple additive regression trees
  3. pruning

Qualifiers

  • Short-paper

Funding Sources

  • European Commission

Conference

SIGIR '17
Sponsor:

Acceptance Rates

SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657994(3051-3054)Online publication date: 10-Jul-2024
  • (2024)Influence-Balanced XGBoost: Improving XGBoost for Imbalanced Data Using Influence FunctionsIEEE Access10.1109/ACCESS.2024.352015912(193473-193486)Online publication date: 2024
  • (2023)ReNeuIR at SIGIR 2023: The Second Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591922(3456-3459)Online publication date: 19-Jul-2023
  • (2023)Evolving Gradient Boost: A Pruning Scheme Based on Loss Improvement Ratio for Learning Under Concept DriftIEEE Transactions on Cybernetics10.1109/TCYB.2021.310979653:4(2110-2123)Online publication date: Apr-2023
  • (2023)Early Exit Strategies for Learning-to-Rank CascadesIEEE Access10.1109/ACCESS.2023.333108811(126691-126704)Online publication date: 2023
  • (2022)The Istella22 DatasetProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531740(3099-3107)Online publication date: 6-Jul-2022
  • (2022)ReNeuIR: Reaching Efficiency in Neural Information RetrievalProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531704(3462-3465)Online publication date: 6-Jul-2022
  • (2022)Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAsAdvances in Information Retrieval10.1007/978-3-030-99736-6_18(260-273)Online publication date: 5-Apr-2022
  • (2021)Learning Early Exit Strategies for Additive Ranking EnsemblesProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463088(2217-2221)Online publication date: 11-Jul-2021
  • (2020)Query-level Early Exit for Additive Learning-to-Rank EnsemblesProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401256(2033-2036)Online publication date: 25-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media