[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A Visual Analytics Approach to Understanding Gradient Boosting Tree via Click Prediction on Ads

  • Conference paper
  • First Online:
Cooperative Design, Visualization, and Engineering (CDVE 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13492))

  • 654 Accesses

Abstract

As an iterative algorithm consisting of multiple decision trees, gradient boosting decision tree (GBDT) is widely used in problems such as classification and regression prediction. The ensemble decision trees of the algorithm obtain predictive effect by automatically filtering and combining new feature vectors, which contributes to discovering effective feature combinations. However, gradient boosting tree (GBT) is a tedious model, especially the boosting tree approach. It is difficult to interpret the principle of the model due to the characteristic of each tree of the model with weights and the unique structural properties of each decision tree, which is a challenge in many fields that require high interpretation such as financial risk control. In this paper, we design an interactive visual analytic system to solve this problem, to explain the structure and prediction process of the gradient boosting tree model, and to help experts in related fields to perform efficient analysis. We have designed a graphical representation of the feature information and a visual model of the boosting tree to show the basic mechanism of the GBT algorithm in a comprehensive way. The case study is conducted on the dataset of Kaggle competition to prove the effectiveness of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jerome, H.F.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 25, 1189–1232 (2001)

    Google Scholar 

  2. Microsoft LightGBM. https://github.com/Microsoft/LightGBM. Accessed 21 Aug 2021

  3. He, X., et al.: Practical lessons from predicting clicks on ads at Facebook. In: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising (2014)

    Google Scholar 

  4. Liu, S., Xiao, J., Liu, J., Wang, X., Wu, J., Zhu, J.: Visual diagnosis of tree boosting methods. IEEE Trans. Visual. Comput. Graph. 24(1), 163–173 (2017)

    Google Scholar 

  5. Zhao, X., Wu, Y., Lee, D., Cui, W.: iForest: interpreting random forests via visual analytics. IEEE Trans. Visual. Comput. Graph. 25(1), 407–416 (2018)

    Google Scholar 

  6. Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. CRC Press (2012)

    Google Scholar 

  7. Sandulescu, V., Chiru, M.: Predicting the future relevance of research institutions - the winning solution of the KDD Cup 2016. arXiv eprints:1609.02728 (2016)

    Google Scholar 

  8. Cossok, D., Zhang, T.: Statistical analysis of Bayes optimal subset ranking. IEEE Trans. Inform. Theory 54(11), 5140–5154 (2008)

    Google Scholar 

  9. Palczewsk, A., Palczewski, J., Robinson, R.M., Neagu, D.: Interpreting random forest classification models using a feature contribution method. In: Integration of Reusable Systems, pp. 193–218 (2014)

    Google Scholar 

  10. Lipton, Z.C.: The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016)

  11. Guidotti, R., Monreale, A., Turini, F., Pedreschi, D., Giannotti, F.: A survey of methods for explaining black box models. arXiv preprint arXiv:1802.01933 (2018)

  12. Stiglic, G., Mertik, M., Podgorelec, V., Kokol, P.: Using visual interpretation of small ensembles in microarray analysis. In: IEEE International Symposium on Computer-Based Medical Systems, pp. 691–695 (2006)

    Google Scholar 

  13. Furcy, D., Koenig, S.: Limited discrepancy beam search. In: IJCAI, pp. 125–131 (2005)

    Google Scholar 

  14. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)

    Google Scholar 

  15. Paiva, J.G.S., Schwartz, W.R., Pedrini, H., Minghim, R.: An approach to supporting incremental visual data classification. IEEE Trans. Visual. Comput. Graph. 21(1), 4–17 (2015)

    Google Scholar 

  16. Jakulin, A., Mozˇina, M., Demsˇar, J., Bratko, I., Zupan, B.: Nomograms for visualizing support vector machines. In: KDD, pp. 108–117 (2005)

    Google Scholar 

  17. Ren, D., Amershi, S., Lee, B., Suh, J., Williams, J.D.: Squares: supporting interactive performance analysis for multiclass classifiers. IEEE Trans. Visual. Comput. Graph. 23(1), 61–70 (2017)

    Google Scholar 

  18. van den Elzen, S., van Wijk, J.J.: BaobabView: Interactive construction and analysis of decision trees. In: VAST, pp. 151–160 (2011)

    Google Scholar 

  19. Urbanek, S.: Exploring statistical forests. In: Proceedings of the 2002 Joint Statistical Meeting, Springer (2002)

    Google Scholar 

  20. Stiglic, G., Mertik, M., Podgorelec, V., Kokol, P.: Using visual interpretation of small ensembles in microarray analysis. In: Proceedings of the CMBS 2006, pp. 691–695 (2006)

    Google Scholar 

  21. Krause, J., Perer, A., Ng, K.: Interacting with predictions: Visual inspection of black-box machine learning models. In: CHI, pp. 5686– 5697 (2016)

    Google Scholar 

  22. Talbot, J., Lee, B., Kapoor, A., Tan, D.S.: Ensemblematrix: Interactive visualization to support machine learning with multiple classifiers. In: CHI, pp. 1283–1292 (2009)

    Google Scholar 

  23. Kim, B., Rudin, C., Shah, J.A.: The Bayesian case model: a generative approach for case-based reasoning and prototype classification. In: Advances in Neural Information Processing Systems, pp. 1952–1960 (2014)

    Google Scholar 

  24. Click-through rate (CTR). https://www.kaggle.com/c/avazu-ctr-prediction/data. Accessed 24 Aug 2021

  25. Wang, J., Gou, L., Shen, H., Yang, H.: DQNViz: a visual analytics approach to understand deep q-networks. IEEE Trans. Visual. Comput. Graph. 25(1), 288–298 (2019)

    Google Scholar 

  26. Streeb, D., et al.: Task-based visual interactive modeling: decision trees and rule-based classifiers. IEEE Trans. Visual. Comput. Graph. 28, 2207–3323 (2021)

    Google Scholar 

  27. Wang, J., Zhang, W., Wang, L., Yang, H.: Investigating the evolution of tree boosting models with visual analytics. In: 2021 IEEE 14th Pacific Visualization Symposium, pp. 186–195 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiansu Pu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, Z., Cheng, K., Xia, Y., Pu, J., Rao, Y. (2022). A Visual Analytics Approach to Understanding Gradient Boosting Tree via Click Prediction on Ads. In: Luo, Y. (eds) Cooperative Design, Visualization, and Engineering. CDVE 2022. Lecture Notes in Computer Science, vol 13492. Springer, Cham. https://doi.org/10.1007/978-3-031-16538-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16538-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16537-5

  • Online ISBN: 978-3-031-16538-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics