Abstract
The central point of attention for this paper is weight trimming — a technique known for speeding up boosted learning procedures. The loss of accuracy introduced by the technique is typically negligible. Recently, an elegant algorithm has been proposed by Appel et al.: it applies weight trimming under AdaBoost, prunes some features using a special error bound, but simultanouesly guarantees the same outcome (ensemble of trees with exactly the same parameters) as if with no trimming. Thus, no loss of training accuracy occurs. In this paper, we supplement the idea by Appel with a suitable extension for real-boosting. We prove that this approach gives the same outcome guarantees, both for stumps and trees. Additionally, we analyze the complexity of Appel’s idea and we show that in some cases it may lead to computational losses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
For simplification, assume data indexes mapped to successive integers, so that there are no indexing holes and therefore \(m_j-m_{j-1}\) differences reflect sizes of portions.
References
Appel, R., et al.: Quickly boosting decision trees – pruning underachieving features early. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), vol. 28, pp. 594–602. JMLR Workshop and Conference Proceedings (2013)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156. Morgan Kaufman, San Francisco (1996)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Sci. Syst. Sci. 55, 119–139 (1997)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Stat. 28(2), 337–407 (2000)
Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics: A Foundation for Computer Science, 2nd edn. Addison-Wesley Longman Publishing Co. Inc, Boston (1994)
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5, 197–227 (1990)
Schapire, R.E., Singer, Y.: Improved boosting using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Proof of Outcome Guarantee for ‘quick’ Tree Growing Procedure with Exponential Impurity
A Proof of Outcome Guarantee for ‘quick’ Tree Growing Procedure with Exponential Impurity
Proof
(Theorem 2 ). First, let us define the Gini error (for the n-subset):
where \(Z_{\rho _l}^{+}=\sum _{i\in \rho _l}w_i[y=+1]\), \(Z_{\rho _l}^{-}=\sum _{i\in \rho _l}w_i[y=-1]\) are probability masses for classes. Let us write two representations (22), (23) for the products — mass times Gini error — that will be useful later on:
We now write out the exponential error and show its connection to Gini error.
We aim at showing that , which means that if one removes from a leaf the examples that are not in the m-subset but keeps the tree parameters fixed then the mass times error product must decrease or stay unchanged. We remind that \(\rho _l=u_l\cup \bar{u_l}\) and \(n>m\), see back to definitions (14). Let us observe the square of using a representation from (25) for the leaf error.
Note the similarity of the last expression to a Gini representation (22).
We shall now expand (26) taking advantage of the following lemma (for straightforward algebraic proof see [1]) true for either class label \(y\in \{-1,+1\}\):
Therefore, we have:
The equality pass comes from grouping odd and even terms using representations (22). Hence, finally: . \(\quad \Box \)
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Klęsk, P. (2016). Quick Real-Boost with: Weight Trimming, Exponential Impurity, Bins, and Pruning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2016. Lecture Notes in Computer Science(), vol 9692. Springer, Cham. https://doi.org/10.1007/978-3-319-39378-0_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-39378-0_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39377-3
Online ISBN: 978-3-319-39378-0
eBook Packages: Computer ScienceComputer Science (R0)