Abstract
We introduce an algorithm for incrementaly constructing a hybrid network fo radial and perceptron hidden units. The algorithm determins if a radial or a perceptron unit is required at a given region of input space. Given an error target, the algorithm also determins the number of hidden units. This results in a final architecture which is often much smaller than an RBF network or a MLP. A benchmark on four classification problems and three regression problems is given. The most striking performance improvement is achieved on the vowel data set [4].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. The Wadsworth Statistics/Probability Series, Belmont, CA, 1984.
J. Buckheit and D. L. Donoho. Improved linear discrimination using time-frequency dictionaries. Technical Report, Stanford University, 1995.
S. Cohen and N. Intrator. A hybrid projection based and radial basis function architecture. In J. Kittler and F. Roli, editors, Proc. Int. Workshop on Multiple Classifier Systems (LNCS1857), pages 147–156, Sardingia, June 2000. Springer.
D. H. Deterding. Speaker Normalisation for Automatic Speech Recognition. PhD thesis, University of Cambridge, 1989.
D. L. Donoho and I. M. Johnstone. Projection-based approximation and a duality with kernel methods. Annals of Statistics, 17:58–106, 1989.
H. Drucker, R. Schapire, and P. Simard. Improving performance in neural networks using a boosting algorithm. In Steven J. Hanson, Jack D. Cowan, and C. Lee Giles, editors, Advances in Neural Information Processing Systems, volume 5, pages 42–49. Morgan Kaufmann, 1993.
S. E. Fahlman and C. Lebiere. The cascade-correlation learning architecture. CMU-CS-90-100, Carnegie Mellon University, 1990.
G. W. Flake. Square unit augmented, radially extended, multilayer percpetrons. In G. B. Orr and K. Müller, editors, Neural Networks: Tricks of the Trade, pages 145–163. Springer, 1998.
J. H. Friedman. Mutltivariate adaptive regression splines. The Annals of Statistics, 19:1–141, 1991.
J. H. Friedman and W. Stuetzle. Projection pursuit regression. Journal of the American Statistical Association, 76:817–823, 1981.
T. Hastie and R. Tibshirani. Generalized additive models. Statistical Science, 1:297–318, 1986.
T. Hastie and R. Tibshirani. Generalized Additive Models. Chapman and Hall, London, 1990.
T. Hastie, R. Tibshirani, and A. Buja. Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association, 89:1255–1270, 1994.
R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3(1):79–87, 1991.
M. I. Jordan and R. A. Jacobs. Hierarchies of adaptive experts. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, editors, Advances in Neural Information Processing Systems, volume 4, pages 985–992. Morgan Kaufmann, San Mateo, CA, 1992.
R. E. Kass and A. E. Raftery. Bayes factors. Journal of The American Statistical Association, 90:773–795, 1995.
Y. C. Lee, G. Doolen, H. H. Chen, G. Z. Sun, T. Maxwell, H.Y. Lee, and C. L. Giles. Machine learning using higher order correlation networks. Physica D, 22:276–306, 1986.
D. J. C. MacKay. Bayesian interpolation. Neural Computation, 4(3):415–447, 1992.
John Moody. Prediction risk and architecture selection for neural networks. In V. Cherkassky, J. H. Friedman, and H. Wechsler, editors, From Statistics to Neural Networks: Theory and Pattern Recognition Applications. Springer, NATO ASI Series F, 1994.
S. J. Nowlan. Soft competitive adaptation: Neural network learning algorithms basd on fitting statistical mixtures. Ph.D. dissertation, Carnegie Mellon University, 1991.
M. J. Orr, J. Hallman, K. Takezawa, A. Murray, S. Ninomiya, M. Oide, and T. Leonard. Combining regression trees and radial basis functions. Division of informatics, Edinburgh University, 1999. Submitted to IJNS.
Gorman R. P. and Sejnowski T. J. Analysis of hidden units in a layered network trained to classify sonar targets. Neural Network, pages 75–89, 1988. Vol. 1.
A. J. Robinson. Dynamic Error Propogation Networks. PhD thesis, University of Cambridge, 1989.
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal representations by error propagation. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, volume 1, pages 318–362. MIT Press, Cambridge, MA, 1986.
C. J. Stone. The dimensionality reduction principle for generalized additive models. The Annals of Statistics, 14:590–606, 1986.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cohen, S., Intrator, N. (2001). Automatic Model Selection in a Hybrid Perceptron/Radial Network. In: Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2001. Lecture Notes in Computer Science, vol 2096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48219-9_44
Download citation
DOI: https://doi.org/10.1007/3-540-48219-9_44
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42284-6
Online ISBN: 978-3-540-48219-2
eBook Packages: Springer Book Archive