Abstract
Designing a feed-forward neural network with optimal topology in terms of complexity (hidden layer nodes and connections between nodes) and training performance has been a matter of considerable concern since the very beginning of neural networks research. Typically, this issue is dealt with by pruning a fully interconnected network with “many” nodes in the hidden layers, eliminating “superfluous” connections and nodes. However the problem has not been solved yet and it seems to be even more relevant today in the context of deep learning networks. In this paper we present a method of direct zero-norm minimization for pruning while training a Multi Layer Perceptron. The method employs a cooperative scheme using two swarms of particles and its purpose is to minimize an aggregate function corresponding to the total risk functional. Our discussion highlights relevant computational and methodological issues of the approach that are not apparent and well defined in the literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Norgaard, M.: Neural Network Based System Identification Toolbox, version 2. Technical report, 00-E-891, Dept. of Automation, Technical University of Denmark (2000)
Stepniewski, S.W., Keane, A.J.: Topology Design of Feedforward Neural Networks by Genetic Algorithms. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 771–780. Springer, Heidelberg (1996)
Pinkus, A.: Approximation theory of the MLP model in neural model. Acta Numerica, 143–195 (1999)
Jones, L.K.: A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training. The Annals of Statistics 20, 601–613 (1992)
Barron, A.R.: Universal approximation bounds for superposition of a sigmoidal function. IEEE Trans. Inform. Theory 39, 930–945 (1993)
Kůrková, V., Kainen, P.C., Kreinovich, V.: Estimates of the number of hidden units and variation with respect to half-spaces. Neural Networks 10, 1061–1068 (1997)
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Networks 4, 251–257 (1991)
Reed, R.: Pruning algorithms - A Survey. IEEE Trans. Neural Networks 4, 740–747 (1993)
Tikhonov, A.N., Arsenin, V.Y.: Solution of Ill-posed Problems. W.H. Winston, Washington, DC (1977)
Haykin, S.: Neural networks: A comprehensive Foundation. Prentice-Hall, Upper Saddle River (1999)
Hinton, G.E.: Connectionist learning procedures. Artificial Intelligence 40, 185–234 (1989)
Weigend, A.S., Rumelhart, D.E., Huberman, B.A.: Generalization by weight-elimination with application to forecasting. In: Lippmann, R., Moody, J., Touretzky, D. (eds.) Advances in Neural Information Processing Systems (3), pp. 875–882. Morgan-Kaufmann, San Mateo (1991)
Mozer, M.C., Smolensky, P.: Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems (1), pp. 40–48. Morgan Kaufmann, San Francisco (1989)
Karnin, E.D.: A simple procedure for pruning back-propagation trained neural networks. IEEE Trans. Neural Networks 1, 239–242 (1990)
LeCun, Y., Denker, J.S., Solla, S.A.: Optimal Brain Damage. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems (2), pp. 598–605. Morgan Kaufmann, San Francisco (1990)
Hassibi, B., Stork, D.G.: Second order derivatives for network pruning: Optimal Brain Surgeon. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems (5), pp. 164–172. Morgan-Kaufmann, San Mateo (1993)
Hancock, P.J.B.: Pruning neural networks by genetic algorithm. In: Aleksander, I., Taylor, J.G. (eds.) Proc. of the International Conference on Artificial Neural Networks, pp. 991–994. Elsevier, Brighton (1992)
Whitley, D.: Genetic Algorithms and Neural Networks. Genetic Algorithms in Engineering and Computer Science, pp. 191–201. John Wiley (1995)
Garro, B.A., Sossa, H., Vazquez, R.A.: Design of artificial neural networks using a modified particle swarm optimization algorithm. In: Proc. IEEE International Joint Conference on Neural Networks, Atlanta, pp. 938–945 (2009)
Zhao, L., Qian, F.: Tuning the structure and parameters of a neural network using cooperative binary-real particle swarm optimization. Expert Systems with Applications (2010)
Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.: Use of the zero-norm with linear models and kernel methods. J. Machine Learning Res. 3, 1439–1461 (2003)
Fung, G.M., Mangasarian, O.L., Smola, A.J.: Minimal kernel classifiers. J. Machine Learning Res. 3, 303–321 (2002)
Amaldi, E., Kann, V.: On the approximability of minimizing non zero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 237–260 (1998)
Moody, J.E., Rögnvaldsson, T.: Smoothing regularizers for projective basis function networks. In: Mozer, M., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems (9), pp. 585–591. MIT Press, Denver (1997)
Hanson, S.J., Pratt, L.Y.: Comparing biases for minimal network construction with back-propagation. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems (1), pp. 177–185. Morgan Kaufmann, San Francisco (1989)
Parsopoulos, K.E., Tasoulis, D.K., Vrahatis, M.N.: Multi-objective optimization using parallel vector evaluated particle swarm optimization. In: Proc. of the IASTED International Conference on Artificial Intelligence and Applications (AIA), Innsbruck, vol. 2, pp. 823–828 (2004)
van de Bergh, F., Engelbrecht, A.P.: A cooperative approach to particle swarm optimization. IEEE Trans. Evolutionary Computation 8, 1–15 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Adam, S.P., Magoulas, G.D., Vrahatis, M.N. (2012). Direct Zero-Norm Minimization for Neural Network Pruning and Training. In: Jayne, C., Yue, S., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2012. Communications in Computer and Information Science, vol 311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32909-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-32909-8_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32908-1
Online ISBN: 978-3-642-32909-8
eBook Packages: Computer ScienceComputer Science (R0)