Abstract
The generalization problem of an artificial neural network (ANN) classifier with unlimited size of training sample, namely asymptotic optimization in probability, is discussed in this paper. As an improved ANN network model, the pre-edited ANN classifier shows better practical performance than the standard one. However, it has not been widely applied due to the absence of the related theoretical support. To further promote its application in practice, the asymptotic optimization of the pre-edited ANN classifier is studied in this paper. To help study ANN asymptotic optimization in probability, we gives a review of the previous research works on asymptotic optimization in probability of non-parametric classifier, and grouped the main methods into four classes: two-step method, one-step method, generalization method and hypothesis method. In this paper, we adopt generalization/hypothesis mixed method to prove that pre-edited ANN is asymptotically optimal in probability. Furthermore, a simulation is presented to provide an experimental support for our theoretical work.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bourlard H, Wellekens CJ (1990) Links between Markov models and multilayer perceptrons. IEEE Trans Pattern Anal Mach Intell 12(12):1167–1178
Chi PY, Van Byzin J (1977) A simple histogram method for nonparametric classification. In: Classification and Clustering, Academic press, New York, pp 395–421
Choi S-H, Rockett P (2002) The training of neural classifiers with condensed datasets. IEEE Trans Syst Man Cybern Part B 32(2):202–206
Cid-Sueiro J, Arribas J, Urban-Munoz S, Figueiras-Vidal A (1999) Cost functions to estimate a posteriori probabilities in multiclass problems. IEEE Trans Neural Netw 10(3):645–656
El-Jaroudi A, Makhoul J (1990) A new error criterion for posterior probability estimation with neural nets. Proc Int Joint Conf Neural Net 3:185–192
Ferri FJ, Albert JV, Vidal E (1999) Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules. IEEE Trans Syst Man Cybern Part B Cybern 29(4):667–672
George NK (2000) On overfitting, generalization, and randomly expanded training sets. IEEE Trans Neural Netw 11(5):1050–1057
Gish H (1990) A probabilistic approach to the understanding and training of neural network classifiers. In: Proceedings IEEE international conference on acoustics, speech and signal processing, pp 1361–1364
Hampshire JB II, Pearlmutter BA (1990) Equivalence proofs for multi-layer perceptron classifiers and the bayesian discriminant function. In: Proceedings of the 1990 connectionist models summer school, pp 159–172
Hara K, Nakayama K, Kharaf AAM (1998) A training data selection in on-line training for multilayer neural networks. In: Proceedings of IEEE international joint conference on neural networks, pp 2247–2252
Kwok T-Y, Yeung D-Y (1999) Constructive algorithms for structure learning in feedforward neural networks for regression problems. IEEE Trans Neural Netw 8(3):630–645
Li H, Wang C, Yuan B (2003) An improved SVM: NN-SVM. Chin J Comput 26(8):1015–1020 (in Chinese)
Miller JW, Goodman R, Smyth P (1993) On loss functions which minimize to conditional expected values and posterior probabilities. IEEE Trans Inf Theory 39(4):1404–1408
Ney H (1995) On the probabilistic interpretation of neural network classifiers and discriminative training criteria. IEEE Trans Pattern AnalMach Intell 17(2):107–119
Prechelt L (1998) Automatic early stopping using cross validation: quantifying the criteria. Neural Netw 11(4):761–767
Reed R (1993) Pruning algorithm—A survey. IEEE Trans Neural Netw 4(5):740–747
Ruck DW, Rogers SK, Kabrisky M, Oxley ME, Suter BW (1990) The multilayer perceptron as an approximation to a bayes optimal discriminant function. IEEE Trans Neural Netw 1(4):296–298
Saerens M (2000) Building cost functions minimizing to some summary statistics. IEEE Trans Neural Netw 11(6):1263–1271
Shoemaker PA (1991) A note on least-squares learning procedures and classification by neural network models. IEEE Trans Neural Netw 2(1):158–160
Sigurdsson S, Larsen J, Hansen LK (2001) On comparison of adaptive regularization methods. In: Proceedings of the IEEE workshop on neural network for signal processing X, pp 221–230
Stone CJ (1977) Consistent nonparametic regression. Ann Stat 5:595–620
Wagner TJ (1973) Convergence of the edited nearest neighbor. IEEE Trans Inf Theory 19:696–697
Wan EA (1990) Neural network classification: a Bayesian interpretation. IEEE Trans Neural Netw 1(4):303–305
Wang Q (1986a) Statistically equivalent block (SEB) classifier: its asymptotic property and prospects. Acta Math Sci 4:387–403
Wang Q (1986b) k-NN pre-editing design of tree classifier. Sci Sinica Ser A 11:1223–1232 (in Chinese)
Wang Q (1988) A CNN classification design with boundary patching. Acta Autom Sinica 14(2):106–111 (in Chinese)
Wang C, Principe JC (1999) Training neural networks with additive noise in the desired signal. IEEE Trans Neural Netw 10(6):1511–1517
Wang K, Yang J, Shi G, Wang Q (2008) An expanded training set based validation method to avoid overfitting for neural network classifer. In: The 4th international conference on natural computation vol 3, pp 83–87
Wilson DL (1972) Asymptotic property of NN-rules using edited data. IEEE Trans Syst Man Cybern 2:408–421
Young S, Downs T (1998) CARVE—A constructive algorithm for real-valued examples. IEEE Trans Neural Netw 9(6):1180–1190
Zhang T (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann Stat 32(1):56–85
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, K., Yang, J., Shi, G. et al. The asymptotic optimization of pre-edited ANN classifier. Soft Comput 13, 1153–1161 (2009). https://doi.org/10.1007/s00500-009-0422-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-009-0422-4