Abstract
In this paper, a process of expansion of the training set by synthetic generation of handwritten uppercase letters via deformations of natural images is tested in combination with an approximate k-Nearest Neighbor (k-NN) classifier. It has been previously shown [11] [10] that approximate nearest neighbors search in large databases can be successfully used in an OCR task, and that significant performance improvements can be consistently obtained by simply increasing the size of the training set. In this work, extensive experiments adding distorted characters to the training set are performed, and the results are compared to directly adding new natural samples to the set of prototypes.
Work partially supported by the Spanish CICYT under grant TIC2000-1703-CO3-01
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Wu. An optimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45:891–923, 1998.
J. L. Bentley, B. W. Weide, and A. C. Yao. Optimal expected time algorithms for closest point problems. ACM Trans. on Math. Software, 6:563–580, 1980.
P. A. Devijver and J. Kittler. On the edited nearest neighbour rule. In Proceedings of the 5th International Conference on Pattern Recognition, pages 72–80. IEEE Computer Society Press, Los Alamitos, CA, 1980.
J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm finding best matches in logarithmic expected time. ACM Trans. Math. Software, 3:209–226, 1977.
T. M. Ha and H. Bunke. Off-line, handwritten numeral recognition by perturbation method. IEEE Trans. on PAMI, 19(5):535–539, May 1997.
P. E. Hart. The condensed nearest neighbor rule. IEEE Trans. on Information Theory, 125:515–516, 1968.
A. Jain. Object matching using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18:268–273, 1996.
B. S. Kim and S. B. Park. A fast k nearest neighbor finding algorithm based on the ordered partition. IEEE Trans. on PAMI, 8:761–766, 1986.
Jianchang Mao. Improving ocr performance using character degradation models and boosting algorithm. Pattern Recognition Letters, 18:1415–1419, 1997.
J. C. Perez-Cortes, J. Arlandis Arlandis, and R. Llobet. Fast and accurate handwritten character recognition using approximate nearest neighbours search on large databases. In Workshop on Statistical Pattern Recognition SPR-2000, Alicante (Spain), 2000.
S. J. Smith. Handwritten character classification using nearest neighbor in large databases. IEEE Trans. on PAMI, 16(9):915–919, September 1994.
D. L. Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. on Systems, Man and Cybernetics, 2:408–420, 1972.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cano, J., Perez-Cortes, JC., Arlandis, J., Llobet, R. (2002). Training Set Expansion in Handwritten Character Recognition. In: Caelli, T., Amin, A., Duin, R.P.W., de Ridder, D., Kamel, M. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2002. Lecture Notes in Computer Science, vol 2396. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-70659-3_57
Download citation
DOI: https://doi.org/10.1007/3-540-70659-3_57
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44011-6
Online ISBN: 978-3-540-70659-5
eBook Packages: Springer Book Archive