Training Set Expansion in Handwritten Character Recognition

Javier Cano⁵,
Juan-Carlos Perez-Cortes⁵,
Joaquim Arlandis⁵ &
…
Rafael Llobet⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2396))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

5482 Accesses

Abstract

In this paper, a process of expansion of the training set by synthetic generation of handwritten uppercase letters via deformations of natural images is tested in combination with an approximate k-Nearest Neighbor (k-NN) classifier. It has been previously shown [11] [10] that approximate nearest neighbors search in large databases can be successfully used in an OCR task, and that significant performance improvements can be consistently obtained by simply increasing the size of the training set. In this work, extensive experiments adding distorted characters to the training set are performed, and the results are compared to directly adding new natural samples to the set of prototypes.

Work partially supported by the Spanish CICYT under grant TIC2000-1703-CO3-01

Download to read the full chapter text

Chapter PDF

Using a Synthetic Character Database for Training Deep Learning Models Applied to Offline Handwritten Recognition

How Much Do Synthetic Datasets Matter in Handwritten Text Recognition?

A novel framework for generating handwritten datasets

Article 13 November 2020

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Wu. An optimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45:891–923, 1998.
Article MATH MathSciNet Google Scholar
J. L. Bentley, B. W. Weide, and A. C. Yao. Optimal expected time algorithms for closest point problems. ACM Trans. on Math. Software, 6:563–580, 1980.
Article MATH MathSciNet Google Scholar
P. A. Devijver and J. Kittler. On the edited nearest neighbour rule. In Proceedings of the 5th International Conference on Pattern Recognition, pages 72–80. IEEE Computer Society Press, Los Alamitos, CA, 1980.
Google Scholar
J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm finding best matches in logarithmic expected time. ACM Trans. Math. Software, 3:209–226, 1977.
Article MATH Google Scholar
T. M. Ha and H. Bunke. Off-line, handwritten numeral recognition by perturbation method. IEEE Trans. on PAMI, 19(5):535–539, May 1997.
Google Scholar
P. E. Hart. The condensed nearest neighbor rule. IEEE Trans. on Information Theory, 125:515–516, 1968.
Article Google Scholar
A. Jain. Object matching using deformable templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18:268–273, 1996.
Google Scholar
B. S. Kim and S. B. Park. A fast k nearest neighbor finding algorithm based on the ordered partition. IEEE Trans. on PAMI, 8:761–766, 1986.
MATH Google Scholar
Jianchang Mao. Improving ocr performance using character degradation models and boosting algorithm. Pattern Recognition Letters, 18:1415–1419, 1997.
Article Google Scholar
J. C. Perez-Cortes, J. Arlandis Arlandis, and R. Llobet. Fast and accurate handwritten character recognition using approximate nearest neighbours search on large databases. In Workshop on Statistical Pattern Recognition SPR-2000, Alicante (Spain), 2000.
Google Scholar
S. J. Smith. Handwritten character classification using nearest neighbor in large databases. IEEE Trans. on PAMI, 16(9):915–919, September 1994.
Google Scholar
D. L. Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. on Systems, Man and Cybernetics, 2:408–420, 1972.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Tecnologico de Informatica, Universidad Politecnica de Valencia, Camino de Vera, s/n, 46071, Valencia, Spain
Javier Cano, Juan-Carlos Perez-Cortes, Joaquim Arlandis & Rafael Llobet

Authors

Javier Cano
View author publications
You can also search for this author in PubMed Google Scholar
Juan-Carlos Perez-Cortes
View author publications
You can also search for this author in PubMed Google Scholar
Joaquim Arlandis
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Llobet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computing Science, University of Alberta, Athabasca Hall, Room 409, Edmonton, Alberta, Canada, T6G 2H1
Terry Caelli
School of Computer Science and Engineering, University of New South Wales, Sydney, 2052, NSW, Australia
Adnan Amin
Dept. of Applied Physics Pattern Recognition Group, Delft University of Technology, Lorentzweg 1, 2628 CJ, Delft, The Netherlands
Robert P. W. Duin & Dick de Ridder &
Dept. of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cano, J., Perez-Cortes, JC., Arlandis, J., Llobet, R. (2002). Training Set Expansion in Handwritten Character Recognition. In: Caelli, T., Amin, A., Duin, R.P.W., de Ridder, D., Kamel, M. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2002. Lecture Notes in Computer Science, vol 2396. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-70659-3_57

Download citation

DOI: https://doi.org/10.1007/3-540-70659-3_57
Published: 21 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44011-6
Online ISBN: 978-3-540-70659-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)