The Impact of Large Training Sets on the Recognition Rate of Off-line Japanese Kanji Character Classifiers

Ondrej Velek⁶ &
Masaki Nakagawa⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2423))

Included in the following conference series:

International Workshop on Document Analysis Systems

1123 Accesses
2 Citations

Abstract

Though it is commonly agreed that increasing the training set size leads to improved recognition rates, the deficit of publicly available Japanese character pattern databases prevents us from verifying this assumption empirically for large data sets. Whereas the typical number of training samples has usually been between 100-200 patterns per category until now, newly collected databases and increased computing power allows us to experiment with a much higher number of samples per category. In this paper, we experiment with off-line classifiers trained with up to 1550 patterns for 3036 categories respectively. We show that this bigger training set size indeed leads to improved recognition rates compared to the smaller training sets normally used.

Download to read the full chapter text

Chapter PDF

Feature Selection for Recognition of Online Handwritten Bangla Characters

Article 27 February 2019

A Fast and Efficient K-Nearest Neighbor Classifier Using a Convex Envelope

On developing complete character set Meitei Mayek handwritten character database

Article 17 February 2021

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

T. Kawatani, H. Shimizu; Handwritten Kanji Recognition with the LDA Method, Proc. 14th ICPR, Brisbane, 1998, Vol.II, pp.1031–1035
Google Scholar
M. Nakagawa, et al., On-line character pattern database sampled in a sequence of sentences without any writing instructions, Proc. 4th ICDAR, 1997, pp.376–380.
Google Scholar
S. Jaeger, M. Nakagawa, Two on-line Japanese character databases in Unipen format, Proc. 6th ICDAR, Seattle, 2001, pp.566–570.
Google Scholar
K. Matsumoto, T. Fukushima, M. Nakagawa, Collection and analysis of on-line handwritten Japanese character patterns, Proc. 6th ICDAR, Seattle, 2001, pp.496–500.
Google Scholar
O. Velek, Ch. Liu, M. Nakagawa, Generating Realistic Kanji Character Images from Online Patterns, Proc. 6th ICDAR, pp.556–560, 2001
Google Scholar
O. Velek, Ch. Liu, S. Jaeger, M. Nakagawa, An Improved Approach to Generating Realistic Kanji Character Images and its Effect to Improve Off-line Recognition Performance, accepted for ICPR 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

Tokyo University of Agr. & Tech, 2-24-16 Naka-cho, Koganei-shi, 184-8588, Tokyo, Japan
Ondrej Velek & Masaki Nakagawa

Authors

Ondrej Velek
View author publications
You can also search for this author in PubMed Google Scholar
Masaki Nakagawa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bell Labs, Lucent Technologies, 600 Mountain Avenue, 07974, Murray Hill, NJ, USA
Daniel Lopresti
Avaya Labs Research, 233 Mount Airy Road, 07920, Basking Ridge, NJ, USA
Jianying Hu & Ramanujan Kashi &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Velek, O., Nakagawa, M. (2002). The Impact of Large Training Sets on the Recognition Rate of Off-line Japanese Kanji Character Classifiers. In: Lopresti, D., Hu, J., Kashi, R. (eds) Document Analysis Systems V. DAS 2002. Lecture Notes in Computer Science, vol 2423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45869-7_13

Download citation

DOI: https://doi.org/10.1007/3-540-45869-7_13
Published: 09 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44068-0
Online ISBN: 978-3-540-45869-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

The Impact of Large Training Sets on the Recognition Rate of Off-line Japanese Kanji Character Classifiers

Abstract

Chapter PDF

Similar content being viewed by others

Feature Selection for Recognition of Online Handwritten Bangla Characters

A Fast and Efficient K-Nearest Neighbor Classifier Using a Convex Envelope

On developing complete character set Meitei Mayek handwritten character database

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

The Impact of Large Training Sets on the Recognition Rate of Off-line Japanese Kanji Character Classifiers

Abstract

Chapter PDF

Similar content being viewed by others

Feature Selection for Recognition of Online Handwritten Bangla Characters

A Fast and Efficient K-Nearest Neighbor Classifier Using a Convex Envelope

On developing complete character set Meitei Mayek handwritten character database

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation