More Web Proxy on the site http://driver.im/

research-article

Supervised Template Estimation for Document Image Decoding

Authors:

Mauricio LomelinAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 19, Issue 12

Pages 1313 - 1324

https://doi.org/10.1109/34.643891

Published: 01 December 1997 Publication History

Abstract

An approach to supervised training of character templates from page images and unaligned transcriptions is proposed. The template training problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding framework. This leads to a three-phase iterative training algorithm consisting of transcription alignment, aligned template estimation (ATE), and channel estimation steps. The maximum likelihood ATE problem is shown to be NP-complete and, thus, an approximate solution approach is developed. An evaluation of the training procedure in a document-specific decoding task, using the University of Washington UW-II database of scanned technical journal articles, is described.

References

[1]

Adobe Systems Inc., PostScript Language Reference Manual, 2nd ed. Reading, Mass.: Addison-Wesley, 1990.

[2]

Document Recognition: Proc. SPIE, vol. 2,181, pp. 106-115, 1994.

[3]

California Dept. of Water Resources, General Comparison of Water District Acts, Bulletin 155-94, Mar. 1994.

[4]

F. Chen, D. Bloomberg, and L. Wilcox, "Spotting Phrases in Lines of Imaged Text," L. Vincent and H. Baird eds., Document Recognition II: Proc. SPIE, vol. 2,422, pp. 256-269, 1995.

[5]

T. Fruchterman, "DAFS: A Standard for Document and Image Understanding," Proc. 1995 Symp. Document Image Understanding Technology, pp. 4-100,Bowie, Md., Oct.24-25 1995.

[6]

J. Hopcroft and J. Ullman, Introduction to Automata Theory, Languages and Computation.Reading, Mass.: Addison-Wesley, 1979.

[7]

J. Hull, "Recognition of Mathematics Using a Two-Dimensional Trainable Context-Free Grammar," MEng thesis, Massachusetts Inst. Tech nology, Cambridge, Mass., June 1996.

[8]

A. Kam and G. Kopec, "Separable Source Models for Document Image Decoding," L. Vincent and H. Baird, eds., Document Recognition II: Proc. SPIE, vol. 2,422, pp. 84-97, 1995.

[9]

IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 9, pp. 945-950, Sept. 1996.

Digital Library

[10]

Proc. Second Int'l Conf. Document Analysis and Recognition,Tsukuba Science City, Japan, Oct.20-22 1993.

[11]

IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 6, pp. 602-617, June 1994.

Digital Library

[12]

G. Kopec and M. Lomelin, "Document-Specific Character Template Estimation," L. Vincent and J. Hull, eds., Document Recognition III: Proc. SPIE, vol. 2,660, pp. 14-26, 1996.

[13]

G. Kopec, "Document Image Decoding in the Berkeley Digital Library Project," L. Vincent and J. Hull, eds., Document Recognition III: Proc. SPIE, vol. 2,660, pp. 2-13, 1996.

[14]

G. Kopec, "Multilevel Character Templates for Document Image Decoding," L. Vincent and J. Hull, eds., Document Recognition IV: Proc. SPIE, vol. 3,027, 1997.

[15]

IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 8, pp. 842-848, Aug. 1994.

Digital Library

[16]

C. Papadimitriou and K. Steiglitz, Combinatorial Optimization.Englewood Cliffs, N.J.: Prentice Hall, 1982.

[17]

I. Phillips, S. Chen, J. Ha, and R. Haralick, Reference Manual for the UW English/Japanese Document Image Database II, Version 2.01, ISL report, Dept. of Electrical Eng., Univ. of Washington, Seattle, Mar.8 1995.

[18]

L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition.Englewood Cliffs, N.J.: Prentice Hall, 1993.

[19]

R. Rubenstein, Digital Typography.Reading, Mass.: Addison-Wesley, 1988.

[20]

H. Stabler, "Experiences With High-Volume, High-Accuracy Document Capture," L. Spitz and A. Dengel, eds., Document Analysis Systems.Singapore: World Scientific Publishing, 1995.

Cited By

Vlachou-Efstathiou MSiglidis IStutzmann DAubry M(2024)An Interpretable Deep Learning Approach for Morphological Script Type AnalysisDocument Analysis and Recognition – ICDAR 2024 Workshops10.1007/978-3-031-70642-4_1(3-21)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-70642-4_1
Shaus ATurkel E(2017)Towards Letter Shape Prior and Paleographic Tables Estimation in Hebrew First Temple Period OstracaProceedings of the 4th International Workshop on Historical Document Imaging and Processing10.1145/3151509.3151511(13-18)Online publication date: 10-Nov-2017
https://dl.acm.org/doi/10.1145/3151509.3151511
Mori M(2003)Video text recognition using feature compensation as category-dependent feature extractionProceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 210.5555/938980.939589Online publication date: 3-Aug-2003
https://dl.acm.org/doi/10.5555/938980.939589
Show More Cited By

Index Terms

Supervised Template Estimation for Document Image Decoding

Recommendations

Toward Part-Based Document Image Decoding
DAS '12: Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems

Document image decoding (DID) is a trial to understand the contents of a whole document without any reference information about font, language, etc. Typically, DID approaches assume the correct segmentation of the document and some a priori knowledge ...
Why multiple document image binarizations improve OCR
HIP '13: Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing

Our previous work has shown that the error correction of optical character recognition (OCR) on degraded historical machine-printed documents is improved with the use of multiple information sources and multiple OCR hypotheses including from multiple ...
Document Image Decoding Using Markov Source Models

Document image decoding (DID) is a communication theory approach to document image recognition. In DID, a document recognition problem is viewed as consisting of three elements: an image generator, a noisy channel and an image decoder. A document image ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 19, Issue 12

December 1997

109 pages

ISSN:0162-8828

Editor:
Rangachar Kasturi
Pennsylvania State Univ., University Park

Issue’s Table of Contents

Copyright © Copyright © 1997 IEEE. All Rights Reserved.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 December 1997

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vlachou-Efstathiou MSiglidis IStutzmann DAubry M(2024)An Interpretable Deep Learning Approach for Morphological Script Type AnalysisDocument Analysis and Recognition – ICDAR 2024 Workshops10.1007/978-3-031-70642-4_1(3-21)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-70642-4_1
Shaus ATurkel E(2017)Towards Letter Shape Prior and Paleographic Tables Estimation in Hebrew First Temple Period OstracaProceedings of the 4th International Workshop on Historical Document Imaging and Processing10.1145/3151509.3151511(13-18)Online publication date: 10-Nov-2017
https://dl.acm.org/doi/10.1145/3151509.3151511
Mori M(2003)Video text recognition using feature compensation as category-dependent feature extractionProceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 210.5555/938980.939589Online publication date: 3-Aug-2003
https://dl.acm.org/doi/10.5555/938980.939589
Sarkar PBaird HZhang X(2003)Training on Severely Degraded Text-Line ImagesProceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 110.5555/938979.939276Online publication date: 3-Aug-2003
https://dl.acm.org/doi/10.5555/938979.939276
Nagy G(2000)Twenty Years of Document Image Analysis in PAMIIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/34.82482022:1(38-62)Online publication date: 1-Jan-2000
https://dl.acm.org/doi/10.1109/34.824820

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents