Abstract
Classification step is one of the most important tasks in any recognition system. This step depends greatly on the quality and efficiency of the extracted features, which in turn determines the efficient and appropriate classifier for each system. This study is an investigation of using both K- Nearest Neighbor (KNN) and Random Forest Tree (RFT) classifiers with previously tested statistical features. These features are independent of the fonts and size of the characters. First, a binarization procedure has been performed on the input characters images, and then the main features have been extracted. The features used in this paper are statistical features calculated on the shapes of characters. A comparison between KNN and RFT classifiers has been evaluated. RFT found to be better than KNN by more than 11 % recognition rate. The effect of different parameters of these classifiers has also been tested, as well as the effect of noisy characters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abd, M.A., Paschos, G.: Effective arabic character recognition using support vector machines. In: Innovations and Advanced Techniques in Computer and Information Sciences and Engineering, pp. 7–11. Springer (2007)
Al-Jamimi, H.A., Mahmoud, S.A.: Arabic character recognition using gabor filters. In: Innovations and Advances in Computer Sciences and Engineering, pp. 113–118. Springer (2010)
Thein, Y., Yee, S.S.S.: High accuracy Myanmar handwritten character recognition using hybrid approach through MICR and neural network. International Journal of Computer Science Issues (IJCSI) 7, 22–27 (2010)
Hassin, A.H., Tang, X.-L., Liu, J.-F., Zhao, W.: Printed Arabic character recognition using HMM. Journal of Computer Science and Technology 19, 538–543 (2004)
Amin, A.: Offline Arabic character recognition: a survey. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition, vol. 2, pp. 596–599 (1997)
Almohri, H., Gray, J., Alnajjar, H.: A Real-time DSP-Based Optical Character Recognition System for Isolated Arabic characters using the TI TMS320C6416T. Doctoral dissertation, University of Hartford (2007)
Arica, N., Yarman-Vural, F.T.: An overview of character recognition focused on off-line handwriting. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews 31(2), 216–233 (2001)
Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recognition 39(3), 317–327 (2006)
Dhingra, K.D., Sanyal, S., Sharma, P.K.: A robust OCR for degraded documents. In: Huang, X., Chen, Y.-S., Ao, S.-I. (eds.) Advances in Communication Systems and Electrical Engineering. LNEE, vol. 4, pp. 497–509. Springer, Heidelberg (2008)
Jensen, J., Kendall, W.S.: Networks and chaos-statistical and probabilistic aspects, 50. CRC Press (1993)
Mahmoud, S.: Recognition of Arabic (Indian) check digits using spatial gabor filters. In: Proceeding of 5th IEEE GCC Conference & Exhibition, Kuwait, pp. 1–5 (2009)
Briman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
Caruana, R., Karampatziakis, N., Yessenalina, A.: An empirical evaluation of supervised learning in high dimensions. In: Proceedings of the 25th International Conference on Machine Learning, pp. 96–103 (2008)
Homenda, W., Lesinski, W.: Features Selection in Character Recognition with Random Forest Classifier. In: Jędrzejowicz, P., Nguyen, N.T., Hoang, K. (eds.) ICCCI 2011, Part I. LNCS, vol. 6922, pp. 93–102. Springer, Heidelberg (2011)
Elglaly, Y., Quek, F.: Isolated Handwritten Arabic Characters Recognition using Multilayer Perceptrons and K Nearest Neighbor Classifiers (2012)
Hassanien, A.E., Suraj, Z., Slezak, D., Lingras, P.: Rough computing: Theories, technologies and applications. IGI Publishing Hershey, PA (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Rashad, M., Semary, N.A. (2014). Isolated Printed Arabic Character Recognition Using KNN and Random Forest Tree Classifiers. In: Hassanien, A.E., Tolba, M.F., Taher Azar, A. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2014. Communications in Computer and Information Science, vol 488. Springer, Cham. https://doi.org/10.1007/978-3-319-13461-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-13461-1_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13460-4
Online ISBN: 978-3-319-13461-1
eBook Packages: Computer ScienceComputer Science (R0)