Abstract
Sign language is a powerful form of communication for humans, and advancements in computer vision systems are driving significant progress in sign language recognition. In the context of Indian sign language (ISL), early research focused on differentiating a limited set of distinct hand signs, often relying on specialized hardware such as sensors and gloves, also most of the works were experimented on the dataset captured under controlled environments. This research aims to enhance communication for the speech and hearing impaired community by recognizing static images of ISL digits and alphabets in both offline and real-time scenarios. To achieve this, two publicly available datasets were used, containing a total of 42,000 sign images and 36,000 static signs, respectively. The dataset1 consists of sign images that were taken under controlled environments, whereas the dataset2 consists of sign images that were taken in different environments with varying backgrounds and lighting conditions. Dataset1 was experimented with and without using preprocessing techniques, while dataset2 underwent similar testing. We employed both machine learning and deep learning with CNN to categorize the ISL alphabets and numbers. In the machine learning approach, image preprocessing techniques such as HSV conversion, skin mask generation, and skin portion extraction and Gabor filtering were used to segment the region of interest, which was then fed to five ML models for sign prediction. In contrast, the DL approach used CNN model. In addition, probability ensemble testing was performed on both datasets to compare the accuracies. Real-time recognition was also conducted using a custom dataset, employing the YOLO-NAS-S model. This study contributes to the advancement of ISL recognition by conducting a comparative analysis of ML algorithms and CNNs, examining their performance with and without preprocessing techniques.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are openly available in the following link: dataset1: https://www.kaggle.com/datasets/vaishnaviasonawane/indian-sign-language-dataset, dataset2: https://www.kaggle.com/datasets/atharvadumbre/indian-sign-language-islrtc-referred, Roboflow ISL: https://universe.roboflow.com/isr1/indian-sign-language-3e2qh/dataset/4/images.
References
Deora D, Bajaj N. Indian sign language recognition, 2012 1st International Conference on Emerging Technology Trends in Electronics, Communication and Networking, IEEE 2012-978–1-4673-1627-9/12.
Nair AV, Bindu V. A review on indian sign language recognition. Int J Comput Appl. 2013;73(22):33–8.
Badenas J, Miguel Sanchiz J, Filiberto P. Motion-based segmentation and region tracking in image sequences. Pattern Recognit. 2001;34:661–70.
Liao P-S, Chen T-S, Chung P-C. A fast algorithm for multilevel thresholding. J Inf Sci Eng. 2001;17:713–27.
McIvor A, Zang B, Klette R. The background subtraction problem for video surveillance systems. In: International workshop robot vision 2001, Auckland, New Zealand, February 2001. Springer lecture notes in computer science. 1998. pp 176–83.
Sultana A, Rajapushpa T. Vision based gesture recognition for alphabetical hand gestures using the SVM classifier. Int J Comput Sci Eng Technol. 2012;3(7):218–23.
Nanivadekar PA, Kulkarni V. Indian sign language recognition: database creation, hand tracking and segmentation, International Conference on Circuits, Systems, Communication and Information Technology Applications, IEEE 2014, 978-1-4799-2494-3/14.
Raheja JL, Mishra A, Chaudhary A. Indian Sign language recognition using SVM. Pattern Recognit Image Anal. 2016;26(2):434–41.
Loke P, Paranjpe J, Bhabal S, Kanere K. Indian sign language converter system using an android app, International Conference on Electronics, Communication and Aerospace Technology, 2017 IEEE, 978-105090-5686-6/17.
Beena MV, Agnisarman Namboodiri MN. ASL numerals recognition from depth maps using artificial neural networks. Middle-East J Sci Res. 2017;25(7):1407–13.
Rodriguez KO, Chavez GC. Finger spelling recognition from RGB-D information using kernel descriptor, 2013 XXVI Conference on Graphics, Patterns and Images.
Molchanov P, Yang X, Gupta S, Kim K, Tyree S, Kautz J. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L. Large-scale video classification with convolutional neural networks. In CVPR, 2014.
Kpkl O, Gunduz A, Kose N, Rigoll G. Real-time hand gesture detection and classification using convolutional neural networks, paper accepted to IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019).
Chen X, Gao K. Dense image network: Video spatial-temporal evolution encoding and understanding, paper submitted to ArXiv on 19 May 2018.
Kumar Makwana M. Sign language recognition, M. Tech thesis submitted to Indian Institute of Science, Bengaluru, June 2017.
Buehler P, Zisserman A, Everingham M. Learning sign language by watching tv (using weakly aligned subtitles). In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 2961–2968. IEEE, 2009.
Cooper H, Ong E-J, Pugeault N, Bowden R. Sign language recognition using sub-units. J Mach Learn Res. 2012;13(1):2205–31.
Tharwat A, Gaber T, Ella Hassanien A, Shahin MK, Refaat B. Sift-based arabic sign language recognition system. In: Afro-European conference for industrial advancement. Cham: Springer; 2015. p. 359–70.
Quan Yang. Chinese sign language recognition based on video sequence appearance modeling. In 2010 5th IEEE Conference on Industrial Electronics and Applications, pages 1537–1542. IEEE, 2010.
Al-Rousan M, Assaleh K, Tala’a A. Video-based signer-independent Arabic sign language recognition using hidden markov models. Appl Soft Comput. 2009;9(3):990–9.
Badhe PC, Kulkarni V. Indian sign language translator using gesture recognition algorithm. In: 2015 IEEE International Conference on Computer Graphics, Vision and Information Security (CGVIS), pp 195–200. IEEE, 2015.
Starner T, Weaver J, Pentland A. Real-time American sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell. 1998;20(12):1371–5.
Evangelidis GD, Singh G, Horaud R. Continuous gesture recognition from articulated poses. In: European conference on computer vision. Cham: Springer; 2014. p. 595–607.
Zhang J, Zhou W, Xie C, Pu J, Li H. Chinese sign language recognition with adaptive hmm. In: 2016 IEEE International Conference on Multimedia and Expo (ICME), IEEE, pp 1–6. IEEE, 2016.
Wang SB, Quattoni A, Morency LP, Demirdjian D, Darrell T. Hidden conditional random fields for gesture recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp 1521–1527. IEEE, 2006.
Sakoe H, Chiba S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process. 1978;26(1):43–9.
Lichtenauer JF, Hendriks EA, Reinders MJT. Sign language recognition by combining statistical dtw and independent classification. IEEE Trans Pattern Anal Mach Intell. 2008;30(11):2040–6.
Nagarajan S, Subashini TS. Static hand gesture recognition for sign language alphabets using edge oriented histogram and multi class svm. Int J Comput Appl. 2013;82(4):28–35.
Zafrulla Z, Brashear H, Starner T, Hamilton H, Presti P. American sign language recognition with the kinect. In: Proceedings of the 13th International Conference on Multimodal Interfaces, pp 279–286, 2011.
Ming Lim K, Tan AWC, Chiang Tan S. Block-based histogram of optical flow for isolated sign language recognition. J Vis Commun Image Represent. 2016;40:538–45.
Kulkarni VS, Lokhande SD. Appearance based recognition of american sign language using gesture segmentation. Int J Comput Sci Eng. 2010;2(3):560–5.
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T. Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634, 2015.
Feichtenhofer C, Pinz A, Zisserman A. Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1933–1941, 2016.
Camgoz NC, Hadfield S, Koller O, Bowden R. Using convolutional 3d neural networks for user-independent continuous gesture recognition. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp 49–54. IEEE, 20156.
Rokade YI, Jadav PM. Indian sign language recognition system. Int J Eng Technol. 2017;9(3):189–96.
Katoch S, Singh V, Shanker Tiwary U. Indian sign language recognition system using SURF with SVM and CNN. Array. 2022;14:100141.
Shenoy K, Dastane T, Rao V, Vyavaharkar D. Real-time Indian sign language (ISL) recognition. In: 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp 1–9. IEEE, 2018.
Acknowledgements
The PES University in Bangalore, Karnataka, India provided the facilities needed to conduct the research, which the authors gratefully acknowledge.
Funding
No funding received for this research.
Author information
Authors and Affiliations
Contributions
PK selected the research issues, carried out the study, wrote the article, and examined the simulation findings with the supervision and support of SBJ.
Corresponding author
Ethics declarations
Conflict of Interest
No conflict of interest exists.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advances in Computational Approaches for Image Processing, Wireless Networks, Cloud Applications and Network Security” guest edited by P. Raviraj, Maode Ma and Roopashree H R.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Priya, K., Sandesh, B.J. Developing an Offline and Real-Time Indian Sign Language Recognition System with Machine Learning and Deep Learning. SN COMPUT. SCI. 5, 273 (2024). https://doi.org/10.1007/s42979-023-02482-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02482-w