A project to convert a sign to its equivalent text.
- Abstract
- Introduction
- Overall Description
- LSTM
- Data-set Generation Process
- Data-set Training Process
- Data-set Testing Process
- Future Scope
Sign language is one of the oldest and most natural forms of language for communication. Most people do not know sign language and interpreters are very difficult to come by, therefore we have come up with a real time method using neural networks for finger-spelling based Indian SignLanguage (ISL).
The purpose of the project is to build an AI assistant designed specifically for speech impaired individuals that converts sign language to text using a video call interface.
- The app provides a cheap solution as compared to sign language interpreters which are just costly to buy.
- It can simply be used with a web camera, making it efficient as compared to interpreters.
- This project provides ease of access for speech impaired individuals, as no additional equipment is needed.
- Accepting input from webcam
- Detecting the signs
- Displaying correct output(translated sign)
- Interfaces/Tools:
- Jupyter
- Webcam
- Software dependencies:
- OpenCV
- Mediapipe
- Tensorflow
- scikit-learn
Our project uses the LSTM network. In concept, an LSTM recurrent unit tries to “remember” all the past knowledge that the network is seen so far and to “forget” irrelevant data. This is done by introducing different activation function layers called “gates” for different purposes.
We use the OpenCV library of python to capture images and snippets of the training data that gets stored into a folder ”MP Data” with each sign as a separate sub-folder. As of now the following are the signs that have been included:
- The Alphabet(A-Z)
- First 10 numbers(1-10)
- 12 Common phrases(hello, how are you, my name is, welcome, thank you, please, sorry, bye, good morning, good afternoon, good evening, good night)
The model is trained using 3 layers of LSTM and 3 Dense layers with the Relu activation function and Softmax for the last layer. The optimizer used is Adam. The following data set has been trained for 300 epochs giving an accuracy of 84.60% when it is tested against itself. Finally this model is saved and used for the training phase.
The testing of the model is done once we have the saved trained model. Here we have tested for some of the signs like 6, o, q, v, 8.
This project is still in its initial stages and definitely needs much improvement in terms of accuracy and additional signs to the data set, in order to predict a wider range of signs. Other scopes include:
- Front-end/UI for the application
- Conversion into a browser extension
Contributions to this repository are open via issues. Feel free to contact the authors in case of any problems :)