8000 GitHub - gokulkarthik/Indian-Scene-Text-Detection: Indian Scene Text-Detection
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

gokulkarthik/Indian-Scene-Text-Detection

Repository files navigation

Indian-Scene-Text-Detection

The Indian scene text detection model is developed as part of the work towards Indian Signboard Translation Project by AI4Bharat. I worked on this project under the mentorship of Mitesh Khapra and Pratyush Kumar from IIT Madras.

Indian Signboard Translation involves 4 modular tasks:

  1. T1: Detection: Detecting bounding boxes containing text in the images
  2. T2: Classification: Classifying the language of the text in the bounding box identifed by T1
  3. T3: Recognition: Getting the text from the detected crop by T1 using the T2 classified recognition model
  4. T4: Translation: Translating text from T3 from one Indian language to other Indian language

Pipeline for sign board translation

Note: T2: Classification is not updated in the above picture

Dataset

Indian Scene Text Detection Dataset(D1-Big + D1-English) is used for training the detection model and evaluation. Axis-Aligned Bounding Box representation of the text boxes are used.

Labels

The score map for an image is the region with in the shrinked bounding box. The geometry map at a point inside the bounding box represents the distance of that point to the left, top, right and bottom boundaries respectively.

Sample-X-Y

Model

The fully convolutional neural network proposed in the paper titled "An Efficient and Accurate Scene Text Detector" (EAST) is used to predict the word instance regions and their geometries. The following two variants of the model are experimented:

  1. M1: Pretrained VGG-16 net as a feature extractor. It produces output in the reduced dimensions by a factor of 4.
  • Input Image Shape: [320, 320, 3]
  • Output Score Map Shape: [80, 80, 1]
  • Output Geometry Map Shape: [80, 80, 4]
  1. M2: U-Net for feature extractor and merging. It produces per pixel predictions of text regions and geometries.
  • Input Image Shape: [320, 320, 3]
  • Output Score Map Shape: [320, 320, 1]
  • Output Geometry Map Shape: [320, 320, 4]

Non-Maximal Supression (NMS) is performed to remove the overlapping bounding boxes with the maximum permitted IoU threshold of 0.1.

For detailed model architecture, check the file model.py

Sample Input-Output

Sample-X-Y-Pred

Training

M1 & M2 converged to simliar score and geometry losses after training for a specific number of epochs. As M1is significantly efficient in memory and computation, it is selected over M2. The detection model is trained for 30 epochs. The model weights are saved every 3 epochs and you can find them in the Models directory.

The final hyperparameters can be accessed in config.yaml

Training Loss

Performance

The lowest validation loss is observed in epoch 12. Hence, the model Models/EAST-Detector-e12.pth is used to evaluate the detection performance. In the NMS stage, minimum score threshold is set as 0.85 and maximum permitted IoU threshold is set as 0.2

Minimum IoU threshold for the predicted bounding boxes to be considered as correct is set as 0.70

Metric Precision Recall F1-Score
Trainset 0.311847 0.360114 0.426558
Valset 0.331797 0.384548 0.446315
Testset 0.267891 0.343183 0.343183

Sample Detections:

Sample Detection 1 Sample Detection 2 Sample Detection 3 Sample Detection 4

Code

Related Links:

  1. Indian Signboard Translation Project
  2. Indian Scene Text Dataset
  3. Indian Scene Text Detection
  4. Indian Scene Text Classification
  5. Indian Scene Text Recognition

References:

  1. https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhou_EAST_An_Efficient_CVPR_2017_paper.pdf
  2. https://arxiv.org/pdf/1505.04597.pdf
  3. https://github.com/liushuchun/EAST.pytorch
  4. https://github.com/GokulKarthik/EAST.pytorch
  5. https://www.pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/

About

Indian Scene Text-Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0