DigitsRecognition

Goal

Goal: able to recognize any length of digits combination.(all digits come from MNIST dataset)

the length is in [5, 20] (but we can change this as we want)
each digit will be applied with [0.5, 1.0) resizing, and [0, 90] degree rotation

A sample is like this:

which is: 3463700204839487

Ideas

end-to-end neural network sounds promising, but in practice, it's always hard, for its efficiency(takes too long to optimize the parameters, too hungry for the training data), so the pipeline approach might be more practical.

first idea: sliding window and classification

This is kind of like YOLO, and it's a pipeline

find the location of the interest(maybe a bounding box)
classify the object

so the difficult part will be step 1.

2nd idea: end-to-end approach

When we classify the ImageNet, we not only get 1 class but a vector of probability of classes. Naively if we have a threshold, we just need to output the classes whose probability is greater than the threshold.

train a model against all single digits(and some variances, e.g resizing, rotation, etc.)
output the classes whose probability is greater than T(threshold)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
3463700204839487.jpg		3463700204839487.jpg
LICENSE		LICENSE
README.md		README.md
gen.py		gen.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DigitsRecognition

Goal

Ideas

first idea: sliding window and classification

2nd idea: end-to-end approach

About

Uh oh!

Releases

Packages

Languages

License

towerjoo/DigitsRecognition

Folders and files

Latest commit

History

Repository files navigation

DigitsRecognition

Goal

Ideas

first idea: sliding window and classification

2nd idea: end-to-end approach

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages