Support for detecting special symbols like bullet points, check boxes, etc.

🚀 The feature

Currently, all text detection models in docTR do not seem to identify special characters like bullet points, check boxes, etc. (likely because the training data is so) - attaching sample outputs (clipped screenshots of document pages). We should be able to detect these symbols to increase detection coverage to all text present on a page.
Bullet points:

Checkboxes:

Motivation, pitch

Symbols like bullet points, checkboxes, etc. form an integral part of the text content of many types of documents in general and OCR should be able to detect as well as recognize these symbols to increase coverage to all text present on the page.

Alternatives

No response

Additional context

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚀 The feature

Motivation, pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

🚀 The feature

Motivation, pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions