8000 Support for detecting special symbols like bullet points, check boxes, etc. · Issue #1570 · mindee/doctr · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Support for detecting special symbols like bullet points, check boxes, etc. #1570
Closed
@parthpatel002

Description

@parthpatel002

🚀 The feature

Currently, all text detection models in docTR do not seem to identify special characters like bullet points, check boxes, etc. (likely because the training data is so) - attaching sample outputs (clipped screenshots of document pages). We should be able to detect these symbols to increase detection coverage to all text present on a page.
Bullet points:
image
Checkboxes:
image

Motivation, pitch

Symbols like bullet points, checkboxes, etc. form an integral part of the text content of many types of documents in general and OCR should be able to detect as well as recognize these symbols to increase coverage to all text present on the page.

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0