8000
Comment options
You must be logged in to vote
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji
reacted with thumbs down emoji
reacted with laugh emoji
reacted with hooray emoji
reacted with confused emoji
reacted with heart emoji
reacted with rocket emoji
reacted with eyes emoji
Replies: 1 comment
-
It's a very good question and honestly, I don't know. When training, I only used augmentation techniques like resizing and flipping. So there might be some gains when adding some additional augmentation techniques powered by Open-CV or similar... And because I did not train the vision models with these techniques I did not add any pre-processing step either. So, if you want to try some pre-processing you could do something like this (I haven't tried though) import deepdoctection as dd
def my_pre_proc_func(np_image):
# your implementation that pre-processes and returns the transformed image
class PreProcessing(dd.ImageTransformer):
# a predictor that runs in a SimpleTransformService
def __init__(self):
self.name = "preproc"
self.model_id = self.get_model_id()
def transform(self, np_img, specification):
return my_pre_proc_func(np_img)
def predict(self, np_img):
return dd.DetectionResult(document_type="my_pre_proc_func")
def clone(self):
return self.__class__()
@staticmethod
def possible_category():
return dd.PageType.document_type
preproc_component = dd.SimpleTransformService(PreProcessing())
analyzer = dd.get_dd_analyzer()
# inject the pre-processing step at the beginning of the pipeline
analyzer.pipe_component_list[0] = preproc_component
df = analyzer.analyze(path=...) # the usual stuff Would be interesting, if you observe improvements! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I wonder is PDF/Images Pre-processing necessary to improve Layout segmentation and OCR accuracy? I have tried Table Recognition + DocTR on two different PDF. One work perfectly, the other one fail to perform segmentation.
Beta Was this translation helpful? Give feedback.
All reactions