Windows compatibility of Deepdoctection #354

krupeshp · 2024-07-24T15:13:41Z

I am looking into the deepdoctection dependency list I have found out that there is popler & Detectron2 are stopping deepdoctection from running on windows without a docker file.

Pymupdf is independent from popler to work on which is being used to convert pdf into images.(img2pdf alternative.)

Detectron2 can be installed into Windows attaching a reference over here. https://medium.com/@yogeshkumarpilli/how-to-install-detectron2-on-windows-10-or-11-2021-aug-with-the-latest-build-v0-5-c7333909676f , https://dev.to/reckon762/how-to-install-detectron2-on-windows-3hil

Is it possible to do it? @JaMe76 I am open to contributing.

JaMe76 · 2024-07-25T08:52:29Z

Supporting Windows on one hand side would be nice.
But on the other hand the current number of dependencies is at a critical point, so that just adding another dependency is not an option. In fact, I am trying to reduce the number of requried dependencies.

On other problem regarding Pymupdf is its licence: AGPL would require to change this projects license and prevent building anything that's basically beyond Open Source. The current trend is that most parsers have a restricted license and I do not want to follow this trend.

Poppler is mainly used for converting pdf bytes into numpy arrays. One alternative which seems to provide this, is Pypdfmium2. In order to keep the amount of dependencies low one should let users install Pypdfmium2 by themselves. One could than extend

deepdoctection/deepdoctection/utils/pdf_utils.py

Line 212 in 6a518e9

    
           def pdf_to_np_array(pdf_bytes: bytes, size: Optional[tuple[int, int]] = None, dpi: int = 200) -> PixelValues:

such that if Pypdfmium2 is installed then use their pdf/numpy conversion and fallback to Poppler otherwise.

krupeshp · 2024-07-25T13:35:30Z

Okay got it, should I do a PR for contributing or you would like to assign this task to the existing contributors?

Other feature enhancement can be classifying, detecting and recognizing handwritten texts in the images. TrOCR or any better lite weight models.

I have seen into dOctr that they are also trying to work on handwritten text but have not any progress since 2022. Can we do this by integrating third-party models?

JaMe76 · 2024-07-26T13:16:37Z

You can work on a PR.

With respect to TrOCR, I do not want to add the model (yet) as it will require a hand writing text detector for which there is yet no satisfying model I am aware of.

I am much more in favor for trying Kosmos 2.5 once it has been added to the transformers library.

krupeshp · 2024-07-27T12:31:23Z

You can work on a PR.

With respect to TrOCR, I do not want to add the model (yet) as it will require a hand writing text detector for which there is yet no satisfying model I am aware of.

I am much more in favor for trying Kosmos 2.5 once it has been added to the transformers library.

Kosmos 2.5 seems very large model to run on lower specification servers or workstations. Around 5.5 GB.

JaMe76 added the enhancement New feature or request label Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows compatibility of Deepdoctection #354

Windows compatibility of Deepdoctection #354

Windows compatibility of Deepdoctection #354

Windows compatibility of Deepdoctection #354

Comments