On the pptx type, Docwire is much worse than DocToText. What is the reason for this? · Issue #128 · docwire/docwire · GitHub
More Web Proxy on the site http://driver.im/
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for analysis. Could you please recheck with the latest code? There were a lot of optimalisations introduced in this release: https://github.com/docwire/docwire/releases/tag/2024.06.19
I addition please let us know what is the source of data for docwire, is this a CLI run or your C++ application? If an application than what is the source type for processing chain, a file path or std::istream?
Format detection is faster if file name with extension is passed, but starting from this release: https://github.com/docwire/docwire/releases/tag/2024.06.24
there is a possibility to pass a memory buffer or stream together with a file extension to send a hint about file format to DocWire SDK. In this scenario parser matching file extension will be tried as first one and detection based on binary data is performed only if it fails.
Were you able to retest using the newest version of DocWire? In addition: are the test files confidential? If not, could you please attach them (especially that largest file) here so we can analyze the issue in our environment?
Docwire:docwire-2024.04.04/arm64-osx-dynamic
DocToText:The last one
The text was updated successfully, but these errors were encountered: