-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add Azure OpenAI service support to marker package #675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
- Create AzureOpenAIService class that implements BaseService interface - Fix duplicate function name in test_service_init.py - Add test case for Azure OpenAI service - Update README.md to document Azure OpenAI service option - Add sample script for converting PDFs with Azure OpenAI This implementation allows marker to use Azure OpenAI for LLM-enhanced processing and image descriptions by configuring azure_endpoint, azure_api_key and deployment_name parameters.
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read the CLA document and I hereby sign the CLA |
This because it yielded cut-off-responses, which leads to invalid json
… be compatible with other LLM processing units
@VikParuchuri Can you please review this & merge it? |
…ions The inference_blocks method in LLMImageDescriptionProcessor had a logic error where it would return an empty list when extract_images=True, effectively disabling all image description processing. This is counterintuitive and contradicts the documented behavior where extract_images should control whether to keep images in output, not whether to generate descriptions. This fix ensures the processor always proce 8000 sses image blocks and generates descriptions regardless of the extract_images setting, aligning with the expected behavior described in the CLI documentation.
Agreed. I would be interested to get it merged please. Thanks! |
@VikParuchuri : could you have a look into this PR please? Thanks a lot :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! A couple of questions in the comments
@@ -41,8 +41,6 @@ class LLMImageDescriptionProcessor(BaseLLMSimpleBlockProcessor): | |||
|
|||
def inference_blocks(self, document: Document) -> List[BlockData]: | |||
blocks = super().inference_blocks(document) | |||
if self.extract_images: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think self.extract_images default True, so inference_blocks() return []. We should remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed - I found this one during debugging: with extract_images=True
(default), the processor would never process any images, making it completely non-functional... So removing ensures the processor actually does its job of generating image descriptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can set extract_images=False in confict, so don't need remove them
@tuantran23012000 @VikParuchuri I think we're now ready to merge this? Removed the langchain dependency |
fe180a0
to
06ad1e6
Compare
This implementation allows marker to use Azure OpenAI for LLM-enhanced processing and image descriptions by configuring azure_endpoint, azure_api_key and deployment_name parameters.