-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Insights: datalab-to/marker
Overview
Could not load contribution data
Please try again later
5 Releases published by 2 people
-
v1.7.0 New OCR model; inline math; beta structured extraction
published
May 19, 2025 -
v1.7.1 Misc bugfixes
published
May 19, 2025 -
v1.7.2 Fix config parsing
published
May 20, 2025 -
v1.7.3 Misc bugfixes
published
May 22, 2025 -
v1.7.4 Misc bugfixes and improvements
published
Jun 2, 2025
11 Pull requests merged by 3 people
-
Enable dropping repeated text
#724 merged
Jun 2, 2025 -
Fix annotations
#705 merged
Jun 2, 2025 -
Vik remove chars
#720 merged
Jun 2, 2025 -
Format Lines: Fix line merging
#712 merged
Jun 2, 2025 -
Add "no corrections needed" check to llm_equation, standardize in other files
#721 merged
Jun 2, 2025 -
Fix misc issues
#704 merged
May 22, 2025 -
Fix config passing
#697 merged
May 20, 2025 -
Fix click 8.2.0 issue
#694 merged
May 19, 2025 -
New OCR model, structured extraction beta
#693 merged
May 19, 2025 -
WIP: Foundation Model Integration
#616 merged
May 19, 2025 -
Structured extraction
#687 merged
May 19, 2025
3 Pull requests opened by 3 people
-
Bugfixes PR
#707 opened
May 23, 2025 -
Dev
#733 opened
Jun 6, 2025 -
Maker with mol
#738 opened
Jun 9, 2025
17 Issues closed by 5 people
-
bump anthropic to 0.49 or higher?
#740 closed
Jun 9, 2025 -
MCP Implementation
#735 closed
Jun 6, 2025 -
truncated image
#715 closed
Jun 6, 2025 -
Trying marker in offline mode
#725 closed
Jun 4, 2025 -
how is the performance on arbic language?
#723 closed
Jun 2, 2025 -
Add support for vLLM as an alternative OpenAI‐compatible backend
#670 closed
May 30, 2025 -
Segfault with local PDF conversion using `--use_llm --format_lines`
#700 closed
May 21, 2025 -
KeyError: 'encoder' raised from Surya model config during PdfConverter initialization
#698 closed
May 21, 2025 -
TypeError: can only concatenate list (not "NoneType") to list
#644 closed
May 20, 2025 -
Incompatibility with click 8.2.0 - CLI errors and non-working parameters
#690 closed
May 19, 2025 -
I want to store the models in different location and not in C drive
#678 closed
May 19, 2025 -
Memory Leak w/ LLM option enabled
#679 closed
May 19, 2025 -
Facing KeyError: 'encoder'.
#688 closed
May 19, 2025 -
Installing with additional dependencies (full) isn't working on MacOS
#692 closed
May 19, 2025 -
model weights dir
#691 closed
May 19, 2025
25 Issues opened by 23 people
-
How to get text without html tag from RecognitionPredictor?
#739 opened
Jun 9, 2025 -
CLI marker_single error: pypdfium2/_helpers/document.py->Invalid input type 'PdfDocument'
#736 opened
Jun 7, 2025 -
Request to add another LLM into marker (Mistral)
#732 opened
Jun 6, 2025 -
Duplicate content
#731 opened
Jun 6, 2025 -
Headings in Markdown Output Are mostly all Flat (No consistent Hierarchy)
#730 opened
Jun 6, 2025 -
Summarize the pages
#729 opened
Jun 5, 2025 -
'PdfConverter' object has no attribute 'artifact_dict'
#728 opened
Jun 5, 2025 -
How to use gpu on windows 11
#718 opened
May 30, 2025 -
Regression in PDF-to-Markdown Conversion compare to v1.6.3
#717 opened
May 29, 2025 -
marker_server should support --use_llm and related arguments
#714 opened
May 28, 2025 -
Unrelated text inserted into output
#713 opened
May 27, 2025 -
CUDA Out-of-Memory Error on Linux with LLM Integration - VM Environment
#711 opened
May 27, 2025 -
CUDA Out-of-Memory in Linux
#710 opened
May 27, 2025 -
No module named `cgi`
#709 opened
May 25, 2025 -
Table completely misinterpreted
#708 opened
May 24, 2025 -
--languages deleted in 1.7.0
#706 opened
May 23, 2025 -
error after longtime Recognizing Text:
#703 opened
May 22, 2025 -
Seg-fault running marker_single, likely caused by unstable Torch version
#701 opened
May 21, 2025 -
RuntimeError: operator torchvision::nms does not exist
#699 opened
May 21, 2025 -
Feature, extract tables as images
#696 opened
May 20, 2025 -
Does the marker support installation on the NPU?
#695 opened
May 20, 2025 -
Does the LLM option not help to actually parse the text?
#686 opened
May 13, 2025 -
Many errors with DeepSeek v3 USE_LLM
#685 opened
May 12, 2025
20 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add Azure OpenAI service support to marker package
#675 commented on
Jun 6, 2025 • 8 new comments -
Fix issue #556 and expand table benchmarks
#582 commented on
Jun 6, 2025 • 0 new comments -
WIP feat: accept binary PDF instead of just path to PDF
#416 commented on
May 22, 2025 • 0 new comments -
Feature/docker container
#413 commented on
Jun 6, 2025 • 0 new comments -
Move `benchmark.py` to `marker_benchmark.py`
#74 commented on
Jun 6, 2025 • 0 new comments -
OCR_ENGINE=None Doesn't work
#256 commented on
Jun 6, 2025 • 0 new comments -
Rate limit also occurs for local LLM?
#642 commented on
May 31, 2025 • 0 new comments -
'gbk' codec can't decode byte 0xb3 in position 1470: illegal multibyte sequence
#576 commented on
May 26, 2025 • 0 new comments -
Some text missing when extracting tables
#471 commented on
May 22, 2025 • 0 new comments -
Ollama LLM extracted image information incorrectly, with the answer stating that there is no visible content in the image.
#634 commented on
May 21, 2025 • 0 new comments -
when install, I got error : AttributeError: 'dict' object has no attribute 'to_dict'
#654 commented on
May 21, 2025 • 0 new comments -
Strange characters in OCR of table numbers
#643 commented on
May 20, 2025 • 0 new comments -
OCR cannot recognize the image.
#658 commented on
May 20, 2025 • 0 new comments -
Add option to provide API key when using Ollama LLM
#683 commented on
May 19, 2025 • 0 new comments -
Bug Report: Marker Crashes with Bus Error (SIGBUS) After Model Loading in Docker on Rocky Linux
#684 commented on
May 19, 2025 • 0 new comments -
AttributeError: 'ConfigParser' object has no attribute 'get_llm_service'
#568 commented on
May 19, 2025 • 0 new comments -
Marker is not working
#628 commented on
May 17, 2025 • 0 new comments -
How to download converted markdown in Interactive App
#653 commented on
May 14, 2025 • 0 new comments -
Gemini API exhausted, need some pause mechanism or something
#490 commented on
May 13, 2025 • 0 new comments -
Best LLMs for use_llm mode in hybrid Marker pipeline?
#680 commented on
May 13, 2025 • 0 new comments