Open
Description
currently it seems (document) images (jpg, png, webp, avif, etc.) are not supported. obviously one would need a "traditional" ocr inbetween or rather just use a vision llm for feature extraction. on that topic i highly recommend to look at https://github.com/illuin-tech/colpali (extract features directly from image embeddings instead of converting to ocr first, which preserves visual information as well for later retrieval)
Metadata
Metadata
Assignees
Labels
No labels