Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
An easy way to apply LoRA to CLIP. Implementation of the paper "Low-Rank Few-Shot Adaptation of Vision-Language Models" (CLIP-LoRA) [CVPRW 2024].
An open-source AI agent that brings the power of Gemini directly into your terminal.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
Papers about Explainable AI (Deep Learning-based)
GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)
An Arena-style Automated Evaluation Benchmark for Detailed Captioning
[CVPR 2023] OneFormer: One Transformer to Rule Universal Image Segmentation
Access large language models from the command-line
A final sanity checklist to help your CS paper get accepted, not desk rejected.
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
集找番、追番、看番的一站式弹幕追番平台,云收藏同步 (Bangumi),离线缓存,BitTorrent,弹幕云过滤。100% Kotlin/Compose Multiplatform
Minimal and annotated implementations of key ideas from modern deep learning research.
中文翻译的 Hands-On-Large-Language-Models (hands-on-llms),动手学习大模型
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
11 Lessons to Get Started Building AI Agents
12 Weeks, 24 Lessons, 63E6 AI for All!
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Brain tumor images classification with ResNet, EfficientNet, EfficientNet_V2 and Compact Convolutional Transformers architectures with PyTorch
Chrome 多窗口管理器是一款Chrome浏览器多窗口管理工具。它可以帮助用户轻松管理多个 Chrome 窗口,实现窗口批量打开、排列以及之间的同步操作,大大提高交互效率。
Training A Small Emotional Vision Language Model for Visual Art Comprehension