Starred repositories
PyTorch implementation of "Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models"
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
A high-throughput and memory-efficient inference and serving engine for LLMs
[CVPR 2025] Official implementation for "Empowering LLMs to Understand and Generate Complex Vector Graphics" https://arxiv.org/abs/2412.11102
从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Solve Visual Understanding with Reinforced VLMs
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis (arXiv, 2024)
[ICLR 2025] Animate-X: Universal Character Image Animation with Enhanced Motion Representation
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation
Exploiting unlabeled data with vision and language models for object detection, ECCV 2022
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "Fast Vision Transformers with HiLo Attention"
[CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
Janus-Series: Unified Multimodal Understanding and Generation Models
[CVPR 2024] Official implementation of "VRP-SAM: SAM with Visual Reference Prompt"
[ICLR'25] AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation
🤩 IM bots platform for the LLM era / 简单易用的大模型即时通信机器人平台 ⚡️ 适配 QQ / 微信(企业微信、个人微信)/ 飞书 / 钉钉 / Discord / Telegram / Slack 等平台 | 支持 ChatGPT、DeepSeek、Dify、Claude、Google Gemini、xAI、PPIO、Ollama、阿里云百炼、Silic…
[CVPR'23] A Simple Framework for Text-Supervised Semantic Segmentation
Offical implementation of "SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection"
Bringing Old Photo Back to Life (CVPR 2020 oral)
ECCV2024:A Comparative Study of Image Restoration Networks for General Backbone Network Design
[CVPR 2024] Official implementation of "Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss"
Open-source and strong foundation image recognition models.