- NanJing
Lists (4)
Sort Name ascending (A-Z)
Starred repositories
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
New generation of CLIP with fine grained discrimination capability, ICML2025
Using vision-language models to decode natural image perception from non-invasive brain recordings.
Official PyTorch implementation of FlowMo.
Official Implementation of "LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis"
Towards a Unified Copernicus Foundation Model for Earth Vision
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
Simulating the Real World: Survey & Resources, which contains our survey "Simulating the Real World: A Unified Survey of Multimodal Generative Models" and Awesome-Text2X-Resources. Watch this repos…
Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
[CVPR 2025 Highlight] TinyFusion: Diffusion Transformers Learned Shallow
VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Collect some World Models for Autonomous Driving (and Robotic) papers.
Official PyTorch implementation for "Large Language Diffusion Models"
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
[ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Models. (Updated Regularly)
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework
[CSUR 2025] Continual Learning of Large Language Models: A Comprehensive Survey
Large Concept Models: Language modeling in a sentence representation space
Open reproduction of MUSE for fast text2image generation.
Implementation of papers in 100 lines of code.
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838