-
Peking University
- Beijing
-
01:57
(UTC +08:00) - https://yuanzhang.cc/
- @YuanZhang_PKU
Lists (3)
Sort Name ascending (A-Z)
Stars
Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning
[CVPR 2025] RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete. Official Repository.
DesignEdit: Unify Spatial-Aware Image Editing via Training-free Inpainting with a Multi-Layered Latent Diffusion Framework
[CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
[Arxiv 2025: MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation]
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
[CVPR'25] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
[ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
OpenMMLab Model Compression Toolbox and Benchmark.
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
[NeurIPS'24] Official implementation of paper "Unveiling the Tapestry of Consistency in Large Vision-Language Models".
DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
[CVPR'24] Official implementation of paper "FreeKD: Knowledge Distillation via Semantic Frequency Prompt".
Code for paper LocalMamba: Visual State Space Model with Windowed Selective Scan
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
[ACM MM'23] Official implementation of paper "Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty".
Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023
An advanced guide to learn English which might benefit you a lot 🎉 . 离谱的英语学习指南/英语学习教程。
A curated list of neural network pruning resources.
Gumpest / MasKD
Forked from hunto/MasKDOfficial implementation of paper "Masked Distillation with Receptive Tokens", ICLR 2023.
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
OpenMMLab Pre-training Toolbox and Benchmark