8000 ifzhang (Yifu Zhang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View ifzhang's full-sized avatar
🐶
Focusing
🐶
Focusing

Organizations

@hustvl

Block or report ifzhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official Repository of Paper "ROSA: Harnessing Robot States for Vision-Language and Action Alignment"

10 Updated Jun 17, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,300 193 Updated Jun 17, 2025

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,923 112 Updated Jun 16, 2025

[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Python 947 28 Updated Jun 12, 2025

ReNeg: Learning Negative Embedding with Reward Guidance

Python 32 Updated Jan 2, 2025

Liquid: Language Models are Scalable and Unified Multi-modal Generators

Python 592 34 Updated Apr 8, 2025

[CVPR 2025] StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Python 214 20 Updated May 13, 2025

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,123 52 Updated Mar 16, 2025

The official implementation of "[MASK] is All You Need"

Jupyter Notebook 121 6 Updated Feb 28, 2025

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 2,846 162 Updated May 28, 2025

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 1,337 73 Updated Apr 24, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 10,406 940 Updated Jun 3, 2025

[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Python 1,114 72 Updated Dec 31, 2024

[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

Python 784 52 Updated Jun 17, 2025

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 799 57 Updated Oct 1, 2024

Bridging Large Vision-Language Models and End-to-End Autonomous Driving

Python 398 26 Updated Dec 26, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,578 1,121 Updated Jun 17, 2025

Multimodal Models in Real World

Jupyter Notebook 513 21 Updated Feb 24, 2025

[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.

Python 297 7 Updated Jul 9, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,780 81 Updated Aug 15, 2024

[NeurIPS 2024] A Generalizable World Model for Autonomous Driving

Python 753 50 Updated Dec 12, 2024

[AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention

Python 111 1 Updated Jun 17, 2024

[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Python 167 8 Updated Mar 1, 2025

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,272 511 Updated May 18, 2025

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 3,460 243 Updated Feb 13, 2025

[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Jupyter Notebook 771 64 Updated Apr 14, 2025

A method that can match the 3D point cloud sub-map generated by the robot during the SLAM process with the 2D map.

Python 19 2 Updated Oct 4, 2022
HTML 7 1 Updated Aug 26, 2023

[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"

Python 231 12 Updated Aug 15, 2024
Next
0