8000 sjtuplayer / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View sjtuplayer's full-sized avatar

Block or report sjtuplayer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,213 247 Updated Jun 12, 2025
Python 21 Updated Mar 14, 2025

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 899 35 Updated Jun 27, 2025
Python 39 2 Updated Jun 17, 2025

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Python 1,092 93 Updated Jun 13, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,515 1,525 Updated Jun 13, 2025

The official code implementation of Generalized Category Discovery in Semantic Segmentation

Jupyter Notebook 17 1 Updated Dec 20, 2023

Taming Transformers for High-Resolution Image Synthesis

Jupyter Notebook 6,229 1,194 Updated Jul 30, 2024

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 16,521 1,509 Updated Sep 5, 2024

Next-Token Prediction is All You Need

Python 2,156 81 Updated Mar 17, 2025

SAM with text prompt

Python 2,257 260 Updated May 10, 2025

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 2,271 162 Updated Feb 16, 2025

[CVPR25] IAR

Python 11 1 Updated Jun 13, 2025

[ICCV 2023] Official PyTorch implementation of "Rethinking Mobile Block for Efficient Attention-based Models"

Jupyter Notebook 241 18 Updated Oct 24, 2023

利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.

Python 5,669 666 Updated May 22, 2025

[ICLR 2025] Autoregressive Video Generation without Vector Quantization

Python 537 14 Updated May 22, 2025
HTML 64 4 Updated Oct 31, 2024

EMOv2: Pushing 5M Vision Model Frontier

Python 46 1 Updated Dec 30, 2024
Python 9 2 Updated Dec 4, 2024

[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Python 719 35 Updated Jun 8, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 10,482 953 Updated Jun 3, 2025

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,785 81 Updated Aug 15, 2024
Python 80 8 Updated Nov 8, 2024

[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Python 317 10 Updated May 30, 2025

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

Python 114 7 Updated Oct 18, 2024
Python 37 2 Updated Jun 24, 2025

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,989 1,056 Updated Jun 19, 2025
Jupyter Notebook 1,034 398 Updated May 5, 2023

[CVPR 2024] SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

C++ 50 3 Updated Dec 18, 2024
Next
0