8000 sjtuplayer / Starred · GitHub

More Web Proxy on the site http://driver.im/

sjtuplayer

Follow

sjtuplayer

Follow

27 followers · 0 following

Achievements

Achievements

Stars

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,213 247 Updated Jun 12, 2025

wjc2830 / MelQCD-main

Python 21 Updated Mar 14, 2025

TencentARC / SEED-Voken

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 899 35 Updated Jun 27, 2025

xzc-zju / UltraVideo

Python 39 2 Updated Jun 17, 2025

Tencent-Hunyuan / HunyuanCustom

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Python 1,092 93 Updated Jun 13, 2025

Tencent-Hunyuan / HunyuanVideo-Avatar

Python 1,465 200 Updated Jun 17, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,515 1,525 Updated Jun 13, 2025

JethroPeng / GCDSS

The official code implementation of Generalized Category Discovery in Semantic Segmentation

Jupyter Notebook 17 1 Updated Dec 20, 2023

CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis

Jupyter Notebook 6,229 1,194 Updated Jul 30, 2024

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 16,521 1,509 Updated Sep 5, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 2,156 81 Updated Mar 17, 2025

luca-medeiros / lang-segment-anything

SAM with text prompt

Python 2,257 260 Updated May 10, 2025

dvlab-research / LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 2,271 162 Updated Feb 16, 2025

sjtuplayer / IAR

[CVPR25] IAR

Python 11 1 Updated Jun 13, 2025

zhangzjn / EMO

[ICCV 2023] Official PyTorch implementation of "Rethinking Mobile Block for Efficient Attention-based Models"

Jupyter Notebook 241 18 Updated Oct 24, 2023

linyqh / NarratoAI

利用AI大模型，一键解说并剪辑视频； Using AI models to automatically provide commentary and edit videos with a single click.

Python 5,669 666 Updated May 22, 2025

baaivision / NOVA

[ICLR 2025] Autoregressive Video Generation without Vector Quantization

Python 537 14 Updated May 22, 2025

zeke / red-panda-svg

HTML 64 4 Updated Oct 31, 2024

zhangzjn / EMOv2

EMOv2: Pushing 5M Vision Model Frontier

Python 46 1 Updated Dec 30, 2024

YuzhenD / Resyn

Python 9 2 Updated Dec 4, 2024

PKU-YuanGroup / ConsisID

[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Python 719 35 Updated Jun 8, 2025

Tencent-Hunyuan / HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 10,482 953 Updated Jun 3, 2025

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,785 81 Updated Aug 15, 2024

Fantasyele / LLaVA-KD

Python 80 8 Updated Nov 8, 2024

viiika / Meissonic

[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Python 317 10 Updated May 30, 2025

sjtuplayer / SaRA

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

Python 114 7 Updated Oct 18, 2024

AIGC-Explorer / TIMotion

Python 37 2 Updated Jun 24, 2025

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,989 1,056 Updated Jun 19, 2025

pesser / stable-diffusion

Jupyter Notebook 1,034 398 Updated May 5, 2023

sjtuplayer / SuperSVG

[CVPR 2024] SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

C++ 50 3 Updated Dec 18, 2024

0