8000 Z-Zheng (Zhuo Zheng) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Z-Zheng's full-sized avatar
😎
Focusing
😎
Focusing

Organizations

@RSIDEA-EarthInsight

Block or report Z-Zheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Interactive visualizations of the geometric intuition behind diffusion models.

Svelte 502 21 Updated May 19, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,076 168 Updated May 14, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,112 51 Updated May 9, 2025

[CVPR 2025] Test-Time Visual In-Context Tuning

23 1 Updated Mar 28, 2025

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

547 26 Updated Apr 9, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,971 228 Updated May 19, 2025

[IEEE TIP 2025] Multi-Axis Feature Diversity Enhancement for Remote Sensing Video Super-Resolution

Python 18 Updated Mar 23, 2025

Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.

Python 340 10 Updated May 19, 2025

Official implementation of OneDiffusion paper (CVPR 2025)

Python 626 19 Updated Dec 14, 2024

Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning (Best open-source multimodal reasoning model)

Python 2,492 247 Updated May 9, 2025

Radar Simulator built with Python and C++

Python 373 76 Updated May 16, 2025

Code release for DynamicTanh (DyT)

Python 931 79 Updated Mar 30, 2025

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 242 9 Updated Apr 30, 2025

This repository contains the "superproject" wrapper for the "Classic" configuration of the GEOS-Chem model of atmospheric chemistry and composition.

CMake 20 42 Updated May 19, 2025

Official Implementation of Rectified Flow (ICLR2023 Spotlight)

Python 1,244 70 Updated Jul 20, 2024

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

Python 5,956 958 Updated Apr 4, 2025

Diffusion Model-Based Image Editing: A Survey (TPAMI 2025)

618 40 Updated Mar 23, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,272 2,233 Updated Feb 1, 2025

Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 521 39 Updated Mar 16, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 11,497 1,307 Updated May 17, 2025

Official Repo for Open-Reasoner-Zero

Python 1,921 99 Updated Apr 8, 2025

[CVPR2025 Highlight] Video Generation Foundation Models: https://saiyan-world.github.io/goku/

Python 2,840 300 Updated Feb 19, 2025

Orthogonalize polygon in python by making all its angles 90 or 180 deg

Python 76 7 Updated Feb 9, 2021

Fully open reproduction of DeepSeek-R1

Python 24,478 2,253 Updated May 20, 2025

GeoPixel: A Pixel Grounding Large Multimodal Model for Remote Sensing is specifically developed for high-resolution remote sensing image analysis, offering advanced multi-target pixel grounding cap…

Python 82 7 Updated May 1, 2025

Official implementation of the WACV 2025 paper "Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery".

Python 74 7 Updated Feb 20, 2025

[IEEE GRSS DFC 2025 Track II] BRIGHT: A globally distributed multimodal VHR dataset for all-weather disaster response

Python 135 18 Updated May 12, 2025

[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Python 308 9 Updated Mar 20, 2025

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 2,596 137 Updated Jan 2, 2025
Next
0