8000 zhangzw12319 / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View zhangzw12319's full-sized avatar
🎯
Focusing
🎯
Focusing
  • CSE Dept, SJTU
  • Shanghai, China

Highlights

  • Pro

Block or report zhangzw12319

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository provides a Python script to fetch and summarize research papers from arXiv using the free Gemini API

Python 195 25 Updated Mar 5, 2025

[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions

Python 246 5 Updated May 16, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,060 986 Updated May 17, 2025

[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Python 957 67 Updated Apr 25, 2025

The Simulation Framework from AgiBot

Python 138 9 Updated May 7, 2025

Tracking Any Point (TAP)

Jupyter Notebook 1,519 141 Updated May 17, 2025

3D Occupancy Prediction Benchmark in Autonomous Driving

Python 356 22 Updated May 27, 2024

Nexus: Decoupled Diffusion Sparks Adaptive Scene Generation

Python 37 4 Updated May 15, 2025

LeetCode 101:力扣刷题指南

9,411 1,231 Updated Dec 8, 2024
Python 5 1 Updated May 8, 2025

OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Python 174 9 Updated May 31, 2024

[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”

Python 965 48 Updated Apr 21, 2025

Official implementation of the paper “MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes”

284 4 Updated Jun 8, 2024

🖱️ 纯粹的Windows右键菜单管理程序

C# 14,560 691 Updated Aug 17, 2024

STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?

Python 20 1 Updated May 9, 2025
Python 16 1 Updated Jun 1, 2022

Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]

Python 520 25 Updated May 16, 2025

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Python 321 23 Updated Jul 17, 2024
Python 56 1 Updated Apr 17, 2025

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Python 277 8 Updated Apr 28, 2025

[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Python 960 55 Updated Mar 23, 2025

[CVPR 2025] UniScene: Unified Occupancy-centric Driving Scene Generation

Python 381 10 Updated Apr 24, 2025
Python 987 51 Updated May 14, 2025

RoboDual: Dual-System for Robotic Manipulation

Python 76 2 Updated Apr 28, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,422 749 Updated May 15, 2025

[CVPR 2025] Test-Time Visual In-Context Tuning

23 1 Updated Mar 28, 2025

Modified 3D Gaussian rasterizer for latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction

Cuda 52 6 Updated Apr 10, 2024
Next
0