8000 LiheYoung (Lihe Yang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View LiheYoung's full-sized avatar

Block or report LiheYoung

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,057 46 Updated May 9, 2025

[CVPR'25 Highlight] Official repository of Sonata: Self-Supervised Learning of Reliable Point Representations

Python 300 7 Updated May 7, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,343 167 Updated May 13, 2025

A curated list of awesome papers on visual reconstructions from brain activity.

24 Updated Dec 31, 2024

Zero-Shot Monocular Depth Completion with Guided Diffusion

Python 153 6 Updated Dec 20, 2024

[CVPR 2025] Prompt Depth Anything

Python 779 43 Updated Mar 4, 2025

[CVPR 2025] Video Depth without Video Models

Python 521 17 Updated Mar 18, 2025

[NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".

Python 39 3 Updated Jan 21, 2025

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,904 132 Updated Oct 30, 2024

[TPAMI 2025] UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation

Python 132 10 Updated Jan 8, 2025

[CVPR 2024 Extension] 160K volumes (42M slices) datasets, new segmentation datasets, 31M-1.2B pre-trained models, various pre-training recipes, 50+ downstream tasks implementation

Python 159 11 Updated Mar 17, 2025

[CVPR 2025 Highlight] DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Python 1,300 74 Updated Apr 18, 2025

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"

Python 4,890 669 Updated Aug 23, 2024

Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs

Python 771 41 Updated Apr 27, 2025

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 1,150 44 Updated Apr 24, 2025

An open source implementation of CLIP.

Python 11,726 1,102 Updated Apr 23, 2025

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 3,826 565 Updated Apr 24, 2024

Upgraded repo includes more capabilities, converted the cmd .py scripts to function more intuitively, added 147 different depth output colour map methods, introduced batch image as well as video pr…

Python 104 7 Updated Apr 8, 2025

VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning

Python 110 12 Updated Sep 17, 2024

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 676 474 Updated Jul 4, 2024

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 5,471 498 Updated Jan 22, 2025

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-contex…

323 22 Updated Mar 19, 2025

🌊 Images to → 3D Parallax effect video. A free and open source ImmersityAI alternative

Python 752 58 Updated May 10, 2025

[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models

Python 308 15 Updated Jul 9, 2024

Muggled DPT: Depth estimation without the magic

Python 94 5 Updated Apr 25, 2025

[CVPR 2024] VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis

Python 183 16 Updated Dec 15, 2024

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Python 602 25 Updated Oct 25, 2024

[IEEE TKDE] Open-Domain Semi-Supervised Learning via Glocal Cluster Structure Exploitation

Python 8 Updated Feb 26, 2024

(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life

Python 348 15 Updated Dec 2, 2024
Next
0