LiheYoung

Lihe Yang LiheYoung

PhD student in Computer Vision

533 followers · 38 following

The University of Hong Kong
liheyoung.github.io

Achievements

Stars

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,057 46 Updated May 9, 2025

facebookresearch / sonata

[CVPR'25 Highlight] Official repository of Sonata: Self-Supervised Learning of Reliable Point Representations

Python 300 7 Updated May 7, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,343 167 Updated May 13, 2025

ZhenZHAO / awesome-brain-decoding

A curated list of awesome papers on visual reconstructions from brain activity.

24 Updated Dec 31, 2024

prs-eth / Marigold-DC

Zero-Shot Monocular Depth Completion with Guided Diffusion

Python 153 6 Updated Dec 20, 2024

DepthAnything / PromptDA

[CVPR 2025] Prompt Depth Anything

Python 779 43 Updated Mar 4, 2025

prs-eth / RollingDepth

[CVPR 2025] Video Depth without Video Models

Python 521 17 Updated Mar 18, 2025

rayleizhu / GLMix

[NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".

Python 39 3 Updated Jan 21, 2025

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,904 132 Updated Oct 30, 2024

LiheYoung / UniMatch-V2

[TPAMI 2025] UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation

Python 132 10 Updated Jan 8, 2025

Luffy03 / Large-Scale-Medical

[CVPR 2024 Extension] 160K volumes (42M slices) datasets, new segmentation datasets, 31M-1.2B pre-trained models, various pre-training recipes, 50+ downstream tasks implementation

Python 159 11 Updated Mar 17, 2025

Tencent / DepthCrafter

[CVPR 2025 Highlight] DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Python 1,300 74 Updated Apr 18, 2025

isl-org / MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"

Python 4,890 669 Updated Aug 23, 2024

pablovela5620 / sam2-depthanything

Python 71 3 Updated Apr 14, 2025

NVlabs / EAGLE

Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs

Python 771 41 Updated Apr 27, 2025

NVlabs / RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 1,150 44 Updated Apr 24, 2025

mlfoundations / open_clip

An open source implementation of CLIP.

Python 11,726 1,102 Updated Apr 23, 2025

deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 3,826 565 Updated Apr 24, 2024

MackinationsAi / Upgraded-Depth-Anything-V2

Upgraded repo includes more capabilities, converted the cmd .py scripts to function more intuitively, added 147 different depth output colour map methods, introduced batch image as well as video pr…

Python 104 7 Updated Apr 8, 2025

TRI-ML / vlm-evaluation

VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning

Python 110 12 Updated Sep 17, 2024

TRI-ML / prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 676 474 Updated Jul 4, 2024

DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 5,471 498 Updated Jan 22, 2025

Atomic-man007 / Awesome_Multimodel_LLM

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-contex…

323 22 Updated Mar 19, 2025

BrokenSource / DepthFlow

🌊 Images to → 3D Parallax effect video. A free and open source ImmersityAI alternative

Python 752 58 Updated May 10, 2025

mbanani / probe3d

[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models

Python 308 15 Updated Jul 9, 2024

heyoeyo / muggled_dpt

Muggled DPT: Depth estimation without the magic

Python 94 5 Updated Apr 25, 2025

Luffy03 / VoCo

[CVPR 2024] VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis

Python 183 16 Updated Dec 15, 2024

snap-research / Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Python 602 25 Updated Oct 25, 2024

nukezil / GlocalMatch

[IEEE TKDE] Open-Domain Semi-Supervised Learning via Glocal Cluster Structure Exploitation

Python 8 Updated Feb 26, 2024

VIRL-Platform / VIRL

(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life

Python 348 15 Updated Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lihe Yang LiheYoung

Achievements

Achievements

Block or report LiheYoung

Stars

facebookresearch / perception_models

facebookresearch / sonata

hiyouga / EasyR1

ZhenZHAO / awesome-brain-decoding

prs-eth / Marigold-DC

DepthAnything / PromptDA

prs-eth / RollingDepth

rayleizhu / GLMix

cambrian-mllm / cambrian

LiheYoung / UniMatch-V2

Luffy03 / Large-Scale-Medical

Tencent / DepthCrafter

isl-org / MiDaS

pablovela5620 / sam2-depthanything

NVlabs / EAGLE

NVlabs / RADIO

mlfoundations / open_clip

deepseek-ai / DeepSeek-VL

MackinationsAi / Upgraded-Depth-Anything-V2

TRI-ML / vlm-evaluation

TRI-ML / prismatic-vlms

DepthAnything / Depth-Anything-V2

Atomic-man007 / Awesome_Multimodel_LLM

BrokenSource / DepthFlow

mbanani / probe3d

heyoeyo / muggled_dpt

Luffy03 / VoCo

snap-research / Panda-70M

nukezil / GlocalMatch

VIRL-Platform / VIRL