linzhenyuyuchen

🎯

Focusing

LinZhenYu linzhenyuyuchen

🎯

Focusing

Autonomous Driving, NeRF, 3D Reconstruction, Computer Vision and Medical Image Analysis.

27 followers · 61 following

Achievements

Starred repositories

huggingface / search-and-learn

Recipes to scale inference-time compute of open models

Python 1,069 113 Updated May 8, 2025

GAIR-NLP / O1-Journey

O1 Replication Journey

1,988 65 Updated Jan 14, 2025

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,001 318 Updated May 10, 2025

roatienza / straug

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Python 258 36 Updated Jun 24, 2024

aleju / imgaug

Image augmentation for machine learning experiments.

Python 14,573 2,463 Updated Jul 30, 2024

tianyi-lab / HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Python 281 8 Updated Nov 13, 2024

YiyangZhou / LURE

[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Python 146 5 Updated Apr 30, 2024

showlab / Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

678 25 Updated Apr 9, 2025

EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,469 277 Updated May 7, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 8,048 607 Updated Apr 27, 2025

dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,273 281 Updated May 4, 2024

PKU-YuanGroup / LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Python 805 54 Updated Mar 25, 2024

archiki / RepARe

Python 19 1 Updated Oct 10, 2023

thunlp / LLaVA-UHD

LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window Transformer

Python 376 18 Updated Apr 20, 2025

DirtyHarryLYL / LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

862 38 Updated Mar 8, 2025

OFA-Sys / ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 1,040 70 Updated Oct 6, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,271 102 Updated May 4, 2025

lichao-sun / Mora

Mora: More like Sora for Generalist Video Generation

Python 1,555 105 Updated Oct 10, 2024

WongKinYiu / yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Python 9,253 1,531 Updated Aug 9, 2024

X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,169 127 Updated Dec 24, 2024

open-mmlab / mmyolo

OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8,YOLOX, PPYOLOE, etc.

Python 3,194 575 Updated Jul 14, 2024

Megvii-BaseDetection / YOLOX

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

Python 9,821 2,306 Updated Nov 20, 2024

AILab-CVC / YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 5,392 517 Updated Feb 26, 2025

lichengunc / refer

Referring Expression Datasets API

Jupyter Notebook 514 81 Updated Aug 27, 2024

Yuliang-Liu / Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,756 121 Updated Apr 17, 2025

FoundationVision / GenerateU

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection

Python 168 7 Updated Mar 29, 2025

xai-org / grok-1

Grok open release

Python 50,240 8,353 Updated Aug 30, 2024

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 17,242 2,019 Updated May 1, 2025

pkunlp-icler / FastV

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 423 14 Updated Jan 4, 2025

wayveai / LingoQA

[ECCV 2024] Official GitHub repository for the paper "LingoQA: Visual Question Answering for Autonomous Driving"

Python 164 6 Updated Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LinZhenYu linzhenyuyuchen

Achievements

Achievements

Block or report linzhenyuyuchen

Starred repositories

huggingface / search-and-learn

GAIR-NLP / O1-Journey

linkedin / Liger-Kernel

roatienza / straug

aleju / imgaug

tianyi-lab / HallusionBench

YiyangZhou / LURE

showlab / Awesome-MLLM-Hallucination

EvolvingLMMs-Lab / lmms-eval

OpenGVLab / InternVL

dvlab-research / MGM

PKU-YuanGroup / LanguageBind

archiki / RepARe

thunlp / LLaVA-UHD

DirtyHarryLYL / LLM-in-Vision

OFA-Sys / ONE-PEACE

yunlong10 / Awesome-LLMs-for-Video-Understanding

lichao-sun / Mora

WongKinYiu / yolov9

X-PLUG / mPLUG-DocOwl

open-mmlab / mmyolo

Megvii-BaseDetection / YOLOX

AILab-CVC / YOLO-World

lichengunc / refer

Yuliang-Liu / Monkey

FoundationVision / GenerateU

xai-org / grok-1

liguodongiot / llm-action

pkunlp-icler / FastV

wayveai / LingoQA

Starred topics

mimic-iii