Starred repositories
Official implementation "Egocentric zone-aware action recognition across environments" (Pattern Recognition Letters 2025)
Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Code Repository for Liquid Time-Constant Networks (LTCs)
Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".
[CVPR 2023] OneFormer: One Transformer to Rule Universal Image Segmentation
Combining Segment Anything (SAM) with Grounded DINO for zero-shot object detection and CLIPSeg for zero-shot segmentation
Code release for paper "You Only Segment Once: Towards Real-Time Panoptic Segmentation" [CVPR 2023]
CAVIS: Context-Aware Video Instance Segmentation
[ICLR 2025 oral] RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
A unified framework for 3D content generation.
[SGP 2025] OctFusion: Octree-based Diffusion Models for 3D Shape Generation
[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
Official implementation of "MeshDiffusion: Score-based Generative 3D Mesh Modeling" (ICLR 2023 Spotlight)
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Official implementation of the paper "MotionAGFormer: Enhancing 3D Pose Estimation with a Transformer-GCNFormer Network" (WACV 2024).
[ECCV 2024] HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning
HandTailor: Towards High-Precision Monocular 3D Hand Recovery
(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image
PyTorch reimplementation of minimal-hand (CVPR2020)