Stars
A generative speech model for daily dialogue.
Real-time and accurate open-vocabulary end-to-end object detection
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Fine-tune Segment-Anything Model with Lightning Fabric.
Simple Finetuning Starter Code for Segment Anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Pytorch-Image-Classification
(TPAMI 2021) iOD: Incremental Object Detection via Meta-Learning
Open-source code for Generic Grouping Network (GGN, CVPR 2022)
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
SimMatch: Semi-supervised Learning with Similarity Matching
GroupSoftmax cross entropy loss function for training with multiple different benchmark datasets
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Source Code of our CVPR2021 paper "Rethinking BiSeNet For Real-time Semantic Segmentation"
TensorFlow Implementation for Computing a Semantically Segmented Bird's Eye View (BEV) Image Given the Images of Multiple Vehicle-Mounted Cameras.
Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (ECCV 2020)
E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation
[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Rank 1st in the leaderboard of SemanticKITTI semantic segmentation (both single-scan and multi-scan) (Nov. 2020) (CVPR2021 Oral)