Starred repositories
Amodal-Instance-Segmentation-through-KINS-Dataset
A protocol for real-time transfer and visualization of autonomy data
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-e…
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
End-to-End Object Detection with Transformers
[T-PAMI-2024] Transformer-Based Visual Segmentation: A Survey
A simple, fully convolutional model for real-time instance segmentation.
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
[CVPR 2023] OneFormer: One Transformer to Rule Universal Image Segmentation
Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Pointcept: a codebase for point cloud perception research. Latest works: Sonata (CVPR'25 Highlight), PTv3 (CVPR'24 Oral), PPT (CVPR'24), MSC (CVPR'23)
TrackR-CNN baseline method for Multi-Object Tracking and Segmentation (MOTS)
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
[CVPR 2024 Oral] Official repository of FMA-Net
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Methods for creating saliency maps for computer vision models.
Lime: Explaining the predictions of any machine learning classifier
A game theoretic approach to explain the output of any machine learning model.
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
The LRP Toolbox provides simple and accessible stand-alone implementations of LRP for artificial neural networks supporting Matlab and Python. The Toolbox realizes LRP functionality for the Caffe D…