Stars
The official implementation of [CVPR 2025] "5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks".
Last version of Foxglove Studio Open Source
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
深度学习入门2:自制框架-随书代码
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理
Quantization of Convolutional Neural networks.
PyTorch implementation of https://arxiv.org/abs/1705.07115 loss weighting
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
An object-oriented algebraic modeling language in Python for structured optimization problems.
VarifocalNet: An IoU-aware Dense Object Detector
Code for CVPR-W 2020 paper "Hierarchical Image Classification using Entailment Cone Embeddings" https://arxiv.org/abs/2004.03459
An open-source RAG-based tool for chatting with your documents.
KITTI Object Visualization (Birdview, Volumetric LiDar point cloud )
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral
[ICML2021] What Makes for End-to-End Object Detection
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
OCR & Document Extraction using vision models
The Linux Kernel Module Programming Guide (updated for 5.0+ kernels)