Stars
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
Measuring Massive Multitask Language Understanding | ICLR 2021
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Instruct-tune LLaMA on consumer hardware
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
A more efficient yolov5 with oneflow backend 🎉🎉🎉
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
The implementation of “Gradient Harmonized Single-stage Detector” published on AAAI 2019.
You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2
Boosting your Web Services of Deep Learning Applications.
[ICML2021] What Makes for End-to-End Object Detection
The official PyTorch implementation of paper BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
ICCV19: Official code of Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation
This repo is implemented based on detectron2 and centernet
A pytorch codebase for human parsing and vehicle parsing
PingoLH / CenterNet-HarDNet
Forked from xingyizhou/CenterNetObject detection achieving 44.3 mAP / 45 fps on COCO dataset
Monocular, One-stage, Regression of Multiple 3D People and their 3D positions & trajectories in camera & global coordinates. ROMP[ICCV21], BEV[CVPR22], TRACE[CVPR2023]
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
support deepsort and bytetrack MOT(Multi-object tracking) using yolov5 with C++
convert mmdetection model to tensorrt, support fp16, int8, batch input, dynamic shape etc.