Stars
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Code for the paper Hybrid Spectrogram and Waveform Source Separation
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC ac…
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
libass is a portable subtitle renderer for the ASS/SSA (Advanced Substation Alpha/Substation Alpha) subtitle format.
Font files available from Google Fonts, and a public issue tracker for all things Google Fonts
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Tensorflow implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
LAVIS - A One-stop Library for Language-Vision Intelligence
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Janus-Series: Unified Multimodal Understanding and Generation Models
Docker container for the WPS Office Suite
Text Normalization & Inverse Text Normalization
Yolov5 deepsort inference,使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中
最新版本yolov5+deepsort目标检测和追踪,能够显示目标类别,支持5.0版本可训练自己数据集
Object tracking implemented with YOLOv4, DeepSort, and TensorFlow.
基于Yolov8+UCMCTrack/DeepSort+注意力机制的多目标跟踪系统 v1.0.3
Open-Unmix - Music Source Separation for PyTorch
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
A generative speech model for daily dialogue.