Stars
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Get your documents ready for gen AI
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Robust Speech Recognition via Large-Scale Weak Supervision
A high-throughput and memory-efficient inference and serving engine for LLMs
Collective communications library with various primitives for multi-machine training.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
✨✨Latest Advances on Multimodal Large Language Models
The official Python library for the OpenAI API
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
C++ IPC Library: A high-performance inter-process communication using shared memory on Linux/Windows.
a clean C library for processing UTF-8 Unicode data
📊 A simple command-line utility for querying and monitoring GPU status
Unsupervised text tokenizer for Neural Network-based text generation.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.
An implementation of C++17 std::filesystem for C++11 /C++14/C++17/C++20 on Windows, macOS, Linux and FreeBSD.
Noise supression using deep filtering
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Data manipulation and transformation for audio signal processing, powered by PyTorch
Deezer source separation library including pretrained models.
kaldi-asr/kaldi is the official location of the Kaldi project.
Accessible large language models via k-bit quantization for PyTorch.
A library for efficient similarity search and clustering of dense vectors.
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。