More
Stars
- All languages
- AspectJ
- Assembly
- C
- C#
- C++
- CSS
- CoffeeScript
- Common Lisp
- Cuda
- Cython
- Dockerfile
- Go
- HTML
- Haskell
- Idris
- Java
- JavaScript
- Jsonnet
- Julia
- Jupyter Notebook
- Lua
- MDX
- MLIR
- Makefile
- OCaml
- Objective-C
- PHP
- PowerShell
- Pug
- Python
- R
- Racket
- Reason
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- TeX
- TypeScript
- Vim Script
- Vue
- WebAssembly
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
A curated list of awesome papers on dataset distillation and related applications.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)
LangChain for Go, the easiest way to write LLM-based programs in Go
Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models (CRFM) at Stanford for holistic, reproducible and transparen…
AutoMQ is a stateless/diskless Kafka on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A generative AI extension for JupyterLab
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
An open protocol enabling communication and interoperability between opaque agentic applications.
Democratizing Reinforcement Learning for LLMs
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Development repository for the Triton language and compiler
A Datacenter Scale Distributed Inference Serving Framework
AI 基础知识 - GPU 架构、CUDA 编程以及大模型基础知识
Hyperlight is a lightweight Virtual Machine Manager (VMM) designed to be embedded within applications. It enables safe execution of untrusted code within micro virtual machines with very low latenc…
A Go implementation of the Model Context Protocol (MCP), enabling seamless integration between LLM applications and external data sources and tools.
Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)
WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices,…