8000 peakji (Yichao 'Peak' Ji) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View peakji's full-sized avatar
🔜
Making progress
🔜
Making progress

Highlights

  • Pro

Organizations

@Level @hyperonym

Block or report peakji

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

how to optimize some algorithm in cuda.

Cuda 1,919 167 Updated Feb 23, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 1,015 87 Updated Feb 25, 2025

Header-only C++/python library for fast approximate nearest neighbors

C++ 4,553 679 Updated Aug 11, 2024

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,444 243 Updated Feb 20, 2025

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 624 50 Updated Dec 16, 2024

Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!

Python 5,702 398 Updated Feb 24, 2025

Everything we actually know about the Apple Neural Engine (ANE)

2,159 77 Updated Sep 23, 2024

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 1,964 245 Updated Jan 20, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 41,885 5,130 Updated Feb 25, 2025

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 16,301 2,348 Updated Feb 25, 2025

Minimalist ML framework for Rust

Rust 16,660 1,039 Updated Feb 22, 2025

Blazingly fast LLM inference.

Rust 5,087 358 Updated Feb 25, 2025

A natural language interface for computers

Python 58,447 5,002 Updated Jan 24, 2025

Build real-time multimodal AI applications 🤖🎙️📹

Python 5,156 634 Updated Feb 25, 2025

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality

Python 3,655 271 Updated Aug 10, 2024

A generative speech model for daily dialogue.

Python 34,698 3,739 Updated Feb 18, 2025

Minimal container for Chrome's headless shell, useful for automating / driving the web

Shell 522 65 Updated Jun 25, 2024

A collective list of free APIs

Python 328,388 34,817 Updated Oct 31, 2024

OpenGFW is a flexible, easy-to-use, open source implementation of GFW (Great Firewall of China) on Linux

Go 10,091 751 Updated Oct 28, 2024

📖 100 Go Mistakes and How to Avoid Them

Go 7,228 453 Updated Feb 4, 2025

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,421 896 Updated Jul 1, 2024

Detect file content types with deep learning

Rust 8,434 436 Updated Feb 24, 2025

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 15,645 1,085 Updated Feb 20, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 40,714 5,461 Updated Feb 23, 2025

Make images smaller using best-in-class codecs, right in the browser.

TypeScript 22,510 1,604 Updated Nov 26, 2024

leaked prompts of GPTs

29,323 3,978 Updated Sep 27, 2024

A blazing fast inference solution for text embeddings models

Rust 3,202 218 Updated Feb 25, 2025

Retrieval and Retrieval-augmented LLMs

Python 8,656 627 Updated Feb 13, 2025

A tokenizer based on Unicode text segmentation (UAX #29), for Go. Split words, sentences and graphemes.

Go 54 3 Updated Sep 2, 2024
Next
0