Stars
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Making large AI models cheaper, faster and more accessible
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
High-speed download of LLaMA, Facebook's 65B parameter GPT model
AI agent stdlib that works with any LLM and TypeScript AI SDK.
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
GLIDE: a diffusion-based text-conditional image synthesis model
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Guide to using pre-trained large language models of source code
Full description can be found here: https://discuss.huggingface.co/t/pretrain-gpt-neo-for-open-source-github-copilot-model/7678?u=ncoop57
Easily compute clip embeddings and build a clip retrieval system with them
The Audio Set Ontology aims to provide a comprehensive set of categories to describe sound events.
billjie1 / Chinese-CLIP
Forked from OFA-Sys/Chinese-CLIPChinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations
DALL·E Mini - Generate images from a text prompt
Examples of how to create colorful, annotated equations in Latex using Tikz.