Stars
[ICLR 2025] Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron
(ICLR 2023 Spotlight) MPCFormer: fast, performant, and private transformer inference with MPC
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'
A curated list of resources dedicated to the safety of Large Vision-Language Models. This repository aligns with our survey titled A Survey of Safety on Large Vision-Language Models: Attacks, Defen…
[CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks"
Efficient Multimodal Large Language Models: A Survey
Research and Materials on Hardware implementation of Transformer Model
The official implementation of the paper "RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction"
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Official implementation of "When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture" published at NeurIPS 2022.
MoBA: Mixture of Block Attention for Long-Context LLMs
GMoE could be the next backbone model for many kinds of generalization task.
Machine Learning Engineering Open Book
A family of efficient edge language models in 100M~1B sizes.
OLMoE: Open Mixture-of-Experts Language Models
[ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains More
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?