Stars
[NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting"
A high-throughput and memory-efficient inference and serving engine for LLMs
Quantized Attention achieves speedup of 2-3x and 3-5x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
kevinz8866 / MobileFormer
Forked from AAboys/MobileFormerCode and models for mobile-former
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
A playbook for systematically maximizing the performance of deep learning models.
Use a pretrained network to classify birds on a nature came
solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning
A PyTorch implementation of SimCLR based on ICML 2020 paper "A Simple Framework for Contrastive Learning of Visual Representations"
Efficient Spiking Neural Network framework, built on top of PyTorch for GPU acceleration
This repository contains implementations and illustrative code to accompany DeepMind publications
All Algorithms implemented in Python
Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626
Kalman Filter book using Jupyter Notebook. Focuses on building intuition and experience, not formal proofs. Includes Kalman filters,extended Kalman filters, unscented Kalman filters, particle filte…
Some Python Implementations of the Kalman Filter
Kalman Filter, Smoother, and EM Algorithm for Python