- San Francisco Bay Area
- mehmet.aktukmak@intel.com
Stars
List of papers related to neural network quantization in recent AI conferences and journals.
mlpack: a fast, header-only C++ machine learning library
💻📚💡 DoctorGPT provides advanced LLM prompting for PDFs and webpages.
Software to implement GoT with a weviate vectorized database
This repository contains the experimental PyTorch native float8 training UX
Turning float tensors to binary tensors according to IEEE-754 standard.
A framework for few-shot evalu 5318 ation of language models.
A pytorch quantization backend for optimum
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization' [Huang+, ICCV2017]
A high-throughput and memory-efficient inference and serving engine for LLMs