8000 godfather991 (Tongxin Xie) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View godfather991's full-sized avatar

Highlights

  • Pro

Block or report godfather991

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 308 26 Updated Jul 2, 2024

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 387 38 Updated Nov 26, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

1,149 94 Updated Apr 1, 2025
Python 66 6 Updated Jun 24, 2024

Artifact for "Multi-Dimensional Vector ISA Extension for Mobile In-Cache Computing (HPCA 2025)" paper

HTML 2 1 Updated Apr 2, 2025
Python 3 Updated Apr 7, 2024

A PIM instrumentation, compilation, execution, simulation, and evaluation repository for BLIMP-style architectures.

C 18 5 Updated May 12, 2022
Rust 11 2 Updated May 13, 2025

Message passing ISA compiler for general GNN, and architecture simulation for graph tensor accelerator

Python 2 2 Updated Oct 17, 2024

微信聊天记录导出、微信年度报告生成!记录你的2023!

Python 151 15 Updated Jan 16, 2024

Run Mixtral-8x7B models in Colab or consumer desktops

Python 2,311 232 Updated Apr 8, 2024

UPMEM LLM Framework allows profiling PyTorch layers and functions and simulate those layers/functions with a given hardware profile.

Python 28 8 Updated Apr 29, 2025

hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.

Jupyter Notebook 48 9 Updated Jun 15, 2023

MLIR For Beginners tutorial

C++ 970 86 Updated Feb 7, 2025

DAMOV is a benchmark suite and a methodical framework targeting the study of data movement bottlenecks in modern applications. It is intended to study new architectures, such as near-data processin…

C++ 81 18 Updated Jul 27, 2023

translate python documents to Chinese for convenient reference 简而言之,这里用来存放那些Python文档君们,并且尽力将其翻译成中文~~

1,949 675 Updated May 17, 2024

Tips for Writing a Research Paper using LaTeX

TeX 3,466 388 Updated May 4, 2023

Inference code for Llama models

Python 58,229 9,767 Updated Jan 26, 2025

NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing

Jupyter Notebook 80 23 Updated Jun 19, 2024

MambaOut: Do We Really Need Mamba for Vision? (CVPR 2025)

Python 2,368 42 Updated Mar 9, 2025

Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…

C++ 332 80 Updated May 7, 2025

A Fast and Extensible DRAM Simulator, with built-in support for modeling many different DRAM technologies including DDRx, LPDDRx, GDDRx, WIOx, HBMx, and various academic proposals. Described in the…

C++ 633 214 Updated Aug 29, 2023

Training and serving large-scale neural networks with auto parallelization.

Python 3,131 359 Updated Dec 9, 2023

Processing-In-Memory (PIM) Simulator

C++ 163 57 Updated Dec 12, 2024

DRAMsim3: a Cycle-accurate, Thermal-Capable DRAM Simulator

C++ 368 156 Updated Aug 3, 2024
C++ 17 12 Updated Jun 1, 2023

Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators

C++ 80 14 Updated Apr 28, 2025
Next
0