-
lectures Public
Forked from gpu-mode/lecturesMaterial for gpu-mode lectures
Jupyter Notebook Apache License 2.0 UpdatedDec 3, 2024 -
mteb Public
Forked from embeddings-benchmark/mtebMTEB: Massive Text Embedding Benchmark
Jupyter Notebook Apache License 2.0 UpdatedSep 20, 2024 -
FlagEmbedding Public
Forked from FlagOpen/FlagEmbeddingRetrieval and Retrieval-augmented LLMs
Python MIT License UpdatedSep 10, 2024 -
composable_kernel Public
Forked from ROCm/composable_kernelComposable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
C++ Other UpdatedSep 9, 2024 -
rocprofiler Public
Forked from ROCm/rocprofilerROC profiler library. Profiling with perf-counters and derived metrics.
C MIT License UpdatedSep 6, 2024 -
llm-awq Public
Forked from mit-han-lab/llm-awq[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python MIT License UpdatedJul 16, 2024 -
How_to_optimize_in_GPU Public
Forked from Liu-xiandong/How_to_optimize_in_GPUThis is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
Cuda Apache License 2.0 UpdatedJul 29, 2023 -
YHs_Sample Public
Forked from Yinghan-Li/YHs_SampleYinghan's Code Sample
Cuda GNU General Public License v3.0 UpdatedJul 25, 2022