-
bigcode-evaluation-harness Public
Forked from bigcode-project/bigcode-evaluation-harnessA framework for the evaluation of autoregressive code generation language models.
Python Apache License 2.0 UpdatedSep 12, 2023 -
codellama Public
Forked from meta-llama/codellamaInference code for CodeLlama models
Python Other UpdatedSep 12, 2023 -
evalplus Public
Forked from evalplus/evalplusEvalPlus for rigourous evaluation of LLM-synthesized code
Python Apache License 2.0 UpdatedSep 2, 2023 -
WizardLM Public
Forked from nlpxucan/WizardLMFamily of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
Python UpdatedSep 1, 2023 -
human-eval Public
Forked from openai/human-evalCode for the paper "Evaluating Large Language Models Trained on Code"
Python MIT License UpdatedAug 21, 2023 -
FastChat Public
Forked from lm-sys/FastChatAn open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Python Apache License 2.0 UpdatedAug 20, 2023 -
LLMDrift Public
Forked from lchen001/LLMDriftTo use materials for checking LLM
Jupyter Notebook Apache License 2.0 UpdatedJul 19, 2023 -
llm-humaneval-benchmarks Public
Forked from my-other-github-account/llm-humaneval-benchmarksJupyter Notebook MIT License UpdatedJun 13, 2023 -
-
coder_reviewer_reranking Public
Forked from facebookresearch/coder_reviewer_rerankingOfficial code release for the paper Coder Reviewer Reranking for Code Generation.
Python Other UpdatedFeb 14, 2023