Stars
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
Anthropic's Interactive Prompt Engineering Tutorial
The official repo of the paper "MMLongBench Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly"
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
🤗 smolagents: a barebones library for agents that think in code.
Tools for understanding how transformer predictions are built layer-by-layer
We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理
A comprehensive paper list of Reasoning over Tables.
Robust recipes to align language models with human and AI preferences
WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
A library for mechanistic interpretability of GPT-style language models
Stanford NLP Python library for understanding and improving PyTorch models via interventions
This repository collects all relevant resources about interpretability in LLMs
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
A beautiful, simple, clean, and responsive Jekyll theme for academics
Attribute (or cite) statements generated by LLMs back to in-context information.
A curated list of Large Language Model (LLM) Interpretability resources.
🔥Highlighting the top ML papers every week.
A high-throughput and memory-efficient inference and serving engine for LLMs
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/l…
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
Stanford NLP Python library for Representation Finetuning (ReFT)
The official PyTorch implementation of Google's Gemma models
Must-read Papers on Knowledge Editing for Large Language Models.