Lists (2)
Sort Name ascending (A-Z)
Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
The official repo for "Where do Large Vision-Language Models Look at when Answering Questions?"
[NAACL 2025 Oral] 🎉 From redundancy to relevance: Enhancing explainability in multimodal large language models
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…
[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models
Stanford NLP Python library for understanding and improving PyTorch models via interventions
Open-source and strong foundation image recognition models.
Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering
[ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
VIP cheatsheets for Stanford's CS 229 Machine Learning
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
ECSO (Make MLLM safe without neither training nor any external models!) (https://arxiv.org/abs/2403.09572)
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
Recommend new arxiv papers of your interest daily according to your Zotero libarary.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
A RLHF Infrastructure for Vision-Language Models
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Code for paper: Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection