Stars
This repository contains the baseline and example submission for the FEVER 8 Shared Task. The baseline is a computationally optimized version of the HerO system (https://github.com/ssu-humane/HerO)โฆ
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
The code for HerO: a fact-checking pipeline based on open LLMs (the runner-up in AVeriTeC)
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieval Results in RAG Systems (WWW 2025)
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Official inference framework for 1-bit LLMs
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
A curated list of retrieval-augmented generation (RAG) in large language models
Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).
๐ A list of open LLMs available for commercial use.
MINT-1T: A one trillion token multimodal interleaved dataset.
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
A large-scale information-rich web dataset, featuring millions of real clicked query-document labels
A curated list of awesome instruction tuning datasets, models, papers and repositories.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
LangChain ๊ณต์ Document, Cookbook, ๊ทธ ๋ฐ์ ์ค์ฉ ์์ ๋ฅผ ๋ฐํ์ผ๋ก ์์ฑํ ํ๊ตญ์ด ํํ ๋ฆฌ์ผ์ ๋๋ค. ๋ณธ ํํ ๋ฆฌ์ผ์ ํตํด LangChain์ ๋ ์ฝ๊ณ ํจ๊ณผ์ ์ผ๋ก ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ์ ๋ฐฐ์ธ ์ ์์ต๋๋ค.
Dataset and Code for Multimodal Fact Checking and Explanation Generation (Mocheg)