-
Alibaba Group
- Hangzhou, China
- http://zihaolucky.github.io
Stars
FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age
RewardBench: the first evaluation tool for reward models.
DeepRetrieval - 🔥 Training Search Agent with Retrieval Outcomes via Reinforcement Learning
Implementing DeepSeek R1's GRPO algorithm from scratch
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning
DeepSeek-VL2: Mixture-of-Experts Vision-Langua 8000 ge Models for Advanced Multimodal Understanding
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content".
A fork to add multimodal model training to open-r1
Fully open reproduction of DeepSeek-R1
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR25]
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
Aligning LMMs with Factually Augmented RLHF
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
GPT4V-level open-source multi-modal model based on Llama3-8B
pdf-translator translates English PDF files into Japanese, preserving the original layout.
Open-Sora: Democratizing Efficient Video Production for All
SGLang is a fast serving framework for large language models and vision language models.
A Python package for causal inference in quasi-experimental settings
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
This directory includes a giant Japanese-English subtitle corpus. The raw data comes from the Stanford’s JESC project.
ccks2022 task9 subtask2 商品同款识别
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Large Language Model Text Generation Inference