8000 Chenfeng1271 (Feng Chen) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Chenfeng1271's full-sized avatar
  • P.h.D student@University of Adelaide
  • Sydney, Australia

Highlights

  • Pro

Block or report Chenfeng1271

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 812 13 Updated May 15, 2025

Official repository of T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Python 282 15 Updated May 12, 2025

Lets make video diffusion practical!

Python 13,159 1,118 Updated May 4, 2025

Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"

Python 201 7 Updated Apr 23, 2025

Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning

Python 37 Updated Apr 30, 2025

A Self-Training Framework for Vision-Language Reasoning

Python 78 1 Updated Jan 23, 2025

MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, realistic, and adaptive scene generation for applications in…

Python 118 5 Updated May 5, 2025

Envolving Temporal Reasoning Capability into LMMs via Temporal Consistent Reward

Python 35 3 Updated Mar 21, 2025

This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…

Python 563 13 Updated May 7, 2025

Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]

Python 520 25 Updated May 16, 2025

Large Language Model (LLM) Systems Paper List

BD16 1,224 69 Updated May 10, 2025

R1-like Video-LLM for Temporal Grounding

Python 89 Updated Apr 10, 2025

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,238 263 Updated May 6, 2025
Python 9 Updated Apr 1, 2025

GRPO Algorithm for Llava Architecture (Based on Verl)

Python 16 Updated May 9, 2025

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,747 194 Updated Jan 16, 2025

[LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)

Python 28 2 Updated May 9, 2025

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models

Python 269 6 Updated Apr 11, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 49,036 5,970 Updated May 16, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 927 43 Updated Apr 15, 2025

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Python 335 Updated Apr 30, 2025

The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"

Python 48 Updated Apr 5, 2025
Python 29 2 Updated Mar 11, 2025

Collections of Papers and Projects for Multimodal Reasoning.

104 9 Updated Apr 25, 2025

Paper list for Efficient Reasoning.

435 14 Updated May 14, 2025

Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.

Jupyter Notebook 618 51 Updated Mar 22, 2025

Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation

Jupyter Notebook 29 Updated Mar 28, 2025

Official implementation of UnifiedReward & UnifiedReward-Think

Python 364 9 Updated May 15, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 16,511 1,935 Updated May 16, 2025
Next
0