8000 mashijie1028 (Shijie Ma) / Starred · GitHub

More Web Proxy on the site http://driver.im/

mashijie1028

Follow

👨‍💻

working

Shijie Ma mashijie1028

👨‍💻

working

Follow

Ph.D. student @ Institute of Automation, Chinese Academy of Sciences (CASIA). Previously B.E. @ Tsinghua University.

25 followers · 55 following

Institute of Automation, CAS
Beijing
00:30 (UTC +08:00)
https://mashijie1028.github.io
https://orcid.org/0009-0005-1131-5686
https://scholar.google.com/citations?user=pLVzF3cAAAAJ&hl=en

Achievements

Achievements

Starred repositories

EvolvingLMMs-Lab / LongVA

Long Context Transfer from Language to Vision

Python 374 18 Updated Mar 18, 2025

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,718 129 Updated Apr 21, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 20,732 1,358 Updated May 9, 2025

Breakthrough / PySceneDetect

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 3,858 435 Updated May 3, 2025

NVlabs / FRAG

Python 10 Updated Apr 25, 2025

SandAI-org / MAGI-1

MAGI-1: Autoregressive Video Generation at Scale

Python 2,968 158 Updated May 8, 2025

LongVideoHaystack / TStar

Python 43 1 Updated Apr 5, 2025

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,266 102 Updated May 4, 2025

ttengwang / Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

270 12 Updated Nov 15, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,876 442 Updated Aug 7, 2024

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,300 731 Updated May 4, 2025

sihyun-yu / REPA

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,030 45 Updated Mar 16, 2025

PanguIR / MRAGSurvey

A Survey of Multimodal Retrieval-Augmented Generation

18 1 Updated Apr 17, 2025

mashijie1028 / ProtoGCD

Official code for TPAMI 2025 paper "ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category 8000 Discovery"

Python 19 Updated Apr 9, 2025

LuciusLan / Visual-RAG

Repository for our paper Visual-RAG: Benchmarking Text-to-Image Retrieval Augmented Generation for Visual Knowledge Intensive Queries

3 Updated Apr 15, 2025

aimagelab / ReflectiVA

[CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Python 28 Updated Mar 31, 2025

ChaofanTao / Autoregressive-Models-in-Vision-Survey

[TMLR 2025🔥] A survey for the autoregressive models in vision.

566 15 Updated Apr 28, 2025

MME-Benchmarks / MME-Unify

MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models

Python 34 2 Updated Apr 10, 2025

altndrr / lmms-owc

Code implementation of our paper: On Large Multimodal Models as Open-World Image Classifiers

Python 18 Updated Mar 26, 2025

mshumer / OpenDeepResearcher

Jupyter Notebook 2,541 346 Updated May 2, 2025

EvolvingLMMs-Lab / multimodal-search-r1

Python 95 7 Updated Apr 8, 2025

Code-kunkun / LamRA

[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Python 107 4 Updated Mar 18, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,316 164 Updated May 9, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,346 2,237 Updated May 9, 2025

zhengxuJosh / Awesome-RAG-Vision

Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision

147 3 Updated Apr 30, 2025

hymie122 / RAG-Survey

Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".

1,617 109 Updated Aug 20, 2024

coree / awesome-rag

A curated list of retrieval-augmented generation (RAG) in large language models

269 20 Updated Feb 14, 2025

frutik / Awesome-RAG

332 25 Updated Sep 9, 2024

baaivision / Emu

Emu Series: Generative Multimodal Models from BAAI

Python 1,719 85 Updated Sep 27, 2024

llm-lab-org / Multimodal-RAG-Survey

A Survey on Multimodal Retrieval-Augmented Generation

165 8 Updated Apr 19, 2025

Starred topics

Awesome Lists

video-captioning

0