xxl007

xxl007

5 followers · 0 following

Achievements

Stars

OS-Copilot / OS-Atlas

OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

Python 347 18 Updated Apr 20, 2025

gpt-omni / mini-omni2

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,755 194 Updated Jan 16, 2025

fudan-generative-vision / hallo2

[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 3,572 513 Updated Feb 27, 2025

Hanbo-Cheng / DAWN-pytorch

Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation

Python 227 15 Updated Mar 30, 2025

OSU-NLP-Group / UGround

[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents

Python 244 12 Updated May 28, 2025

Lizonghang / TPI-LLM

TPI-LLM: Serving 70b-scale LLMs Efficiently on Low-resource Edge Devices

Python 179 13 Updated May 28, 2025

bigai-nlco / VideoLLaMB

Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges

Python 68 2 Updated Feb 27, 2025

njucckevin / SeeClick

The model, data and code for the visual GUI Agent SeeClick

HTML 382 19 Updated Nov 22, 2024

niuzaisheng / ScreenAgent

ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)

Python 460 46 Updated Nov 25, 2024

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,406 709 Updated Jun 5, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,869 780 Updated May 15, 2025

facebookresearch / AnimatedDrawings

Code to accompany "A Method for Animating Children's Drawings of the Human Figure"

Python 12,495 1,082 Updated Apr 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xxl007

Achievements

Achievements

Block or report xxl007

Stars

OS-Copilot / OS-Atlas

gpt-omni / mini-omni2

fudan-generative-vision / hallo2

Hanbo-Cheng / DAWN-pytorch

OSU-NLP-Group / UGround

Lizonghang / TPI-LLM

bigai-nlco / VideoLLaMB

njucckevin / SeeClick

niuzaisheng / ScreenAgent

kyutai-labs / moshi

QwenLM / Qwen2.5-VL

facebookresearch / AnimatedDrawings