-
Beijing Jiaotong University
- Beijing
- https://scholar.google.com/citations?user=po1KXtwAAAAJ&hl=en
- https://orcid.org/0000-0002-3743-7738
Highlights
- Pro
Lists (24)
Sort Name ascending (A-Z)
Change Detection
ChatGPT
CLIP
Cloud Detection
Cloud Removal
Computer Vision
Consistency Models
Deepfake Detection
Diffusion
Face Generation
Face Recognition
Image Restoration
Linux
Machine Learning
Music Source Separation
Obejct Detection
Pansharpening
Point Cloud Completion
Reinforcement Learning
RLHF
Semantic Segmentation
Talking Face Generation
Toolbox
Video Prediction
Stars
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
[ICLR 2025] Animate-X: Universal Character Image Animation with Enhanced Motion Representation
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
SkyReels-V2: Infinite-length Film Generative model
[ICCV 2025] Dynamic Dictionary Learning for Remote Sensing Image Segmentation
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
🤩 Easy-to-use global IM bot platform designed for the LLM era / 简单易用的大模型即时通信机器人开发平台 ⚡️ Bots for QQ / QQ频道 / Discord / WeChat(企业微信、个人微信)/ Telegram / 飞书 / 钉钉 / Slack 🧩 Integrated with ChatGPT、DeepSee…
TensorFlow code implementation of "MTCAE-DFER: Multi-Task Cascaded Autoencoder for Dynamic Facial Expression Recognition"
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
MAGI-1: Autoregressive Video Generation at Scale
[ECCVW/AIM 2024] MM2Latent: Text-to-facial image generation and editing in GANs with multimodal assistance
A Model Context Protocol (MCP) server that provides access to the DBLP computer science bibliography database for Large Language Models.
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
Source code for the CVPR'20 paper "Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network"
FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age
[CVPR 2024 Highlight] PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
[CVPR 2025] High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model