Liu et al., 2023 - Google Patents
Moda: Mapping-once audio-driven portrait animation with dual attentionsLiu et al., 2023
View PDF- Document ID
- 12958348119986055500
- Author
- Liu Y
- Lin L
- Yu F
- Zhou C
- Li Y
- Publication year
- Publication venue
- Proceedings of the IEEE/CVF International Conference on Computer Vision
External Links
Snippet
Audio-driven portrait animation aims to synthesize portrait videos that are conditioned by given audio. Animating high-fidelity and multimodal video portraits has a variety of applications. Previous methods have attempted to capture different motion modes and …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Facial: Synthesizing dynamic talking face with implicit attribute learning | |
Wang et al. | Seeing what you said: Talking face generation guided by a lip reading expert | |
Lu et al. | Live speech portraits: real-time photorealistic talking-head animation | |
Ye et al. | Geneface: Generalized and high-fidelity audio-driven 3d talking face synthesis | |
Liang et al. | Expressive talking head generation with granular audio-visual control | |
Guo et al. | Ad-nerf: Audio driven neural radiance fields for talking head synthesis | |
Yi et al. | Audio-driven talking face video generation with learning-based personalized head pose | |
Tian et al. | Emo: Emote portrait alive-generating expressive portrait videos with audio2video diffusion model under weak conditions | |
Zhu et al. | Arbitrary talking face generation via attentional audio-visual coherence learning | |
Liu et al. | Moda: Mapping-once audio-driven portrait animation with dual attentions | |
Sinha et al. | Emotion-controllable generalized talking face generation | |
Gururani et al. | Space: Speech-driven portrait animation with controllable expression | |
Ye et al. | Geneface++: Generalized and stable real-time audio-driven 3d talking face generation | |
Hajarolasvadi et al. | Generative adversarial networks in human emotion synthesis: A review | |
Bigioi et al. | Speech driven video editing via an audio-conditioned diffusion model | |
Zhou et al. | An image-based visual speech animation system | |
Ma et al. | Talkclip: Talking head generation with text-guided expressive speaking styles | |
Corona et al. | VLOGGER: Multimodal diffusion for embodied avatar synthesis | |
Tan et al. | Style2talker: High-resolution talking head generation with emotion style and art style | |
Liu et al. | Talking face generation via facial anatomy | |
Hong et al. | Dagan++: Depth-aware generative adversarial network for talking head video generation | |
Wang et al. | StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads | |
Wang et al. | Talking faces: Audio-to-video face generation | |
Ji et al. | Realtalk: Real-time and realistic audio-driven face generation with 3d facial prior-guided identity alignment network | |
Liu et al. | OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions |