[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Liu et al., 2023 - Google Patents

Moda: Mapping-once audio-driven portrait animation with dual attentions

Liu et al., 2023

View PDF
Document ID
12958348119986055500
Author
Liu Y
Lin L
Yu F
Zhou C
Li Y
Publication year
Publication venue
Proceedings of the IEEE/CVF International Conference on Computer Vision

External Links

Snippet

Audio-driven portrait animation aims to synthesize portrait videos that are conditioned by given audio. Animating high-fidelity and multimodal video portraits has a variety of applications. Previous methods have attempted to capture different motion modes and …
Continue reading at openaccess.thecvf.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00288Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00335Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00362Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
    • G06K9/00369Recognition of whole body, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Similar Documents

Publication Publication Date Title
Zhang et al. Facial: Synthesizing dynamic talking face with implicit attribute learning
Wang et al. Seeing what you said: Talking face generation guided by a lip reading expert
Lu et al. Live speech portraits: real-time photorealistic talking-head animation
Ye et al. Geneface: Generalized and high-fidelity audio-driven 3d talking face synthesis
Liang et al. Expressive talking head generation with granular audio-visual control
Guo et al. Ad-nerf: Audio driven neural radiance fields for talking head synthesis
Yi et al. Audio-driven talking face video generation with learning-based personalized head pose
Tian et al. Emo: Emote portrait alive-generating expressive portrait videos with audio2video diffusion model under weak conditions
Zhu et al. Arbitrary talking face generation via attentional audio-visual coherence learning
Liu et al. Moda: Mapping-once audio-driven portrait animation with dual attentions
Sinha et al. Emotion-controllable generalized talking face generation
Gururani et al. Space: Speech-driven portrait animation with controllable expression
Ye et al. Geneface++: Generalized and stable real-time audio-driven 3d talking face generation
Hajarolasvadi et al. Generative adversarial networks in human emotion synthesis: A review
Bigioi et al. Speech driven video editing via an audio-conditioned diffusion model
Zhou et al. An image-based visual speech animation system
Ma et al. Talkclip: Talking head generation with text-guided expressive speaking styles
Corona et al. VLOGGER: Multimodal diffusion for embodied avatar synthesis
Tan et al. Style2talker: High-resolution talking head generation with emotion style and art style
Liu et al. Talking face generation via facial anatomy
Hong et al. Dagan++: Depth-aware generative adversarial network for talking head video generation
Wang et al. StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads
Wang et al. Talking faces: Audio-to-video face generation
Ji et al. Realtalk: Real-time and realistic audio-driven face generation with 3d facial prior-guided identity alignment network
Liu et al. OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions