[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Habibie et al., 2021 - Google Patents

Learning speech-driven 3d conversational gestures from video

Habibie et al., 2021

View PDF
Document ID
12700202956387754433
Author
Habibie I
Xu W
Mehta D
Liu L
Seidel H
Pons-Moll G
Elgharib M
Theobalt C
Publication year
Publication venue
Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents

External Links

Snippet

We propose the first approach to synthesize the synchronous 3D conversational body and hand gestures, as well as 3D face and head animations, of a virtual character from speech input. Our algorithm uses a CNN architecture that leverages the inherent correlation …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00335Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Similar Documents

Publication Publication Date Title
Habibie et al. Learning speech-driven 3d conversational gestures from video
Lu et al. Live speech portraits: real-time photorealistic talking-head animation
Kim et al. Neural style-preserving visual dubbing
Fan et al. Faceformer: Speech-driven 3d facial animation with transformers
Yi et al. Generating holistic 3d human motion from speech
Zhang et al. Facial: Synthesizing dynamic talking face with implicit attribute learning
Bhattacharya et al. Speech2affectivegestures: Synthesizing co-speech gestures with generative adversarial affective expression learning
Suwajanakorn et al. Synthesizing obama: learning lip sync from audio
Chen et al. What comprises a good talking-head video generation?: A survey and benchmark
US11514634B2 (en) Personalized speech-to-video with three-dimensional (3D) skeleton regularization and expressive body poses
Yu et al. Multimodal inputs driven talking face generation with spatial–temporal dependency
Tian et al. Audio2face: Generating speech/face animation from single audio with attention-based bidirectional lstm networks
Thambiraja et al. Imitator: Personalized speech-driven 3d facial animation
US20210390945A1 (en) Text-driven video synthesis with phonetic dictionary
Filntisis et al. Visual speech-aware perceptual 3d facial expression reconstruction from videos
Yu et al. Mining audio, text and visual information for talking face generation
Liu et al. Synthesizing talking faces from text and audio: an autoencoder and sequence-to-sequence convolutional neural network
Liu et al. Real-time speech-driven animation of expressive talking faces
Nazarieh et al. A Survey of Cross-Modal Visual Content Generation
Tran et al. Dyadic Interaction Modeling for Social Behavior Generation
Bhattacharya et al. Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs
Wang et al. Flow2Flow: Audio-visual cross-modality generation for talking face videos with rhythmic head
Chuang Analysis, synthesis, and retargeting of facial expressions
Liu et al. A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Kumar Das et al. Audio driven artificial video face synthesis using gan and machine learning approaches