Liu et al., 2023 - Google Patents

Talking face generation via facial anatomy

Liu et al., 2023

Document ID: 10483059587120463585
Author: Liu S; Wang H
Publication year: 2023
Publication venue: ACM Transactions on Multimedia Computing, Communications and Applications

External Links

Cited by

Snippet

To generate the corresponding talking face from a speech audio and a face image, it is essential to match the variations in the facial appearance with the speech audio in subtle movements of different face regions. Nevertheless, the facial movements generated by the …

Continue reading at dl.acm.org (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/205—3D [Three Dimensional] animation driven by audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality

Similar Documents

Publication	Publication Date	Title
Lu et al.	2021	Live speech portraits: real-time photorealistic talking-head animation
Kim et al.	2019	Neural style-preserving visual dubbing
Guo et al.	2021	Ad-nerf: Audio driven neural radiance fields for talking head synthesis
Wang et al.	2022	One-shot talking face generation from single-speaker audio-visual correlation learning
Thies et al.	2020	Neural voice puppetry: Audio-driven facial reenactment
Song et al.	2022	Everybody’s talkin’: Let me talk as you want
Wen et al.	2020	Photorealistic audio-driven video portraits
Park et al.	2022	Synctalkface: Talking face generation with precise lip-syncing via audio-lip memory
US11514634B2 (en)	2022-11-29	Personalized speech-to-video with three-dimensional (3D) skeleton regularization and expressive body poses
Chuang et al.	2005	Mood swings: expressive speech animation
Ye et al.	2022	Audio-driven talking face video generation with dynamic convolution kernels
Liu et al.	2023	Talking face generation via facial anatomy
Liao et al.	2020	Speech2video synthesis with 3d skeleton regularization and expressive body poses
Yu et al.	2019	Mining audio, text and visual information for talking face generation
Shen et al.	2023	Sd-nerf: Towards lifelike talking head animation via spatially-adaptive dual-driven nerfs
Liu et al.	2023	Moda: Mapping-once audio-driven portrait animation with dual attentions
Yi et al.	2022	Animating portrait line drawings from a single face photo and a speech signal
Liu et al.	2023	4D facial analysis: A survey of datasets, algorithms and applications
Cheng et al.	2021	Audio-driven talking video frame restoration
Xu et al.	2024	FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Wang et al.	2022	Talking faces: Audio-to-video face generation
Ji et al.	2024	Realtalk: Real-time and realistic audio-driven face generation with 3d facial prior-guided identity alignment network
Shen et al.	2024	Talking head generation based on 3d morphable facial model
Sun et al.	2023	Generation of virtual digital human for customer service industry
Jang et al.	2023	That's What I Said: Fully-Controllable Talking Face Generation