Liu et al., 2023 - Google Patents
Talking face generation via facial anatomyLiu et al., 2023
- Document ID
- 10483059587120463585
- Author
- Liu S
- Wang H
- Publication year
- Publication venue
- ACM Transactions on Multimedia Computing, Communications and Applications
External Links
Snippet
To generate the corresponding talking face from a speech audio and a face image, it is essential to match the variations in the facial appearance with the speech audio in subtle movements of different face regions. Nevertheless, the facial movements generated by the …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/205—3D [Three Dimensional] animation driven by audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | Live speech portraits: real-time photorealistic talking-head animation | |
Kim et al. | Neural style-preserving visual dubbing | |
Guo et al. | Ad-nerf: Audio driven neural radiance fields for talking head synthesis | |
Wang et al. | One-shot talking face generation from single-speaker audio-visual correlation learning | |
Thies et al. | Neural voice puppetry: Audio-driven facial reenactment | |
Song et al. | Everybody’s talkin’: Let me talk as you want | |
Wen et al. | Photorealistic audio-driven video portraits | |
Park et al. | Synctalkface: Talking face generation with precise lip-syncing via audio-lip memory | |
US11514634B2 (en) | Personalized speech-to-video with three-dimensional (3D) skeleton regularization and expressive body poses | |
Chuang et al. | Mood swings: expressive speech animation | |
Ye et al. | Audio-driven talking face video generation with dynamic convolution kernels | |
Liu et al. | Talking face generation via facial anatomy | |
Liao et al. | Speech2video synthesis with 3d skeleton regularization and expressive body poses | |
Yu et al. | Mining audio, text and visual information for talking face generation | |
Shen et al. | Sd-nerf: Towards lifelike talking head animation via spatially-adaptive dual-driven nerfs | |
Liu et al. | Moda: Mapping-once audio-driven portrait animation with dual attentions | |
Yi et al. | Animating portrait line drawings from a single face photo and a speech signal | |
Liu et al. | 4D facial analysis: A survey of datasets, algorithms and applications | |
Cheng et al. | Audio-driven talking video frame restoration | |
Xu et al. | FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio | |
Wang et al. | Talking faces: Audio-to-video face generation | |
Ji et al. | Realtalk: Real-time and realistic audio-driven face generation with 3d facial prior-guided identity alignment network | |
Shen et al. | Talking head generation based on 3d morphable facial model | |
Sun et al. | Generation of virtual digital human for customer service industry | |
Jang et al. | That's What I Said: Fully-Controllable Talking Face Generation |