Chen et al., 2023 - Google Patents

Reason out your layout: Evoking the layout master from large language models for text-to-image synthesis

Chen et al., 2023

Document ID: 2317904446450054062
Author: Chen X; Liu Y; Yang Y; Yuan J; You Q; Liu L; Yang H
Publication year: 2023
Publication venue: arXiv preprint arXiv:2311.17126

External Links

Cited by

Snippet

Recent advancements in text-to-image (T2I) generative models have shown remarkable capabilities in producing diverse and imaginative visuals based on text prompts. Despite the advancement, these diffusion models sometimes struggle to translate the semantic content …

Continue reading at arxiv.org (PDF) (other versions)

230000015572 biosynthetic process 0 title description 5

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K2209/00—Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics

Similar Documents

Publication	Publication Date	Title
Chen et al.	2024	Anydoor: Zero-shot object-level image customization
Jiang et al.	2020	Coherent reconstruction of multiple humans from a single image
Zhuang et al.	2023	Dreameditor: Text-driven 3d scene editing with neural fields
Geng et al.	2019	3d guided fine-grained face manipulation
Zanfir et al.	2021	Thundr: Transformer-based 3d human reconstruction with markers
Huang et al.	2024	Surface reconstruction from point clouds: A survey and a benchmark
Chen et al.	2023	Reason out your layout: Evoking the layout master from large language models for text-to-image synthesis
Xu et al.	2015	Autoscanning for coupled scene reconstruction and proactive object analysis
Xia et al.	2016	Keyframe extraction for human motion capture data based on joint kernel sparse representation
Wong et al.	2015	One shot learning via compositions of meaningful patches
US8934715B2 (en)	2015-01-13	Human pose estimation in visual computing
Mademlis et al.	2008	Combining topological and geometrical features for global and partial 3-D shape retrieval
Lee et al.	2023	Locomotion-action-manipulation: Synthesizing human-scene interactions in complex 3d environments
Sahillioğlu	2018	A genetic isometric shape correspondence algorithm with adaptive sampling
Xiao et al.	2023	R&b: Region and boundary aware zero-shot grounded text-to-image generation
Zhu et al.	2023	H3wb: Human3. 6m 3d wholebody dataset and benchmark
Zhang et al.	2024	Brush your text: Synthesize any scene text on images via diffusion model
Fang et al.	2020	A comprehensive pipeline for complex text-to-image synthesis
Han et al.	2023	Chorus: Learning canonicalized 3d human-object spatial relations from unbounded synthesized images
Xu et al.	2024	InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
Xie et al.	2020	Structure-consistent customized virtual mannequin reconstruction from 3D scans based on optimization
Morariu et al.	2012	Tracking people's hands and feet using mixed network and/or search
Shen et al.	2021	ClipFlip: Multi‐view Clipart Design
Matthews et al.	2011	A sketch-based articulated figure animation tool
Li et al.	2020	Unsupervised learning of landmarks based on inter-intra subject consistencies