Chen et al., 2023 - Google Patents
Reason out your layout: Evoking the layout master from large language models for text-to-image synthesisChen et al., 2023
View PDF- Document ID
- 2317904446450054062
- Author
- Chen X
- Liu Y
- Yang Y
- Yuan J
- You Q
- Liu L
- Yang H
- Publication year
- Publication venue
- arXiv preprint arXiv:2311.17126
External Links
Snippet
Recent advancements in text-to-image (T2I) generative models have shown remarkable capabilities in producing diverse and imaginative visuals based on text prompts. Despite the advancement, these diffusion models sometimes struggle to translate the semantic content …
- 230000015572 biosynthetic process 0 title description 5
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K2209/00—Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Anydoor: Zero-shot object-level image customization | |
Jiang et al. | Coherent reconstruction of multiple humans from a single image | |
Zhuang et al. | Dreameditor: Text-driven 3d scene editing with neural fields | |
Geng et al. | 3d guided fine-grained face manipulation | |
Zanfir et al. | Thundr: Transformer-based 3d human reconstruction with markers | |
Huang et al. | Surface reconstruction from point clouds: A survey and a benchmark | |
Chen et al. | Reason out your layout: Evoking the layout master from large language models for text-to-image synthesis | |
Xu et al. | Autoscanning for coupled scene reconstruction and proactive object analysis | |
Xia et al. | Keyframe extraction for human motion capture data based on joint kernel sparse representation | |
Wong et al. | One shot learning via compositions of meaningful patches | |
US8934715B2 (en) | Human pose estimation in visual computing | |
Mademlis et al. | Combining topological and geometrical features for global and partial 3-D shape retrieval | |
Lee et al. | Locomotion-action-manipulation: Synthesizing human-scene interactions in complex 3d environments | |
Sahillioğlu | A genetic isometric shape correspondence algorithm with adaptive sampling | |
Xiao et al. | R&b: Region and boundary aware zero-shot grounded text-to-image generation | |
Zhu et al. | H3wb: Human3. 6m 3d wholebody dataset and benchmark | |
Zhang et al. | Brush your text: Synthesize any scene text on images via diffusion model | |
Fang et al. | A comprehensive pipeline for complex text-to-image synthesis | |
Han et al. | Chorus: Learning canonicalized 3d human-object spatial relations from unbounded synthesized images | |
Xu et al. | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | |
Xie et al. | Structure-consistent customized virtual mannequin reconstruction from 3D scans based on optimization | |
Morariu et al. | Tracking people's hands and feet using mixed network and/or search | |
Shen et al. | ClipFlip: Multi‐view Clipart Design | |
Matthews et al. | A sketch-based articulated figure animation tool | |
Li et al. | Unsupervised learning of landmarks based on inter-intra subject consistencies |