[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Chen et al., 2023 - Google Patents

Reason out your layout: Evoking the layout master from large language models for text-to-image synthesis

Chen et al., 2023

View PDF
Document ID
2317904446450054062
Author
Chen X
Liu Y
Yang Y
Yuan J
You Q
Liu L
Yang H
Publication year
Publication venue
arXiv preprint arXiv:2311.17126

External Links

Snippet

Recent advancements in text-to-image (T2I) generative models have shown remarkable capabilities in producing diverse and imaginative visuals based on text prompts. Despite the advancement, these diffusion models sometimes struggle to translate the semantic content …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K2209/00Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics

Similar Documents

Publication Publication Date Title
Chen et al. Anydoor: Zero-shot object-level image customization
Jiang et al. Coherent reconstruction of multiple humans from a single image
Zhuang et al. Dreameditor: Text-driven 3d scene editing with neural fields
Geng et al. 3d guided fine-grained face manipulation
Zanfir et al. Thundr: Transformer-based 3d human reconstruction with markers
Huang et al. Surface reconstruction from point clouds: A survey and a benchmark
Chen et al. Reason out your layout: Evoking the layout master from large language models for text-to-image synthesis
Xu et al. Autoscanning for coupled scene reconstruction and proactive object analysis
Xia et al. Keyframe extraction for human motion capture data based on joint kernel sparse representation
Wong et al. One shot learning via compositions of meaningful patches
US8934715B2 (en) Human pose estimation in visual computing
Mademlis et al. Combining topological and geometrical features for global and partial 3-D shape retrieval
Lee et al. Locomotion-action-manipulation: Synthesizing human-scene interactions in complex 3d environments
Sahillioğlu A genetic isometric shape correspondence algorithm with adaptive sampling
Xiao et al. R&b: Region and boundary aware zero-shot grounded text-to-image generation
Zhu et al. H3wb: Human3. 6m 3d wholebody dataset and benchmark
Zhang et al. Brush your text: Synthesize any scene text on images via diffusion model
Fang et al. A comprehensive pipeline for complex text-to-image synthesis
Han et al. Chorus: Learning canonicalized 3d human-object spatial relations from unbounded synthesized images
Xu et al. InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
Xie et al. Structure-consistent customized virtual mannequin reconstruction from 3D scans based on optimization
Morariu et al. Tracking people's hands and feet using mixed network and/or search
Shen et al. ClipFlip: Multi‐view Clipart Design
Matthews et al. A sketch-based articulated figure animation tool
Li et al. Unsupervised learning of landmarks based on inter-intra subject consistencies