Jin et al., 2024 - Google Patents

MtArtGPT: A multi-task art generation system with pre-trained transformer

Jin et al., 2024

Document ID: 16921547868524398835
Author: Jin C; Zhu R; Zhu Z; Yang L; Yang M; Luo J
Publication year: 2024
Publication venue: IEEE Transactions on Circuits and Systems for Video Technology

External Links

Cited by

Snippet

Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception capabilities. Such models have been used as the core assistant for many tasks including art …

Continue reading at ieeexplore.ieee.org (other versions)

230000008901 benefit 0 abstract description 3

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30731—Creation of semantic tools
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30873—Retrieval from the Internet, e.g. browsers by navigation, e.g. using categorized browsing, portals, synchronized browsing, visual networks of documents, virtual worlds or tours
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling

Similar Documents

Publication	Publication Date	Title
Gozalo-Brizuela et al.	2023	ChatGPT is not all you need. A State of the Art Review of large Generative AI models
Gao et al.	2021	Hierarchical representation network with auxiliary tasks for video captioning and video question answering
Yang et al.	2021	Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis
Hao et al.	2018	Integrating both visual and audio cues for enhanced video caption
He et al.	2021	MF-BERT: Multimodal fusion in pre-trained BERT for sentiment analysis
Zhang et al.	2019	Effective subword segmentation for text comprehension
Song et al.	2022	Memorial gan with joint semantic optimization for unpaired image captioning
Lu et al.	2023	Sentiment analysis: Comprehensive reviews, recent advances, and open challenges
Jin et al.	2023	Weakening the dominant role of text: CMOSI dataset and multimodal semantic enhancement network
Wang et al.	2022	A text-guided generation and refinement model for image captioning
Jin et al.	2024	MtArtGPT: A Multi-task Art Generation System with Pre-Trained Transformer
Huang et al.	2023	Recent advances in artificial intelligence for video production system
Ai et al.	2024	DER-GCN: Dialog and Event Relation-Aware Graph Convolutional Neural Network for Multimodal Dialog Emotion Recognition
Mai et al.	2021	A unimodal representation learning and recurrent decomposition fusion structure for utterance-level multimodal embedding learning
CN117216234A (en)	2023-12-12	Artificial intelligence-based speaking operation rewriting method, device, equipment and storage medium
Zhao et al.	2020	Leveraging pre-trained language model for summary generation on short text
Chen et al.	2021	Robotic musicianship based on least squares and sequence generative adversarial networks
Fang et al.	2022	Sense-aware bert and multi-task fine-tuning for multimodal sentiment analysis
Zhou et al.	2023	Let’s all dance: Enhancing amateur dance motions
Kizhner et al.	2022	The history and context of the digital humanities in Russia
Guo et al.	2024	PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation
Yi et al.	2024	Diffusion models in text generation: a survey
Lu et al.	2023	Multi-dimensional fusion: transformer and GANs-based multimodal audiovisual perception robot for musical performance art
Kleinberger et al.	2022	Voice at NIME: a Taxonomy of New Interfaces for Vocal Musical Expression
Li et al.	2023	AIGC in China: Current developments and future outlook