[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Jin et al., 2024 - Google Patents

MtArtGPT: A multi-task art generation system with pre-trained transformer

Jin et al., 2024

Document ID
16921547868524398835
Author
Jin C
Zhu R
Zhu Z
Yang L
Yang M
Luo J
Publication year
Publication venue
IEEE Transactions on Circuits and Systems for Video Technology

External Links

Snippet

Instruction tuning large language models are making rapid advances in the field of artificial intelligence where GPT-4 models have exhibited impressive multi-modal perception capabilities. Such models have been used as the core assistant for many tasks including art …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30731Creation of semantic tools
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30873Retrieval from the Internet, e.g. browsers by navigation, e.g. using categorized browsing, portals, synchronized browsing, visual networks of documents, virtual worlds or tours
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • G06F17/30023Querying
    • G06F17/30038Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling

Similar Documents

Publication Publication Date Title
Gozalo-Brizuela et al. ChatGPT is not all you need. A State of the Art Review of large Generative AI models
Gao et al. Hierarchical representation network with auxiliary tasks for video captioning and video question answering
Yang et al. Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis
Hao et al. Integrating both visual and audio cues for enhanced video caption
He et al. MF-BERT: Multimodal fusion in pre-trained BERT for sentiment analysis
Zhang et al. Effective subword segmentation for text comprehension
Song et al. Memorial gan with joint semantic optimization for unpaired image captioning
Lu et al. Sentiment analysis: Comprehensive reviews, recent advances, and open challenges
Jin et al. Weakening the dominant role of text: CMOSI dataset and multimodal semantic enhancement network
Wang et al. A text-guided generation and refinement model for image captioning
Jin et al. MtArtGPT: A Multi-task Art Generation System with Pre-Trained Transformer
Huang et al. Recent advances in artificial intelligence for video production system
Ai et al. DER-GCN: Dialog and Event Relation-Aware Graph Convolutional Neural Network for Multimodal Dialog Emotion Recognition
Mai et al. A unimodal representation learning and recurrent decomposition fusion structure for utterance-level multimodal embedding learning
CN117216234A (en) Artificial intelligence-based speaking operation rewriting method, device, equipment and storage medium
Zhao et al. Leveraging pre-trained language model for summary generation on short text
Chen et al. Robotic musicianship based on least squares and sequence generative adversarial networks
Fang et al. Sense-aware bert and multi-task fine-tuning for multimodal sentiment analysis
Zhou et al. Let’s all dance: Enhancing amateur dance motions
Kizhner et al. The history and context of the digital humanities in Russia
Guo et al. PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation
Yi et al. Diffusion models in text generation: a survey
Lu et al. Multi-dimensional fusion: transformer and GANs-based multimodal audiovisual perception robot for musical performance art
Kleinberger et al. Voice at NIME: a Taxonomy of New Interfaces for Vocal Musical Expression
Li et al. AIGC in China: Current developments and future outlook