[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Guo et al., 2019 - Google Patents

Deep multimodal representation learning: A survey

Guo et al., 2019

View PDF
Document ID
9278948631551619742
Author
Guo W
Wang J
Wang S
Publication year
Publication venue
Ieee Access

External Links

Snippet

Multimodal representation learning, which aims to narrow the heterogeneity gap among different modalities, plays an indispensable role in the utilization of ubiquitous multimodal data. Due to the powerful representation ability with multiple levels of abstraction, deep …
Continue reading at ieeexplore.ieee.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06K9/6261Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6279Classification techniques relating to the number of classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling

Similar Documents

Publication Publication Date Title
Guo et al. Deep multimodal representation learning: A survey
Zhang et al. Multimodal intelligence: Representation learning, information fusion, and applications
Li et al. A survey of multi-view representation learning
Mu A survey of recommender systems based on deep learning
Baltrušaitis et al. Multimodal machine learning: A survey and taxonomy
Wang et al. Image captioning with deep bidirectional LSTMs and multi-task learning
Dong et al. Predicting visual features from text for image and video caption retrieval
Koohzadi et al. Survey on deep learning methods in human action recognition
Tang et al. Graph-based multimodal sequential embedding for sign language translation
Chen et al. CAAN: Context-aware attention network for visual question answering
Hossain et al. Text to image synthesis for improved image captioning
Zhang et al. Temporal sentence grounding in videos: A survey and future directions
Estevam et al. Zero-shot action recognition in videos: A survey
CN112000818A (en) Cross-media retrieval method and electronic device for texts and images
Sun et al. Video question answering: a survey of models and datasets
Chen et al. New ideas and trends in deep multimodal content understanding: A review
Xu et al. Deep image captioning: A review of methods, trends and future challenges
Cao et al. A review on multimodal zero‐shot learning
Liu et al. Multimodal emotion recognition based on cascaded multichannel and hierarchical fusion
CN117574904A (en) Named entity recognition method based on contrast learning and multi-modal semantic interaction
Abdar et al. A review of deep learning for video captioning
Shou et al. Adversarial representation with intra-modal and inter-modal graph contrastive learning for multimodal emotion recognition
Deng et al. Multimodal affective computing with dense fusion transformer for inter-and intra-modality interactions
Dai et al. Visual relationship detection based on bidirectional recurrent neural network
Zhao et al. Toward Label-Efficient Emotion and Sentiment Analysis.