[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Chae et al., 2022 - Google Patents

Uncertainty-based visual question answering: estimating semantic inconsistency between image and knowledge base

Chae et al., 2022

View PDF
Document ID
3689315751260307147
Author
Chae J
Kim J
Publication year
Publication venue
2022 International Joint Conference on Neural Networks (IJCNN)

External Links

Snippet

Knowledge-based visual question answering (KVQA) task aims to answer questions that require additional external knowledge as well as an understanding of images and questions. Recent studies on KVQA inject an external knowledge in a multi-modal form, and …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Similar Documents

Publication Publication Date Title
Lu et al. R-VQA: learning visual relation facts with semantic attention for visual question answering
Zellers et al. From recognition to cognition: Visual commonsense reasoning
Messina et al. Transformer reasoning network for image-text matching and retrieval
Zhang et al. A gated peripheral-foveal convolutional neural network for unified image aesthetic prediction
Chen et al. CAAN: Context-aware attention network for visual question answering
Moayeri et al. Text-to-concept (and back) via cross-model alignment
Zablocki et al. Context-aware zero-shot learning for object recognition
Zhang et al. Relational graph learning for grounded video description generation
Yang et al. Hierarchical scene graph encoder-decoder for image paragraph captioning
Xu et al. Relation-aware compositional zero-shot learning for attribute-object pair recognition
Zhang et al. Hierarchical scene parsing by weakly supervised learning with image descriptions
Wang et al. Deep multi-person kinship matching and recognition for family photos
Khan et al. A deep neural framework for image caption generation using gru-based attention mechanism
Li et al. Inner knowledge-based Img2Doc scheme for visual question answering
CN110659392B (en) Retrieval method and device, and storage medium
Lin et al. Feature Enhancement in Attention for Visual Question Answering.
Chae et al. Uncertainty-based visual question answering: estimating semantic inconsistency between image and knowledge base
CN113158672B (en) Relationship analysis method and device based on news event
Wang et al. Generalised zero-shot learning for entailment-based text classification with external knowledge
CN116089644A (en) Event detection method integrating multi-mode features
Elu et al. Inferring spatial relations from textual descriptions of images
Oura et al. Multimodal Deep Neural Network with Image Sequence Features for Video Captioning
Bose et al. Attention-based multimodal deep learning on vision-language data: models, datasets, tasks, evaluation metrics and applications
Prabhakar et al. Question relevance in visual question answering
CN114003708A (en) Automatic question answering method and device based on artificial intelligence, storage medium and server