Chae et al., 2022 - Google Patents

Uncertainty-based visual question answering: estimating semantic inconsistency between image and knowledge base

Chae et al., 2022

Document ID: 3689315751260307147
Author: Chae J; Kim J
Publication year: 2022
Publication venue: 2022 International Joint Conference on Neural Networks (IJCNN)

External Links

Cited by

Snippet

Knowledge-based visual question answering (KVQA) task aims to answer questions that require additional external knowledge as well as an understanding of images and questions. Recent studies on KVQA inject an external knowledge in a multi-modal form, and …

Continue reading at arxiv.org (PDF) (other versions)

230000000007 visual effect 0 title abstract description 10

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass

Similar Documents

Publication	Publication Date	Title
Lu et al.	2018	R-VQA: learning visual relation facts with semantic attention for visual question answering
Zellers et al.	2019	From recognition to cognition: Visual commonsense reasoning
Messina et al.	2021	Transformer reasoning network for image-text matching and retrieval
Zhang et al.	2019	A gated peripheral-foveal convolutional neural network for unified image aesthetic prediction
Chen et al.	2022	CAAN: Context-aware attention network for visual question answering
Moayeri et al.	2023	Text-to-concept (and back) via cross-model alignment
Zablocki et al.	2019	Context-aware zero-shot learning for object recognition
Zhang et al.	2020	Relational graph learning for grounded video description generation
Yang et al.	2020	Hierarchical scene graph encoder-decoder for image paragraph captioning
Xu et al.	2021	Relation-aware compositional zero-shot learning for attribute-object pair recognition
Zhang et al.	2018	Hierarchical scene parsing by weakly supervised learning with image descriptions
Wang et al.	2020	Deep multi-person kinship matching and recognition for family photos
Khan et al.	2022	A deep neural framework for image caption generation using gru-based attention mechanism
Li et al.	2022	Inner knowledge-based Img2Doc scheme for visual question answering
CN110659392B (en)	2022-05-06	Retrieval method and device, and storage medium
Lin et al.	2018	Feature Enhancement in Attention for Visual Question Answering.
Chae et al.	2022	Uncertainty-based visual question answering: estimating semantic inconsistency between image and knowledge base
CN113158672B (en)	2024-11-08	Relationship analysis method and device based on news event
Wang et al.	2022	Generalised zero-shot learning for entailment-based text classification with external knowledge
CN116089644A (en)	2023-05-09	Event detection method integrating multi-mode features
Elu et al.	2021	Inferring spatial relations from textual descriptions of images
Oura et al.	2018	Multimodal Deep Neural Network with Image Sequence Features for Video Captioning
Bose et al.	2023	Attention-based multimodal deep learning on vision-language data: models, datasets, tasks, evaluation metrics and applications
Prabhakar et al.	2018	Question relevance in visual question answering
CN114003708A (en)	2022-02-01	Automatic question answering method and device based on artificial intelligence, storage medium and server