Manmadhan et al., 2020 - Google Patents
Visual question answering: a state-of-the-art reviewManmadhan et al., 2020
- Document ID
- 13994106084701901519
- Author
- Manmadhan S
- Kovoor B
- Publication year
- Publication venue
- Artificial Intelligence Review
External Links
Snippet
Visual question answering (VQA) is a task that has received immense consideration from two major research communities: computer vision and natural language processing. Recently it has been widely accepted as an AI-complete task which can be used as an …
- 230000000007 visual effect 0 title abstract description 100
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Manmadhan et al. | Visual question answering: a state-of-the-art review | |
Uppal et al. | Multimodal research in vision and language: A review of current and emerging trends | |
Kumar et al. | Hybrid context enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data | |
Samant et al. | Framework for deep learning-based language models using multi-task learning in natural language understanding: A systematic literature review and future directions | |
Li et al. | Visual to text: Survey of image and video captioning | |
Kulkarni et al. | Babytalk: Understanding and generating simple image descriptions | |
Bernardi et al. | Automatic description generation from images: A survey of models, datasets, and evaluation measures | |
Guo et al. | LD-MAN: Layout-driven multimodal attention network for online news sentiment recognition | |
Sharma et al. | A comprehensive survey on image captioning: from handcrafted to deep learning-based techniques, a taxonomy and open research issues | |
Liu et al. | Fact-based visual question answering via dual-process system | |
Yang et al. | A comprehensive survey on image aesthetic quality assessment | |
Chhabra et al. | Multimodal hate speech detection via multi-scale visual kernels and knowledge distillation architecture | |
Sharma et al. | Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey | |
Dai et al. | Visual relationship detection based on bidirectional recurrent neural network | |
Paul et al. | A context-sensitive multi-tier deep learning framework for multimodal sentiment analysis | |
Rehman et al. | Deep Learning Techniques for Future Intelligent Cross-Media Retrieval | |
Park et al. | SAM: cross-modal semantic alignments module for image-text retrieval | |
Zhou et al. | Multimodal embedding for lifelog retrieval | |
Jana et al. | Network embeddings from distributional thesauri for improving static word representations | |
Dey et al. | How Machine Learning is Innovating Today's World: A Concise Technical Guide | |
Mangalika | Object Recognition to Content Based Image Retrieval: A Study of the Developments and Applications of Computer Vision | |
Singh et al. | Neural approaches towards text summarization | |
Nag | Text-based emotion recognition using contextual phrase embedding model | |
Town | Ontology based visual information processing. | |
Müller-Budack | Unsupervised quantification of entity consistency between photos and text in real-world news |