[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Manmadhan et al., 2020 - Google Patents

Visual question answering: a state-of-the-art review

Manmadhan et al., 2020

Document ID
13994106084701901519
Author
Manmadhan S
Kovoor B
Publication year
Publication venue
Artificial Intelligence Review

External Links

Snippet

Visual question answering (VQA) is a task that has received immense consideration from two major research communities: computer vision and natural language processing. Recently it has been widely accepted as an AI-complete task which can be used as an …
Continue reading at link.springer.com (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • G06F17/30684Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/271Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • G06F17/30023Querying
    • G06F17/30038Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30244Information retrieval; Database structures therefor; File system structures therefor in image databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/18Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR

Similar Documents

Publication Publication Date Title
Manmadhan et al. Visual question answering: a state-of-the-art review
Uppal et al. Multimodal research in vision and language: A review of current and emerging trends
Kumar et al. Hybrid context enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data
Samant et al. Framework for deep learning-based language models using multi-task learning in natural language understanding: A systematic literature review and future directions
Li et al. Visual to text: Survey of image and video captioning
Kulkarni et al. Babytalk: Understanding and generating simple image descriptions
Bernardi et al. Automatic description generation from images: A survey of models, datasets, and evaluation measures
Guo et al. LD-MAN: Layout-driven multimodal attention network for online news sentiment recognition
Sharma et al. A comprehensive survey on image captioning: from handcrafted to deep learning-based techniques, a taxonomy and open research issues
Liu et al. Fact-based visual question answering via dual-process system
Yang et al. A comprehensive survey on image aesthetic quality assessment
Chhabra et al. Multimodal hate speech detection via multi-scale visual kernels and knowledge distillation architecture
Sharma et al. Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey
Dai et al. Visual relationship detection based on bidirectional recurrent neural network
Paul et al. A context-sensitive multi-tier deep learning framework for multimodal sentiment analysis
Rehman et al. Deep Learning Techniques for Future Intelligent Cross-Media Retrieval
Park et al. SAM: cross-modal semantic alignments module for image-text retrieval
Zhou et al. Multimodal embedding for lifelog retrieval
Jana et al. Network embeddings from distributional thesauri for improving static word representations
Dey et al. How Machine Learning is Innovating Today's World: A Concise Technical Guide
Mangalika Object Recognition to Content Based Image Retrieval: A Study of the Developments and Applications of Computer Vision
Singh et al. Neural approaches towards text summarization
Nag Text-based emotion recognition using contextual phrase embedding model
Town Ontology based visual information processing.
Müller-Budack Unsupervised quantification of entity consistency between photos and text in real-world news