Kehkashan et al., 2024 - Google Patents

Combinatorial Analysis of Deep Learning and Machine Learning Video Captioning Studies: A Systematic Literature Review

Kehkashan et al., 2024

Document ID: 14296671659972291888
Author: Kehkashan T; Alsaeedi A; Yafooz W; Ismail N; Al-Dhaqm A
Publication year: 2024
Publication venue: IEEE Access

External Links

Cited by

Snippet

Recent improvements formulated in the area of video captioning have brought rapid revolutions in its methods and the performance of its models. Machine learning and deep learning techniques are both employed in this regard. However, there is a lack of tracing the …

Continue reading at ieeexplore.ieee.org (PDF) (other versions)

238000013135 deep learning 0 title abstract description 48

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/32—Medical data management, e.g. systems or protocols for archival or communication of medical images, computerised patient records or computerised general medical references
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

Similar Documents

Publication	Publication Date	Title
Bragg et al.	2019	Sign language recognition, generation, and translation: An interdisciplinary perspective
Shen et al.	2022	Towards natural language interfaces for data visualization: A survey
Gan et al.	2022	Vision-language pre-training: Basics, recent advances, and future trends
Jänicke et al.	2017	Visual text analysis in digital humanities
Biondi et al.	2017	A deep learning semantic approach to emotion recognition using the IBM watson bluemix alchemy language
Islam et al.	2021	Exploring video captioning techniques: A comprehensive survey on deep learning methods
Rehm et al.	2013	Strategic research agenda for multilingual Europe 2020
JP7574371B2 (en)	2024-10-28	Natural Language Solutions
Singh et al.	2022	Emoint-trans: A multimodal transformer for identifying emotions and intents in social conversations
Manzoor et al.	2023	Multimodality representation learning: A survey on evolution, pretraining and its applications
Hua et al.	2025	Finematch: Aspect-based fine-grained image and text mismatch detection and correction
Lai et al.	2022	Multimodal sentiment analysis with asymmetric window multi-attentions
Yang et al.	2024	Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey
Alzu’bi et al.	2023	Multimodal deep learning with discriminant descriptors for offensive memes detection
Chaudhury et al.	2015	Multimedia ontology: representation and applications
Yusuf et al.	2024	Graph neural networks for visual question answering: a systematic review
Mubashira et al.	2020	Transformer Network for video to text translation
Kehkashan et al.	2024	Combinatorial Analysis of Deep Learning and Machine Learning Video Captioning Studies: A Systematic Literature Review
Kumari et al.	2024	Emotion aided multi-task framework for video embedded misinformation detection
Sun et al.	2023	Cross-language multimodal scene semantic guidance and leap sampling for video captioning
Suryanto et al.	2023	Evolving Conversations: A Review of Chatbots and Implications in Natural Language Processing for Cultural Heritage Ecosystems
Li et al.	2024	A survey on multimodal benchmarks: In the era of large ai models
Chang et al.	2019	Report of 2017 NSF workshop on multimedia challenges, opportunities and research roadmaps
Parian-Scherb et al.	2024	Gesture retrieval and its application to the study of multimodal communication
Lo et al.	2022	CNERVis: a visual diagnosis tool for Chinese named entity recognition