Kehkashan et al., 2024 - Google Patents
Combinatorial Analysis of Deep Learning and Machine Learning Video Captioning Studies: A Systematic Literature ReviewKehkashan et al., 2024
View PDF- Document ID
- 14296671659972291888
- Author
- Kehkashan T
- Alsaeedi A
- Yafooz W
- Ismail N
- Al-Dhaqm A
- Publication year
- Publication venue
- IEEE Access
External Links
Snippet
Recent improvements formulated in the area of video captioning have brought rapid revolutions in its methods and the performance of its models. Machine learning and deep learning techniques are both employed in this regard. However, there is a lack of tracing the …
- 238000013135 deep learning 0 title abstract description 48
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/32—Medical data management, e.g. systems or protocols for archival or communication of medical images, computerised patient records or computerised general medical references
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bragg et al. | Sign language recognition, generation, and translation: An interdisciplinary perspective | |
Shen et al. | Towards natural language interfaces for data visualization: A survey | |
Gan et al. | Vision-language pre-training: Basics, recent advances, and future trends | |
Jänicke et al. | Visual text analysis in digital humanities | |
Biondi et al. | A deep learning semantic approach to emotion recognition using the IBM watson bluemix alchemy language | |
Islam et al. | Exploring video captioning techniques: A comprehensive survey on deep learning methods | |
Rehm et al. | Strategic research agenda for multilingual Europe 2020 | |
JP7574371B2 (en) | Natural Language Solutions | |
Singh et al. | Emoint-trans: A multimodal transformer for identifying emotions and intents in social conversations | |
Manzoor et al. | Multimodality representation learning: A survey on evolution, pretraining and its applications | |
Hua et al. | Finematch: Aspect-based fine-grained image and text mismatch detection and correction | |
Lai et al. | Multimodal sentiment analysis with asymmetric window multi-attentions | |
Yang et al. | Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey | |
Alzu’bi et al. | Multimodal deep learning with discriminant descriptors for offensive memes detection | |
Chaudhury et al. | Multimedia ontology: representation and applications | |
Yusuf et al. | Graph neural networks for visual question answering: a systematic review | |
Mubashira et al. | Transformer Network for video to text translation | |
Kehkashan et al. | Combinatorial Analysis of Deep Learning and Machine Learning Video Captioning Studies: A Systematic Literature Review | |
Kumari et al. | Emotion aided multi-task framework for video embedded misinformation detection | |
Sun et al. | Cross-language multimodal scene semantic guidance and leap sampling for video captioning | |
Suryanto et al. | Evolving Conversations: A Review of Chatbots and Implications in Natural Language Processing for Cultural Heritage Ecosystems | |
Li et al. | A survey on multimodal benchmarks: In the era of large ai models | |
Chang et al. | Report of 2017 NSF workshop on multimedia challenges, opportunities and research roadmaps | |
Parian-Scherb et al. | Gesture retrieval and its application to the study of multimodal communication | |
Lo et al. | CNERVis: a visual diagnosis tool for Chinese named entity recognition |