Action Segmentation through Self-Supervised Video Features and Positional-Encoded Embeddings
Abstract
References
Recommendations
Unsupervised method for video action segmentation through spatio-temporal and positional-encoded embeddings
MMSys '22: Proceedings of the 13th ACM Multimedia Systems ConferenceAction segmentation consists of temporally segmenting a video and labeling each segmented interval with a specific action label. In this work, we propose a novel action segmentation method that requires no prior video analysis and no annotated data. Our ...
Self-Supervised Learning for Videos: A Survey
The remarkable success of deep learning in various domains relies on the availability of large-scale annotated datasets. However, obtaining annotations is expensive and requires great effort, which is especially challenging for videos. Moreover, the use ...
SigFormer: Sparse Signal-guided Transformer for Multi-modal Action Segmentation
Multi-modal human action segmentation is a critical and challenging task with a wide range of applications. Nowadays, the majority of approaches concentrate on the fusion of dense signals (i.e., RGB, optical flow, and depth maps). However, the potential ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In

Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Research-article
Funding Sources
- Air Force Office of Scientific Research
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 273Total Downloads
- Downloads (Last 12 months)230
- Downloads (Last 6 weeks)17
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in