Computer Science > Computer Vision and Pattern Recognition

arXiv:2212.10596 (cs)

[Submitted on 20 Dec 2022 (v1), last revised 10 Jan 2023 (this version, v2)]

Title:Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Authors:Vivek Rathod, Bryan Seybold, Sudheendra Vijayanarasimhan, Austin Myers, Xiuye Gu, Vighnesh Birodkar, David A. Ross

View PDF

Abstract:Detecting actions in untrimmed videos should not be limited to a small, closed set of classes. We present a simple, yet effective strategy for open-vocabulary temporal action detection utilizing pretrained image-text co-embeddings. Despite being trained on static images rather than videos, we show that image-text co-embeddings enable openvocabulary performance competitive with fully-supervised models. We show that the performance can be further improved by ensembling the image-text features with features encoding local motion, like optical flow based features, or other modalities, like audio. In addition, we propose a more reasonable open-vocabulary evaluation setting for the ActivityNet data set, where the category splits are based on similarity rather than random assignment.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2212.10596 [cs.CV]
	(or arXiv:2212.10596v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2212.10596

Submission history

From: Sudheendra Vijayanarasimhan [view email]
[v1] Tue, 20 Dec 2022 19:12:58 UTC (207 KB)
[v2] Tue, 10 Jan 2023 19:44:37 UTC (211 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators