Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.11248 (cs)

[Submitted on 20 Jun 2023 (v1), last revised 13 Aug 2023 (this version, v2)]

Title:Dynamic Perceiver for Efficient Visual Recognition

Authors:Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang, Xuran Pan, Yifan Pu, Chao Deng, Junlan Feng, Shiji Song, Gao Huang

View PDF

Abstract:Early exiting has become a promising approach to improving the inference efficiency of deep networks. By structuring models with multiple classifiers (exits), predictions for ``easy'' samples can be generated at earlier exits, negating the need for executing deeper layers. Current multi-exit networks typically implement linear classifiers at intermediate layers, compelling low-level features to encapsulate high-level semantics. This sub-optimal design invariably undermines the performance of later exits. In this paper, we propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task with a novel dual-branch architecture. A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks. Bi-directional cross-attention layers are established to progressively fuse the information of both branches. Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features. Dyn-Perceiver constitutes a versatile and adaptable framework that can be built upon various architectures. Experiments on image classification, action recognition, and object detection demonstrate that our method significantly improves the inference efficiency of different backbones, outperforming numerous competitive approaches across a broad range of computational budgets. Evaluation on both CPU and GPU platforms substantiate the superior practical efficiency of Dyn-Perceiver. Code is available at this https URL.

Comments:	Accepted at ICCV 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.11248 [cs.CV]
	(or arXiv:2306.11248v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.11248

Submission history

From: Yizeng Han [view email]
[v1] Tue, 20 Jun 2023 03:00:22 UTC (1,972 KB)
[v2] Sun, 13 Aug 2023 05:44:54 UTC (1,977 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Dynamic Perceiver for Efficient Visual Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Dynamic Perceiver for Efficient Visual Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators