More Web Proxy on the site http://driver.im/

research-article

Frequency-Guided Spatial Adaptation for Camouflaged Object Detection

Authors:

Guoqiang Liang,

Yanning ZhangAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 27

Pages 72 - 83

https://doi.org/10.1109/TMM.2024.3521681

Published: 17 January 2025 Publication History

Abstract

Camouflaged object detection (COD) aims to segment camouflaged objects which exhibit very similar patterns with the surrounding environment. Recent research works have shown that enhancing the feature representation via the frequency information can greatly alleviate the ambiguity problem between the foreground objects and the background. With the emergence of vision foundation models, like InternImage, Segment Anything Model etc, adapting the pretrained model on COD tasks with a lightweight adapter module shows a novel and promising research direction. Existing adapter modules mainly care about the feature adaptation in the spatial domain. In this paper, we propose a novel frequency-guided spatial adaptation method for COD task. Specifically, we transform the input features of the adapter into frequency domain. By grouping and interacting with frequency components located within non overlapping circles in the spectrogram, different frequency components are dynamically enhanced or weakened, making the intensity of image details and contour features adaptively adjusted. At the same time, the features that are conducive to distinguishing object and background are highlighted, indirectly implying the position and shape of camouflaged object. We conduct extensive experiments on four widely adopted benchmark datasets and the proposed method outperforms 26 state-of-the-art methods with large margins. Code will be released.

References

[1]

D.-P. Fan et al., “PraNet: Parallel reverse attention network for polyp segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Interv., Springer, 2020, pp. 263–273.

[2]

Y. Huang et al., “MCMT-GAN: Multi-task coherent modality transferable GAN for 3D brain image synthesis,” IEEE Trans. Image Process., vol. 29, pp. 8187–8198, 2020.

[3]

M. Dean, R. Harwood, and C. Kasari, “The art of camouflage: Gender differences in the social behaviors of girls and boys with autism spectrum disorder,” Autism, vol. 21, pp. 678–689, 2017.

[4]

D.-P. Fan, G.-P. Ji, M.-M. Cheng, and L. Shao, “Concealed object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 10, pp. 6024–6042, Oct. 2022.

Digital Library

[5]

X. Hu et al., “High-resolution iterative feedback network for camouflaged object detection,” in Proc. AAAI Conf. Artif. Intell., 2023, pp. 881–889.

[6]

Q. Zhai et al., “Mutual graph learning for camouflaged object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 12992–13002.

[7]

Y. Zhong et al., “Detecting camouflaged object in frequency domain,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 4494–4503.

[8]

R. Cong et al., “Frequency perception network for camouflaged object detection,” in Proc. 31st ACM Int. Conf. Multimedia, 2023, pp. 1179–1189.

Digital Library

[9]

J. Lin, X. Tan, K. Xu, L. Ma, and R. W. H. Lau, “Frequency-aware camouflaged object detection,” ACM Trans. Multimedia Comput. Commun. Appl., vol. 19, pp. 1–16, 2023.

Digital Library

[10]

W. Wang et al., “InternImage: Exploring large-scale vision foundation models with deformable convolutions,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 14408–14419.

[11]

A. Kirillov et al., “Segment anything,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2023, pp. 3992–4003.

[12]

Z. Hu et al., “LLM-adapters: An adapter family for parameter-efficient fine-tuning of large language models,” in Proc. 2023 Conf. Empirical Methods Natural Lang. Process., 2023, pp. 5254–5276.

[13]

E. J. Hu et al., “LoRA: Low-rank adaptation of large language models,” in Proc. Int. Conf. Learn. Representations, 2021, pp. 1–20.

[14]

Z. Chen et al., “Vision transformer adapter for dense predictions,” in Proc. Int. Conf. Learn. Representations, 2023, pp. 1–20.

[15]

Y. Sun, S. Wang, C. Chen, and T.-Z. Xiang, “Boundary-guided camouflaged object detection,” in Proc. 31st Int. Joint Conf. Artif. Intell., 2022, pp. 1335–1341.

[16]

Q. Jia et al., “Segment, magnify and reiterate: Detecting camouflaged objects the hard way,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 4703–4712.

[17]

J. Zhu, X. Zhang, S. Zhang, and J. Liu, “Inferring camouflaged objects by texture-aware interactive guidance network,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 3599–3607.

[18]

Y. Sun et al., “Context-aware cross-level fusion network for camouflaged object detection,” in Proc. Int. Joint Conf. Artif. Intell., 2021, pp. 1025–1031.

[19]

F. Yang et al., “Uncertainty-guided transformer reasoning for camouflaged object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 4126–4135.

[20]

A. Li et al., “Uncertainty-aware joint salient object and camouflaged object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 10066–10076.

[21]

Y. Lyu et al., “UEDG: Uncertainty-edge dual guided camouflage object detection,” IEEE Trans. Multimedia, vol. 26, pp. 4050–4060, 2024.

Digital Library

[22]

W. Zhai, Y. Cao, H. Xie, and Z.-J. Zha, “Deep texton-coherence network for camouflaged object detection,” IEEE Trans. Multimedia, vol. 25, pp. 5155–5165, 2023.

Digital Library

[23]

M. Jia et al., “Visual prompt tuning,” in Proc. Eur. Conf. Comput. Vis., 2022, pp. 709–727.

[24]

R. Zhang et al., “Tip-Adapter: Training-free CLIP-adapter for better vision-language modeling,” 2021, arXiv:2111.03930.

[25]

J. He et al., “Towards a unified view of parameter-efficient transfer learning,” in Proc. Int. Conf. Learn. Representations, 2021, pp. 1–15.

[26]

F. Yuan, X. He, A. Karatzoglou, and L. Zhang, “Parameter-efficient transfer from sequential behaviors for user modeling and recommendation,” in Proc. 43rd Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2020, pp. 1469–1478.

Digital Library

[27]

N. Houlsby et al., “Parameter-efficient transfer learning for NLP,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 2790–2799.

[28]

A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 6000–6010.

[29]

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Representations, 2020, pp. 1–21.

[30]

M. Zhou, J. Huang, C.-L. Guo, and C. Li, “Fourmer: An efficient global modeling paradigm for image restoration,” in Proc. Int. Conf. Mach. Learn., 2023, pp. 42589–42601.

[31]

C. Li et al., “Embedding Fourier for ultra-high-definition low-light image enhancement,” in Proc. Int. Conf. Learn. Representations, 2023, pp. 1–27.

[32]

Z. Qin, P. Zhang, F. Wu, and X. Li, “FcaNet: Frequency channel attention networks,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 763–772.

[33]

K. Xu et al., “Learning in the frequency domain,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 1737–1746.

[34]

Y. Chen et al., “Drop an Octave: Reducing spatial redundancy in convolutional neural networks with octave convolution,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 3434–3443.

[35]

T. Chen et al., “SAM-adapter: Adapting segment anything in underperformed scenes,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop, 2023, pp. 3359–3367.

[36]

Y. Zhang, Y. Lu, Y. Yan, H. Wang, and X. Li, “Frequency domain nuances mining for visible-infrared person re-identification,” 2024, arXiv:2401.02162.

[37]

T.-Y. Lin et al., “Feature pyramid networks for object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 936–944.

[38]

W. Wang et al., “PVT v2: Improved baselines with pyramid vision transformer,” Comput. Vis. Media, vol. 8, pp. 415–424, 2022.

[39]

J. Wei, S. Wang, and Q. Huang, “F$^{3}$net: Fusion, feedback and focus for salient object detection,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 12321–12328.

[40]

P. Skurowski et al., “Animal camouflage analysis: Chameleon database,” Unpublished manuscript, vol. 2, no. 6, p. 7, 2018.

[41]

T.-N. Le, T. V. Nguyen, Z. Nie, M.-T. Tran, and A. Sugimoto, “Anabranch network for camouflaged object segmentation,” Comput. Vis. Image Understanding, vol. 184, pp. 45–56, 2019.

[42]

D.-P. Fan et al., “Camouflaged object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 2774–2784.

[43]

Y. Lv et al., “Simultaneously localize, segment and rank the camouflaged objects,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 11586–11596.

[44]

D.-P. Fan et al., “Structure-measure: A new way to evaluate foreground maps,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2017, pp. 4558–4567.

[45]

D.-P. Fan et al., “Enhanced-alignment measure for binary foreground map evaluation,” in Proc. 27th Int. Joint Conf. Artif. Intell., 2018, pp. 698–704.

Digital Library

[46]

R. Margolin, L. Zelnik-Manor, and A. Tal, “How to evaluate foreground maps?,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2014, pp. 248–255.

[47]

F. Perazzi, P. Krähenbühl, Y. Pritch, and A. Hornung, “Saliency filters: Contrast based filtering for salient region detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2012, pp. 733–740.

[48]

T. Xiao, Y. Liu, B. Zhou, Y. Jiang, and J. Sun, “Unified perceptual parsing for scene understanding,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 432–448.

[49]

X. Zhu et al., “Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 16783–16794.

[50]

H. Mei et al., “Camouflaged object segmentation with distraction mining,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 8768–8777.

[51]

G.-P. Ji, L. Zhu, M. Zhuge, and K. Fu, “Fast camouflaged object detection via edge-based reversible re-calibration network,” Pattern Recognit., vol. 123, 2022, Art. no.

Digital Library

[52]

X. Qin et al., “BASNet: Boundary-aware salient object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7471–7481.

[53]

Y. Pang, X. Zhao, T.-Z. Xiang, L. Zhang, and H. Lu, “Zoom in and out: A mixed-scale triplet network for camouflaged object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 2150–2160.

[54]

X. Li et al., “Locate, refine and restore: A progressive enhancement network for camouflaged object detection,” in Proc. Int. Joint Conf. Artif. Intell., 2023, pp. 1116–1124.

[55]

D. Zheng et al., “MFFN: Multi-view feature fusion network for camouflaged object detection,” in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis., 2023, pp. 6221–6231.

[56]

Z. Huang et al., “Feature shrinkage pyramid for camouflaged object detection with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 5557–5566.

[57]

X. Zhou, Z. Wu, and R. Cong, “Decoupling and integration network for camouflaged object detection,” IEEE Trans. Multimedia, vol. 26, pp. 7114–7129, 2024.

Digital Library

[58]

C. He et al., “Camouflaged object detection with feature decomposition and edge reconstruction,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 22046–22055.

[59]

A. Steiner et al., “How to train your ViT? Data, augmentation, and regularization in vision transformers,” Trans. Mach . Learn. Res., vol. 2022, 2022.

[60]

H. Bao, L. Dong, S. Piao, and F. Wei, “BEiT: BERT pre-training of image transformers,” in Proc. Int. Conf. Learn. Representations, 2021.

[61]

L. Wang et al., “Learning to detect salient objects with image-level supervision,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 3796–3805.

[62]

Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 1155–1162.

[63]

G. Li and Y. Yu, “Visual saliency based on multiscale deep features,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 5455–5463.

[64]

C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 3166–3173.

[65]

L. Zhang, J. Dai, H. Lu, Y. He, and G. Wang, “A bi-directional message passing model for salient object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 1741–1750.

[66]

S. Chen, X. Tan, B. Wang, and X. Hu, “Reverse attention for salient object detection,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 236–252.

[67]

N. Liu, J. Han, and M.-H. Yang, “PiCANet: Learning pixel-wise contextual attention for saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3089–3098.

[68]

T. Wang et al., “Detect globally, refine locally: A novel approach to saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3127–3135.

[69]

Z. Wu, L. Su, and Q. Huang, “Cascaded partial decoder for fast and accurate salient object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3902–3911.

[70]

J.-J. Liu, Q. Hou, M.-M. Cheng, J. Feng, and J. Jiang, “A simple pooling-based design for real-time salient object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3912–3921.

[71]

J. Su et al., “Selectivity or invariance: Boundary-aware salient object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 3798–3807.

[72]

J.-X. Zhao et al., “EGNet: Edge guidance network for salient object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 8778–8787.

[73]

M. Zhuge et al., “Salient object detection via integrity learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3738–3752, Mar. 2023.

[74]

Z. Wu, S. Li, C. Chen, A. Hao, and H. Qin, “Deeper look at image salient object detection: Bi-stream network with a small training dataset,” IEEE Trans. Multimedia, vol. 24, pp. 73–86, 2022.

Digital Library

[75]

J. Zhu et al., “Perception-and-regulation network for salient object detection,” IEEE Trans. Multimedia, vol. 25, pp. 6525–6537, 2023.

Digital Library

[76]

Z. Yao and L. Wang, “Boundary information progressive guidance network for salient object detection,” IEEE Trans. Multimedia, vol. 24, pp. 4236–4249, 2022.

Digital Library

Index Terms

Frequency-Guided Spatial Adaptation for Camouflaged Object Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
        Interest point and salient region detections
        Object detection
        Object identification
        Object recognition
      2. Computer vision tasks

Index terms have been assigned to the content through auto-classification.

Recommendations

Frequency-aware Camouflaged Object Detection
Camouflaged object detection (COD) is important as it has various potential applications. Unlike salient object detection (SOD), which tries to identify visually salient objects, COD tries to detect objects that are visually very similar to the ...
Mscnet: Mask stepwise calibration network for camouflaged object detection
Abstract
Camouflaged object detection (COD) aims to accurately segment camouflaged objects blending into the environment and is a challenging task. Most existing deep learning-based COD methods do not explicitly enhance the region information of ...
Boundary Guided Feature Fusion Network for Camouflaged Object Detection
Pattern Recognition and Computer Vision
Abstract
Camouflaged object detection (COD) refers to the process of detecting and segmenting camouflaged objects in an environment using algorithmic techniques. The intrinsic similarity between foreground objects and the background environment limits the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 27, Issue

2025

502 pages

Issue’s Table of Contents

1520-9210 © 2025 IEEE. All rights reserved, including rights for text and data mining, and training of artificial intelligence and similar technologies. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 17 January 2025

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents