[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Frequency-Guided Spatial Adaptation for Camouflaged Object Detection

Published: 17 January 2025 Publication History

Abstract

Camouflaged object detection (COD) aims to segment camouflaged objects which exhibit very similar patterns with the surrounding environment. Recent research works have shown that enhancing the feature representation via the frequency information can greatly alleviate the ambiguity problem between the foreground objects and the background. With the emergence of vision foundation models, like InternImage, Segment Anything Model etc, adapting the pretrained model on COD tasks with a lightweight adapter module shows a novel and promising research direction. Existing adapter modules mainly care about the feature adaptation in the spatial domain. In this paper, we propose a novel frequency-guided spatial adaptation method for COD task. Specifically, we transform the input features of the adapter into frequency domain. By grouping and interacting with frequency components located within non overlapping circles in the spectrogram, different frequency components are dynamically enhanced or weakened, making the intensity of image details and contour features adaptively adjusted. At the same time, the features that are conducive to distinguishing object and background are highlighted, indirectly implying the position and shape of camouflaged object. We conduct extensive experiments on four widely adopted benchmark datasets and the proposed method outperforms 26 state-of-the-art methods with large margins. Code will be released.

References

[1]
D.-P. Fan et al., “PraNet: Parallel reverse attention network for polyp segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Interv., Springer, 2020, pp. 263–273.
[2]
Y. Huang et al., “MCMT-GAN: Multi-task coherent modality transferable GAN for 3D brain image synthesis,” IEEE Trans. Image Process., vol. 29, pp. 8187–8198, 2020.
[3]
M. Dean, R. Harwood, and C. Kasari, “The art of camouflage: Gender differences in the social behaviors of girls and boys with autism spectrum disorder,” Autism, vol. 21, pp. 678–689, 2017.
[4]
D.-P. Fan, G.-P. Ji, M.-M. Cheng, and L. Shao, “Concealed object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 10, pp. 6024–6042, Oct. 2022.
[5]
X. Hu et al., “High-resolution iterative feedback network for camouflaged object detection,” in Proc. AAAI Conf. Artif. Intell., 2023, pp. 881–889.
[6]
Q. Zhai et al., “Mutual graph learning for camouflaged object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 12992–13002.
[7]
Y. Zhong et al., “Detecting camouflaged object in frequency domain,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 4494–4503.
[8]
R. Cong et al., “Frequency perception network for camouflaged object detection,” in Proc. 31st ACM Int. Conf. Multimedia, 2023, pp. 1179–1189.
[9]
J. Lin, X. Tan, K. Xu, L. Ma, and R. W. H. Lau, “Frequency-aware camouflaged object detection,” ACM Trans. Multimedia Comput. Commun. Appl., vol. 19, pp. 1–16, 2023.
[10]
W. Wang et al., “InternImage: Exploring large-scale vision foundation models with deformable convolutions,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 14408–14419.
[11]
A. Kirillov et al., “Segment anything,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2023, pp. 3992–4003.
[12]
Z. Hu et al., “LLM-adapters: An adapter family for parameter-efficient fine-tuning of large language models,” in Proc. 2023 Conf. Empirical Methods Natural Lang. Process., 2023, pp. 5254–5276.
[13]
E. J. Hu et al., “LoRA: Low-rank adaptation of large language models,” in Proc. Int. Conf. Learn. Representations, 2021, pp. 1–20.
[14]
Z. Chen et al., “Vision transformer adapter for dense predictions,” in Proc. Int. Conf. Learn. Representations, 2023, pp. 1–20.
[15]
Y. Sun, S. Wang, C. Chen, and T.-Z. Xiang, “Boundary-guided camouflaged object detection,” in Proc. 31st Int. Joint Conf. Artif. Intell., 2022, pp. 1335–1341.
[16]
Q. Jia et al., “Segment, magnify and reiterate: Detecting camouflaged objects the hard way,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 4703–4712.
[17]
J. Zhu, X. Zhang, S. Zhang, and J. Liu, “Inferring camouflaged objects by texture-aware interactive guidance network,” in Proc. AAAI Conf. Artif. Intell., 2021, pp. 3599–3607.
[18]
Y. Sun et al., “Context-aware cross-level fusion network for camouflaged object detection,” in Proc. Int. Joint Conf. Artif. Intell., 2021, pp. 1025–1031.
[19]
F. Yang et al., “Uncertainty-guided transformer reasoning for camouflaged object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 4126–4135.
[20]
A. Li et al., “Uncertainty-aware joint salient object and camouflaged object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 10066–10076.
[21]
Y. Lyu et al., “UEDG: Uncertainty-edge dual guided camouflage object detection,” IEEE Trans. Multimedia, vol. 26, pp. 4050–4060, 2024.
[22]
W. Zhai, Y. Cao, H. Xie, and Z.-J. Zha, “Deep texton-coherence network for camouflaged object detection,” IEEE Trans. Multimedia, vol. 25, pp. 5155–5165, 2023.
[23]
M. Jia et al., “Visual prompt tuning,” in Proc. Eur. Conf. Comput. Vis., 2022, pp. 709–727.
[24]
R. Zhang et al., “Tip-Adapter: Training-free CLIP-adapter for better vision-language modeling,” 2021, arXiv:2111.03930.
[25]
J. He et al., “Towards a unified view of parameter-efficient transfer learning,” in Proc. Int. Conf. Learn. Representations, 2021, pp. 1–15.
[26]
F. Yuan, X. He, A. Karatzoglou, and L. Zhang, “Parameter-efficient transfer from sequential behaviors for user modeling and recommendation,” in Proc. 43rd Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2020, pp. 1469–1478.
[27]
N. Houlsby et al., “Parameter-efficient transfer learning for NLP,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 2790–2799.
[28]
A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 6000–6010.
[29]
A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Representations, 2020, pp. 1–21.
[30]
M. Zhou, J. Huang, C.-L. Guo, and C. Li, “Fourmer: An efficient global modeling paradigm for image restoration,” in Proc. Int. Conf. Mach. Learn., 2023, pp. 42589–42601.
[31]
C. Li et al., “Embedding Fourier for ultra-high-definition low-light image enhancement,” in Proc. Int. Conf. Learn. Representations, 2023, pp. 1–27.
[32]
Z. Qin, P. Zhang, F. Wu, and X. Li, “FcaNet: Frequency channel attention networks,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 763–772.
[33]
K. Xu et al., “Learning in the frequency domain,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 1737–1746.
[34]
Y. Chen et al., “Drop an Octave: Reducing spatial redundancy in convolutional neural networks with octave convolution,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 3434–3443.
[35]
T. Chen et al., “SAM-adapter: Adapting segment anything in underperformed scenes,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop, 2023, pp. 3359–3367.
[36]
Y. Zhang, Y. Lu, Y. Yan, H. Wang, and X. Li, “Frequency domain nuances mining for visible-infrared person re-identification,” 2024, arXiv:2401.02162.
[37]
T.-Y. Lin et al., “Feature pyramid networks for object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 936–944.
[38]
W. Wang et al., “PVT v2: Improved baselines with pyramid vision transformer,” Comput. Vis. Media, vol. 8, pp. 415–424, 2022.
[39]
J. Wei, S. Wang, and Q. Huang, “F$^{3}$net: Fusion, feedback and focus for salient object detection,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 12321–12328.
[40]
P. Skurowski et al., “Animal camouflage analysis: Chameleon database,” Unpublished manuscript, vol. 2, no. 6, p. 7, 2018.
[41]
T.-N. Le, T. V. Nguyen, Z. Nie, M.-T. Tran, and A. Sugimoto, “Anabranch network for camouflaged object segmentation,” Comput. Vis. Image Understanding, vol. 184, pp. 45–56, 2019.
[42]
D.-P. Fan et al., “Camouflaged object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 2774–2784.
[43]
Y. Lv et al., “Simultaneously localize, segment and rank the camouflaged objects,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 11586–11596.
[44]
D.-P. Fan et al., “Structure-measure: A new way to evaluate foreground maps,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2017, pp. 4558–4567.
[45]
D.-P. Fan et al., “Enhanced-alignment measure for binary foreground map evaluation,” in Proc. 27th Int. Joint Conf. Artif. Intell., 2018, pp. 698–704.
[46]
R. Margolin, L. Zelnik-Manor, and A. Tal, “How to evaluate foreground maps?,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2014, pp. 248–255.
[47]
F. Perazzi, P. Krähenbühl, Y. Pritch, and A. Hornung, “Saliency filters: Contrast based filtering for salient region detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2012, pp. 733–740.
[48]
T. Xiao, Y. Liu, B. Zhou, Y. Jiang, and J. Sun, “Unified perceptual parsing for scene understanding,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 432–448.
[49]
X. Zhu et al., “Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 16783–16794.
[50]
H. Mei et al., “Camouflaged object segmentation with distraction mining,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 8768–8777.
[51]
G.-P. Ji, L. Zhu, M. Zhuge, and K. Fu, “Fast camouflaged object detection via edge-based reversible re-calibration network,” Pattern Recognit., vol. 123, 2022, Art. no.
[52]
X. Qin et al., “BASNet: Boundary-aware salient object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7471–7481.
[53]
Y. Pang, X. Zhao, T.-Z. Xiang, L. Zhang, and H. Lu, “Zoom in and out: A mixed-scale triplet network for camouflaged object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 2150–2160.
[54]
X. Li et al., “Locate, refine and restore: A progressive enhancement network for camouflaged object detection,” in Proc. Int. Joint Conf. Artif. Intell., 2023, pp. 1116–1124.
[55]
D. Zheng et al., “MFFN: Multi-view feature fusion network for camouflaged object detection,” in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis., 2023, pp. 6221–6231.
[56]
Z. Huang et al., “Feature shrinkage pyramid for camouflaged object detection with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 5557–5566.
[57]
X. Zhou, Z. Wu, and R. Cong, “Decoupling and integration network for camouflaged object detection,” IEEE Trans. Multimedia, vol. 26, pp. 7114–7129, 2024.
[58]
C. He et al., “Camouflaged object detection with feature decomposition and edge reconstruction,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2023, pp. 22046–22055.
[59]
A. Steiner et al., “How to train your ViT? Data, augmentation, and regularization in vision transformers,” Trans. Mach . Learn. Res., vol. 2022, 2022.
[60]
H. Bao, L. Dong, S. Piao, and F. Wei, “BEiT: BERT pre-training of image transformers,” in Proc. Int. Conf. Learn. Representations, 2021.
[61]
L. Wang et al., “Learning to detect salient objects with image-level supervision,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 3796–3805.
[62]
Q. Yan, L. Xu, J. Shi, and J. Jia, “Hierarchical saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 1155–1162.
[63]
G. Li and Y. Yu, “Visual saliency based on multiscale deep features,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 5455–5463.
[64]
C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, “Saliency detection via graph-based manifold ranking,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2013, pp. 3166–3173.
[65]
L. Zhang, J. Dai, H. Lu, Y. He, and G. Wang, “A bi-directional message passing model for salient object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 1741–1750.
[66]
S. Chen, X. Tan, B. Wang, and X. Hu, “Reverse attention for salient object detection,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 236–252.
[67]
N. Liu, J. Han, and M.-H. Yang, “PiCANet: Learning pixel-wise contextual attention for saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3089–3098.
[68]
T. Wang et al., “Detect globally, refine locally: A novel approach to saliency detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3127–3135.
[69]
Z. Wu, L. Su, and Q. Huang, “Cascaded partial decoder for fast and accurate salient object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3902–3911.
[70]
J.-J. Liu, Q. Hou, M.-M. Cheng, J. Feng, and J. Jiang, “A simple pooling-based design for real-time salient object detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 3912–3921.
[71]
J. Su et al., “Selectivity or invariance: Boundary-aware salient object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 3798–3807.
[72]
J.-X. Zhao et al., “EGNet: Edge guidance network for salient object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 8778–8787.
[73]
M. Zhuge et al., “Salient object detection via integrity learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3738–3752, Mar. 2023.
[74]
Z. Wu, S. Li, C. Chen, A. Hao, and H. Qin, “Deeper look at image salient object detection: Bi-stream network with a small training dataset,” IEEE Trans. Multimedia, vol. 24, pp. 73–86, 2022.
[75]
J. Zhu et al., “Perception-and-regulation network for salient object detection,” IEEE Trans. Multimedia, vol. 25, pp. 6525–6537, 2023.
[76]
Z. Yao and L. Wang, “Boundary information progressive guidance network for salient object detection,” IEEE Trans. Multimedia, vol. 24, pp. 4236–4249, 2022.

Index Terms

  1. Frequency-Guided Spatial Adaptation for Camouflaged Object Detection
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Please enable JavaScript to view thecomments powered by Disqus.

              Information & Contributors

              Information

              Published In

              cover image IEEE Transactions on Multimedia
              IEEE Transactions on Multimedia  Volume 27, Issue
              2025
              502 pages

              Publisher

              IEEE Press

              Publication History

              Published: 17 January 2025

              Qualifiers

              • Research-article

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • 0
                Total Citations
              • 0
                Total Downloads
              • Downloads (Last 12 months)0
              • Downloads (Last 6 weeks)0
              Reflects downloads up to 09 Feb 2025

              Other Metrics

              Citations

              View Options

              View options

              Figures

              Tables

              Media

              Share

              Share

              Share this Publication link

              Share on social media