Abstract
Tiny object detection is one of the key challenges for most generic detectors. The main difficulty lies in extracting effective features of tiny objects. Existing methods usually perform generation-based feature enhancement, which is seriously affected by spurious textures and artifacts, making it difficult to make the tiny-object-specific features visible and clear for detection. To address this issue, we propose a self-reconstructed tiny object detection (SR-TOD) framework. We for the first time introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects. Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects. This inspires us to enhance the weak representations of tiny objects under the guidance of the difference maps. Thus, improving the visibility of tiny objects for the detectors. Building on this, we further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear. In addition, we further propose a new multi-instance anti-UAV dataset. Extensive experiments demonstrate our effectiveness. The code is available: https://github.com/Hiyuur/SR-TOD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bai, Y., Zhang, Y., Ding, M., Ghanem, B.: Finding tiny faces in the wild with generative adversarial network. In: CVPR, pp. 21–30 (2018)
Bai, Y., Zhang, Y., Ding, M., Ghanem, B.: SOD-MTGAN: small object detection via multi-task generative adversarial network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 210–226. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_13
Bashir, S.M.A., Wang, Y.: Small object detection in remote sensing images with residual feature aggregation-based super-resolution and object detector network. Remote Sens. 13(9), 1854 (2021)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Bosquet, B., Cores, D., Seidenari, L., Brea, V.M., Mucientes, M., Del Bimbo, A.: A full data augmentation pipeline for small object detection based on generative adversarial networks. PR 133, 108998 (2023)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR, pp. 6154–6162 (2018)
Cao, B., et al.: Autoencoder-driven multimodal collaborative learning for medical image synthesis. IJCV 131, 1–20 (2023)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chen, C., Liu, M.-Y., Tuzel, O., Xiao, J.: R-CNN for small object detection. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10115, pp. 214–230. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54193-8_14
Chen, K., et al.: MMDetection: open MMLab detection toolbox and benchmark. arxiv 2019. arXiv preprint arXiv:1906.07155 (2019)
Cheng, G., et al.: Towards large-scale small object detection: survey and benchmarks. IEEE TPAMI 45(11), 13467–13488 (2023)
Coluccia, A., et al.: Drone-vs-bird detection challenge at IEEE avss2021. In: 2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–8 (2021)
Dai, X., et al.: Dynamic head: unifying object detection heads with attentions. In: CVPR, pp. 7373–7382 (2021)
Deng, C., Wang, M., Liu, L., Liu, Y., Jiang, Y.: Extended feature pyramid network for small object detection. IEEE TMM 24, 1968–1979 (2021)
Du, D., et al.: Visdrone-det2019: the vision meets drone object detection in image challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 213–226 (2019)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE TPAMI 43(2), 652–662 (2019)
Ghiasi, G., Lin, T.Y., Le, Q.V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. In: CVPR, pp. 7036–7045 (2019)
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Gong, Y., Yu, X., Ding, Y., Peng, X., Zhao, J., Han, Z.: Effective fusion factor in FPN for tiny object detection. In: WACV, pp. 1160–1168 (2021)
Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS, pp. 2672–2680 (2014)
Guo, G., Chen, P., Yu, X., Han, Z., Ye, Q., Gao, S.: Save the tiny, save the all: hierarchical activation network for tiny object detection. IEEE TCSVT (2023)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Hu, X., et al.: SINet: a scale-insensitive convolutional neural network for fast vehicle detection. IEEE Trans. Intell. Transp. Syst. 20(3), 1010–1019 (2018)
Jiang, L., Dai, B., Wu, W., Loy, C.C.: Focal frequency loss for image reconstruction and synthesis. In: ICCV, pp. 13919–13929 (2021)
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019)
Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: FoveaBox: beyound anchor-based object detection. IEEE TIP 29, 7389–7398 (2020)
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: CVPR, pp. 1222–1230 (2017)
Li, Y., Chen, Y., Wang, N., Zhang, Z.: Scale-aware trident networks for object detection. In: ICCV, pp. 6054–6063 (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV, pp. 2980–2988 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: CVPR, pp. 8759–8768 (2018)
Lu, X., Li, B., Yue, Y., Li, Q., Yan, J.: Grid R-CNN. In: CVPR, pp. 7363–7372 (2019)
Meethal, A., Granger, E., Pedersoli, M.: Cascaded zoom-in detector for high resolution aerial images. In: CVPRW, pp. 2045–2054 (2023)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML, pp. 807–814 (2010)
Noh, J., Bae, W., Lee, W., Seo, J., Kim, G.: Finding tiny faces in the wild with generative adversarial network. In: ICCV, pp. 9725–9734 (2019)
Qiao, S., Chen, L.C., Yuille, A.: Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: CVPR, pp. 10213–10224 (2021)
Rabbi, J., Ray, N., Schubert, M., Chowdhury, S., Chao, D.: Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network. Remote Sens. 12(9), 1432 (2020)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: CVPR, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS, vol. 28, pp. 91–99 (2015)
Rodriguez-Ramos, A., Rodriguez-Vazquez, J., Sampedro, C., Campoy, P.: Adaptive inattentional framework for video object detection with reward-conditional training. IEEE Access 8, 124451–124466 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Singh, B., Davis, L.S.: An analysis of scale invariance in object detection snip. In: CVPR, pp. 3578–3587 (2018)
Sun, P., et al.: Sparse R-CNN: end-to-end object detection with learnable proposals. In: CVPR, pp. 14454–14463 (2021)
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: CVPR, pp. 10781–10790 (2020)
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: ICCV, pp. 9627–9636 (2019)
Vu, T., Jang, H., Pham, T.X., Yoo, C.: Cascade RPN: delving into high-quality region proposal network with adaptive convolution. In: NeurIPS, vol. 32, pp. 1432–1442 (2019)
Wang, J., Yang, W., Guo, H., Zhang, R., Xia, G.S.: Tiny object detection in aerial images. In: ICPR, pp. 3791–3798 (2021)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Proceedings of the IEEE conference on computer vision and pattern recognition. In: CVPR, pp. 1492–1500 (2017)
Xu, C., Wang, J., Yang, W., Yu, H., Yu, L., Xia, G.S.: RFLA: Gaussian receptive field based label assignment for tiny object detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13669, pp. 526–543. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20077-9_31
Yang, C., Huang, Z., Wang, N.: Querydet: cascaded sparse query for accelerating high-resolution small object detection. In: CVPR, pp. 13668–13677 (2022)
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: RepPoints: point set representation for object detection. In: ICCV, pp. 9657–9666 (2019)
Yu, X., Gong, Y., Jiang, N., Ye, Q., Han, Z.: Scale match for tiny person detection. In: WACV, pp. 1257–1265 (2020)
Yuan, X., Cheng, G., Yan, K., Zeng, Q., Han, J.: Small object detection via coarse-to-fine proposal generation and imitation learning. In: ICCV, pp. 6317–6327 (2023)
Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: ICCV, pp. 2018–2025 (2011)
Zhang, H., et al.: Dino: DeTR with improved denoising anchor boxes for end-to-end object detection. In: ICLR (2022)
Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR, pp. 9759–9768 (2020)
Zhao, J., Zhang, J., Li, D., Wang, D.: Vision-based anti-UAV detection and tracking. IEEE Trans. Intell. Transp. Syst. 23(12), 25323–25334 (2022)
Zhou, Y., Deng, W., Tong, T., Gao, Q.: Guided frequency separation network for real-world super-resolution. In: CVPRW, pp. 428–429 (2020)
Zhu, P., et al.: VisDrone-DET2018: the vision meets drone object detection in image challenge results. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 437–468. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_27
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DeTR: deformable transformers for end-to-end object detection. In: ICLR (2021)
Acknowledgements
This work was sponsored in part by the National Key R&D Program of China 2022ZD0116500, in part by the National Natural Science Foundation of China (62106171, 62222608, U23B2049, 61925602), in part by the Haihe Lab of ITAI under Grant 22HHXCJC00002, in part by Tianjin Natural Science Funds for Distinguished Young Scholar under Grant 23JCJQJC00270, and in part by the Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications under Grant BDIC-2023-A-008. This work was also sponsored by CAAI-CANN Open Fund, developed on OpenI Community.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cao, B., Yao, H., Zhu, P., Hu, Q. (2025). Visible and Clear: Finding Tiny Objects in Difference Map. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15075. Springer, Cham. https://doi.org/10.1007/978-3-031-72643-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-72643-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72642-2
Online ISBN: 978-3-031-72643-9
eBook Packages: Computer ScienceComputer Science (R0)