[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3664647.3681236acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Timeline and Boundary Guided Diffusion Network for Video Shadow Detection

Published: 28 October 2024 Publication History

Abstract

Video Shadow Detection (VSD) aims to detect the shadow masks with frame sequence. Existing works suffer from inefficient temporal learning. Moreover, few works address the VSD problem by considering the characteristic (i.e., boundary) of shadow. Motivated by this, we propose a Timeline and Boundary Guided Diffusion (TBGDiff) network for VSD where we take account of the past-future temporal guidance and boundary information jointly. In detail, we design a Dual Scale Aggregation (DSA) module for better temporal understanding by rethinking the affinity of the long-term and short-term frames for the clipped video. Next, we introduce Shadow Boundary Aware Attention (SBAA) to utilize the edge contexts for capturing the characteristics of shadows. Moreover, we are the first to introduce the Diffusion model for VSD in which we explore a Space-Time Encoded Embedding (STEE) to inject the temporal guidance for Diffusion to conduct shadow detection. Benefiting from these designs, our model can not only capture the temporal information but also the shadow property. Extensive experiments show that the performance of our approach overtakes the state-of-the-art methods, verifying the effectiveness of our components. We release the codes at https://github.com/haipengzhou856/TBGDiff.

References

[1]
Fan Bao, Shen Nie, Kaiwen Xue, Yue Cao, Chongxuan Li, Hang Su, and Jun Zhu. 2023. All are worth words: A vit backbone for diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22669--22679.
[2]
Dmitry Baranchuk, Andrey Voynov, Ivan Rubachev, Valentin Khrulkov, and Artem Babenko. 2021. Label-Efficient Semantic Segmentation with Diffusion Models. In International Conference on Learning Representations.
[3]
Maxim Berman, Amal Rannen Triki, and Matthew B Blaschko. 2018. The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4413--4421.
[4]
Andreas Blattmann, Robin Rombach, Kaan Oktay, Jonas Müller, and Björn Ommer. 2022. Retrieval-augmented diffusion models. Advances in Neural Information Processing Systems, Vol. 35 (2022), 15309--15324.
[5]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In European Conference on Computer Vision. 801--818.
[6]
Shoufa Chen, Peize Sun, Yibing Song, and Ping Luo. 2023. Diffusiondet: Diffusion model for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 19830--19843.
[7]
Ting Chen, Lala Li, Saurabh Saxena, Geoffrey Hinton, and David J Fleet. 2023. A generalist framework for panoptic segmentation of images and videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 909--919.
[8]
Ting Chen, Ruixiang Zhang, and Geoffrey Hinton. 2022. Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning. In The Eleventh International Conference on Learning Representations.
[9]
Xi Chen, Zhiyan Zhao, Feiwu Yu, Yilei Zhang, and Manni Duan. 2021. Conditional diffusion for interactive segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7345--7354.
[10]
Zhihao Chen, Liang Wan, Lei Zhu, Jia Shen, Huazhu Fu, Wennan Liu, and Jing Qin. 2021. Triple-cooperative video shadow detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2715--2724.
[11]
Zhihao Chen, Lei Zhu, Liang Wan, Song Wang, Wei Feng, and Pheng-Ann Heng. 2020. A multi-task mean teacher for semi-supervised shadow detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5611--5620.
[12]
Ho Kei Cheng and Alexander G Schwing. 2022. Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model. In European Conference on Computer Vision. Springer, 640--658.
[13]
Ho Kei Cheng, Yu-Wing Tai, and Chi-Keung Tang. 2021. Rethinking space-time networks with improved memory coverage for efficient video object segmentation. Advances in Neural Information Processing Systems, Vol. 34 (2021), 11781--11794.
[14]
Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei Zhang, Yao Zhao, and Sam Kwong. 2023. Sddnet: Style-guided dual-layer disentanglement network for shadow detection. In Proceedings of the 31st ACM International Conference on Multimedia. 1202--1211.
[15]
Rita Cucchiara, Costantino Grana, Massimo Piccardi, and Andrea Prati. 2003. Detecting moving objects, ghosts, and shadows in video streams. IEEE transactions on pattern analysis and machine intelligence, Vol. 25, 10 (2003), 1337--1342.
[16]
Xinpeng Ding, Jingwen Yang, Xiaowei Hu, and Xiaomeng Li. 2022. Learning shadow correspondence for video shadow detection. In European Conference on Computer Vision. Springer, 705--722.
[17]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[18]
Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, and Jian Yang. 2024. Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 26109--26119.
[19]
Lanqing Guo, Chong Wang, Wenhan Yang, Yufei Wang, and Bihan Wen. 2023. Boundary-Aware Divide and Conquer: A Diffusion-based Solution for Unsupervised Shadow Removal. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13045--13054.
[20]
Salman Hameed Khan, Mohammed Bennamoun, Ferdous Sohel, and Roberto Togneri. 2014. Automatic feature learning for robust shadow detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1931--1938.
[21]
Xizewen Han, Huangjie Zheng, and Mingyuan Zhou. 2022. Card: Classification and regression diffusion models. Advances in Neural Information Processing Systems, Vol. 35 (2022), 18100--18115.
[22]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 770--778.
[23]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[24]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems, Vol. 33 (2020), 6840--6851.
[25]
Xiaowei Hu, Tianyu Wang, Chi-Wing Fu, Yitong Jiang, Qiong Wang, and Pheng-Ann Heng. 2021. Revisiting shadow detection: A new benchmark dataset for complex world. IEEE Transactions on Image Processing, Vol. 30 (2021), 1925--1934.
[26]
Naoto Inoue and Toshihiko Yamasaki. 2020. Learning from synthetic shadows for shadow detection and removal. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 31, 11 (2020), 4187--4197.
[27]
Yuanfeng Ji, Zhe Chen, Enze Xie, Lanqing Hong, Xihui Liu, Zhaoqiang Liu, Tong Lu, Zhenguo Li, and Ping Luo. 2023. Ddp: Diffusion model for dense visual prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 21741--21752.
[28]
Youngjo Lee, Hongje Seong, and Euntai Kim. 2022. Iteratively selecting an easy reference frame makes unsupervised video object segmentation easier. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1245--1253.
[29]
Yu Li, Zhuoran Shen, and Ying Shan. 2020. Fast video object segmentation using the global context module. In European Conference on Computer Vision. Springer, 735--750.
[30]
Yongqing Liang, Xin Li, Navid Jafari, and Jim Chen. 2020. Video object segmentation with adaptive feature bank and uncertain-region refinement. Advances in Neural Information Processing Systems, Vol. 33 (2020), 3430--3441.
[31]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2117--2125.
[32]
Lihao Liu, Jean Prost, Lei Zhu, Nicolas Papadakis, Pietro Liö, Carola-Bibiane Schönlieb, and Angelica I Aviles-Rivero. 2023. SCOTCH and SODA: A Transformer Video Shadow Detection Framework. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10449--10458.
[33]
Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
[34]
Chen Lu, Min Xia, Ming Qian, and Binyu Chen. 2022. Dual-branch network for cloud and cloud shadow segmentation. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--12.
[35]
Xiao Lu, Yihong Cao, Sheng Liu, Chengjiang Long, Zipei Chen, Xuanyu Zhou, Yimin Yang, and Chunxia Xiao. 2022. Video shadow detection via spatio-temporal interpolation consistency training. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3116--3125.
[36]
Xiankai Lu, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli. 2019. See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3623--3632.
[37]
Sohail Nadimi and Bir Bhanu. 2004. Physical models for moving shadow and object detection in video. IEEE transactions on pattern analysis and machine intelligence, Vol. 26, 8 (2004), 1079--1087.
[38]
Kunpeng Niu, Yanli Liu, Enhua Wu, and Guanyu Xing. 2022. A boundary-aware network for shadow removal. IEEE Transactions on Multimedia (2022).
[39]
Seoung Wug Oh, Joon-Young Lee, Ning Xu, and Seon Joo Kim. 2019. Video object segmentation using space-time memory networks. In Proceedings of the IEEE International Conference on Computer Vision. 9226--9235.
[40]
William Peebles and Saining Xie. 2023. Scalable diffusion models with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4195--4205.
[41]
Gensheng Pei, Fumin Shen, Yazhou Yao, Guo-Sen Xie, Zhenmin Tang, and Jinhui Tang. 2022. Hierarchical feature alignment network for unsupervised video object segmentation. In European Conference on Computer Vision. Springer, 596--613.
[42]
Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, and Lei Zhu. 2024. UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks. arXiv preprint arXiv:2407.02158 (2024).
[43]
Sucheng Ren, Wenxi Liu, Yongtuo Liu, Haoxin Chen, Guoqiang Han, and Shengfeng He. 2021. Reciprocal transformations for unsupervised video object segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 15455--15464.
[44]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684--10695.
[45]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 234--241.
[46]
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.
[47]
Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, and Kin-Man Lam. 2018. Pyramid dilated deeper convlstm for video salient object detection. In European Conference on Computer Vision. 715--731.
[48]
Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising Diffusion Implicit Models. In International Conference on Learning Representations.
[49]
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2020. Score-Based Generative Modeling through Stochastic Differential Equations. In International Conference on Learning Representations.
[50]
Tiankang Su, Huihui Song, Dong Liu, Bo Liu, and Qingshan Liu. 2023. Unsupervised Video Object Segmentation with Online Adversarial Self-Tuning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 688--698.
[51]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[52]
Tomas F. Yago Vicente, Le Hou, Chen-Ping Yu, Minh Hoai, and Dimitris Samaras. 2016. Large-scale Training of Shadow Detectors with Noisily-Annotated Shadow Examples. In Proceedings of European Conference on Computer Vision.
[53]
Paul Voigtlaender, Yuning Chai, Florian Schroff, Hartwig Adam, Bastian Leibe, and Liang-Chieh Chen. 2019. Feelvos: Fast end-to-end embedding learning for video object segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9481--9490.
[54]
Haochen Wang, Xiaolong Jiang, Haibing Ren, Yao Hu, and Song Bai. 2021. Swiftnet: Real-time video object segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1296--1305.
[55]
Hongqiu Wang, Yueming Jin, and Lei Zhu. 2023. Dynamic Interactive Relation Capturing via Scene Graph Learning for Robotic Surgical Report Generation. In 2023 IEEE International Conference on Robotics and Automation. IEEE, 2702--2709.
[56]
Hongqiu Wang, Wei Wang, Haipeng Zhou, Huihui Xu, Shaozhi Wu, and Lei Zhu. 2024. Language-Driven Interactive Shadow Detection. In ACM Multimedia 2024.
[57]
Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, and Lei Zhu. 2024. Video-instrument synergistic network for referring video instrument segmentation in robotic surgery. IEEE Transactions on Medical Imaging (2024).
[58]
Jifeng Wang, Xiang Li, and Jian Yang. 2018. Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1788--1797.
[59]
Shujun Wang, Lequan Yu, Kang Li, Xin Yang, Chi-Wing Fu, and Pheng-Ann Heng. 2019. Boundary and entropy-driven adversarial learning for fundus image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 102--110.
[60]
Yonghui Wang, Wengang Zhou, Yunyao Mao, and Houqiang Li. 2023. Detect any shadow: Segment anything for video shadow detection. IEEE Transactions on Circuits and Systems for Video Technology (2023).
[61]
Hongtao Wu, Yijun Yang, Haoyu Chen, Jingjing Ren, and Lei Zhu. 2023. Mask-Guided Progressive Network for Joint Raindrop and Rain Streak Removal in Videos. In Proceedings of the 31st ACM International Conference on Multimedia. 7216--7225.
[62]
Hongtao Wu, Yijun Yang, Huihui Xu, Weiming Wang, Jinni Zhou, and Lei Zhu. 2024. RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining. In ACM Multimedia 2024.
[63]
Lin Xi, Weihai Chen, Xingming Wu, Zhong Liu, and Zhengguo Li. 2023. Online Unsupervised Video Object Segmentation via Contrastive Motion Clustering. IEEE Transactions on Circuits and Systems for Video Technology (2023).
[64]
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, Vol. 34 (2021), 12077--12090.
[65]
Zhaohu Xing, Liang Wan, Huazhu Fu, Guang Yang, and Lei Zhu. 2023. Diff-unet: A diffusion embedded network for volumetric segmentation. arXiv preprint arXiv:2303.10326 (2023).
[66]
Zhaohu Xing, Lequan Yu, Liang Wan, Tong Han, and Lei Zhu. 2022. NestedFormer: Nested modality-aware transformer for brain tumor segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 140--150.
[67]
Zhaohu Xing, Lei Zhu, Lequan Yu, Zhiheng Xing, and Liang Wan. 2024. Hybrid Masked Image Modeling for 3D Medical Image Segmentation. IEEE Journal of Biomedical and Health Informatics (2024).
[68]
Yimin Xu, Mingbao Lin, Hong Yang, Fei Chao, and Rongrong Ji. 2024. Shadow-aware dynamic convolution for shadow removal. Pattern Recognition, Vol. 146 (2024), 109969.
[69]
Han Yang, Tianyu Wang, Xiaowei Hu, and Chi-Wing Fu. 2023. SILT: Shadow-aware Iterative Label Tuning for Learning to Detect Shadows from Noisy Labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12687--12698.
[70]
Yijun Yang, Angelica I Aviles-Rivero, Huazhu Fu, Ye Liu, Weiming Wang, and Lei Zhu. 2023. Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13200--13210.
[71]
Yijun Yang, Huazhu Fu, Angelica I Aviles-Rivero, Carola-Bibiane Schönlieb, and Lei Zhu. 2023. Diffmic: Dual-guidance diffusion network for medical image classification. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 95--105.
[72]
Yijun Yang, Hongtao Wu, Angelica I Aviles-Rivero, Yulun Zhang, Jing Qin, and Lei Zhu. 2024. Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25606--25616.
[73]
Yijun Yang, Zhaohu Xing, and Lei Zhu. 2024. Vivim: a Video Vision Mamba for Medical Video Object Segmentation. arXiv preprint arXiv:2401.14168 (2024).
[74]
Tian Ye, Sixiang Chen, Wenhao Chai, Zhaohu Xing, Jing Qin, Ge Lin, and Lei Zhu. 2024. Learning Diffusion Texture Priors for Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2524--2534.
[75]
Tian Ye, Sixiang Chen, Yun Liu, Wenhao Chai, Jinbin Bai, Wenbin Zou, Yunchen Zhang, Mingchao Jiang, Erkang Chen, and Chenghao Xue. 2023. Sequential Affinity Learning for Video Restoration. In Proceedings of the 31st ACM International Conference on Multimedia. 4147--4156.
[76]
Zhicheng Zhang, Song Chen, Zichuan Wang, and Jufeng Yang. 2023. Planeseg: Building a plug-in for boosting planar region segmentation. IEEE Transactions on Neural Networks and Learning Systems (2023).
[77]
Zhicheng Zhang, Junyao Hu, Wentao Cheng, Danda Paudel, and Jufeng Yang. 2024. ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[78]
Junting Zhao, Zhaohu Xing, Zhihao Chen, Liang Wan, Tong Han, Huazhu Fu, and Lei Zhu. 2023. Uncertainty-Aware multi-dimensional mutual learning for brain and brain tumor segmentation. IEEE Journal of Biomedical and Health Informatics, Vol. 27, 9 (2023), 4362--4372.
[79]
Quanlong Zheng, Xiaotian Qiao, Ying Cao, and Rynson WH Lau. 2019. Distraction-aware shadow detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5167--5176.
[80]
Lei Zhu, Zijun Deng, Xiaowei Hu, Chi-Wing Fu, Xuemiao Xu, Jing Qin, and Pheng-Ann Heng. 2018. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In European Conference on Computer Vision. 121--136.

Cited By

View all
  • (2024)Diffusion Model for Camouflaged Object Segmentation with Frequency DomainElectronics10.3390/electronics1319392213:19(3922)Online publication date: 3-Oct-2024
  • (2024)Non-Invasive to Invasive: Enhancing FFA Synthesis from CFP with a Benchmark Dataset and a Novel NetworkProceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine10.1145/3688868.3689194(7-15)Online publication date: 28-Oct-2024
  • (2024)Language-Driven Interactive Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681192(5527-5536)Online publication date: 28-Oct-2024

Index Terms

  1. Timeline and Boundary Guided Diffusion Network for Video Shadow Detection

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. boundary attention
    2. diffusion model
    3. temporal guidance
    4. video shadow detection

    Qualifiers

    • Research-article

    Funding Sources

    • Guangzhou Industrial Information and Intelligent Key Laboratory Project
    • Nansha Key Area Science and Technology Project
    • Guangzhou-HKUST(GZ) Joint Funding Program

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)56
    • Downloads (Last 6 weeks)56
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Diffusion Model for Camouflaged Object Segmentation with Frequency DomainElectronics10.3390/electronics1319392213:19(3922)Online publication date: 3-Oct-2024
    • (2024)Non-Invasive to Invasive: Enhancing FFA Synthesis from CFP with a Benchmark Dataset and a Novel NetworkProceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine10.1145/3688868.3689194(7-15)Online publication date: 28-Oct-2024
    • (2024)Language-Driven Interactive Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681192(5527-5536)Online publication date: 28-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media