[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3666025.3699324acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article
Open access

ImmerScope: Multi-view Video Aggregation at Edge towards Immersive Content Services

Published: 04 November 2024 Publication History

Abstract

The multi-camera capture system is an emerging visual sensing modality. It facilitates the production of various immersive contents ranging from regular to neural videos. Although the delivery of immersive content is popular and promising, it suffers from the bandwidth bottleneck when streaming multi-view videos to the cloud (i.e., multi-view video aggregation). Existing works fail to provide a bandwidth-efficient and content-generic solution. Even the closest effort to ours based on the SOTA multi-view video codecs suffers from issues of underutilized dependency and content distortion. In this paper, we present ImmerScope, a multi-view video aggregation framework at the edge with a neural multi-view video codec. It outperforms existing solutions with highly-utilized dependency via neuron connections and distortion awareness via end-to-end training. Evaluations on diverse multi-camera setups show that ImmerScope outperforms single-view codecs by at least 64% bandwidth savings in peak-signal-to-noise ratio with a frame rate of 50 fps.

References

[1]
Adobe. 2012. Adobe's Real Time Messaging Protocol. Available at: https://rtmp.veriskope.com/pdf/rtmp_specification_1.0.pdf. (2012). Accessed on 2023.
[2]
Eirikur Agustsson, David Minnen, Nick Johnston, Johannes Balle, Sung Jin Hwang, and George Toderici. 2020. Scale-space flow for end-to-end optimized video compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8503--8512.
[3]
Xavier Alameda-Pineda, Jacopo Staiano, Ramanathan Subramanian, Ligia Batrinca, Elisa Ricci, Bruno Lepri, Oswald Lanz, and Nicu Sebe. 2015. Salsa: A novel dataset for multimodal group behavior analysis. IEEE transactions on pattern analysis and machine intelligence 38, 8 (2015), 1707--1720.
[4]
Miko Atokari, Marko Viitanen, Alexandre Mercat, Emil Kattainen, and Jarno Vanne. 2019. Parallax-tolerant 360 live video stitcher. In 2019 IEEE Visual Communications and Image Processing (VCIP). IEEE, 1--4.
[5]
Roberto G de A Azevedo, Neil Birkbeck, Francesca De Simone, Ivan Janatra, Balu Adsumilli, and Pascal Frossard. 2019. Visual distortions in 360° videos. IEEE Transactions on Circuits and Systems for Video Technology 30, 8 (2019), 2524--2537.
[6]
Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. CompressAI: a PyTorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).
[7]
Fabrice Bellard. 2018. BPG Image Format. (2018). https://bellard.org/bpg/
[8]
Abdelhak Bentaleb, Ali C Begen, and Roger Zimmermann. 2016. SDNDASH: Improving QoE of HTTP adaptive streaming using software defined networking. In Proceedings of the 24th ACM international conference on Multimedia. 1296--1305.
[9]
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. Proceedings of COMPSTAT'2010 (2010), 177--186.
[10]
Michael Broxton, John Flynn, Ryan Overbeck, Daniel Erickson, Peter Hedman, Matthew Duvall, Jason Dourgarian, Jay Busch, Matt Whalen, and Paul Debevec. 2020. Immersive light field video with a layered mesh representation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 86--1.
[11]
Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2019. Once-for-all: Train one network and specialize it for efficient deployment. arXiv preprint arXiv:1908.09791 (2019).
[12]
Tatjana Chavdarova, Pierre Baqué, Stéphane Bouquet, Andrii Maksai, Cijo Jose, Timur Bagautdinov, Louis Lettry, Pascal Fua, Luc Van Gool, and François Fleuret. 2018. Wildtrack: A multi-camera hd dataset for dense unscripted pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5030--5039.
[13]
Bo Chen, Mingyuan Wu, Hongpeng Guo, Zhisheng Yan, and Klara Nahrstedt. 2024. Vesper: Learning to Manage Uncertainty in Video Streaming. In Proceedings of the 15th ACM Multimedia Systems Conference. 166--177.
[14]
Bo Chen, Zhisheng Yan, Bo Han, and Klara Nahrstedt. 2024. {NeRFHub}: A Context-Aware NeRF Serving Framework for Mobile Immersive Applications. In Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services. 85--98.
[15]
Bo Chen, Zhisheng Yan, Yinjie Zhang, Zhe Yang, and Klara Nahrstedt. 2024. {LiFteR}: Unleash Learned Codecs in Video Streaming with Loose Frame Referencing. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). 533--548.
[16]
Ying Chen, Ye-Kui Wang, Kemal Ugur, Miska M Hannuksela, Jani Lainema, and Moncef Gabbouj. 2008. The emerging MVC standard for 3D video services. EURASIP Journal on Advances in Signal Processing 2009 (2008), 1--13.
[17]
Zhenghao Chen, Guo Lu, Zhihao Hu, Shan Liu, Wei Jiang, and Dong Xu. 2022. LSVC: A learning-based stereo video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6073--6082.
[18]
Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, Yuhan Liu, Kuntai Du, Xu Zhang, Francis Y Yan, et al. 2024. {GRACE}:{Loss-Resilient}{Real-Time} Video through Neural Codecs. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). 509--531.
[19]
Alexander Clemm, Maria Torres Vega, Hemanth Kumar Ravuri, Tim Wauters, and Filip De Turck. 2020. Toward Truly Immersive Holographic-type Communication: Challenges and Solutions. IEEE Communications Magazine 58, 1 (2020), 93--99.
[20]
Abril Corona-Figueroa, Jonathan Frawley, Sam Bond-Taylor, Sarath Bethapudi, Hubert PH Shum, and Chris G Willcocks. 2022. Mednerf: Medical neural radiance fields for reconstructing 3d-aware ct-projections from a single x-ray. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 3843--3848. https://arxiv.org/abs/2202.01020
[21]
Mallesham Dasari, Kumara Kahatapitiya, Samir R Das, Aruna Balasubramanian, and Dimitris Samaras. 2022. Swift: Adaptive Video Streaming with Layered Neural Codecs. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 103--118.
[22]
Xin Deng, Wenzhe Yang, Ren Yang, Mai Xu, Enpeng Liu, Qianhan Feng, and Radu Timofte. 2021. Deep homography for efficient stereo image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1492--1501.
[23]
FCC. 2023. Measuring Broadband Raw Data Releases - Fixed. (2023). https://www.fcc.gov/oet/mba/raw-data-releases
[24]
Internet Engineering Task Force. 1998. Real Time Streaming Protocol (RTSP). Available at: https://datatracker.ietf.org/doc/html/rfc2326. (1998). Accessed on 2023.
[25]
Aditya Ganjam, Faisal Siddiqui, Jibin Zhan, Xi Liu, Ion Stoica, Junchen Jiang, Vyas Sekar, and Hui Zhang. 2015. C3: Internet-scale control plane for video quality optimization. In 12th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 15). 131--144.
[26]
Kyle Gao, Yina Gao, Hongjie He, Dening Lu, Linlin Xu, and Jonathan Li. 2022. Nerf: Neural radiance field in 3d vision, a comprehensive review. arXiv preprint arXiv:2210.00379 (2022). https://arxiv.org/abs/2210.00379
[27]
The Independent JPEG Group. 2014. libjpeg. (2014). https://github.com/LuaDist/libjpeg
[28]
Yongjie Guan, Xueyu Hou, Nan Wu, Bo Han, and Tao Han. 2023. MetaStream: Live Volumetric Content Capture, Creation, Delivery, and Rendering in Real Time. In Proceedings of the 29th Annual International Conference on Mobile Computing and Networking. 1--15.
[29]
Yu Guan, Chengyuan Zheng, Xinggong Zhang, Zongming Guo, and Junchen Jiang. 2019. Pano: Optimizing 360 video streaming with a better understanding of quality perception. In Proceedings of the ACM Special Interest Group on Data Communication. 394--407.
[30]
Amirhossein Habibian, Ties van Rozendaal, Jakub M Tomczak, and Taco S Cohen. 2019. Video compression with rate-distortion autoencoders. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7033--7042.
[31]
Bo Han, Yu Liu, and Feng Qian. 2020. ViVo: Visibility-aware mobile volumetric video streaming. In Proceedings of the 26th annual international conference on mobile computing and networking. 1--13.
[32]
Xiaotian Han, Quanzeng You, Chunyu Wang, Zhizheng Zhang, Peng Chu, Houdong Hu, Jiang Wang, and Zicheng Liu. 2021. MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark. (2021). arXiv:cs.CV/2111.15157
[33]
Jian He, Mubashir Adnan Qureshi, Lili Qiu, Jin Li, Feng Li, and Lei Han. 2018. Rubiks: Practical 360-Degree Streaming for Smartphones. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 482--494.
[34]
Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In 2010 20th international conference on pattern recognition. IEEE, 2366--2369.
[35]
Tianchi Huang, Rui-Xiao Zhang, Chao Zhou, and Lifeng Sun. 2018. QARC: Video quality aware rate control for real-time video streaming based on deep reinforcement learning. In Proceedings of the 26th ACM international conference on Multimedia. 1208--1216.
[36]
Tianchi Huang, Chao Zhou, Rui-Xiao Zhang, Chenglei Wu, and Lifeng Sun. 2023. Buffer awareness neural adaptive video streaming for avoiding extra buffer consumption. In IEEE INFOCOM 2023-IEEE Conference on Computer Communications. IEEE, 1--10.
[37]
Tianchi Huang, Chao Zhou, Rui-Xiao Zhang, Chenglei Wu, Xin Yao, and Lifeng Sun. 2019. Comyco: Quality-aware adaptive video streaming via imitation learning. In Proceedings of the 27th ACM international conference on multimedia. 429--437.
[38]
Tianchi Huang, Chao Zhou, Rui-Xiao Zhang, Chenglei Wu, Xin Yao, and Lifeng Sun. 2020. Stick: A harmonious fusion of buffer-based and learning-based approach for adaptive streaming. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 1967--1976.
[39]
David A Huffman. 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (1952), 1098--1101.
[40]
IETF/W3C. 2015. WebRTC. (2015). https://webrtc.org/
[41]
ITU-T and ISO/IEC JTC 1. 1994. Generic coding of moving pictures and associated audio informationVPart 2: Video. ITU-T Recommendation H.262 and ISO/IEC 13818-2 (MPEG-2 Video) (1994).
[42]
Junchen Jiang, Shijie Sun, Vyas Sekar, and Hui Zhang. 2017. Pytheas: Enabling data-driven quality of experience optimization using group-based exploration-exploitation. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 393--406.
[43]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 2023. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics 42, 4 (2023), 1--14.
[44]
Jaehong Kim, Youngmok Jung, Hyunho Yeo, Juncheol Ye, and Dongsu Han. 2020. Neural-enhanced live streaming: Improving live video ingest via online learning. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 107--125.
[45]
Yoon Kim, Sam Wiseman, Andrew Miller, David Sontag, and Alexander Rush. 2018. Semi-amortized variational autoencoders. In International Conference on Machine Learning. PMLR, 2678--2687.
[46]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[47]
DidierJ Le Gall. 1992. The MPEG video compression algorithm. Signal Processing: Image Communication 4, 2 (1992), 129--140.
[48]
Kyungjin Lee, Juheon Yi, and Youngki Lee. 2023. Farfetchfusion: Towards fully mobile live 3d telepresence platform. In Proceedings of the 29th Annual International Conference on Mobile Computing and Networking. 1--15.
[49]
Kyungjin Lee, Juheon Yi, Youngki Lee, Sunghyun Choi, and Young Min Kim. 2020. GROOT: a real-time streaming system of high-fidelity volumetric videos. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--14.
[50]
Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, et al. 2022. Neural 3d video synthesis from multi-view video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5521--5531.
[51]
Xianshang Lin, Yunfei Ma, Junshao Zhang, Yao Cui, Jing Li, Shi Bai, Ziyue Zhang, Dennis Cai, Hongqiang Harry Liu, and Ming Zhang. 2022. GSO-simulcast: global stream orchestration in simulcast video conferencing systems. In Proceedings of the ACM SIGCOMM 2022 Conference. 826--839.
[52]
Jerry Liu, Shenlong Wang, and Raquel Urtasun. 2019. Dsic: Deep stereo image compression. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3136--3145.
[53]
Yu Liu, Bo Han, Feng Qian, Arvind Narayanan, and Zhi-Li Zhang. 2022. Vues: practical mobile volumetric video streaming through multiview transcoding. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking. 514--527.
[54]
Yung-Cheng Liu, Wen-Hsin Chan, and Ye-Quang Chen. 1995. Automatic white balance for digital still camera. IEEE Transactions on Consumer Electronics 41, 3 (1995), 460--466.
[55]
Jean loup Gailly and Mark Adler. 1995. zlib: A Massively Spiffy Yet Delicately Unobtrusive Compression Library. https://zlib.net. (1995). https://zlib.net Version 1.2.11.
[56]
Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, and Zhiyong Gao. 2020. Content adaptive and error propagation aware deep video compression. In European Conference on Computer Vision. Springer, 456--472.
[57]
Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, and Zhiyong Gao. 2019. Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11006--11015.
[58]
Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication. ACM, 197--210.
[59]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99--106.
[60]
David L Mills. 1992. Network Time Protocol (Version 3) Specification, Implementation and Analysis. RFC 1305 (1992). https://datatracker.ietf.org/doc/html/rfc1305
[61]
NVIDIA. 2023. Neural Radiance Fields: View Synthesis and 3D Reconstruction for 3D Online Shopping Experiences. https://www.nvidia.com/en-us/on-demand/session/gtcspring23-s51547/. (2023). Accessed: 2024-06-29.
[62]
Albert Parra Pozo, Michael Toksvig, Terry Filiba Schrager, Joyce Hsu, Uday Mathur, Alexander Sorkine-Hornung, Rick Szeliski, and Brian Cabral. 2019. An integrated 6DoF video camera and system design. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1--16.
[63]
Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. 2021. D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10318--10327.
[64]
Feng Qian, Bo Han, Qingyang Xiao, and Vijay Gopalakrishnan. 2018. Flare: Practical Viewport-Adaptive 360-Degree Video Streaming for Mobile Devices. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. ACM, 99--114.
[65]
Oren Rippel, Alexander G Anderson, Kedar Tatwawadi, Sanjay Nair, Craig Lytle, and Lubomir Bourdev. 2021. Elf-vc: Efficient learned flexible-rate video coding. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14479--14488.
[66]
Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European conference on computer vision. Springer, 17--35.
[67]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, 234--241.
[68]
MICHAEL RUBLOFF. 2023. Google announces new Google Maps experience featuring Neural Radiance Fields (NeRFs). (2023). https://neuralradiancefields.io/google-announces-new-google-maps-experience-featuring-neural-radiance-fields/
[69]
Michael Rubloff. 2024. Apple Pursuing Radiance Fields in Apple Maps. Radiance Fields (Jan 2024). https://radiancefields.com/apple-pursuing-radiance-fields-in-apple-maps
[70]
F. Sener, D. Chatterjee, D. Shelepov, K. He, D. Singhania, R. Wang, and A. Yao. [n. d.]. Assembly 101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities. CVPR 2022 ([n. d.]).
[71]
Suji Shimizu, Toshiharu Kondo, Takashi Kohashi, M Tsurata, and Teruyoshi Komuro. 1992. A new algorithm for exposure control based on fuzzy logic for video cameras. IEEE Transactions on Consumer Electronics 38, 3 (1992), 617--623.
[72]
David Slepian and Jack Wolf. 1973. Noiseless coding of correlated information sources. IEEE Transactions on information Theory 19, 4 (1973), 471--480.
[73]
Liangchen Song, Anpei Chen, Zhong Li, Zhang Chen, Lele Chen, Junsong Yuan, Yi Xu, and Andreas Geiger. 2023. Nerfplayer: A streamable dynamic scene representation with decomposed neural radiance fields. IEEE Transactions on Visualization and Computer Graphics 29, 5 (2023), 2732--2742.
[74]
Kevin Spiteri, Rahul Urgaonkar, and Ramesh K Sitaraman. 2020. BOLA: Near-optimal bitrate adaptation for online videos. IEEE/ACM Transactions on Networking 28, 4 (2020), 1698--1711.
[75]
Thomas Stockhammer. 2011. Dynamic adaptive streaming over HTTP-: standards and design principles. In Proceedings of the second annual ACM conference on Multimedia systems. ACM, 133--144.
[76]
Gary J Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology 22, 12 (2012), 1649--1668.
[77]
Gerhard Tech, Ying Chen, Karsten Müller, Jens-Rainer Ohm, Anthony Vetro, and Ye-Kui Wang. 2015. Overview of the multiview and 3D extensions of high efficiency video coding. IEEE Transactions on Circuits and Systems for Video Technology 26, 1 (2015), 35--49.
[78]
TaoXi Technology. 2021. 3D Modeling Techniques for Products Based on Neural Rendering. Available at: https://www.alibabacloud.com/blog/3d-modeling-techniques-for-products-based-on-neural-rendering_598327. (2021). Accessed on 2023.
[79]
Jim Thacker. 2023. Use NeRFs in Unreal Engine with Luma AI's new plugin. Available at: https://www.cgchannel.com/2023/04/use-nerfs-in-unreal-engine-with-luma-ais-new-plugin/. (2023). Accessed on 2023.
[80]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[81]
Anthony Vetro, Thomas Wiegand, and Gary J. Sullivan. 2011. Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard. Proc. IEEE 99, 4 (2011), 626--642.
[82]
Liao Wang, Qiang Hu, Qihan He, Ziyu Wang, Jingyi Yu, Tinne Tuytelaars, Lan Xu, and Minye Wu. 2023. Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[83]
Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398--1402.
[84]
Thomas Wiegand, Gary J Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H. 264/AVC video coding standard. IEEE Transactions on circuits and systems for video technology 13, 7 (2003), 560--576.
[85]
Ian H Witten, Radford M Neal, and John G Cleary. 1987. Arithmetic coding for data compression. Commun. ACM 30, 6 (1987), 520--540.
[86]
Jelmer M Wolterink, Jesse C Zwienenberg, and Christoph Brune. 2022. Implicit neural representations for deformable image registration. In International Conference on Medical Imaging with Deep Learning. PMLR, 1349--1359. https://proceedings.mlr.press/v172/wolterink22a.html
[87]
Aaron Wyner and Jacob Ziv. 1976. The rate-distortion function for source coding with side information at the decoder. IEEE Transactions on information Theory 22, 1 (1976), 1--10.
[88]
Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, and Shenlong Wang. 2024. Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4578--4588.
[89]
Francis Y Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Alexander Levis, and Keith Winstein. 2020. Learning in situ: a randomized experiment in video streaming. In NSDI, Vol. 20. 495--511.
[90]
Ren Yang, Fabian Mentzer, Luc Van Gool, and Radu Timofte. 2020. Learning for video compression with hierarchical quality and recurrent enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6628--6637.
[91]
Ren Yang, Fabian Mentzer, Luc Van Gool, and Radu Timofte. 2020. Learning for video compression with recurrent auto-encoder and recurrent probability model. IEEE Journal of Selected Topics in Signal Processing 15, 2 (2020), 388--401.
[92]
Zhenyu Yang, Wanmin Wu, Klara Nahrstedt, Gregorij Kurillo, and Ruzena Bajcsy. 2010. Enabling multi-party 3d tele-immersive environments with viewcast. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 6, 2 (2010), 1--30.
[93]
Shuochao Yao, Jinyang Li, Dongxin Liu, Tianshi Wang, Shengzhong Liu, Huajie Shao, and Tarek Abdelzaher. 2020. Deep compressive offloading: speeding up neural network inference by trading edge computation for network latency. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems. 476--488.
[94]
Abid Yaqoob, Ting Bi, and Gabriel-Miro Muntean. 2020. A Priority-aware DASH-based multi-view video streaming scheme over multiple channels. In 2020 International Wireless Communications and Mobile Computing (IWCMC). IEEE, 297--303.
[95]
Hyunho Yeo, Youngmok Jung, Jaehong Kim, Jinwoo Shin, and Dongsu Han. 2018. Neural adaptive content-aware internet video delivery. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 645--661.
[96]
Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control-theoretic approach for dynamic adaptive video streaming over HTTP. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. 325--338.
[97]
Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas Huang. 2018. Slimmable neural networks. arXiv preprint arXiv:1812.08928 (2018).
[98]
Dooyeol Yun and Kwangsue Chung. 2017. DASH-based multi-view video streaming system. IEEE Transactions on Circuits and Systems for Video Technology 28, 8 (2017), 1974--1980.
[99]
Anlan Zhang, Chendong Wang, Bo Han, and Feng Qian. 2022. YuZu:Neural-Enhanced Volumetric Video Streaming. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 137--154.
[100]
Rui-Xiao Zhang, Tianchi Huang, Ming Ma, Haitian Pang, Xin Yao, Chenglei Wu, and Lifeng Sun. 2019. Enhancing the crowdsourced live streaming: a deep reinforcement learning approach. In Proceedings of the 29th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video. 55--60.
[101]
Rui-Xiao Zhang, Chaoyang Li, Chenglei Wu, Tianchi Huang, and Lifeng Sun. 2023. Owl: A pre-and post-processing framework for video analytics in low-light surroundings. In IEEE INFOCOM 2023-IEEE Conference on Computer Communications. IEEE, 1--10.
[102]
Yuanxing Zhang, Pengyu Zhao, Kaigui Bian, Yunxin Liu, Lingyang Song, and Xiaoming Li. 2019. DRL360: 360-degree video streaming with deep reinforcement learning. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 1252--1260.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SenSys '24: Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems
November 2024
950 pages
ISBN:9798400706974
DOI:10.1145/3666025
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2024

Check for updates

Author Tags

  1. multi-view cameras
  2. immersive computing
  3. video streaming
  4. neural codec
  5. edge computing

Qualifiers

  • Research-article

Funding Sources

Conference

Acceptance Rates

Overall Acceptance Rate 174 of 867 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 240
    Total Downloads
  • Downloads (Last 12 months)240
  • Downloads (Last 6 weeks)195
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media