A Parallel Compression Pipeline for Improving GPU Virtualization Data Transfers
<p>General architecture of remote GPU virtualization solutions.</p> "> Figure 2
<p>Comparison between a naive compression approach and the parallel compression pipeline proposed in this paper.</p> "> Figure 3
<p>High-level diagram of the implementation of the parallel compression pipeline using <span class="html-italic">n</span> compression/decompression threads and <span class="html-italic">m</span> data chunks.</p> "> Figure 4
<p>Scenarios explored in the experiments.</p> "> Figure 5
<p>Data used in experiments.</p> "> Figure 6
<p>Study of data chunk sizes using 1 Gbps network and Gipfeli compression library. The horizontal line in the plots corresponds to a trade-off equal to zero.</p> "> Figure 7
<p>Study of the number of data chunks using 1 Gbps network and Gipfeli compression library.</p> "> Figure 8
<p>Study of the number of threads using 1 Gbps network and Gipfeli compression library.</p> "> Figure 9
<p>Results using a 1 Gbps network. Different scenarios are considered: ‘No Compression’ (rCUDA without using compression); ‘Gipfeli Pipeline’ and ‘Lzo Pipeline’ (rCUDA using the parallel compression pipeline system with Gipfeli and Lzo, respectively); ‘Gipfeli Naive’ and ‘Lzo Naive’ rCUDA using compression with Gipfeli and Lzo, respectively, without the pipeline. The scenarios are evaluated with datasets of different sizes. More details about these scenarios can be found in <a href="#sec4dot1-sensors-24-04649" class="html-sec">Section 4.1</a>.</p> "> Figure 10
<p>Results using a 100 Mbps network. Different scenarios are considered: ‘No Compression’ (rCUDA without using compression); ‘Gipfeli Pipeline’ and ‘Lzo Pipeline’ (rCUDA using the parallel compression pipeline system with Gipfeli and Lzo, respectively); ‘Gipfeli Naive’ and ‘Lzo Naive’ rCUDA using compression with Gipfeli and Lzo, respectively, without the pipeline. The scenarios are evaluated with datasets of different sizes. More details about these scenarios can be found in <a href="#sec4dot1-sensors-24-04649" class="html-sec">Section 4.1</a>.</p> "> Figure 11
<p>Results using a 10 Mbps network. Different scenarios are considered: ‘No Compression’ (rCUDA without using compression); ‘Gipfeli Pipeline’ and ‘Lzo Pipeline’ (rCUDA using the parallel compression pipeline system with Gipfeli and Lzo, respectively); ‘Gipfeli Naive’ and ‘Lzo Naive’ rCUDA using compression with Gipfeli and Lzo, respectively, without the pipeline. The scenarios are evaluated with datasets of different sizes. More details about these scenarios can be found in <a href="#sec4dot1-sensors-24-04649" class="html-sec">Section 4.1</a>.</p> "> Figure 12
<p>Normalized bandwidth obtained over a 1 Gbps network by ‘Gipfeli Pipeline’ and ‘Gipfeli Pipeline V2’. The latter only sends the compressed data if the size is smaller than the original uncompressed data.</p> "> Figure 13
<p>Normalized bandwidth obtained over a 100 Mbps network by ‘Gipfeli Pipeline’ and ‘Gipfeli Pipeline V2’. The latter only sends the compressed data if the size is smaller than the original uncompressed data.</p> "> Figure 14
<p>Normalized bandwidth obtained over a 10 Mbps network by ‘Gipfeli Pipeline’ and ‘Gipfeli Pipeline V2’. The latter only sends the compressed data if the size is smaller than the original uncompressed data.</p> "> Figure 15
<p>Percentage of compressed chunks sent to the network with ‘Gipfeli Pipeline V2’ when using data with different sizes.</p> "> Figure A1
<p>Study of data chunk sizes using 100 Mbps network and Gipfeli compression library using traced data.</p> "> Figure A2
<p>Study of the number of data chunks and the number of threads using 100 Mbps network and Gipfeli compression library using traced data.</p> "> Figure A3
<p>Study of data chunk sizes using 10 Mbps network and Gipfeli compression library using traced data.</p> "> Figure A4
<p>Study of the number of data chunks and the number of threads using 10 Mbps network and Gipfeli compression library using traced data.</p> "> Figure A5
<p>Study of data chunk sizes using 1 Gbps network and Lz4 compression library using lineal data.</p> "> Figure A6
<p>Study of the number of data chunks and the number of threads using 1 Gbps network and Lz4 compression library using lineal data.</p> "> Figure A7
<p>Study of data chunk sizes using 1 Gbps network and Snappy compression library using random data.</p> "> Figure A8
<p>Study of the number of data chunks and the number of threads using 1 Gbps network and Snappy compression library using random data.</p> "> Figure A9
<p>Study of data chunk sizes using 1 Gbps network and Lzo compression library using sparse data.</p> "> Figure A10
<p>Study of the number of data chunks and the number of threads using 1 Gbps network and Lzo compression library using sparse data.</p> "> Figure A11
<p>Final results where the best parameters have been selected by compression library using 1 Gbps and all data types.</p> "> Figure A12
<p>Final results where the best parameters have been selected by compression library using 100 Mbps and all data types.</p> "> Figure A13
<p>Final results where the best parameters have been selected by compression library using 10 Mbps and all data types.</p> ">
Abstract
:1. Introduction
- We design and implement a parallel compression pipeline system for remote GPU virtualization, addressing constraints such as low-performance networks (commonly present in IoT environments) or data types (lineal, random, sparse, and traced data).
- We conduct a comprehensive performance analysis to analyze the impact of the different parameters on the implemented solution, providing insights for optimizing such systems.
- We demonstrate how existing compression libraries can enhance network bandwidth using parallel pipeline mechanisms without introducing significant latency.
2. Related Work
2.1. Remote GPU Virtualization Systems
2.2. Compression Solutions
3. A Parallel Compression Pipeline for GPU Virtualization Data Transfers
- Number of compression/decompression threads. Depending on the size of the data to be compressed, compression and decompression times are different. For that reason, it is essential that the main thread, which sends compressed chunks over the network, has the next chunk compressed and ready to be sent before finishing the current chunk transfer. To couple the different speeds of producers and consumer threads, it is necessary to choose the appropriate number of compression/decompression threads.
- Number of data chunks. Splitting the data into multiple data chunks allows compression/decompression threads not to idle. An adequate number of data chunks should be selected so that those threads always have work to do.
- Size of data chunks. Choosing the best data chunk size is also important. Compression takes longer with a large data chunk than with a smaller one. However, the compression rate is usually better for a large data chunk than for a smaller one. In addition, better compression could also lead to a faster transfer.
4. Experimental Results
4.1. Experimental Setup
- Scenario A represents the initial scenario where the remote GPU virtualization solution rCUDA is used. No compression is used.
- Scenario B shows an improvement over the previous scenario: naive compression is used to compress data transfers carried out within rCUDA.
- Scenario C makes use of our parallel compression pipeline system. The compression and decompression stages overlap with data transfers done within rCUDA.
4.2. Compression Libraries Used in the Experiments
- Snappy [24]. The Google team developed this compression library based on the LZ77 algorithm to obtain a fast compressor instead of focusing on compression.
- Gipfeli [25]. The Google team also developed this compression library based on LZ77. Gipfeli obtains better compression ratios than Snappy but increases the computation time.
- Lz4 [26]. This LZ77-based compression library focuses on fast compression and decompression.
- Lzo [27]. This compression library is another LZ77 derivative. It sacrifices compression and decompression speed for compression ratio.
4.3. Data Used in Experiments
- Lineal data. Data start from 0 and increase one by one up to 255. Once the value 255 is reached, they start again from 0.
- Random data. Data are composed of random numbers between 0 and 255.
- Sparse data. Data contain a random set of 0s followed by a random set of numbers between 0 and 255.
- Traced data. Traces from real obtained from TensorFlow applications [22]. These data are actually the data exchanged among the host memory and the GPU memory during the execution of TensorFlow applications. We sized the data in order to make them compatible with the CUDA bandwidth test benchmark.
4.4. Finding the Best Parameters for the Parallel Compression Pipeline
4.5. Impact on the Performance of the Parallel Compression Pipeline
4.6. To Send or Not to Send Compressed Data
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Additional Material
Appendix A.1. Finding the Best Parameters for the Parallel Compression Pipeline
Appendix A.1.1. Traced Data
Appendix A.1.2. Lineal Data
Network | Library | Threads | Chunks | Size (KB) | Ratio | Compression + Decompression Time (μs) | Bandwidth (Mb/s) |
---|---|---|---|---|---|---|---|
1 Gbps | Snappy | 2 | 2 | 32 | 18.29 | 18.46 | 8830.11 |
Gipfeli | 1 | 4 | 4 | 8.94 | 6.96 | 3863.49 | |
Lz4 | 1 | 2 | 32 | 83.17 | 24.86 | 10,245.63 | |
Lzo | 4 | 8 | 4 | 14.52 | 55.32 | 2697.30 | |
100 Mbps | Snappy | 1 | 4 | 4 | 9.16 | 5.01 | 885.22 |
Gipfeli | 1 | 4 | 4 | 8.94 | 6.96 | 854.13 | |
Lz4 | 1 | 2 | 32 | 83.17 | 24.86 | 7423.55 | |
Lzo | 2 | 4 | 4 | 14.52 | 55.32 | 1288.82 | |
10 Mbps | Snappy | 1 | 4 | 4 | 9.16 | 5.01 | 88.79 |
Gipfeli | 1 | 2 | 4 | 8.94 | 6.96 | 86.63 | |
Lz4 | 1 | 2 | 128 | 168.26 | 67.77 | 1574.96 | |
Lzo | 1 | 2 | 32 | 75.16 | 352.52 | 698.11 |
Appendix A.1.3. Random Data
Network | Library | Threads | Chunks | Size (KB) | Ratio | Compression + Decompression Time (μs) | Bandwidth (Mb/s) |
---|---|---|---|---|---|---|---|
1 Gbps | Snappy | 1 | 2 | 2 | 0.998 | 6.82 | 919.43 |
Gipfeli | 1 | 2 | 2 | 0.992 | 6.39 | 920.18 | |
Lz4 | 2 | 4 | 1 | 0.995 | 26.37 | 895.98 | |
Lzo | 8 | 8 | 4 | 0.995 | 359.52 | 878.50 | |
100 Mbps | Snappy | 1 | 2 | 1 | 0.995 | 5.33 | 97.00 |
Gipfeli | 1 | 2 | 1 | 0.981 | 5.66 | 95.57 | |
Lz4 | 1 | 2 | 1 | 0.995 | 26.37 | 97.00 | |
Lzo | 1 | 4 | 1 | 0.992 | 80.45 | 96.56 | |
10 Mbps | Snappy | 2 | 2 | 1 | 0.995 | 5.33 | 9.71 |
Gipfeli | 1 | 2 | 1 | 0.981 | 5.66 | 9.57 | |
Lz4 | 1 | 2 | 1 | 0.995 | 26.37 | 9.71 | |
Lzo | 2 | 4 | 1 | 0.992 | 80.45 | 9.68 |
Appendix A.1.4. Sparse Data
Network | Library | Threads | Chunks | Size (KB) | Ratio | Compression + Decompression Time (μs) | Bandwidth (Mb/s) |
---|---|---|---|---|---|---|---|
1 Gbps | Snappy | 1 | 2 | 4 | 20.50 | 7.12 | 3401.36 |
Gipfeli | 1 | 4 | 4 | 18.63 | 7.86 | 3445.01 | |
Lz4 | 1 | 2 | 1 | 69.37 | 27.83 | 385.93 | |
Lzo | 2 | 8 | 1 | 80.02 | 22.77 | 833.96 | |
100 Mbps | Snappy | 1 | 2 | 1 | 19.35 | 2.22 | 1889.86 |
Gipfeli | 1 | 2 | 4 | 18.63 | 7.86 | 1727.83 | |
Lz4 | 2 | 4 | 1 | 69.37 | 27.83 | 750.15 | |
Lzo | 2 | 8 | 1 | 80.02 | 22.77 | 828.20 | |
10 Mbps | Snappy | 1 | 2 | 4 | 20.50 | 7.12 | 174.77 |
Gipfeli | 1 | 4 | 1 | 15.41 | 2.49 | 141.61 | |
Lz4 | 2 | 4 | 2 | 104.31 | 57.15 | 657.80 | |
Lzo | 2 | 4 | 1 | 80.02 | 22.77 | 531.81 |
Appendix A.2. Impact on the Performance of the Parallel Compression Pipeline
- Traced data were explained in Section 4.5. Figure A11d, Figure A12d and Figure A13d show more detailed information with 1 Gbps, 100 Mbps, and 10 Mbps networks.
- No compression library improves the bandwidth obtained with random data, nor with the naive version, where all the data are compressed before sending them, nor with the pipeline version, as we can observe in Figure A11b, Figure A12b and Figure A13b with 1 Gbps, 100 Mbps, and 10 Mbps networks. That behavior correlates with the data obtained in Table A2.
- Lineal data results are presented in Figure A11a with a 1 Gbps network, Figure A12a with a 100 Mbps network, and Figure A13a with a 10 Mbps network. Most compression library results are better than ‘No Compression’ ones, highlighting naive versions. They achieve great results thanks to their compression ratio. As a result, network bandwidth is greatly increased.
- Compression libraries obtain an interesting behavior when they work with sparse data. Using a 1 Gbps network, the fastest compression libraries (Gipfeli and Snappy) achieve the best bandwidth using their pipeline version, as seen in Figure A11c. Figure A12c shows how the naive version of these compression libraries benefits when we reduce the network bandwidth to 100 Mbps. Finally, the slowest compression libraries (Lz4 and Lzo) obtain the best bandwidth using a 10 Mbps network, as Figure A13c illustrates. In this last scenario, the network bandwidth is the bottleneck, and that makes them achieve the best performance because these compression libraries obtain the best compression ratio, as we can observe in Table A3.
References
- Papadokostaki, K.; Mastorakis, G.; Panagiotakis, S.; Mavromoustakis, C.X.; Dobre, C.; Batalla, J.M. Handling Big Data in the Era of Internet of Things (IoT). In Advances in Mobile Cloud Computing and Big Data in the 5G Era; Springer: Cham, Switzerland, 2017; pp. 3–22. [Google Scholar]
- Gubbi, J.; Buyya, R.; Marusic, S.; Palaniswami, M. Internet of Things (IoT): A Vision, Architectural Elements, and Future Directions. Future Gener. Comput. Syst. (FGCS) 2013, 29, 1645–1660. [Google Scholar] [CrossRef]
- Satyanarayanan, M. The Emergence of Edge Computing. Computer 2017, 50, 30–39. [Google Scholar] [CrossRef]
- Capra, M.; Peloso, R.; Masera, G.; Ruo Roch, M.; Martina, M. Edge Computing: A Survey on the Hardware Requirements in the Internet of Things World. Future Internet 2019, 11, 100. [Google Scholar] [CrossRef]
- Cecilia, J.M.; Morales-García, J.; Imbernón, B.; Prades, J.; Cano, J.C.; Silla, F. Using Remote GPU Virtualization Techniques to Enhance Edge Computing Devices. Future Gener. Comput. Syst. (FGCS) 2023, 142, 14–24. [Google Scholar] [CrossRef]
- NVIDIA Corporation. CUDA (Compute Unified Device Architecture). 2022. Available online: https://developer.nvidia.com/cuda-toolkit (accessed on 15 July 2024).
- Giunta, G.; Montella, R.; Agrillo, G.; Coviello, G. A GPGPU Transparent Virtualization Component for High Performance Computing Clouds. In Proceedings of the Euro-Par 2010-Parallel Processing: 16th International Euro-Par Conference, Ischia, Italy, 31 August– 3 September 2010; pp. 379–391. [Google Scholar]
- Silla, F.; Iserte, S.; Reaño, C.; Prades, J. On the Benefits of the Remote GPU Virtualization Mechanism: The rCUDA Case. Concurr. Comput. Pract. Exp. (CCPE) 2017, 29, e4072. [Google Scholar] [CrossRef]
- Duranton, M.; De Bosschere, K.; Gamrat, C.; Maebe, J.; Munk, H.; Zendra, O. The HiPEAC Vision 2017; HiPEAC High-Performance Embedded Architecture and Compilation: Barcelona, Spain, 2017. [Google Scholar]
- Vega, J.; Ruiz, M.; Sánchez, E.; Pereira, A.; Portas, A.; Barrera, E. Real-Time Lossless Data Compression Techniques for Long-Pulse Operation. Fusion Eng. Des. 2007, 82, 1301–1307. [Google Scholar] [CrossRef]
- Hansson, E.; Karlsson, S. Lossless Message Compression. 2013. Dissertation. Available online: https://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-21434 (accessed on 15 July 2024).
- Liang, Y.; Li, Y. An Efficient and Robust Data Compression Algorithm in Wireless Sensor Networks. IEEE Commun. Lett. 2014, 18, 439–442. [Google Scholar] [CrossRef]
- Uthayakumar, J.; Elhoseny, M.; Shankar, K. Highly Reliable and Low-Complexity Image Compression Scheme Using Neighborhood Correlation Sequence Algorithm in WSN. IEEE Trans. Reliab. 2020, 69, 1398–1423. [Google Scholar] [CrossRef]
- Welton, B.; Kimpe, D.; Cope, J.; Patrick, C.M.; Iskra, K.; Ross, R. Improving i/o Forwarding Throughput With Data Compression. In Proceedings of the 2011 IEEE International Conference on Cluster Computing (Cluster), Austin, TX, USA, 26–30 September 2011; pp. 438–445. [Google Scholar]
- Wiseman, Y. Unlimited and Protected Memory for Flight Data Recorders. Aircr. Eng. Aerosp. Technol. (AEAT) 2016, 88, 866–872. [Google Scholar] [CrossRef]
- Routray, S.K.; Javali, A.; Sharmila, K.; Semunigus, W.; Pappa, M.; Ghosh, A.D. Lossless Compression Techniques for Low Bandwidth Networks. In Proceedings of the 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 3–5 December 2020; pp. 823–828. [Google Scholar]
- Hu, N. Network Aware Data Transmission with Compression. In Proceedings of the Selected Papers from the Proceedings of the Fourth Student Symposium on Computer Systems (SOCS-4), Pittsburgh, PA, USA, 6 October 2001; p. 33.
- Krintz, C.; Sucu, S. Adaptive On-the-Fly Compression. IEEE Trans. Parallel Distrib. Syst. (TPDS) 2005, 17, 15–24. [Google Scholar] [CrossRef]
- Peterson, P.A.; Reiher, P.L. Datacomp: Locally Independent Adaptive Compression for Real-World Systems. In Proceedings of the 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS), Nara, Japan, 27–30 June 2016; pp. 211–220. [Google Scholar]
- Chowdhury, M.R.; Tripathi, S.; De, S. Adaptive Multivariate Data Compression in Smart Metering Internet of Things. IEEE Trans. Ind. Inform. 2020, 17, 1287–1297. [Google Scholar] [CrossRef]
- Kim, Y.; Choi, S.; Lee, D.; Jeong, J.; Kwak, J.; Lee, J.; Lee, G.; Lee, S.; Park, K.; Jeong, J.; et al. Low-Overhead Compressibility Prediction for High-Performance Lossless Data Compression. IEEE Access 2020, 8, 37105–37123. [Google Scholar] [CrossRef]
- Peñaranda, C.; Reaño, C.; Silla, F. Smash: A Compression Benchmark with AI Datasets from Remote GPU Virtualization Systems. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems (HAIS), Salamanca, Spain, 5–7 September 2022; Springer: Cham, Switzerland, 2022; pp. 236–248. [Google Scholar]
- Peñaranda, C.; Reaño, C.; Silla, F. Exploring the Use of Data Compression for Accelerating Machine Learning in the Edge with Remote Virtual Graphics Processing Units. Concurr. Comput. Pract. Exp. (CCPE) 2022, 35, e7328. [Google Scholar] [CrossRef]
- Google. Snappy—A Fast Compressor/Decompressor. 2021. Available online: https://github.com/google/snappy (accessed on 15 July 2024).
- Google. Gipfeli, a High-Speed Compression Library. 2022. Available online: https://github.com/google/gipfeli (accessed on 15 July 2024).
- LZ4. Lz4 Website. Available online: https://lz4.github.io/lz4/ (accessed on 25 October 2022).
- Oberhumer, M.F. Lzo Website. Available online: http://www.oberhumer.com/opensource/lzo/ (accessed on 25 October 2022).
- Chen, H.; Liu, L.; Meng, J.; Lu, W. AFC: An adaptive lossless floating-point compression algorithm in time series database. Inf. Sci. 2024, 654, 119847. [Google Scholar] [CrossRef]
- Gao, R.; Li, Z.; Tan, G.; Li, X. BeeZip: Towards An Organized and Scalable Architecture for Data Compression. In Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, La Jolla, CA, USA, 27 April–1 May 2024; Volume 3, pp. 133–148. [Google Scholar]
- Afroozeh, A.; Felius, L.; Boncz, P. Accelerating GPU Data Processing using FastLanes Compression. In Proceedings of the 20th International Workshop on Data Management on New Hardware, Santiago, Chile, 10 June 2024; pp. 1–11. [Google Scholar]
- Jaranilla, C.; Choi, J. Requirements and Trade-Offs of Compression Techniques in Key–Value Stores: A Survey. Electronics 2023, 12, 4280. [Google Scholar] [CrossRef]
- Gao, C.; Xu, X.; Yang, Z.; Lin, L.; Li, J. QZRAM: A Transparent Kernel Memory Compression System Design for Memory-Intensive Applications with QAT Accelerator Integration. Appl. Sci. 2023, 13, 10526. [Google Scholar] [CrossRef]
- Karandikar, S.; Udipi, A.N.; Choi, J.; Whangbo, J.; Zhao, J.; Kanev, S.; Lim, E.; Alakuijala, J.; Madduri, V.; Shao, Y.S.; et al. CDPU: Co-designing compression and decompression processing units for hyperscale systems. In Proceedings of the 50th Annual International Symposium on Computer Architecture, Orlando, FL, USA, 17–21 June 2023; pp. 1–17. [Google Scholar]
Number of Threads | Number of Chunks | Size of Chunks |
---|---|---|
1 | 2 | 512 B |
2 | 4 | 1 KB |
4 | 8 | 2 KB |
8 | 16 | 4 KB |
32 | 32 KB | |
64 | 128 KB |
Number of Threads | Number of Chunks | Size of Chunks |
---|---|---|
1 | 4 | 2 KB |
2 | 4 | 4 KB |
4 | 16 | 4 KB |
8 | 16 | 4 KB |
Network | Library | Threads | Chunks | Size (KB) | Ratio | Compression + Decompression Time (μs) | Bandwidth (Mb/s) |
---|---|---|---|---|---|---|---|
1 Gbps | Snappy | 1 | 2 | 4 | 2.25 | 17.81 | 1345.68 |
Gipfeli | 2 | 4 | 4 | 2.36 | 55.39 | 1565.04 | |
Lz4 | 4 | 8 | 2 | 2.24 | 74.26 | 1190.93 | |
Lzo | 8 | 8 | 2 | 2.52 | 159.54 | 763.29 | |
100 Mbps | Snappy | 1 | 4 | 1 | 1.49 | 7.96 | 127.41 |
Gipfeli | 1 | 4 | 1 | 1.54 | 21.84 | 135.02 | |
Lz4 | 1 | 4 | 1 | 1.73 | 40.43 | 134.24 | |
Lzo | 2 | 4 | 1 | 1.94 | 91.56 | 148.42 | |
10 Mbps | Snappy | 1 | 4 | 1 | 1.49 | 7.96 | 12.74 |
Gipfeli | 1 | 4 | 1 | 1.54 | 21.84 | 13.51 | |
Lz4 | 1 | 2 | 1 | 1.73 | 40.43 | 13.48 | |
Lzo | 1 | 2 | 1 | 1.94 | 91.56 | 14.87 |
Library | Data | Compression + Decompression Time (μs) | Compression Ratio |
---|---|---|---|
Gipfeli | Chunk (1 KB) | 21.84 | 1.54 |
Whole (8 MB) | 40,468.28 | 2.22 | |
Lzo | Chunk (1 KB) | 91.56 | 1.94 |
Whole (8 MB) | 1,662,723.10 | 2.65 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Peñaranda, C.; Reaño, C.; Silla, F. A Parallel Compression Pipeline for Improving GPU Virtualization Data Transfers. Sensors 2024, 24, 4649. https://doi.org/10.3390/s24144649
Peñaranda C, Reaño C, Silla F. A Parallel Compression Pipeline for Improving GPU Virtualization Data Transfers. Sensors. 2024; 24(14):4649. https://doi.org/10.3390/s24144649
Chicago/Turabian StylePeñaranda, Cristian, Carlos Reaño, and Federico Silla. 2024. "A Parallel Compression Pipeline for Improving GPU Virtualization Data Transfers" Sensors 24, no. 14: 4649. https://doi.org/10.3390/s24144649
APA StylePeñaranda, C., Reaño, C., & Silla, F. (2024). A Parallel Compression Pipeline for Improving GPU Virtualization Data Transfers. Sensors, 24(14), 4649. https://doi.org/10.3390/s24144649