[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3144769.3144776acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
short-paper

Cosmological Particle Data Compression in Practice

Published: 12 November 2017 Publication History

Abstract

In cosmological simulations, trillions of particles are handled and several terabytes of particle data are generated in each time step. Transferring this data directly from memory to disk in an uncompressed way results in a massive load on I/O and storage systems. Hence, one goal of domain scientists is to compress the data before storing it to disk while minimizing the loss of information. In this in situ scenario, the available time for the compression of one time step is limited. Therefore, the evaluation of compression techniques has shifted from only focusing on compression rates to including throughput and scalability. This study aims to evaluate and compare state-of-the-art compression techniques applied to particle data. For the investigated compression techniques, quantitative performance indicators such as compression rates, throughput, scalability, and reconstruction errors are measured. Based on these factors, this study offers a comprehensive analysis of the individual techniques and discusses their applicability for in situ compression. Based on this study, future challenges and directions in the compression of cosmological particle data are identified.

References

[1]
Jae-Kyun Ahn, Kyu-Yul Lee, Jae-Young Sim, and Chang-Su Kim. 2015. Large-scale 3D point cloud compression using adaptive radial distance prediction in hybrid coordinate domains. IEEE Journal of Selected Topics in Signal Processing 9, 3 (2015), 422--434.
[2]
Francesc Alted. 2010. Blosc, an extremely fast, multi-threaded, meta-compressor library -- Blosc Main Page. (July 2010). http://www.blosc.org/
[3]
Francesc Alted. 2010. Why modern CPUs are starving and what can be done about it. Computing in Science & Engineering 12, 2 (2010).
[4]
Andrew C Bauer, Hasan Abbasi, James Ahrens, Hank Childs, Berk Geveci, Scott Klasky, Kenneth Moreland, Patrick O'Leary, Venkatram Vishwanath, Brad Whitlock, and others. 2016. In situ methods, infrastructures, and applications on high performance computing platforms. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 577--597.
[5]
Michael Boylan-Kolchin, Volker Springel, Simon DM White, Adrian Jenkins, and Gerard Lemson. 2009. Resolving cosmic structure formation with the Millennium-II Simulation. Monthly Notices of the Royal Astronomical Society 398, 3 (2009), 1150--1164.
[6]
Yann Collet. 2011. LZ4 - Extremely fast compression. (April 2011). http://lz4.github.io/lz4/
[7]
Yann Collet. 2015. Zstandard - Real-time data compression algorithm. (January 2015). http://facebook.github.io/zstd/
[8]
Lasse Collin. 2005. A Quick Benchmark: Gzip vs. Bzip2 vs. LZMA. (May 2005). http://tukaani.org/lzma/benchmarks.html
[9]
Jeff Dean, Sanjay Ghemawat, and Steinar H. Gunderson. 2011. Snappy by google. (March 2011). http://google.github.io/snappy/
[10]
Sheng Di and Franck Cappello. 2016. Fast error-bounded lossy HPC data compression with SZ. In Parallel and Distributed Processing Symposium, 2016 IEEE International. IEEE, 730--739.
[11]
Jean-loup Gailly and Mark Adler. 1995. zlib Home Site. (May 1995). http://www.zlib.net/
[12]
Salman Habib, Vitali Morozov, Nicholas Frontiere, Hal Finkel, Adrian Pope, and Katrin Heitmann. 2013. HACC: Extreme Scaling and Performance Across Diverse Architectures. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '13). ACM, New York, NY, USA, Article 6, 10 pages.
[13]
Ariya Hidayat. 2007. FastLZ - lightning-fast compression library. (2007). http://fastlz.org/
[14]
Scott Klasky, Hasan Abbasi, Jeremy Logan, Manish Parashar, Karsten Schwan, Arie Shoshani, Matthew Wolf, Sean Ahern, Ilkay Altintas, Wes Bethel, and others. 2011. In situ data processing for extreme-scale computing. Scientific Discovery through Advanced Computing Program (SciDACfi11) (2011).
[15]
Sriram Lakshminarasimhan, Neil Shah, Stephane Ethier, Scott Klasky, Rob Latham, Rob Ross, and Nagiza F Samatova. 2011. Compressing the incompressible with ISABELA: In-situ reduction of spatio-temporal data. In European Conference on Parallel Processing. Springer, 366--379.
[16]
Peter Lindstrom. 2014. Fixed-rate compressed floating-point arrays. IEEE transactions on visualization and computer graphics 20, 12 (2014), 2674--2683.
[17]
Peter Lindstrom and Martin Isenburg. 2006. Fast and efficient compression of floating-point data. IEEE transactions on visualization and computer graphics 12, 5 (2006), 1245--1250.
[18]
Kwan-Liu Ma. 2009. In situ visualization at extreme scale: Challenges and opportunities. IEEE Computer Graphics and Applications 29, 6 (2009), 14--19.
[19]
Bruce Merry, Patrick Marais, and James Gain. 2006. Compression of dense and regular point clouds. In Computer Graphics Forum, Vol. 25. Wiley Online Library, 709--716.
[20]
Md Mostofa Ali Patwary, Suren Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukić, Vadim Roytershteyn, Michael J Anderson, Yushu Yao, Pradeep Dubey, and others. 2015. Bd-cats: Big data clustering at trillion particle scale. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 6.
[21]
Igor Pavlov. 1999. LZMA SDK (Software Development Kit). (January 1999). http://7-zip.org/sdk.html
[22]
The Tukaani Project. XZ Utils. (????). http://tukaani.org/xz/
[23]
Samuel W Skillman, Michael S Warren Matthew J Turk, Risa H Wechsler, Daniel E Holz, and PM Sutter. 2014. Dark sky simulations: Early data release. arXiv preprint arXiv:1407.2600 (2014).
[24]
Jonathan Woodring, J Ahrens, J Figg, Joanne Wendelberger, Salman Habib, and Katrin Heitmann. 2011. In-situ Sampling of a Large-Scale Particle Simulation for Interactive Visualization and Analysis. In Computer Graphics Forum, Vol. 30. Wiley Online Library, 1151--1160.

Cited By

View all
  • (2024)An In-Situ Visual Analytics Framework for Deep Neural NetworksIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.333958530:10(6770-6786)Online publication date: Oct-2024
  • (2024)Enabling High- Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring2024 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops)10.1109/CLUSTERWorkshops61563.2024.00022(86-95)Online publication date: 24-Sep-2024
  • (2022)Analysis of Lossless Data Compression Algorithm in Columnar Data Warehouse2022 6th International Conference On Computing, Communication, Control And Automation (ICCUBEA10.1109/ICCUBEA54992.2022.10010925(1-4)Online publication date: 26-Aug-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISAV'17: Proceedings of the In Situ Infrastructures on Enabling Extreme-Scale Analysis and Visualization
November 2017
53 pages
ISBN:9781450351393
DOI:10.1145/3144769
© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Case Study
  2. Cosmology
  3. In Situ
  4. Lossless Compression
  5. Lossy Compression

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

SC '17
Sponsor:

Acceptance Rates

ISAV'17 Paper Acceptance Rate 9 of 28 submissions, 32%;
Overall Acceptance Rate 23 of 63 submissions, 37%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)1
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An In-Situ Visual Analytics Framework for Deep Neural NetworksIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.333958530:10(6770-6786)Online publication date: Oct-2024
  • (2024)Enabling High- Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring2024 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops)10.1109/CLUSTERWorkshops61563.2024.00022(86-95)Online publication date: 24-Sep-2024
  • (2022)Analysis of Lossless Data Compression Algorithm in Columnar Data Warehouse2022 6th International Conference On Computing, Communication, Control And Automation (ICCUBEA10.1109/ICCUBEA54992.2022.10010925(1-4)Online publication date: 26-Aug-2022
  • (2022)Analyzing the Impact of Lossy Data Reduction on Volume Rendering of Cosmology Data2022 IEEE/ACM 8th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD)10.1109/DRBSD56682.2022.00007(11-20)Online publication date: Nov-2022
  • (2021)Characterization of data compression across CPU platforms and acceleratorsConcurrency and Computation: Practice and Experience10.1002/cpe.646535:20Online publication date: 11-Jul-2021
  • (2020)ForesightProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3433701.3433811(1-15)Online publication date: 9-Nov-2020
  • (2020)Foresight: Analysis That Matters for Data ReductionSC20: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41405.2020.00087(1-15)Online publication date: Nov-2020
  • (2020)Distribution-based Particle Data Reduction for In-situ Analysis and Visualization of Large-scale N-body Cosmological Simulations2020 IEEE Pacific Visualization Symposium (PacificVis)10.1109/PacificVis48177.2020.1186(171-180)Online publication date: Jun-2020
  • (2019)Towards Improving Rate-Distortion Performance of Transform-Based Lossy Compression for HPC Datasets2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916286(1-7)Online publication date: Sep-2019
  • (2019)A novel in situ compression method for CFD data based on generative adversarial networkJournal of Visualization10.1007/s12650-018-0519-x22:1(95-108)Online publication date: 1-Feb-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media