Abstract
Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations [1]. Since its commissioning in 2014, Rucio has become the de-facto standard for scientific data management, even outside CERN community [6]. The rich amount of data gathered about the transfers by Rucio presents a unique opportunity to better understand the complex mechanisms involved in file transfers across the Worldwide LHC Computing Grid (WLCG). This work focuses on the study of a recently published dataset [4] to reconstruct the lifetime of transfers and reveals important information that can be used to predict the Time To Complete (TTC) of transfers across the WLCG.
Supported by CONICET - IFLP - CERN - LINTI.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barisits, M., et al.: Rucio - scientific data management, February 2019. arXiv e-prints. arXiv:1902.09857. https://arxiv.org/abs/1902.09857
Begy, V., Barisits, M., Lassnig, M., Schikuta, E.: Forecasting network throughput of remote data access in computing grids. J. Comput. Sci. 44, 101158 (2020). https://doi.org/10.1016/j.jocs.2020.101158. http://www.sciencedirect.com/science/article/pii/S1877750320304592
Bogado, J., Monticelli, F., Diaz, J., Lassnig, M., Vukotic, I.: Modelling high-energy physics data transfers. In: 2018 IEEE 14th International Conference on e-Science (e-Science), pp. 334–335 (2018). https://doi.org/10.1109/eScience.2018.00081
Bogado, J., Lassnig, M., Monticelli, F., Díaz, J., Beermann, T.: Atlas rucio transfers dataset. Zenodo, December 2020. https://doi.org/10.5281/zenodo.4320937
Kiryanov, A., Álvarez Ayllón, A., Salichos, M., Keeble, O.: FTS3 - a file transfer service for grids, HPCs and clouds. In: International Symposium on Grids and Clouds 2015, p. 028, March 2016. https://doi.org/10.22323/1.239.0028
Lassnig, M., et al.: Rucio beyond ATLAS: experiences from Belle II, CMS, DUNE, EISCAT3D, LIGO/VIRGO, SKA, XENON. Technical report, ATL-SOFT-PROC-2020-017, CERN, Geneva, March 2020. https://doi.org/10.1051/epjconf/202024511006. https://cds.cern.ch/record/2711755
Lassnig, M., Toler, W., Vamosi, R., Bogado, J.: Machine learning of network metrics in atlas distributed data management. J. Phys. Conf. Ser. 898, 062009 (2017). https://doi.org/10.1088/1742-6596/898/6/062009
Zheng, A.: Evaluating Machine Learning Models. O’Reilly Media, Inc., Sebastopol (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bogado, J., Lassnig, M., Monticelli, F., Díaz, J. (2021). Modelling Network Throughput of Large-Scale Scientific Data Transfers. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds) Cloud Computing, Big Data & Emerging Topics. JCC-BD&ET 2021. Communications in Computer and Information Science, vol 1444. Springer, Cham. https://doi.org/10.1007/978-3-030-84825-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-84825-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84824-8
Online ISBN: 978-3-030-84825-5
eBook Packages: Computer ScienceComputer Science (R0)