Abstract
We investigate the benefits that an energy-aware implementation of the runtime in charge of the concurrent execution of ILUPACK—a sophisticated preconditioned iterative solver for sparse linear systems—produces on the time-power-energy balance of the application. Furthermore, to connect the experimental results with the theory, we propose several simple yet accurate power models that capture the variations of average power that result from the introduction of the energy-aware strategies as well as the impact of the P-states into ILUPACK’s runtime, at high accuracy, on two distinct platforms based on multicore technology from AMD and Intel.
Similar content being viewed by others
Notes
A ratio of 1 EXAFLOPS=\(10^{18}\) floating-point arithmetic operations, or flops, per second.
On a separate experiment [21], it was determined that the aggregate power supplied by the 3-Volt and 5-Volt lines during the execution of ILUPACK on these two platforms remains practically constant and, furthermore, it is negligible compared with that measured from the 12-Volt lines.
References
Duranton, M. et al.: The HiPEAC vision for advanced computing in horizon 2020. (2013).
Feng, W.C., Feng, X., Ge, R.: Green supercomputing comes of age. IT Professional, 10(1), 17–23 (2008).
Dongarra, J., et al.: The international ExaScale software project roadmap. Int. J. High Perform. Comput. Appl. 25(1), 3–60 (2011).
The Green500 list. http://www.green500.org. Accessed Aug 2014
The Top500 list (2010). http://www.top500.org. Accessed Aug 2014
Albers, Susanne: Energy-efficient algorithms. Commun. ACM 53, 86–96 (May 2010)
ILUPack. http://www.icm.tu-bs.de/bolle/ilupack/. Accessed Aug 2014
Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput. 37(3), 183–202 (2011)
Aliaga, J.I., Bollhöfer, M., Martín, A.F., Quintana-Ortí, E.S.: Parallelization of multilevel ILU preconditioners on distributed-memory multiprocessors. In: State of the Art in Scientific and Parallel Computing—PARA 2010. Lecture Notes in Computer Science, vol. 7133, pp. 162–172 (2012)
Aliaga, J.I., Dolz, M.F., Martín, A.F., Mayo, R. Quintana-Ortí, E.S.: Leveraging task-parallelism in energy-efficient ILU preconditioners. In: Proceedings of the Second International Conference on ICT as Key Technology Against Global Warming—ICT-GLOW. Lecture Notes in Computer Science, vol. 7453, pp. 55–63 (2012).
HP Corp., Intel Corp., Microsoft Corp., Phoenix Tech. Ltd., and Toshiba Corp. Advanced configuration and power interface specification, revision 5.0 (2011).
Valentini, G.L., Lassonde, W., Khan, S.U., Samee, U., Min-Allah, N., Madani, S.A., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., Li, H., Zomaya, A.Y., Xu, C., Balaji, P., Vishnu, A., Pinel, F., Pecero, J.E., Kliazovich, D., Bouvry, P.: An overview of energy efficiency techniques in cluster computing systems. Cluster Comput. 16(1), 3–15 (2013)
Dolz, M.F., Fernández, J.C., Mayo, R., Quintana-Ortí, E.S.: Energy saving cluster roll: power saving system for clusters. In: Architecture of Computing Systems ARCS. Lecture Notes in Computer Science, vol. 5974, pp. 162–173 (2010).
Pinheiro, E., Bianchini, R., Carrera, E.V., Heath, T.: Dynamic cluster recon guration for power and performance. In: Proceedings of the Workshop on Compilers and Operating Systems for Low Power, pp. 75–93 (2003).
Wang, L., Khan, S.U., Chen, D., Kolodziej, J., Ranjan, R., Xu, C., Zomaya, A.Y.: Energy-aware parallel task scheduling in a cluster. Future Generation Comp. Syst. 29(7), 1661–1670 (2013)
Alonso, P., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S.: Energy-efficient execution of dense linear algebra algorithms on multi-core processors. Cluster Comput. 16(3), 497–509 (2013)
Kolodziej, J., Khan, S.U., Wang, L., Byrski, A., Min-Allah, N., Madani, S.A.: Hierarchical genetic-based grid scheduling with energy optimization. Cluster Comput. 16(3), 591–609 (2013)
The STREAM benchmark: Computer memory bandwidth. http://www.streambench.org/. Accessed Aug 2014
Gunther, S., Deval, A., Burton, T.: Energy-efficient computing: power-management system on the Intel Nehalem family of processors. Intel Technol. J. 15(1), 211 (2010).
Schöne, R., Hackenberg, D., Molka, D.: Memory performance at reduced CPU clock speeds: an analysis of current x86\_64 processors. In Proceedings of the 2012 USENIX Conference on Power-Aware Computing and Systems (2012).
Diouri, M.E.M., Dolz, M.F., Glück, O., Lefèvre, L., Alonso, P., Catalán, S., Mayo, R., Quintana-Ortí, E.S.: Solving some mysteries in power monitoring of servers: take care of your wattmeters!. In: Proceedings of the Energy Efficiency in Large Scale Distributed Systems Conference—EE-LSDS (2013).
Alonso, P., Badia, R.M., Labarta, J., Barreda, M., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S., Reyes, R.: Tools for power-energy modelling and analysis of parallel scientific applications. In: Proceedings of the 2012 41st International Conference on Parallel Processing—ICPP, pp. 420–429 (2012).
Barrachina, S., Barreda, M., Catalán, S., Dolz, M.F., Fabregat, G., Mayo, R., Quintana-Ortí, E.S.: An integrated framework for power-performance analysis of parallel scientific workloads. In: Proceedings of the Third International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (2013).
Paraver: the flexible analysis tool. http://www.cepba.upc.es/paraver. Accessed Aug 2014
Alonso, P., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S.: Modeling power and energy of the task-parallel Cholesky factorization on multicore processors. Comput. Sci. Res. Develop., 29(2), 105–112 (2014)
Hill, D.L., Huff, T., Kulick, S., Safranek, R.: The Uncore: a modular approach to feeding the high-performance cores. Intel Technol. J. 14(3), 30 (2010).
AnandTech Forums. Power-consumption scaling with clockspeed and Vcc for the i7–2600K, 2011. http://forums.anandtech.com/showthread.php?t=2195927. Accessed Aug 2014
SMP superscalar project home page. http://www.bsc.es/plantillaG.php?cat_id=385. Accessed Aug 2014
FLAME project home page. http://www.cs.utexas.edu/users/flame/. Accessed Aug 2014
NVIDIA Corporation. CUDA toolkit 4.0 readiness for CUDA applications, 4.0 edition, March (2011).
MVAPICH: MPI over InfiniBand, 10GigE/iWARP and RoCE. http://mvapich.cse.ohio-state.edu/. Accessed Aug 2014
Acknowledgments
The researchers from the Universidad Jaume I were supported by project CICYT TIN2011-23283 of the Ministerio de Ciencia e Innovación and FEDER and the FPU program of the Ministerio de Educación, Cultura y Deporte. A. F. Martín was funded by the Generalitat de Catalunya under the program “Ajuts per a la incorporació, amb caràcter temporal, de personal investigador júnior a les universitats públiques del sistema universitari català PDJ 2013”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Aliaga, J.I., Barreda, M., Dolz, M.F. et al. Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems. Cluster Comput 17, 1335–1348 (2014). https://doi.org/10.1007/s10586-014-0402-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-014-0402-z