Abstract
Reducing power consumption is quickly becoming a first-class optimization metric for many high-performance parallel computing platforms. One of the techniques employed by many prior proposals along this direction is voltage scaling and past research used it on different components such as networks, CPUs, and memories. In contrast to most of the existent efforts on voltage scaling that target a single component (CPU, network or memory components), this paper proposes and experimentally evaluates a voltage/frequency scaling algorithm that considers CPU and communication links in a mesh network at the same time. More specifically, it scales voltages/frequencies of CPUs in the nodes and the communication links among them in a coordinated fashion (instead of one after another) such that energy savings are maximized without impacting execution time. Our experiments with several tree-based sparse matrix computations reveal that the proposed integrated voltage scaling approach is very effective in practice and brings 13% and 17% energy savings over the pure CPU and pure communication link voltage scaling schemes, respectively. The results also show that our savings are consistent with the different network sizes and different sets of voltage/frequency levels.
Similar content being viewed by others
References
Advanced Micro Devices, Inc. AMD Athlon 64 processor power and thermal data sheet, 2004
Chandrakasan A, Brodersen R (1995) Low power digital CMOS design. Kluwer Academic, Dordrecht
Chase J, Anderson D, Thackar P, Vahdat A, Boyle R (2001) Managing energy and server resources in hosting centers. In: Proceedings of the 18th symposium on operating systems principles, October 2001, pp 103–116
Chen G, Malkowski K, Kandemir MT, Raghavan P (2005) Reducing power with performance constraints for parallel sparse applications. In: Proceedings of international parallel and distributed processing symposium, April 2005
Chen X, Peh L (2003) Leakage power modeling and optimization in interconnection networks. In: Proceedings of the international symposium on low power and electronics design, August 2003, pp 90–95
Demmel J, Eisenstat SC, Gilbert JR, Li XS, Liu JWH (1995) A supernodal approach to sparse partial pivoting. Technical report UCB/CSD-95-883, EECS Department, University of California, Berkeley, 1995
Douglis F, Krishnan P, Marsh B (1994) Thwarting the power-hungry disk. In: Proceedings of the USENIX winter conference, 1994, pp 292–306
Elnozahy M, Kistler M, Rajamony R (2002) Energy-efficient server clusters. In: Proceedings of the second workshop on power aware computing systems, February 2002
Elnozahy M, Kistler M, Rajamony R (2003) Energy conservation policies for web servers. In: Proceedings of the 4th USENIX symposium on internet technologies and systems, March 2003
Freeh VW, Lowenthal DK (2005) Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proceedings of the tenth ACM SIGPLAN symposium on principles and practice of parallel programming, 2005, pp 164–173
George JA, Liu JW-H (1981) Computer solution of large sparse positive definite systems. Prentice-Hall, Englewood Cliffs
Grigori L, Li XS (2002) A new scheduling algorithm for parallel sparse lu factorization with static pivoting. In: Proceedings of the 2002 ACM/IEEE conference on supercomputing. IEEE Computer Society Press, 2002, pp 1–18
Gropp W, Lusk E, Doss N, Skjellum A (1996) High-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6):789–828
Gupta A, Gustavson F, Joshi M, Karypis G, Kumar V (1999) PSPASES: an efficient and scalable parallel sparse direct solver, http://www-users.cs.umn.edu/~mjoshi/pspases
Gupta A, Kumar V, Sameh A (1993) Performance and scalability of preconditioned conjugate gradient methods on the CM-5. In: Proceedings of the sixth SIAM conference on parallel processing for scientific computing, 1993, pp 664–674
Gurumurthi S, Sivasubramaniam A, Kandemir M, Franke H (2003) DRPM: dynamic speed control for power management in server class disks. In: Proceedings of the international symposium on computer architecture, June 2003, pp 169–179
Heath MT, Ng E, Peyton BW (1991) Parallel algorithms for sparse linear systems. SIAM Rev 33:420–460
Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Nat Bur Stand 49:409–436
Intel XScale™ Core developer’s manual (2002), http://developer.intel.com/design/intelxscale/
Karypis G, Kumar V (1995) METIS: Unstructured graph partitioning and sparse matrix ordering system, Version 2.0, Manual. Department of Computer Science, University of Minnesota, Minneapolis
Kim EJ, Yum KH, Link G, Das CR, Vijaykrishnan N, Kandemir M, Irwin MJ (2003) Energy optimization techniques in cluster interconnects. In: Proceedings of the international symposium on low power electronics and design. ACM, August 2003, pp 459–464
Kim J, Horowitz MA (2002) Adaptive supply serial links with sub-1v operation and per-pin clock recovery. In: Proceedings of international solid-state circuits conference, February 2002
Luo J, Peh L-S, Jha N (2003) Simultaneous dynamic voltage scaling of processors and communication links in real-time distributed embedded systems. In: Proceedings of the design automation and test in Europe conference, 2003, pp 1150–1151
Malkowski K, Raghavan P (2005) Multi-pass mapping schemes for parallel sparse matrix computation. In: International conference on computational science (1), 2005, pp 245–255
Ng E, Raghavan P (2000) Towards a scalable hybrid sparse solver. Concurr Pract Exp 12:1–16
Pothen A, Sun C (1993) A mapping algorithm for parallel sparse Cholesky factorization. SIAM J Sci Comput 14(5):1253–1257
Raghavan P (1991) Distributed sparse matrix factorization: QR and Cholesky factorizations. PhD thesis, Pennsylvania State University
Raghavan P, Teranishi K, Ng E (2003) A latency tolerant hybrid sparse solver using incomplete Cholesky factorization. Numer Linear Algebra 10:541–560
Saad Y (1996) Iterative methods for sparse linears systems. PWS Publishing, Boston
Shang L, Peh L-S, Jha NK (2003) Dynamic voltage scaling with links for power optimization of interconnection networks. In: Proceedings of the 9th international symposium on high-performance computer architecture, 2003, pp 91–102
Shin D, Kim J (2004) Power-aware communication optimization for networks-on-chips with voltage scalable links. In: Proceedings of the 2nd IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis, 2004, pp 170–175
Soteriou V, Peh L-S (2004) Design-space exploration of power-aware on/off interconnection networks. In: Proceedings of the IEEE international conference on computer design, 2004, pp 510–517
Transmeta. Crusoe Longrun Power Management White Paper. http://www.transmeta.com/crusoe/longrun.html
Weiser M, Demers A, Welch B, Shenker S (1994) Scheduling for reduced CPU energy. In: Proceedings of symposium on operating system design and implementation, November 1994, pp 13–23
Worm F, Ienne P, Thiran P, Micheli GD (2002) An adaptive low-power transmission scheme for on-chip networks. In: Proceedings of the 15th international symposium on system synthesis, 2002, pp 92–100
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Son, S.W., Malkowski, K., Chen, G. et al. Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling. J Supercomput 41, 179–213 (2007). https://doi.org/10.1007/s11227-007-0113-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-007-0113-9