Abstract
Redundancy is the traditional technique used to increase system reliability. With modern technology, in addition to being used as temporal redundancy, slack time can also be used by energy management schemes to scale down system processing speed and supply voltage to save energy. In this paper, we consider a system that consists of multiple servers for providing reliable service. Assuming that servers have self-detection mechanisms to detect faults, we first propose an efficient parallel recovery scheme that processes service requests in parallel to increase the number of faults that can be tolerated and thus the system reliability. Then, for a given request arrival rate, we explore the optimal number of active severs needed for minimizing system energy consumption while achieving k-fault tolerance or for maximizing the number of faults to be tolerated with limited energy budget. Analytical results are presented to show the trade-off between the energy savings and the number of faults being tolerated.
Proc. of the Fifth European Dependable Computing Conference, Apr. 2005.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bohrer, P., Elnozahy, E.N., Keller, T., Kistler, M., Lefurgy, C., McDowell, C., Rajamony, R.: The case for power management in web servers. In: Power Aware Computing, ch. 1. Plenum/Kluwer Publishers (2002)
Burd, T.D., Brodersen, R.W.: Energy efficient cmos microprocessor design. In: Proc. of The HICSS Conference (January 1995)
Castillo, X., McConnel, S., Siewiorek, D.: Derivation and calibration of a transient error reliability model. IEEE Trans. on computers 31(7), 658–671 (1982)
Intel Corp. Mobile pentium iii processor-m datasheet. Order Number: 298340-002 (October 2001)
Elnozahy, E. (Mootaz), Kistler, M., Rajamony, R.: Energy-efficient server clusters. In: Proc. of Power Aware Computing Systems (2002)
Elnozahy, E. (Mootaz), Melhem, R., Mossé, D.: Energy-efficient duplex and tmr real-time systems. In: Proc. of The IEEE Real-Time Systems Symposium (2002)
Ishihara, T., Yauura, H.: Voltage scheduling problem for dynamically variable voltage processors. In: Proc. of The 1998 International Symposium on Low Power Electronics and Design (August 1998)
Kavi, K.M., Youn, H.Y., Shirazi, B.: A performability model for soft real-time systems. In: Proc. of the Hawaii International Conference on System Sciences, HICSS (January 1994)
Koo, R., Toueg, S.: Checkpointing and rollback recovery for distributed systems. IEEE Trans. on Software Engineering 13(1), 23–31 (1987)
Lebeck, A.R., Fan, X., Zeng, H., Ellis, C.S.: Power aware page allocation. In: Proc. of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (November 2000)
Lee, H., Shin, H., Min, S.: Worst case timing requirement of real-time tasks with time redundancy. In: Proc. of Real-Time Computing Systems and Applications (1999)
Lefurgy, C., Rajamani, K., Rawson, F., Felter, W., Kistler, M., Keller, T.W.: Energy management for commercial servers. IEEE Computer 36(12), 39–48 (2003)
Melhem, R., Mossé, D. (Mootaz)Elnozahy, E.: The interplay of power management and fault recovery in real-time systems. IEEE Trans. on Computers 53(2), 217–231 (2004)
Pradhan, D.K.: Fault Tolerance Computing: Theory and Techniques. Prentice Hall, Englewood Cliffs (1986)
Rambus. Rdram (1999), http://www.rambus.com/
Seth, K., Anantaraman, A., Mueller, F., Rotenberg, E.: Fast: Frequency-aware static timing analysis. In: Proc. of the IEEE Real-Time System Symposium (2003)
Sharma, V., Thomas, A., Abdelzaher, T., Skadron, K., Lu, Z.: Power-aware qos management in web servers. In: Proc. of the 24th IEEE Real-Time System Symposium (December 2003)
Shin, K.G., Kim, H.: A time redundancy approach to tmr failures using fault-state likelihoods. IEEE Trans. on Computers 43(10), 1151–1162 (1994)
Sinha, A., Chandrakasan, A.P.: Jouletrack - a web based tool for software energy profiling. In: Proc. of Design Automation Conference (June 2001)
Thompson, S., Packan, P., Bohr, M.: Mos scaling: Transistor challenges for the 21st century. Intel Technology Journal, Q3 (1998)
Unsal, O.S., Koren, I., Krishna, C.M.: Towards energy-aware software-based fault tolerance in real-time systems. In: Proc. of The International Symposium on Low Power Electronics Design, ISLPED (August 2002)
Weiser, M., Welch, B., Demers, A., Shenker, S.: Scheduling for reduced cpu energy. In: Proc. of The First USENIX Symposium on Operating Systems Design and Implementation (November 1994)
Yao, F., Demers, A., Shenker, S.: A scheduling model for reduced cpu energy. In: Proc. of The 36th Annual Symposium on Foundations of Computer Science (1995)
Zhang, Y., Chakrabarty, K.: Energy-aware adaptive checkpointing in embedded real-time systems. In: Proc. of IEEE/ACM Design, Automation and Test in Europe Conference(DATE) (2003)
Zhang, Y., Chakrabarty, K.: Task feasibility analysis and dynamic voltage scaling in fault-tolerant real-time embedded systems. In: Proc. of IEEE/ACM Design, Automation and Test in Europe Conference(DATE) (2004)
Zhu, D., Melhem, R., Mossé, D., (Mootaz) Elnozahy, E.: Analysis of an energy efficient optimistic tmr scheme. In: Proc. of the 10th International Conference on Parallel and Distributed Systems, ICPADS (July 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, D., Melhem, R., Mossé, D. (2005). Energy Efficient Configuration for QoS in Reliable Parallel Servers. In: Dal Cin, M., Kaâniche, M., Pataricza, A. (eds) Dependable Computing - EDCC 5. EDCC 2005. Lecture Notes in Computer Science, vol 3463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408901_9
Download citation
DOI: https://doi.org/10.1007/11408901_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25723-3
Online ISBN: 978-3-540-32019-7
eBook Packages: Computer ScienceComputer Science (R0)