Abstract
Energy management on computing systems is one of the major concerns in both academia and industry for several decades. However, in complex real-time distributed computing systems, used in data-intensive systems such as space-based control system or life maintenance systems, where a failure may cause catastrophic results, reliability is equally important as timeliness. If left unchecked, the high power consumption and deteriorating reliability of IC chips will restrict the availability of future generations of such computing systems. Although fault-tolerance and energy-aware task scheduling has been handled independently by researchers, the issue of simultaneously addressing energy and reliability management needs to be explored further. Keeping this in view, the main objective of this chapter is toward analyzing energy-aware fault-tolerant task scheduling for real-time distributed computing system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agarwal, M. M., Govil, M. C., Sinha, M., & Gupta, S. (2019). Fuzzy based data fusion for energy efficient internet of things. International Journal of Grid and High Performance Computing, 11(3), 46–58. https://doi.org/10.4018/ijghpc.2019070103
AMD. 2nd generation AMD embedded R-series APU. https://www.amd.com/en/products/embedded-r-series-2nd-gen-apu (2nd). Accessed 20 March 2020
Aminzadeh, S., & Ejlali, A. (2011). A comparative study of system-level energy management methods for fault-tolerant hard real-time systems. IEEE Transactions on Computers 60(9), 1288–1299 (2011). https://doi.org/10.1109/tc.2011.42
Ansari, M., Safari, S., Poursafaei, F. R., & Salehi, M. (2017). AdDQ: Low-energy hardware replication for real-time systems through adaptive dual-queue scheduling. The CSI Journal on Computer Science and Engineering, 15(1), 31–38.
Attia, K. M., El-Hosseini, M. A., & Ali, H. A. (2017). Dynamic power management techniques in multi-core architectures: A survey study. Ain Shams Engineering Journal, 8(3), 445–456. https://doi.org/10.1016/j.asej.2015.08.010
Aydin, H., Melhem, R., Mosse, D., & Mejia-Alvarez, P. (2004). Power-aware scheduling for periodic real-time tasks. IEEE Transactions on Computers, 53(5), 584–600. https://doi.org/10.1109/tc.2004.1275298
Bambagini, M. (2014). Energy Saving in Real-Time Embedded Systems. Ph.D. Thesis, ReTiS Lab, TeCIP Institute, Pisa, Italy.
Bambagini, M., Marinoni, M., Aydin, H., & Buttazzo, G. (2016). Energy-aware scheduling for real-time systems. ACM Transactions on Embedded Computing Systems, 15(1), 1–34. https://doi.org/10.1145/2808231
Burd, T. D., & Brodersen, R. W. (1995). Energy efficient CMOS microprocessor design. In Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences (Vol. 1, pp. 288–297). https://doi.org/10.1109/HICSS.1995.375385
Campbell, A., McDonald, P., & Ray, K. (1992). Single event upset rates in space. IEEE Transactions on Nuclear Science, 39(6), 1828–1835. https://doi.org/10.1109/23.211373
Castillo, X., McConnel, S. R., & Siewiorek, D. P. (1982). Derivation and calibration of a transient error reliability model. IEEE Transactions on Computers, C-31(7), 658–671. https://doi.org/10.1109/tc.1982.1676063
Cong, J., Nagaraj, N. S., Puri, R., Joyner, W., Burns, J., Gavrielov, M., Radojcic, R., Rickert, P., & Stork, H. (2009). Moore’s law: Another casualty of the financial meltdown? In 2009 46th ACM/IEEE Design Automation Conference (pp. 202–203).
Dewangan, B. K., Agarwal, A., Venkatadri, M., & Pasricha, A. (2019). Energy-aware autonomic resource scheduling framework for cloud. International Journal of Mathematical, Engineering and Management Sciences, 4(1), 41–55. https://doi.org/10.33889/ijmems.2019.4.1-004
EETimes, Staff, E. (2017). 2017 Embedded Market Survey (2017). Accessed 21 May 2020.
Ejlali, A., Al-Hashimi, B. M., & Eles, P. (2012). Low-energy standby-sparing for hard real-time systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 31(3), 329–342. https://doi.org/10.1109/tcad.2011.2173488
Elnozahy, E., Melhem, R., & Mosse, D. (2002) Energy-efficient duplex and TMR real-time systems. In 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002. IEEE Comput. Soc. https://doi.org/10.1109/real.2002.1181580
Fan, M., Han, Q., & Yang, X. (2017). Energy minimization for on-line real-time scheduling with reliability awareness. Journal of Systems and Software, 127, 168–176. https://doi.org/10.1016/j.jss.2017.02.004
Ghosh, S., Melhem, R., & Mosse, D. (1997). Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 8(3), 272–284. https://doi.org/10.1109/71.584093
Ghosh, S., Melhem, R., Mossé, D., & Sarma, J. S. (1998). Fault-tolerant rate-monotonic scheduling. Real-Time Systems, 15(2), 149–181. https://doi.org/10.1023/a:1008046012844
Goyal, N., Dave, M., & Verma, A. K. (2016). Energy efficient architecture for intra and inter cluster communication for underwater wireless sensor networks. Wireless Personal Communications, 89(2), 687–707. https://doi.org/10.1007/s11277-016-3302-0
Guo, Y., Su, H., Zhu, D., & Aydin, H. (2015). Preference-oriented real-time scheduling and its application in fault-tolerant systems. Journal of Systems Architecture, 61(2), 127–139. https://doi.org/10.1016/j.sysarc.2014.12.001
Guo, Y., Zhu, D., Aydin, H., Han, J. J., & Yang, L. T. (2017). Exploiting primary/backup mechanism for energy efficiency in dependable real-time systems. Journal of Systems Architecture, 78, 68–80. https://doi.org/10.1016/j.sysarc.2017.06.008
Han, Q., Wang, T., & Quan, G. (2015). Enhanced fault-tolerant fixed-priority scheduling of hard real-time tasks on multi-core platforms. In 2015 IEEE 21st International Conference on Embedded and Real-Time Computing Systems and Applications. IEEE. https://doi.org/10.1109/rtcsa.2015.22
Haque, M. A., Aydin, H., & Zhu, D. (2011). Energy-aware standby-sparing technique for periodic real-time applications. In 2011 IEEE 29th International Conference on Computer Design (ICCD). IEEE. https://doi.org/10.1109/iccd.2011.6081396
Haque, M. A., Aydin, H., & Zhu, D. (2013). Energy-aware task replication to manage reliability for periodic real-time applications on multicore platforms. In 2013 International Green Computing Conference Proceedings (pp. 1–11). IEEE. https://doi.org/10.1109/igcc.2013.6604518
Haque, M. A., Aydin, H., & Zhu, D. (2015). Energy-aware standby-sparing for fixed-priority real-time task sets. Sustainable Computing: Informatics and Systems, 6, 81–93. https://doi.org/10.1016/j.suscom.2014.05.001
Haque, M. A., Aydin, H., & Zhu, D. (2017). On reliability management of energy-aware real-time systems through task replication. IEEE Transactions on Parallel and Distributed Systems, 28(3), 813–825. https://doi.org/10.1109/tpds.2016.2600595
Huang, K., Jiang, X., Zhang, X., Yan, R., Wang, K., Xiong, D., & Yan, X. (2018). Energy-efficient fault-tolerant mapping and scheduling on heterogeneous multiprocessor real-time systems. IEEE Access, 6, 57614–57630. https://doi.org/10.1109/access.2018.2873641
Jejurikar, R., Pereira, C., & Gupta, R. (2001). Leakage aware dynamic voltage scaling for real-time embedded systems. In Proceedings of the 41st Annual Design Automation Conference, DAC ’04 (pp. 275–280). ACM. https://doi.org/10.1145/996566.996650
Jhumka, A., Hiller, M., Claesson, V., & Suri, N. (2002). On systematic design of globally consistent executable assertions in embedded software. ACM SIGPLAN Notices, 37(7), 75. https://doi.org/10.1145/566225.513843
Kaur, N., Bansal, S., & Bansal, R. K. (2016). Energy conscious scheduling with controlled threshold for precedence-constrained tasks on heterogeneous clusters. Concurrent Engineering, 25(3), 276–286. https://doi.org/10.1177/1063293x16679001
Kaur, N., Bansal, S., & Bansal, R. K. (2016). Energy efficient duplication-based scheduling for precedence constrained tasks on heterogeneous computing cluster. Multiagent and Grid Systems, 12(3), 239–252. https://doi.org/10.3233/MGS-160252
Kaur, N., Bansal, S., & Bansal, R. K. (2017). Duplication-controlled static energy-efficient scheduling on multiprocessor computing system. Concurrency and Computation: Practice and Experience, 29(12), e4124. https://doi.org/10.1002/cpe.4124
Khudia, D. S., & Mahlke, S. (2014). Harnessing soft computations for low-budget fault tolerance. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE. https://doi.org/10.1109/micro.2014.33
Kim, J., Kim, H., Lakshmanan, K., & Rajkumar, R. (2013). Parallel scheduling for cyber-physical systems: Analysis and case study on a self-driving car. In 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS) (pp. 31–40).
Lala, J., & Harper, R. (1994). Architectural principles for safety-critical real-time applications. Proceedings of the IEEE, 82(1), 25–40. https://doi.org/10.1109/5.259424
Leveson, N. G. (1986). Software safety: Why, what, and how. ACM Computing Surveys, 18(2), 125–163. https://doi.org/10.1145/7474.7528
Li, K. (2016). Energy and time constrained task scheduling on multiprocessor computers with discrete speed levels. Journal of Parallel and Distributed Computing, 95, 15–28. https://doi.org/10.1016/j.jpdc.2016.02.006
Market, E.S. (2020). Embedded system market by hardware (MPU, MCU, application-specific integrated circuits, DSP, FPGA, and memories), software (middleware, operating systems), system size, functionality, application, region—global forecast to 2025. Accessed 21 May 2020.
Marwedel, P. (2018). Embedded system design. Springer International Publishing. https://doi.org/10.1007/978-3-319-56045-8
Masiero, M., & Roos, A. (2012). Power consumption—CPU charts 2012: 86 processors from AMD and Intel, tested (2012). Accessed 02 Jan 2020.
Meixner, A., Bauer, M. E., & Sorin, D. (2007). Argus: Low-cost, comprehensive error detection in simple cores. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). IEEE. https://doi.org/10.1109/micro.2007.18
Melhem, R., Mosse, D., & Elnozahy, E. (2004). The interplay of power management and fault recovery in real-time systems. IEEE Transactions on Computers, 53(2), 217–231. https://doi.org/10.1109/tc.2004.1261830
Niu, L., & Li, W. (2016). Reliability-conscious energy management for fixed-priority real-time embedded systems with weakly hard QoS-constraint. Microprocessors and Microsystems, 46, 107–121. https://doi.org/10.1016/j.micpro.2016.03.005
Oh, S. K., & Macewen, G. H. (1992). Toward fault-tolerant adaptive real-time distributed systems.
Pollack, F. J. (1999). New microarchitecture challenges in the coming generations of CMOS process technologies (keynote address) (abstract only). In Proceedings of the 32Nd Annual ACM/IEEE International Symposium on Microarchitecture, MICRO 32 (p. 2). IEEE Computer Society.
Poursafaei, F. R., Safari, S., Ansari, M., Salehi, M., & Ejlali, A. (2015). Offline replication and online energy management for hard real-time multicore systems. In 2015 CSI Symposium on Real-Time and Embedded Systems and Technologies (RTEST). IEEE. https://doi.org/10.1109/rtest.2015.7369847
Pradhan, D. K. (1996). Fault-tolerant computer system design. Prentice-Hall.
Punnekkat, S. (1997). Schedulability Analysis for Fault Tolerant Real-time Systems. Ph.D. Thesis, University of York, UK.
Qi, X., Zhu, D., & Aydin, H. (2011). Global scheduling based reliability-aware power management for multiprocessor real-time systems. Real-Time Systems, 47(2), 109–142. https://doi.org/10.1007/s11241-011-9117-x
Salehi, M., Ejlali, A., & Al-Hashimi, B. M. (2016). Two-phase low-energy n-modular redundancy for hard real-time multi-core systems. IEEE Transactions on Parallel and Distributed Systems, 27(5), 1497–1510. https://doi.org/10.1109/tpds.2015.2444402
Shivakumar, P., Kistler, M., Keckler, S., Burger, D., & Alvisi, L. (2002). Modeling the effect of technology trends on the soft error rate of combinational logic. In Proceedings International Conference on Dependable Systems and Networks. IEEE Comput. Soc. https://doi.org/10.1109/dsn.2002.1028924
Srinivasan, J., Adve, S., Bose, P., & Rivers, J. (2004). The impact of technology scaling on lifetime reliability. In International Conference on Dependable Systems and Networks, 2004. IEEE. https://doi.org/10.1109/dsn.2004.1311888
Tosun, S. (2011). Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures. The Journal of Supercomputing, 62(1), 265–289. https://doi.org/10.1007/s11227-011-0720-3
Unsal, O. S., Koren, I., & Krishna, C. M. (2002). Towards energy-aware software-based fault tolerance in real-time systems. In Proceedings of the 2002 International Symposium on Low Power Electronics and Design (pp. 124–129). ACM Press. https://doi.org/10.1145/566408.566442
Uribe-Toril, J., Ruiz-Real, J., Milán-García, J., & de Pablo Valenciano, J. (2019). Energy, economy, and environment: A worldwide research update. Energies, 12(6), 1120. https://doi.org/10.3390/en12061120
Venkatachalam, V., & Franz, M. (2005). Power reduction techniques for microprocessor systems. ACM Computing Surveys, 37(3), 195–237. https://doi.org/10.1145/1108956.1108957
Wei, T., Mishra, P., Wu, K., & Zhou, J. (2012). Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems. Journal of Systems and Software, 85(6), 1386–1399. https://doi.org/10.1016/j.jss.2012.01.020
Xu, H., Li, R., Zeng, L., Li, K., & Pan, C. (2018). Energy-efficient scheduling with reliability guarantee in embedded real-time systems. Sustainable Computing: Informatics and Systems, 18, 137–148. https://doi.org/10.1016/j.suscom.2018.01.005
Zahaf, H. E. (2016). Energy efficient scheduling of parallel real-time tasks on heterogeneous multicore systems. Ph.D. Thesis, Lille 1 University of Science and Technology, France.
Zhang, Y. W., Zhang, H. Z., & Wang, C. (2017). Reliability-aware low energy scheduling in real time systems with shared resources. Microprocessors and Microsystems, 52, 312–324. https://doi.org/10.1016/j.micpro.2017.06.020
Zhang, Y., & Chakrabarty, K. (2006). A unified approach for fault tolerance and dynamic power management in fixed-priority real-time embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 25(1), 111–125. https://doi.org/10.1109/tcad.2005.852657
Zhang, Y., & Chakrabarty, K. (2004). Dynamic adaptation for fault tolerance and power management in embedded real-time systems. ACM Transactions on Embedded Computing Systems, 3(2), 336–360. https://doi.org/10.1145/993396.993402
Zhao, B., Aydin, & H., Zhu, D. (2009). Enhanced reliability-aware power management through shared recovery technique. In Proceedings of the 2009 International Conference on Computer-Aided Design (pp. 63–70). ACM Press. https://doi.org/10.1145/1687399.1687412
Zhao, B., Aydin, H., & Zhu, D. (2010). On maximizing reliability of real-time embedded applications under hard energy constraint. IEEE Transactions on Industrial Informatics, 6(3), 316–328. https://doi.org/10.1109/tii.2010.2051970
Zhao, B., Aydin, H., & Zhu, D. (2011). Generalized reliability-oriented energy management for real-time embedded applications. In Proceedings of the 48th Design Automation Conference on—DAC ’11. ACM Press. https://doi.org/10.1145/2024724.2024815
Zhao, B., Aydin, H., & Zhu, D. (2012). Energy management under general task-level reliability constraints. In 2012 IEEE 18th Real Time and Embedded Technology and Applications Symposium (pp. 285–294). IEEE. https://doi.org/10.1109/rtas.2012.30
Zhao, B., Aydin, H., & Zhu, D. (2013). Shared recovery for energy efficiency and reliability enhancements in real-time applications with precedence constraints. ACM Transactions on Design Automation of Electronic Systems, 18(2), 1–21. https://doi.org/10.1145/2442087.2442094
Zhu, D., & Aydin, H. (2009). Reliability-aware energy management for periodic real-time tasks. IEEE Transactions on Computers, 58(10), 1382–1397. https://doi.org/10.1109/TC.2009.56
Zhu, D. (2010). Reliability-aware dynamic energy management in dependable embedded real-time systems. ACM Transactions on Embedded Computing Systems, 10(2), 1–27. https://doi.org/10.1145/1880050.1880062
Zhu, D., Qi, X., & Aydin, H. (2007). Priority-monotonic energy management for real-time systems with reliability requirements. In 2007 25th International Conference on Computer Design. IEEE. https://doi.org/10.1109/iccd.2007.4601963
Zhuravlev, S., Saez, J. C., Blagodurov, S., Fedorova, A., & Prieto, M. (2013). Survey of energy-cognizant scheduling techniques. IEEE Transactions on Parallel and Distributed Systems, 24(7), 1447–1464. https://doi.org/10.1109/tpds.2012.20
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bansal, S., Bansal, R.K., Arora, K. (2023). Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed Computing Systems. In: Pandey, S., Shanker, U., Saravanan, V., Ramalingam, R. (eds) Role of Data-Intensive Distributed Computing Systems in Designing Data Solutions. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-15542-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-15542-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15541-3
Online ISBN: 978-3-031-15542-0
eBook Packages: EngineeringEngineering (R0)