[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed Computing Systems

  • Chapter
  • First Online:
Role of Data-Intensive Distributed Computing Systems in Designing Data Solutions

Abstract

Energy management on computing systems is one of the major concerns in both academia and industry for several decades. However, in complex real-time distributed computing systems, used in data-intensive systems such as space-based control system or life maintenance systems, where a failure may cause catastrophic results, reliability is equally important as timeliness. If left unchecked, the high power consumption and deteriorating reliability of IC chips will restrict the availability of future generations of such computing systems. Although fault-tolerance and energy-aware task scheduling has been handled independently by researchers, the issue of simultaneously addressing energy and reliability management needs to be explored further. Keeping this in view, the main objective of this chapter is toward analyzing energy-aware fault-tolerant task scheduling for real-time distributed computing system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 87.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 109.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
GBP 109.99
Price includes VAT (United Kingdom)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agarwal, M. M., Govil, M. C., Sinha, M., & Gupta, S. (2019). Fuzzy based data fusion for energy efficient internet of things. International Journal of Grid and High Performance Computing, 11(3), 46–58. https://doi.org/10.4018/ijghpc.2019070103

    Article  Google Scholar 

  2. AMD. 2nd generation AMD embedded R-series APU. https://www.amd.com/en/products/embedded-r-series-2nd-gen-apu (2nd). Accessed 20 March 2020

  3. Aminzadeh, S., & Ejlali, A. (2011). A comparative study of system-level energy management methods for fault-tolerant hard real-time systems. IEEE Transactions on Computers 60(9), 1288–1299 (2011). https://doi.org/10.1109/tc.2011.42

  4. Ansari, M., Safari, S., Poursafaei, F. R., & Salehi, M. (2017). AdDQ: Low-energy hardware replication for real-time systems through adaptive dual-queue scheduling. The CSI Journal on Computer Science and Engineering, 15(1), 31–38.

    Google Scholar 

  5. Attia, K. M., El-Hosseini, M. A., & Ali, H. A. (2017). Dynamic power management techniques in multi-core architectures: A survey study. Ain Shams Engineering Journal, 8(3), 445–456. https://doi.org/10.1016/j.asej.2015.08.010

    Article  Google Scholar 

  6. Aydin, H., Melhem, R., Mosse, D., & Mejia-Alvarez, P. (2004). Power-aware scheduling for periodic real-time tasks. IEEE Transactions on Computers, 53(5), 584–600. https://doi.org/10.1109/tc.2004.1275298

    Article  MATH  Google Scholar 

  7. Bambagini, M. (2014). Energy Saving in Real-Time Embedded Systems. Ph.D. Thesis, ReTiS Lab, TeCIP Institute, Pisa, Italy.

    Google Scholar 

  8. Bambagini, M., Marinoni, M., Aydin, H., & Buttazzo, G. (2016). Energy-aware scheduling for real-time systems. ACM Transactions on Embedded Computing Systems, 15(1), 1–34. https://doi.org/10.1145/2808231

    Article  Google Scholar 

  9. Burd, T. D., & Brodersen, R. W. (1995). Energy efficient CMOS microprocessor design. In Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences (Vol. 1, pp. 288–297). https://doi.org/10.1109/HICSS.1995.375385

  10. Campbell, A., McDonald, P., & Ray, K. (1992). Single event upset rates in space. IEEE Transactions on Nuclear Science, 39(6), 1828–1835. https://doi.org/10.1109/23.211373

    Article  Google Scholar 

  11. Castillo, X., McConnel, S. R., & Siewiorek, D. P. (1982). Derivation and calibration of a transient error reliability model. IEEE Transactions on Computers, C-31(7), 658–671. https://doi.org/10.1109/tc.1982.1676063

    Article  Google Scholar 

  12. Cong, J., Nagaraj, N. S., Puri, R., Joyner, W., Burns, J., Gavrielov, M., Radojcic, R., Rickert, P., & Stork, H. (2009). Moore’s law: Another casualty of the financial meltdown? In 2009 46th ACM/IEEE Design Automation Conference (pp. 202–203).

    Google Scholar 

  13. Dewangan, B. K., Agarwal, A., Venkatadri, M., & Pasricha, A. (2019). Energy-aware autonomic resource scheduling framework for cloud. International Journal of Mathematical, Engineering and Management Sciences, 4(1), 41–55. https://doi.org/10.33889/ijmems.2019.4.1-004

    Article  Google Scholar 

  14. EETimes, Staff, E. (2017). 2017 Embedded Market Survey (2017). Accessed 21 May 2020.

    Google Scholar 

  15. Ejlali, A., Al-Hashimi, B. M., & Eles, P. (2012). Low-energy standby-sparing for hard real-time systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 31(3), 329–342. https://doi.org/10.1109/tcad.2011.2173488

    Article  Google Scholar 

  16. Elnozahy, E., Melhem, R., & Mosse, D. (2002) Energy-efficient duplex and TMR real-time systems. In 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002. IEEE Comput. Soc. https://doi.org/10.1109/real.2002.1181580

  17. Fan, M., Han, Q., & Yang, X. (2017). Energy minimization for on-line real-time scheduling with reliability awareness. Journal of Systems and Software, 127, 168–176. https://doi.org/10.1016/j.jss.2017.02.004

    Article  Google Scholar 

  18. Ghosh, S., Melhem, R., & Mosse, D. (1997). Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 8(3), 272–284. https://doi.org/10.1109/71.584093

    Article  Google Scholar 

  19. Ghosh, S., Melhem, R., Mossé, D., & Sarma, J. S. (1998). Fault-tolerant rate-monotonic scheduling. Real-Time Systems, 15(2), 149–181. https://doi.org/10.1023/a:1008046012844

    Article  Google Scholar 

  20. Goyal, N., Dave, M., & Verma, A. K. (2016). Energy efficient architecture for intra and inter cluster communication for underwater wireless sensor networks. Wireless Personal Communications, 89(2), 687–707. https://doi.org/10.1007/s11277-016-3302-0

    Article  Google Scholar 

  21. Guo, Y., Su, H., Zhu, D., & Aydin, H. (2015). Preference-oriented real-time scheduling and its application in fault-tolerant systems. Journal of Systems Architecture, 61(2), 127–139. https://doi.org/10.1016/j.sysarc.2014.12.001

    Article  Google Scholar 

  22. Guo, Y., Zhu, D., Aydin, H., Han, J. J., & Yang, L. T. (2017). Exploiting primary/backup mechanism for energy efficiency in dependable real-time systems. Journal of Systems Architecture, 78, 68–80. https://doi.org/10.1016/j.sysarc.2017.06.008

    Article  Google Scholar 

  23. Han, Q., Wang, T., & Quan, G. (2015). Enhanced fault-tolerant fixed-priority scheduling of hard real-time tasks on multi-core platforms. In 2015 IEEE 21st International Conference on Embedded and Real-Time Computing Systems and Applications. IEEE. https://doi.org/10.1109/rtcsa.2015.22

  24. Haque, M. A., Aydin, H., & Zhu, D. (2011). Energy-aware standby-sparing technique for periodic real-time applications. In 2011 IEEE 29th International Conference on Computer Design (ICCD). IEEE. https://doi.org/10.1109/iccd.2011.6081396

  25. Haque, M. A., Aydin, H., & Zhu, D. (2013). Energy-aware task replication to manage reliability for periodic real-time applications on multicore platforms. In 2013 International Green Computing Conference Proceedings (pp. 1–11). IEEE. https://doi.org/10.1109/igcc.2013.6604518

  26. Haque, M. A., Aydin, H., & Zhu, D. (2015). Energy-aware standby-sparing for fixed-priority real-time task sets. Sustainable Computing: Informatics and Systems, 6, 81–93. https://doi.org/10.1016/j.suscom.2014.05.001

    Google Scholar 

  27. Haque, M. A., Aydin, H., & Zhu, D. (2017). On reliability management of energy-aware real-time systems through task replication. IEEE Transactions on Parallel and Distributed Systems, 28(3), 813–825. https://doi.org/10.1109/tpds.2016.2600595

    Article  Google Scholar 

  28. Huang, K., Jiang, X., Zhang, X., Yan, R., Wang, K., Xiong, D., & Yan, X. (2018). Energy-efficient fault-tolerant mapping and scheduling on heterogeneous multiprocessor real-time systems. IEEE Access, 6, 57614–57630. https://doi.org/10.1109/access.2018.2873641

    Article  Google Scholar 

  29. Jejurikar, R., Pereira, C., & Gupta, R. (2001). Leakage aware dynamic voltage scaling for real-time embedded systems. In Proceedings of the 41st Annual Design Automation Conference, DAC ’04 (pp. 275–280). ACM. https://doi.org/10.1145/996566.996650

  30. Jhumka, A., Hiller, M., Claesson, V., & Suri, N. (2002). On systematic design of globally consistent executable assertions in embedded software. ACM SIGPLAN Notices, 37(7), 75. https://doi.org/10.1145/566225.513843

    Article  Google Scholar 

  31. Kaur, N., Bansal, S., & Bansal, R. K. (2016). Energy conscious scheduling with controlled threshold for precedence-constrained tasks on heterogeneous clusters. Concurrent Engineering, 25(3), 276–286. https://doi.org/10.1177/1063293x16679001

    Article  Google Scholar 

  32. Kaur, N., Bansal, S., & Bansal, R. K. (2016). Energy efficient duplication-based scheduling for precedence constrained tasks on heterogeneous computing cluster. Multiagent and Grid Systems, 12(3), 239–252. https://doi.org/10.3233/MGS-160252

    Article  Google Scholar 

  33. Kaur, N., Bansal, S., & Bansal, R. K. (2017). Duplication-controlled static energy-efficient scheduling on multiprocessor computing system. Concurrency and Computation: Practice and Experience, 29(12), e4124. https://doi.org/10.1002/cpe.4124

    Article  Google Scholar 

  34. Khudia, D. S., & Mahlke, S. (2014). Harnessing soft computations for low-budget fault tolerance. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE. https://doi.org/10.1109/micro.2014.33

  35. Kim, J., Kim, H., Lakshmanan, K., & Rajkumar, R. (2013). Parallel scheduling for cyber-physical systems: Analysis and case study on a self-driving car. In 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS) (pp. 31–40).

    Google Scholar 

  36. Lala, J., & Harper, R. (1994). Architectural principles for safety-critical real-time applications. Proceedings of the IEEE, 82(1), 25–40. https://doi.org/10.1109/5.259424

    Article  Google Scholar 

  37. Leveson, N. G. (1986). Software safety: Why, what, and how. ACM Computing Surveys, 18(2), 125–163. https://doi.org/10.1145/7474.7528

    Article  Google Scholar 

  38. Li, K. (2016). Energy and time constrained task scheduling on multiprocessor computers with discrete speed levels. Journal of Parallel and Distributed Computing, 95, 15–28. https://doi.org/10.1016/j.jpdc.2016.02.006

    Article  Google Scholar 

  39. Market, E.S. (2020). Embedded system market by hardware (MPU, MCU, application-specific integrated circuits, DSP, FPGA, and memories), software (middleware, operating systems), system size, functionality, application, region—global forecast to 2025. Accessed 21 May 2020.

    Google Scholar 

  40. Marwedel, P. (2018). Embedded system design. Springer International Publishing. https://doi.org/10.1007/978-3-319-56045-8

  41. Masiero, M., & Roos, A. (2012). Power consumption—CPU charts 2012: 86 processors from AMD and Intel, tested (2012). Accessed 02 Jan 2020.

    Google Scholar 

  42. Meixner, A., Bauer, M. E., & Sorin, D. (2007). Argus: Low-cost, comprehensive error detection in simple cores. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). IEEE. https://doi.org/10.1109/micro.2007.18

  43. Melhem, R., Mosse, D., & Elnozahy, E. (2004). The interplay of power management and fault recovery in real-time systems. IEEE Transactions on Computers, 53(2), 217–231. https://doi.org/10.1109/tc.2004.1261830

    Article  Google Scholar 

  44. Niu, L., & Li, W. (2016). Reliability-conscious energy management for fixed-priority real-time embedded systems with weakly hard QoS-constraint. Microprocessors and Microsystems, 46, 107–121. https://doi.org/10.1016/j.micpro.2016.03.005

    Article  Google Scholar 

  45. Oh, S. K., & Macewen, G. H. (1992). Toward fault-tolerant adaptive real-time distributed systems.

    Google Scholar 

  46. Pollack, F. J. (1999). New microarchitecture challenges in the coming generations of CMOS process technologies (keynote address) (abstract only). In Proceedings of the 32Nd Annual ACM/IEEE International Symposium on Microarchitecture, MICRO 32 (p. 2). IEEE Computer Society.

    Google Scholar 

  47. Poursafaei, F. R., Safari, S., Ansari, M., Salehi, M., & Ejlali, A. (2015). Offline replication and online energy management for hard real-time multicore systems. In 2015 CSI Symposium on Real-Time and Embedded Systems and Technologies (RTEST). IEEE. https://doi.org/10.1109/rtest.2015.7369847

  48. Pradhan, D. K. (1996). Fault-tolerant computer system design. Prentice-Hall.

    Google Scholar 

  49. Punnekkat, S. (1997). Schedulability Analysis for Fault Tolerant Real-time Systems. Ph.D. Thesis, University of York, UK.

    Google Scholar 

  50. Qi, X., Zhu, D., & Aydin, H. (2011). Global scheduling based reliability-aware power management for multiprocessor real-time systems. Real-Time Systems, 47(2), 109–142. https://doi.org/10.1007/s11241-011-9117-x

    Article  MATH  Google Scholar 

  51. Salehi, M., Ejlali, A., & Al-Hashimi, B. M. (2016). Two-phase low-energy n-modular redundancy for hard real-time multi-core systems. IEEE Transactions on Parallel and Distributed Systems, 27(5), 1497–1510. https://doi.org/10.1109/tpds.2015.2444402

    Article  Google Scholar 

  52. Shivakumar, P., Kistler, M., Keckler, S., Burger, D., & Alvisi, L. (2002). Modeling the effect of technology trends on the soft error rate of combinational logic. In Proceedings International Conference on Dependable Systems and Networks. IEEE Comput. Soc. https://doi.org/10.1109/dsn.2002.1028924

  53. Srinivasan, J., Adve, S., Bose, P., & Rivers, J. (2004). The impact of technology scaling on lifetime reliability. In International Conference on Dependable Systems and Networks, 2004. IEEE. https://doi.org/10.1109/dsn.2004.1311888

  54. Tosun, S. (2011). Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures. The Journal of Supercomputing, 62(1), 265–289. https://doi.org/10.1007/s11227-011-0720-3

    Article  Google Scholar 

  55. Unsal, O. S., Koren, I., & Krishna, C. M. (2002). Towards energy-aware software-based fault tolerance in real-time systems. In Proceedings of the 2002 International Symposium on Low Power Electronics and Design (pp. 124–129). ACM Press. https://doi.org/10.1145/566408.566442

  56. Uribe-Toril, J., Ruiz-Real, J., Milán-García, J., & de Pablo Valenciano, J. (2019). Energy, economy, and environment: A worldwide research update. Energies, 12(6), 1120. https://doi.org/10.3390/en12061120

    Article  Google Scholar 

  57. Venkatachalam, V., & Franz, M. (2005). Power reduction techniques for microprocessor systems. ACM Computing Surveys, 37(3), 195–237. https://doi.org/10.1145/1108956.1108957

    Article  Google Scholar 

  58. Wei, T., Mishra, P., Wu, K., & Zhou, J. (2012). Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems. Journal of Systems and Software, 85(6), 1386–1399. https://doi.org/10.1016/j.jss.2012.01.020

    Article  Google Scholar 

  59. Xu, H., Li, R., Zeng, L., Li, K., & Pan, C. (2018). Energy-efficient scheduling with reliability guarantee in embedded real-time systems. Sustainable Computing: Informatics and Systems, 18, 137–148. https://doi.org/10.1016/j.suscom.2018.01.005

    Google Scholar 

  60. Zahaf, H. E. (2016). Energy efficient scheduling of parallel real-time tasks on heterogeneous multicore systems. Ph.D. Thesis, Lille 1 University of Science and Technology, France.

    Google Scholar 

  61. Zhang, Y. W., Zhang, H. Z., & Wang, C. (2017). Reliability-aware low energy scheduling in real time systems with shared resources. Microprocessors and Microsystems, 52, 312–324. https://doi.org/10.1016/j.micpro.2017.06.020

    Article  Google Scholar 

  62. Zhang, Y., & Chakrabarty, K. (2006). A unified approach for fault tolerance and dynamic power management in fixed-priority real-time embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 25(1), 111–125. https://doi.org/10.1109/tcad.2005.852657

    Article  Google Scholar 

  63. Zhang, Y., & Chakrabarty, K. (2004). Dynamic adaptation for fault tolerance and power management in embedded real-time systems. ACM Transactions on Embedded Computing Systems, 3(2), 336–360. https://doi.org/10.1145/993396.993402

    Article  Google Scholar 

  64. Zhao, B., Aydin, & H., Zhu, D. (2009). Enhanced reliability-aware power management through shared recovery technique. In Proceedings of the 2009 International Conference on Computer-Aided Design (pp. 63–70). ACM Press. https://doi.org/10.1145/1687399.1687412

  65. Zhao, B., Aydin, H., & Zhu, D. (2010). On maximizing reliability of real-time embedded applications under hard energy constraint. IEEE Transactions on Industrial Informatics, 6(3), 316–328. https://doi.org/10.1109/tii.2010.2051970

    Article  Google Scholar 

  66. Zhao, B., Aydin, H., & Zhu, D. (2011). Generalized reliability-oriented energy management for real-time embedded applications. In Proceedings of the 48th Design Automation Conference on—DAC ’11. ACM Press. https://doi.org/10.1145/2024724.2024815

  67. Zhao, B., Aydin, H., & Zhu, D. (2012). Energy management under general task-level reliability constraints. In 2012 IEEE 18th Real Time and Embedded Technology and Applications Symposium (pp. 285–294). IEEE. https://doi.org/10.1109/rtas.2012.30

  68. Zhao, B., Aydin, H., & Zhu, D. (2013). Shared recovery for energy efficiency and reliability enhancements in real-time applications with precedence constraints. ACM Transactions on Design Automation of Electronic Systems, 18(2), 1–21. https://doi.org/10.1145/2442087.2442094

    Article  Google Scholar 

  69. Zhu, D., & Aydin, H. (2009). Reliability-aware energy management for periodic real-time tasks. IEEE Transactions on Computers, 58(10), 1382–1397. https://doi.org/10.1109/TC.2009.56

    Article  MATH  Google Scholar 

  70. Zhu, D. (2010). Reliability-aware dynamic energy management in dependable embedded real-time systems. ACM Transactions on Embedded Computing Systems, 10(2), 1–27. https://doi.org/10.1145/1880050.1880062

    Article  Google Scholar 

  71. Zhu, D., Qi, X., & Aydin, H. (2007). Priority-monotonic energy management for real-time systems with reliability requirements. In 2007 25th International Conference on Computer Design. IEEE. https://doi.org/10.1109/iccd.2007.4601963

  72. Zhuravlev, S., Saez, J. C., Blagodurov, S., Fedorova, A., & Prieto, M. (2013). Survey of energy-cognizant scheduling techniques. IEEE Transactions on Parallel and Distributed Systems, 24(7), 1447–1464. https://doi.org/10.1109/tpds.2012.20

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kiran Arora .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bansal, S., Bansal, R.K., Arora, K. (2023). Energy Conscious Scheduling for Fault-Tolerant Real-Time Distributed Computing Systems. In: Pandey, S., Shanker, U., Saravanan, V., Ramalingam, R. (eds) Role of Data-Intensive Distributed Computing Systems in Designing Data Solutions. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-15542-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15542-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15541-3

  • Online ISBN: 978-3-031-15542-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics