Abstract
Reconfigurable devices (RDs) are extremely advantageous when employed in real-time embedded systems. Nonetheless, they are susceptible to soft errors. In a broad sense, the present research addresses the challenge of improving the reliability of independent periodic real-time hardware tasks in RDs by utilizing hybrid fault-tolerant scheduling. The current paper combines static and dynamic real-time scheduling techniques to improve the reliability of the system. First, the proposed algorithm statically schedules primary tasks and preserves area and time for possible backup tasks on the RD. The overlapping of passive backup tasks is possible. Next, at the run time, event-triggered dispatcher dynamically determines which candidate backup copy should be selected for configuration on the overloaded preserved areas. Reliability, task deadline, and RD area limitations are the determining factors of backup overloading in the static phase. On the other hand, in the dynamic phase, the execution result of the primary tasks—in this case, success or failure—is the deciding factor based on which the dispatcher configures the true backup task on the preserved area. Experimental results show that the hybrid scheduling technique enhances the mean-time-to-failure of the system by an average factor of 1.22 in comparison with a similar state-of-the-art study.
Similar content being viewed by others
Notes
Throughout the remainder of the present paper, both transient and intermittent faults are referred to as soft errors (SEs).
References
Cardoso J, Hübner M (2011) Reconfigurable computing: from FPGAs to hardware/software codesign. Springer, Berlin. https://doi.org/10.1007/978-1-4614-0061-5
Cetin E, Diessel O, Li T, Ambrose JA, Fisk T, Parameswaran S, Dempster AG (2016) Overview and investigation of SEU detection and recovery approaches for FPGA-based heterogeneous systems. In: Rech P (ed) FPGAs and parallel architectures for aerospace applications. Springer, Berlin, pp 33–46. https://doi.org/10.1007/978-3-319-14352-1_3
Vipin K, Fahmy SA (2018) FPGA dynamic and partial reconfiguration: a survey of architectures, methods, and applications. ACM Comput Surv (CSUR) 51:72. https://doi.org/10.1145/3193827
Kean T, Buchanan I (1992) The use of FPGAs in a novel computing subsystem. Paper Presented at the Proceeding of 1st International ACM/SIGDA Workshop on FPGAs
Hauck S (1998) The roles of FPGA’s in reprogrammable systems. Proc IEEE 86:615–638. https://doi.org/10.1109/5.663540
Koch D, Ziener D, Hannig F (2016) FPGA versus software programming: why, when, and how? In: FPGAs for Software Programmers. Springer, Berlin, pp 1–21. https://doi.org/10.1007/978-3-319-26408-0_1
Parrilla L, Álvarez-Bermejo JA, Castillo E, López-Ramos JA, Morales-Santos DP, García A (2018) Elliptic Curve Cryptography hardware accelerator for high-performance secure servers. J Supercomput. https://doi.org/10.1007/s11227-018-2317-6
Kastensmidt FL, Carro L, da Luz Reis RA (2006) Fault-tolerance techniques for SRAM-based FPGAs. Springer, Berlin. https://doi.org/10.1007/978-0-387-31069-5
Bolchini C, Miele A, Sandionigi C (2013) Autonomous fault-tolerant systems onto SRAM-based FPGA platforms. J Electron Test 29:779–793. https://doi.org/10.1007/s10836-013-5418-4
Zhao Z, Nguyen NT, Agiakatsikas D, Lee G, Diessel O (2018) Fine-grained module-based error recovery in FPGA-based TMR systems. ACM Trans Reconfigurable Technol Syst 11:4. https://doi.org/10.1145/3173549
Kastensmidt F, Rech P (2016) Radiation effects and fault tolerance techniques for FPGAs and GPUs. In: Rech P (ed) FPGAs and parallel architectures for aerospace applications. Springer, Berlin, pp 3–17. https://doi.org/10.1007/978-3-319-14352-1_1
Krishna C (2014) Fault-tolerant scheduling in homogeneous real-time systems. ACM Comput Surv (CSUR). 46:48. https://doi.org/10.1145/2534028
Ramezani R, Sedaghat Y, Naghibzadeh M, Clemente JA (2017) Reliability and makespan optimization of hardware task graphs in partially reconfigurable platforms. IEEE Trans Aerosp Electron Syst. https://doi.org/10.1109/TAES.2017.2667338
Liang H, Sinha S, Zhang W (2018) Parallelizing hardware tasks on multicontext FPGA with efficient placement and scheduling algorithms. IEEE Trans Comput Aided Design Integr Circuits Syst 37:350–363. https://doi.org/10.1109/TCAD.2017.2697952
Stoddard A, Gruwell A, Zabriskie P, Wirthlin MJ (2017) A hybrid approach to FPGA configuration scrubbing. IEEE Trans Nucl Sci 64:497–503. https://doi.org/10.1109/TNS.2016.2636666
Zhang H, Kochte MA, Imhof ME, Bauer L, Wunderlich H-J, Henkel J (2014) GUARD: Guaranteed reliability in dynamically reconfigurable systems. Paper presented at the Proceedings of the 51st Annual Design Automation Conference. https://doi.org/10.1145/2593069.2593146
Santos R, Venkataraman S, Kumar A (2017) Scrubbing mechanism for heterogeneous applications in reconfigurable devices. ACM Trans Design Autom Electron Syst 22:33. https://doi.org/10.1145/2997646
Giordano R, Perrella S, Izzo V, Milluzzo G, Aloisio A (2017) Redundant-configuration scrubbing of SRAM-based FPGAs. IEEE Trans Nucl Sci 64:2497–2504. https://doi.org/10.1109/TNS.2017.2730960
Sterpone L, Violante M (2006) A new reliability-oriented place and route algorithm for SRAM-based FPGAs. IEEE Trans Comput. https://doi.org/10.1109/TC.2006.82
Huang K, Hu Y, Li X (2014) Reliability-oriented placement and routing algorithm for SRAM-based FPGAs. IEEE Trans Very Large Scale Integr (VLSI) Syst 22:256–269. https://doi.org/10.1109/TVLSI.2013.2239318
Bolchini C, Miele A, Sandionigi C (2011) A novel design methodology for implementing reliability-aware systems on SRAM-based FPGAs. IEEE Trans Comput 60:1744–1758. https://doi.org/10.1109/TC.2010.281
Tambara LA, Almeida F, Rech P, Kastensmidt FL, Bruni G, Frost C (2015) Measuring failure probability of coarse and fine grain TMR schemes in SRAM-based FPGAs under neutron-induced effects. Paper presented at the International Symposium on Applied Reconfigurable Computing. https://doi.org/10.1007/978-3-319-16214-0_28
Yang M, Hua G, Feng Y, Gong J (2017) Fault-tolerance techniques for spacecraft control computers. Wiley, London. https://doi.org/10.1002/9781119107392
Xie G, Zeng G, Chen Y, Bai Y, Zhou Z, Li R, Li K (2017) Minimizing redundancy to satisfy reliability requirement for a parallel application on heterogeneous service-oriented systems. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2017.2665552
Pathan RM (2017) Real-time scheduling algorithm for safety-critical systems on faulty multicore environments. Real-Time Syst 53:45–81. https://doi.org/10.1007/s11241-016-9258-z
Kopetz H (2011) Real-time systems: design principles for distributed embedded applications. Springer, Berlin. https://doi.org/10.1007/978-1-4419-8237-7
Pathan RMJR-TS (2014) Fault-tolerant and real-time scheduling for mixed-criticality systems. Real-Time Syst 50:509–547. https://doi.org/10.1007/s11241-014-9202-z
Kim J, Lakshmanan K, Rajkumar R (2010) R-BATCH: task partitioning for fault-tolerant multiprocessor real-time systems. Paper presented at the 2010 IEEE 10th International Conference on Computer and Information Technology (CIT). https://doi.org/10.1109/CIT.2010.321
Zhu X, Wang J, Guo H, Zhu D, Yang LT, Liu L (2016) Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans Parallel Distrib Syst 27:3501–3517. https://doi.org/10.1109/TPDS.2016.2543731
Löfwenmark A, Nadjm-Tehrani S (2018) Fault and timing analysis in critical multi-core systems—a survey with an avionics perspective. J Syst Archit. https://doi.org/10.1016/j.sysarc.2018.04.001
Yin J-Y, Guo G-C, Wu Y-X (2009) A hybrid fault-tolerant scheduling algorithm of periodic and aperiodic real-time tasks to partially reconfigurable FPGAs. Paper presented at the 2009 ISA 2009 International Workshop on Intelligent Systems and Applications. https://doi.org/10.1109/IWISA.2009.5072624
Yin J, Zheng B, Sun Z (2012) A hybrid real-time fault-tolerant scheduling algorithm for partial reconfigurable system. JCP 7:2773–2780. https://doi.org/10.4304/jcp.7.11.2773-2780
Ramezani R, Sedaghat Y, Clemente JA (2017) Reliability improvement of hardware task graphs via configuration early fetch. IEEE Trans Very Large Scale Integr (VLSI) Syst 25:1408–1420. https://doi.org/10.1109/TVLSI.2016.2631724
Say F, Bazlamaçcı CF (2012) A reconfigurable computing platform for real time embedded applications. Microprocess Microsyst 36:13–32. https://doi.org/10.1016/j.micpro.2011.08.013
Herrera-Alzu I, Lopez-Vallejo M (2014) System design framework and methodology for Xilinx Virtex FPGA configuration scrubbers. IEEE Trans Nucl Sci 61:619–629. https://doi.org/10.1016/j.micpro.2011.08.013
Monson JS, Wirthlin M, Hutchings B (2012) A fault injection analysis of Linux operating on an FPGA-embedded platform. Int J Reconfigurable Comput 2012:7. https://doi.org/10.1155/2012/850487
Ramezani R, Sedaghat Y, Naghibzadeh M, Clemente JA (2018) A decomposition-based reliability and makespan optimization technique for hardware task graphs. Reliab Eng Syst Saf 180:13–24. https://doi.org/10.1016/j.ress.2018.07.007
Clemente JA, Resano J, González C, Mozos D (2011) A hardware implementation of a run-time scheduler for reconfigurable systems. IEEE Trans Very Large Scale Integr (VLSI) Syst 19:1263–1276. https://doi.org/10.1109/TVLSI.2010.2050158
Bushnell M, Agrawal V (2004) Essentials of electronic testing for digital, memory and mixed-signal VLSI circuits. Springer, Berlin. https://doi.org/10.1007/b117406
Ramezani R, Clement JA, Sedaghat Y, Mecha H (2016) Estimation of hardware task reliability on partially reconfigurable FPGAs. Paper presented at the 2016 16th European Conference on Radiation and Its Effects on Components and Systems (RADECS). https://doi.org/10.1109/RADECS.2016.8093184
Mottaghi MH, Zarandi HR (2014) DFTS: a dynamic fault-tolerant scheduling for real-time tasks in multicore processors. Microprocess Microsyst 38:88–97. https://doi.org/10.1016/j.micpro.2013.11.013
Hazucha P, Svensson C (2000) Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans Nucl Sci 47:2586–2594. https://doi.org/10.1109/23.903813
Koren I, Krishna CM (2010) Fault-tolerant systems. Morgan Kaufmann, Burlington. https://doi.org/10.1016/b978-0-12-088525-1.x5000-7
Namazi A, Safari S, Mohammadi S (2018) CMV: clustered majority voting reliability-aware task scheduling for multicore real-time systems. IEEE Trans Reliab. https://doi.org/10.1109/TR.2018.2869786
Kuo W, Prasad VR (2000) An annotated overview of system-reliability optimization. IEEE Trans Reliab 49:176–187. https://doi.org/10.1109/24.877336
Haahr M (2019) RANDOM.ORG: True Random Number Service. https://www.random.org. Accessed Sept 2018
Clemente JA, Beretta I, Rana V, Atienza D, Sciuto D (2014) A mapping-scheduling algorithm for hardware acceleration on reconfigurable platforms. ACM Trans Reconfigurable Technol Syst (TRETS) 7:9. https://doi.org/10.1145/2611562
Danne K, Platzner M (2006) An EDF schedulability test for periodic tasks on reconfigurable hardware devices. Paper presented at the ACM SIGPLAN Notices. https://doi.org/10.1145/1159974.1134665
Steiger C, Walder H, Platzner M, Thiele L (2003) Online scheduling and placement of real-time tasks to partially reconfigurable devices. Paper presented at the RTSS 2003. 24th IEEE Real-Time Systems Symposium. https://doi.org/10.1109/REAL.2003.1253269
XilinxCorporation Virtex-5 FPGA Configuration User Guide, UG191 (v 3.11). http://www.xilinx.com/support/documentation/user_guides/ug191.pdf. Accessed Sept 2018
Tylka AJ, Adams JH, Boberg PR, Brownstein B, Dietrich WF, Flueckiger EO, Petersen EL, Shea MA, Smart DF, Smith EC (1997) CREME96: a revision of the cosmic ray effects on micro-electronics code. IEEE Trans Nucl Sci 44:2150–2160. https://doi.org/10.1109/23.659030
Acknowledgements
The authors wish to sincerely acknowledge and thank Dr. Reza Ramezani, assistant professor at the University of Isfahan. He, generously, provided many constructive comments that greatly assisted the research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghavidel, A., Sedaghat, Y. & Naghibzadeh, M. Hybrid scheduling to enhance reliability of real-time tasks running on reconfigurable devices. J Supercomput 76, 4701–4730 (2020). https://doi.org/10.1007/s11227-019-02976-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-02976-6