Failure Identification Using Model-Implemented Fault Injection with Domain Knowledge-Guided Reinforcement Learning
<p>The reinforcement learning algorithm’s block diagram.</p> "> Figure 2
<p>Overview of the proposed method.</p> "> Figure 3
<p>Fault models. (<b>a</b>) Fault model of data changing; (<b>b</b>) fault model of noise addition.</p> "> Figure 4
<p>Adaptive cruise control-model operational modes.</p> "> Figure 5
<p>Adaptive cruise control model.</p> "> Figure 6
<p>Autonomous emergency braking model modes.</p> "> Figure 7
<p>Fault injection setup for ACC using RL.</p> "> Figure 8
<p>Fault value and activation time of trained agent.</p> "> Figure 9
<p>Result of the FI with the trained agent on ACC.</p> "> Figure A1
<p>Severity level of found faults on the RL algorithms when we use default hyperparameters and when we use tuned hyperparameters on ACC.</p> "> Figure A2
<p>Example of other faults.</p> ">
Abstract
:1. Introduction
- Firstly, we introduce a reinforcement learning-based method to identify dynamic faults. The reinforcement learning guides the fault injection simulation to detect faults in the system model by injecting faults.
- Secondly, we focus on extracting high-level domain knowledge. This domain knowledge is based on the model under test (e.g., the model’s architecture, input boundaries, and states), safety requirements, and temporal behavior of the model under test. The domain knowledge helps us to prune the fault space for the RL agent and focus on the most important fault parameters, for example, fault location and fault value boundary. In this way, the RL agent automatically explores the fault space and finds other fault parameters. The model composition impacts fault location and fault boundary value that maximizes the efficiency of fault injection.
- Thirdly, we provide guidelines to configure different parts of the reinforcement learning agent based on the extracted domain knowledge. We define a technique to shape the reward function using temporal logic and safety requirements.
- Finally, we validate our approach using two experimental case studies and discuss the limitations and benefits. Our case studies include the Simulink model of adaptive cruise control systems and autonomous emergency braking systems, two important systems in modern vehicles.
2. Background and Related Works
2.1. Background
2.1.1. Fault Injection
2.1.2. Reinforcement Learning
2.1.3. System Requirements and Signal Temporal Logic
2.2. Related Work
2.2.1. Test Automation Methods
2.2.2. Historical Data
2.2.3. Machine Learning
2.2.4. Using Domain Knowledge
2.2.5. Statistical Methods
2.2.6. Abstracting of System
3. Reinforcement-Based Catastrophic Fault Identification Method
3.1. Collection of Domain Knowledge
- Fault model: A fault is a disturbance in a system that can deviate the system from its correct service. The fault model is a representation of a real-world fault that we use to evaluate its impact. For example, stuck-at-zero and stuck-at-high (the voltage of a power supply) are two common faults in hardware. We can model them in a generic form in such a way that the outcome is similar to a real-world fault. We specify fault models that each have two parameters, such as the amplitude of the fault and the activation time.
- Signal boundaries: This involves knowledge about the ranges of each signal in input, output, or at an intermediate level. This supports the definition of the input space and avoids simulating invalid conditions and out-of-range signals. It also contributes to having a representative fault. Therefore, the fault value remains within the specified range. Otherwise, the fault value does not mimic real-world disturbances.
- Model architecture: This defines how the model is built up in terms of high-level functional blocks, how those blocks are connected and transfer data, how the controllers work, how many states the model has, etc. This helps determine those parts where faults should be injected. The selection of a fault injection target is determined by the test engineer. The test engineer can utilize further information about the critical components, system constraints, complexity, and risk to prioritize the target.
- System specification: This relates to model properties or any metric, such as performance, accuracy, robustness, or dependability (i.e., safety and reliability), that quantifies the model accurately.
3.2. Reinforcement Learning Configuration
- Create environment: provide tabular information for making an action signal, observation signal, reward function, and fault model in the model under test to formulate our problem as a reinforcement learning problem. The extended model (model under test with all mentioned signals and models) is considered the environment for the agent.
- Create agent: define a policy representation and configure the agent learning algorithm. For the agent, one can choose an appropriate agent based on Table 1 and use neural networks (NN) within it. The neural network enables the agent to handle complex environments.
3.2.1. Define Action and Observation Signal
- System/subsystem: a subsystem or system of interest in the model for the fault injection simulation.
- Signal name: the signal name of interest for the fault injection simulation.
- Fault type: the specified fault model for the simulation.
- Fault range: a fault range must be specified, allowing the agent to define the boundaries of space exploration. These ranges help us scale the signal to [−1,1], mainly for performance reasons.
3.2.2. Reward Signals
- Variable/signal: The first element identifies the signal or variable in the model related to the specification.
- Relation: The second element determines the expected relationship between the signal or the variable and the magnitude. There are two types of relations, inverse (I) and direct (D). A direct relation indicates that increasing the specified signal must increase the reward value. By decreasing the value of the specified signal, the reward value must be decreased. On the other hand, an inverse relation means that by increasing the value of the specified signal, the reward value must be decreased and by decreasing the value of the specified signal, the reward value must be increased.
- Strength: To specify the magnitude of the relation (sensitivity), three levels have been defined for I and D: strong strong (SS), strong (S), and normal indicated by no explicit magnitude relationship. In total, there are six magnitude indicators SSD, SD, D, SSI, SI, and I. In the proposed method, S indicates multiplication by ten, and SS indicates multiplication by one hundred.
- Sign: In the last element, the impact of the signal on the reward value is defined. If it increases the reward value, it has a positive (P) sign (as a motivation for reward). A negative (N) sign decreases the reward value (as a penalty for reward).
3.2.3. Fault Injection Models
3.2.4. Agent Type Selection
Agent | Type | Action Space |
---|---|---|
Q-Learning Agents (Q) [60] | Value-Based | Discrete |
Deep Q-Network Agents (DQN) [61] | Value-Based | Discrete |
Policy Gradient Agents (PG) [62] | Policy-Based | Discrete or Continuous |
Actor–Critic Agents (AC) [63] | Actor–Critic | Discrete or Continuous |
Advantage Actor–Critic Agents (A2C) [64] | Actor–Critic | Discrete or Continuous |
Proximal Policy Optimization Agents (PPO) [65] | Actor–Critic | Discrete or Continuous |
Deep Deterministic Policy Gradient Agents (DDPG) [66] | Actor–Critic | Continuous |
Twin-Delayed Deep Deterministic Agents (TD3) [67] | Actor–Critic | Continuous |
Soft Actor–Critic Agents (SAC) [68] | Actor–Critic | Continuous |
3.3. Experiment Composition
3.4. Hyperparameter Tuning, Training, Data Processing, and Visualization
Algorithm 1 Algorithm of the proposed approach. |
|
4. Experimental Study
4.1. Case Studies
4.1.1. Adaptive Cruise Control
4.1.2. Autonomous Emergency Braking
4.2. Domain Knowledge for ACC
4.2.1. Define Action and Observation Signal
- Signal boundaries: the relative distance (relative distance = position of ego car − position of lead car) is between 0 and 200 m; the velocity is between 0 and 30 m per second; and the acceleration is between −3 and 2 m per square second;
- Model architecture: two models define the control algorithm: a spacing control model and a speed control model; there are two sensors such as acceleration and radar; and there are two important signals such as velocity and relative distance;
- System specification: and.
- Running scenario: at the beginning of the simulation, the acceleration of the vehicle is increased and then decreased;
- Simulation configuration: a fixed-step solver is used with a step-size solver of 0.1 s.
4.2.2. Reward Shaping
4.2.3. Fault Injection Models
4.2.4. Agent Type Selection
4.2.5. Experiment Composition
4.2.6. Training, Data Processing, and Visualization
4.3. Experimental Questions and Experimental Setup
4.4. Results
4.5. Validation
4.6. Threads to Validity
5. Discussion
- High computation cost in training and tuning: The agent requires simulation trials to reach the appropriate hyperparameter and iteration in training to converge on the desired result.
- Appropriateness of the domain knowledge: The test engineer needs accurate and sufficient information to set up the agent and framework. The agent will face an enormous or wrong fault space if the data are incorrect or abstract.
- Dependency of result to toolkits, hardware, and the number of iterations: The proposed framework’s result varies if one uses a different CPU, GPU, or software, especially in terms of execution time.
- No performance guarantee, and training may not converge: Discovering dynamic faults depends on the proper agent and experiment configuration. Therefore, if the agent could not find any fault, it does not mean there is no possible fault. Some adjustment and iteration of the agent leads to outcomes.
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
A2C | Advantage Actor Critic |
AC | Actor–Critic |
ACC | Adaptive Cruise Control |
AEB | Autonomous Emergency Braking |
CPS | Cyber-Physical System |
CPU | Central Process Unit |
DDPG | Deep Deterministic Policy Gradient |
DQN | Deep Q-Network Agent |
FCW | Forward Collision Warning |
FI | Fault Injection |
FMEA | Failure Mode and Effect Analysis |
FMECA | Failure Mode Effect and Critically Analysis |
FMI | Functional Mock-up Interface |
FMU | Functional Mockup Unit |
FTA | Fault Tree Analysis |
GB | Giga Byte |
GPU | Graphics Processing Unit |
HP | Hyperparameter |
HPT | Hyperparameter tuning |
MDP | Markov Decision Process |
ML | Machine Learning |
MUT | Model Under Test |
NN | Neural Network |
PG | Policy Gradien |
PPO | Proximal Policy Optimization |
RAM | Random Access Memory |
RBFI | Random-Based Fault Injection |
RL | Reinforcement Learning |
SAC | Soft Actor–Critic Agent |
STL | Signal Temporal Logic |
SUT | System Under Test |
TD3 | Twin-Delayed Deep Deterministic |
TTC | Time-To-Collision |
VHDL | Very High-Speed Integrated Circuit Hardware Description Language |
Appendix A. DDPG Algorithm’s Pseudo-Code
Algorithm A1 DDPG algorithm. |
|
Appendix B. Default Hyperparameters
Algo | Policy | Lambda/Tau | Gamma | Learning Rate | Batch Size |
---|---|---|---|---|---|
A2C | MlpPolicy | 1.0 | 0.99 | 0.0007 | 5 |
SAC | MlpPolicy | 0.005 | 0.99 | 0.0003 | 256 |
DDPG | MlpPolicy | 0.005 | 0.99 | 0.001 | 100 |
TD3 | MlpPolicy | 0.005 | 0.99 | 0.001 | 100 |
PPO | MlpPolicy | 0.95 | 0.99 | 0.0003 | 64 |
Appendix C. Results
Appendix C.1. Severity of Found Faults
Appendix C.2. Effect of Hyperparameters and Reward Type on Number of Found Faults
Use Case | Specification- No HPT | Specification- with HPT | Custom- No HPT | Custom- with HPT | STL- No HPT | STL- with HPT |
---|---|---|---|---|---|---|
ACC | 1 | 2 | 6 | 5 | 3 | 3 |
AEB | 1 | 1 | 0 | 2 | 0 | 3 |
Use Case | A2C Spec. | A2C Cus. | A2C STL | DDPG Spec. | DDPG Cus. | DDPG STL | PPO Spec. | PPO Cus. | PPO STL | SAC Spec. | SAC Cus. | SAC STL | TD3 Spec. | TD3 Cus. | TD3 STL |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ACC | 1 | 2 | 2 | 0 | 2 | 0 | 1 | 0 | 2 | 0 | 5 | 0 | 1 | 2 | 2 |
AEB | 1 | 1 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Appendix C.3. Additional Example of Faults
References
- Lee, E.A. Cyber physical systems: Design challenges. In Proceedings of the 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), Orlando, FL, USA, 5–7 May 2008; pp. 363–369. [Google Scholar]
- Ammann, P.; Offutt, J. Introduction to Software Testing; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
- Dafflon, B.; Moalla, N.; Ouzrout, Y. The challenges, approaches, and used techniques of CPS for manufacturing in Industry 4.0: A literature review. Int. J. Adv. Manuf. Technol. 2021, 113, 2395–2412. [Google Scholar] [CrossRef]
- Avizienis, A.; Laprie, J.; Randell, B.; Landwehr, C. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secur. Comput. 2004, 1, 11–33. [Google Scholar] [CrossRef]
- Koopman, P. The heavy tail safety ceiling. In Proceedings of the Automated and Connected Vehicle Systems Testing Symposium, Greenville, SC, USA, 20–21 June 2018; Volume 1145. [Google Scholar]
- Hsueh, M.C.; Tsai, T.K.; Iyer, R.K. Fault injection techniques and tools. Computer 1997, 30, 75–82. [Google Scholar] [CrossRef]
- Benso, A.; Prinetto, P. Fault Injection Techniques and Tools for Embedded Systems Reliability Evaluation; Springer: Berlin/Heidelberg, Germany, 2003; Volume 23. [Google Scholar]
- Arlat, J.; Costes, A.; Crouzet, Y.; Laprie, J.C.; Powell, D. Fault injection and dependability evaluation of fault-tolerant systems. IEEE Trans. Comput. 1993, 42, 913–923. [Google Scholar] [CrossRef]
- ISO. Road Vehicles–Functional Safety. 2011. Available online: https://www.iso.org/standard/43464.html (accessed on 14 February 2023).
- Koopman, P.; Ferrell, U.; Fratrik, F.; Wagner, M. A Safety Standard Approach for Fully Autonomous Vehicles. In Proceedings of the Computer Safety, Reliability, and Security, Turku, Finland, 11–13 September 2019; Romanovsky, A., Troubitsyna, E., Gashi, I., Schoitsch, E., Bitsch, F., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 326–332. [Google Scholar]
- ISO/PAS 21448; Road Vehicles-Safety of the Intended Functionality (SOTIF). ISO: Geneva, Switzerland, 2019.
- Ubar, R.; Devadze, S.; Raik, J.; Jutman, A. Parallel X-fault simulation with critical path tracing technique. In Proceedings of the 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010), Dresden, Germany, 8–12 March 2010; pp. 879–884. [Google Scholar]
- Zheng, H.; Fan, L.; Yue, S.; Liu, L. A Monte Carlo-based control signal generator for single event effect (SEE) fault injection. In Proceedings of the 2009 European Conference on Radiation and Its Effects on Components and Systems, Bruges, Belgium, 14–18 September 2009; pp. 247–251. [Google Scholar] [CrossRef]
- Moradi, M.; Van Acker, B.; Vanherpen, K.; Denil, J. Model-Implemented Hybrid Fault Injection for Simulink (Tool Demonstrations). In Cyber Physical Systems. Model-Based Design; Chamberlain, R., Taha, W., Törngren, M., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 71–90. [Google Scholar]
- Koopman, P.; Wagner, M. Toward a Framework for Highly Automated Vehicle Safety Validation; SAE Technical Paper Tech. Rep.; SAE International: Warrendale, PA, USA, 2018; pp. 79–99. [Google Scholar] [CrossRef]
- Virtual Test Drive. Available online: https://hexagon.com/products/virtual-test-drive (accessed on 12 February 2023).
- Wicker, M.; Huang, X.; Kwiatkowska, M. Feature-Guided Black-Box Safety Testing of Deep Neural Networks. arXiv 2018, arXiv:1710.07859. [Google Scholar]
- Svenningsson, R.; Vinter, J.; Eriksson, H.; Törngren, M. MODIFI: A MODel-Implemented Fault Injection Tool. In Proceedings of the Computer Safety, Reliability, and Security, Vienna, Austria, 14–17 September 2010; Schoitsch, E., Ed.; Springer: Berlin/Heidelberg, Germany; pp. 210–222. [Google Scholar]
- Lange, T.; Balakrishnan, A.; Glorieux, M.; Alexandrescu, D.; Sterpone, L. On the estimation of complex circuits functional failure rate by machine learning techniques. In Proceedings of the 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks–Supplemental Volume (DSN-S), Portland, OR, USA, 24–27 June 2019; pp. 35–41. [Google Scholar]
- Corso, A.; Moss, R.; Koren, M.; Lee, R.; Kochenderfer, M. A Survey of Algorithms for Black-Box Safety Validation of Cyber-Physical Systems. J. Artif. Intell. Res. 2021, 72, 377–428. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Moradi, M.; Oakes, B.J.; Saraoglu, M.; Morozov, A.; Janschek, K.; Denil, J. Exploring Fault Parameter Space Using Reinforcement Learning-based Fault Injection. In Proceedings of the 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), Valencia, Spain, 29 June–2 July 2020; pp. 102–109. [Google Scholar] [CrossRef]
- Stott, D.T.; Ries, G.; Hsueh, M.C.; Iyer, R.K. Dependability analysis of a high-speed network using software-implemented fault injection and simulated fault injection. IEEE Trans. Comput. 1998, 47, 108–119. [Google Scholar] [CrossRef]
- Abboush, M.; Bamal, D.; Knieke, C.; Rausch, A. Hardware-in-the-Loop-Based Real-Time Fault Injection Framework for Dynamic Behavior Analysis of Automotive Software Systems. Sensors 2022, 22, 1360. [Google Scholar] [CrossRef]
- Bodmann, P.R.; Papadimitriou, G.; Junior, R.L.R.; Gizopoulos, D.; Rech, P. Soft Error Effects on Arm Microprocessors: Early Estimations versus Chip Measurements. IEEE Trans. Comput. 2021, 71, 2358–2369. [Google Scholar] [CrossRef]
- Kiamanesh, B.; Behravan, A.; Obermaisser, R. Fault Injection with Multiple Fault Patterns for Experimental Evaluation of Demand-Controlled Ventilation and Heating Systems. Sensors 2022, 22, 8180. [Google Scholar] [CrossRef]
- Raman, V.; Donzé, A.; Maasoumy, M.; Murray, R.M.; Sangiovanni-Vincentelli, A.; Seshia, S.A. Model predictive control with signal temporal logic specifications. In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, CA, USA, 15–17 December 2014; pp. 81–87. [Google Scholar]
- Vander Heyden, Y.; Nijhuis, A.; Smeyers-Verbeke, J.; Vandeginste, B.; Massart, D. Guidance for robustness/ruggedness tests in method validation. J. Pharm. Biomed. Anal. 2001, 24, 723–753. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Mäntylä, M.; Eldh, S.; Markkula, J.; Wiklund, K.; Kairi, T.; Raulamo-Jurvanen, P.; Haukinen, A. A self-assessment instrument for assessing test automation maturity. In Proceedings of the Evaluation and Assessment on Software Engineering, Copenhagen, Denmark, 15–17 April 2019; pp. 145–154. [Google Scholar]
- Utting, M.; Legeard, B. Practical Model-Based Testing: A Tools Approach; Elsevier: Amsterdam, The Netherlands, 2010. [Google Scholar]
- Nguyen, C.D.; Marchetto, A.; Tonella, P. Combining model-based and combinatorial testing for effective test case generation. In Proceedings of the 2012 International Symposium on Software Testing and Analysis, Minneapolis, MN, USA, 15–20 July 2012; pp. 100–110. [Google Scholar]
- Elsayed, E.A. Overview of Reliability Testing. IEEE Trans. Reliab. 2012, 61, 282–291. [Google Scholar] [CrossRef]
- Wang, D.; Li, S.; Li, C.; Zhang, Y. Fault Diagnosis Analysis and Application of DC-DC Power Supply based on FMEA and FTA. In Proceedings of the 2021 6th Asia Conference on Power and Electrical Engineering (ACPEE), Chongqing, China, 8–11 April 2021; pp. 130–135. [Google Scholar]
- Dugan, J.B.; Sullivan, K.J.; Coppit, D. Developing a low-cost high-quality software tool for dynamic fault-tree analysis. IEEE Trans. Reliab. 2000, 49, 49–59. [Google Scholar] [CrossRef]
- Sanghavi, M.; Tadepalli, S.; Boyle, T.J.; Downey, M.; Nakayama, M.K. Efficient Algorithms for Analyzing Cascading Failures in a Markovian Dependability Model. IEEE Trans. Reliab. 2017, 66, 258–280. [Google Scholar] [CrossRef]
- Malhotra, M.; Trivedi, K.S. Dependability modeling using Petri-nets. IEEE Trans. Reliab. 1995, 44, 428–440. [Google Scholar] [CrossRef]
- Kanoun, K.; Ortalo-Borrel, M. Fault-tolerant system dependability-explicit modeling of hardware and software component-interactions. IEEE Trans. Reliab. 2000, 49, 363–376. [Google Scholar] [CrossRef]
- Lee, R.; Mengshoel, O.J.; Saksena, A.; Gardner, R.W.; Genin, D.; Silbermann, J.; Owen, M.P.; Kochenderfer, M.J. Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning. CoRR 2018. Available online: http://arxiv.org/abs/1811.02188 (accessed on 11 December 2020). [CrossRef]
- da Rosa, F.R.; Garibotti, R.; Ost, L.; Reis, R. Using Machine Learning Techniques to Evaluate Multicore Soft Error Reliability. IEEE Trans. Circuits Syst. I Regul. Pap. 2019, 66, 2151–2164. [Google Scholar] [CrossRef]
- Cotroneo, D.; De Simone, L.; Liguori, P.; Natella, R. Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems. IEEE Trans. Dependable Secur. Comput. 2020, 19, 1476–1491. [Google Scholar] [CrossRef]
- Li, G.; Li, Y.; Jha, S.; Tsai, T.; Sullivan, M.; Hari, S.K.S.; Kalbarczyk, Z.; Iyer, R. AV-FUZZER: Finding Safety Violations in Autonomous Driving Systems. In Proceedings of the 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, Portugal, 12–15 October 2020; pp. 25–36. [Google Scholar] [CrossRef]
- Karunakaran, D.; Worrall, S.; Nebot, E. Efficient statistical validation with edge cases to evaluate Highly Automated Vehicles. arXiv 2020, arXiv:2003.01886. [Google Scholar]
- Ritz, F.; Phan, T.; Müller, R.; Gabor, T.; Sedlmeier, A.; Zeller, M.; Wieghardt, J.; Schmid, R.; Sauer, H.; Klein, C.; et al. SAT-MARL: Specification Aware Training in Multi-Agent Reinforcement Learning. In Proceedings of the 13th International Conference on Agents and Artificial Intelligence 2021, Vienna, Austria, 4–6 February 2021. [Google Scholar] [CrossRef]
- Moradi, M.; Gomes, C.; Oakes, B.J.; Denil, J. Optimizing Fault Injection in FMI Co-Simulation through Sensitivity Partitioning. In Proceedings of the SummerSim ’19 2019 Summer Simulation Conference; Society for Computer Simulation International, San Diego, CA, USA, 22–23 July 2019. [Google Scholar] [CrossRef]
- FMI. Functional Mock-Up Interface for Model Exchange and Co-Simulation; Technical Report; FMI Development Group: Tokyo, Japan, 2014. [Google Scholar]
- Gabbar, H.A.; Damilola, A.; Sayed, H.E. Trend analysis using real time fault simulation for improved fault diagnosis. In Proceedings of the 2007 IEEE International Conference on Systems, Man and Cybernetics, Montreal, QC, Canada, 7–10 October 2007; pp. 3829–3833. [Google Scholar]
- Li, Z.; Menon, H.; Mohror, K.; Bremer, P.T.; Livant, Y.; Pascucci, V. Understanding a Program’s Resiliency through Error Propagation. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming; Association for Computing Machinery: New York, NY, USA, 2021; pp. 362–373. [Google Scholar]
- Fremont, D.J.; Dreossi, T.; Ghosh, S.; Yue, X.; Sangiovanni-Vincentelli, A.L.; Seshia, S.A. Scenic: A language for scenario specification and scene generation. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, Phoenix, AZ, USA, 22–26 June 2019; pp. 63–78. [Google Scholar]
- Leveugle, R.; Calvez, A.; Maistri, P.; Vanhauwaert, P. Statistical fault injection: Quantified error and confidence. In Proceedings of the 2009 Design, Automation Test in Europe Conference Exhibition, Nice, France, 20–24 April 2009; pp. 502–506. [Google Scholar] [CrossRef]
- Xu, X.; Li, M.-L. Understanding soft error propagation using Efficient vulnerability-driven fault injection. In Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012), Boston, MA, USA, 25–28 June 2012; pp. 1–12. [Google Scholar] [CrossRef]
- Iooss, B.; Lemaître, P. A review on global sensitivity analysis methods. Oper. Res. Comput. Sci. Interfaces Ser. 2015, 59, 101–122. [Google Scholar] [CrossRef]
- Kaaniche, M.; Romano, L.; Kalbarczyk, Z.; Iyer, R.; Karcich, R. A hierarchical approach for dependability analysis of a commercial cache-based RAID storage architecture. In Proceedings of the Digest of Papers, Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing, Munich, Germany, 23–25 June 1998; Cat. No. 98CB36224. pp. 6–15. [Google Scholar] [CrossRef]
- Sartor, A.L.; Becker, P.H.; Beck, A.C. A Fast and Accurate Hybrid Fault Injection Platform for Transient and Permanent Faults. Des. Autom. Embedded Syst. 2019, 23, 3–19. [Google Scholar] [CrossRef]
- Schneider, E.; Kochte, M.A.; Wunderlich, H.J. Multi-Level Timing Simulation on GPUs. In Proceedings of the ASPDAC ’18 23rd Asia and South Pacific Design Automation Conference, Jeju, Republic of Korea, 22–25 January 2018; pp. 470–475. [Google Scholar]
- Liu, F.; Ozev, S. Statistical Test Development for Analog Circuits Under High Process Variations. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2007, 26, 1465–1477. [Google Scholar] [CrossRef]
- Hu, Y.; Wang, W.; Jia, H.; Wang, Y.; Chen, Y.; Hao, J.; Wu, F.; Fan, C. Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping. arXiv 2020, arXiv:2011.02669. [Google Scholar]
- Grzes, M. Reward shaping in episodic reinforcement learning. In Proceedings of the 16th Conference On Autonomous Agents And MultiAgent Systems, Sao Paulo, Brazil, 8–12 May 2017; pp. 565–573. [Google Scholar]
- Peng, X.B.; Abbeel, P.; Levine, S.; van de Panne, M. DeepMimic: Example-guided Deep Reinforcement Learning of Physics-based Character Skills. ACM Trans. Graph. 2018, 37, 143:1–143:14. [Google Scholar] [CrossRef]
- Laud, A.D. Theory and Application of Reward Shaping in Reinforcement Learning; University of Illinois at Urbana-Champaign: Champaign, IL, USA, 2004. [Google Scholar]
- Watkins, C.J.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing Atari with Deep Reinforcement Learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Williams, R.J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 1992, 8, 229–256. [Google Scholar] [CrossRef]
- Lowe, R.; Wu, Y.; Tamar, A.; Harb, J.; Abbeel, P.; Mordatch, I. Multi-agent actor–critic for mixed cooperative-competitive environments. arXiv 2017, arXiv:1706.02275. [Google Scholar]
- Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Lillicrap, T.P.; Harley, T.; Silver, D.; Kavukcuoglu, K. Asynchronous Methods for Deep Reinforcement Learning. arXiv 2016, arXiv:1602.01783. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. CoRR 2017, arXiv:abs/1707.06347. Available online: http://arxiv.org/abs/1707.06347 (accessed on 12 February 2023).
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2019, arXiv:1509.02971. [Google Scholar]
- Fujimoto, S.; van Hoof, H.; Meger, D. Addressing Function Approximation Error in Actor–Critic Methods. arXiv 2018, arXiv:1802.09477. [Google Scholar]
- Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft Actor–Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. arXiv 2018, arXiv:1801.01290. [Google Scholar]
- Zoph, B.; Le, Q.V. Neural Architecture Search with Reinforcement Learning. arXiv 2017, arXiv:1611.01578. [Google Scholar]
- Denil, J.; Mosterman, P.J.; Vangheluwe, H. Rule-based model transformation for, and in simulink. In Proceedings of the Symposium on Theory of Modeling & Simulation-DEVS Integrative, Tampa, FL, USA, 13–16 April 2014; pp. 1–8. [Google Scholar]
- Oakes, B.J.; Moradi, M.; Van Mierlo, S.; Vangheluwe, H.; Denil, J. Machine Learning-Based Fault Injection for Hazard Analysis and Risk Assessment. In Proceedings of the Computer Safety, Reliability, and Security, York, UK, 8–10 September 2021; Habli, I., Sujan, M., Bitsch, F., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 178–192. [Google Scholar]
- Winner, H.; Witte, S.; Uhler, W.; Lichtenberg, B. Adaptive cruise control system aspects and development trends. In SAE Transactions; SAE International: Warrendale, PA, USA, 1996; pp. 1412–1421. [Google Scholar]
- MathWorks. Adaptive Cruise Control System Using Model Predictive Control. Available online: https://nl.mathworks.com/help/mpc/ug/adaptive-cruise-control-using-model-predictive-controller.html (accessed on 12 February 2023).
- MathWorks. Autonomous Emergency Braking with Sensor Fusion. Available online: https://nl.mathworks.com/help/driving/ug/autonomous-emergency-braking-with-sensor-fusion.html (accessed on 12 February 2023).
- Moradi, M.; Van Acker, B.; Denil, J. Failure Identification using Model-Implemented Fault Injection with Domain Knowledge-Guided Reinforcement Learning. Zenodo 2022. [Google Scholar] [CrossRef]
- Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. OpenAI Gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019. [Google Scholar]
- Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-Baselines3: Reliable Reinforcement Learning Implementations. J. Mach. Learn. Res. 2021, 22, 1–8. [Google Scholar]
- Wohlin, C.; Runeson, P.; Höst, M.; Ohlsson, M.C.; Regnell, B.; Wesslén, A. Experimentation in Software Engineering; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
System/Subsystem | Signal Name | Fault Type | Fault Range |
---|---|---|---|
Ego Vehicle | Velocity | Data change | [−3, 2] |
Sensor | Radar | Noise | [−1, 1] |
Sensor | Acceleration | Noise | [−2, 2] |
Controller | Relative Distance | Data change | [0, 200] |
System/Subsystem | Signal Name | Range |
---|---|---|
Controller | Safe Distance | [0, 50] |
Controller | Relative Velocity | [0, 200] |
Sensor | Longitudinal Velocity | [0, 60] |
Sensor | Lateral Velocity | [0, 10] |
Variable/Signal | Relation and Strength | Sign |
---|---|---|
Relative Distance | SD | N |
Time | D | N |
Velocity | D | P |
Use Case | RL without HPT | RL with HPT |
---|---|---|
ACC | 5.8 h | 114.8 h |
AEB | 5.9 h | 155.7 h |
Use Case | RL without HPT | RL with HPT | Small RBFI | Large RBFI |
---|---|---|---|---|
ACC | 10 | 10 | 0 | 1 |
AEB | 1 | 6 | 0 | 3 |
Use Case | RL without HPT | RL with HPT | Small RBFI | Large RBFI |
---|---|---|---|---|
ACC | 118.5 | 136.1 | - | 8.28 |
AEB | 1.34 | 1.34 | - | 1.92 |
Use Case | SAC- Default HP | SAC- with HPT | DDPG- Default HP | DDPG- with HPT | PPO- Default HP | PPO- with HPT | TD3- Default HP | TD3- with HPT | A2C- Default HP | A2C- with HPT |
---|---|---|---|---|---|---|---|---|---|---|
ACC | 2 | 3 | 2 | 0 | 1 | 2 | 1 | 4 | 4 | 1 |
AEB | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Moradi, M.; Van Acker, B.; Denil, J. Failure Identification Using Model-Implemented Fault Injection with Domain Knowledge-Guided Reinforcement Learning. Sensors 2023, 23, 2166. https://doi.org/10.3390/s23042166
Moradi M, Van Acker B, Denil J. Failure Identification Using Model-Implemented Fault Injection with Domain Knowledge-Guided Reinforcement Learning. Sensors. 2023; 23(4):2166. https://doi.org/10.3390/s23042166
Chicago/Turabian StyleMoradi, Mehrdad, Bert Van Acker, and Joachim Denil. 2023. "Failure Identification Using Model-Implemented Fault Injection with Domain Knowledge-Guided Reinforcement Learning" Sensors 23, no. 4: 2166. https://doi.org/10.3390/s23042166
APA StyleMoradi, M., Van Acker, B., & Denil, J. (2023). Failure Identification Using Model-Implemented Fault Injection with Domain Knowledge-Guided Reinforcement Learning. Sensors, 23(4), 2166. https://doi.org/10.3390/s23042166