Fault-Space Approximation using Basic-Block Fault Injection

Due to shrinking transistor sizes and operating voltages, transient hardware faults causes by single event upsets (SEU), also called soft errors, become an emerging challenge for safety-critical systems. SEUs could appear by radiation e.g. and can be mitigated by fault tolerance mechanisms. Testing such mechanisms is nearly impossible under realistic influences (like radiation canons) because radiation is non-deterministic in general.

Testing fault tolerance mechanisms is commonly done by performing extensive fault injection (FI) experiments on a system that try to mimic either the physical causes for SEUs or their effects and then observing the system’s behaviour. The biggest advantage doing such FIs are repeatable experiment results while testing fault tolerance mechanisms.

There are two dimensions of possible fault injections: Every bit in every cycle. When and where a FI is useful for testing safety and robustness of a system is one of the main questions of FI. Evaluating all possible injections in this huge fault space is effectively impossible.

During an execution of a function, registers could be injected in its function space of the fault space. There are many opportunities for FIs again and the number of possible FIs rises the more instructions would be executed.

Compilers usually decompose source codes into their so called basic blocks. Such basic blocks form vertices in a control flow graph. Basic blocks are straight-line code sequences with no branches except to the entry and exit point due to a basic block. Basic blocks could be examined by disassembling a program and are relatively pleasant to work with.

One idea for an approximation is to generalise the register injections and inject an exit point of a basic block only when a written register value at an exit point would be read by a different part of the program.
It should be compared if both ways (register and basic-block injection) lead to the same behaviour and if this concept of approximation is useful or not while reducing the number of needed FIs.

FIs could be done by using the C++ application FAIL*. This fault injection tool is able to simulate fault injections in x86 processors and should be used/extended by the idea above.

  • H. Schirmeier, M. Hoffmann, C. Dietrich, M. Lenz, D. Lohmann, and O. Spinczyk. FAIL*: An open and versatile fault-injection framework for the assessment of software-implemented hardware fault tolerance. In Proceedings of the 11th European Dependable Computing Conference (EDCC '15), pages 245–255. IEEE Computer Society Press, Sept. 2015. PDF
  • H. Schirmeier, M. Hoffmann, R. Kapitza, D. Lohmann, and O. Spinczyk. FAIL*: Towards a versatile fault-injection experiment framework. In G. Mühl, J. Richling, and A. Herkersdorf, editors, 25th International Conference on Architecture of Computing Systems (ARCS '12), Workshop Proceedings, volume 200 of Lecture Notes in Informatics, pages 201–210. German Society of Informatics, Mar. 2012. PDF

Further Reading