CLASSY-FI: Cross-Layer Application-Specific Synthesis and Analysis of Fault Injection

Since the identification of physical causes for soft errors in the 1970s, the sensitivity of circuits for soft errors has been increasing due to voltage and structure shrinking. Functional safety standards, such as ISO 26262-9, demand explicit measures to assess (and, if necessary, mitigate) the effect of soft errors on safety and robustness. This is commonly done by performing extensive fault injection (FI) experiments on the target system that try to mimic the effects of transient faults (by changing logic signals) and then observing the system’s behavior with respect to its functional specification.

Logic faults can be injected on the pin, flip-flop, ISA, or even program level – and it is an open question, which level is “best” to assess a system’s robustness: Higher levels provide for a higher fault-injection efficiency, but lower levels are closer to the physical reality and various researchers have shown that the injection level of the commonly assumed single event upsets (SEUs) can have quite an impact on the results. In general, the lower the injection level (e.g., flip-flop level vs. ISA level), the more precisely we mimic the effects of real SEUs in the hardware.

The goal of the CLASSY-FI project is to derive constructive methods and techniques for scalable, yet precise and complete FI to experimentally assess the robustness of safety-critical embedded control systems against soft errors. The key idea behind the CLASSY-FI method is an application-specific cross-layer data-flow analysis, by which we consider the program–hardware-specific fault-propagation structure systematically on different levels. This fosters a hybrid approach for FI on multiple levels: Virtually cover all faults in the lower-level fault space (􏰀→ precision and completeness), but do the actual injections in the higher-level fault space (􏰀→ scalability) whenever possible and semantically equivalent. Otherwise inject on the lower level, supported by problem-specific hardware-assisted fault injection (HAFI) techniques.

Technically, CLASSY-FI aims at a methodology for the automatic generation of application–hardware-specific fault space (FS) construction and FI implementations that are highly tailored towards the actual system under test. Scientifically, we thereby provide new insights into the questions: (1) How well (quantitatively) does precise ISA-level FI cover precise lower-level FI, also with respect to single flip-flop (FF) faults that evolve to ISA-level multi-bit errors? (2) Is it feasible to reach full fault-space coverage on FF level (for reasonably sized systems) by an application-specific multi-layer fault-space analysis and automatic derivation of campaign-tailored hybrid FI platforms? (3) What is the influence of the μ-architecture on ISA-level FI coverage and to what degree can they be reused in case of incremental evolution of the μ-architecture?


Latest News

2019-12-03 Fault-Space Regions at PRDC '19

Oskar Pusz presents our paper Program-Structure–Guided Approximation of Large Fault Spaces at the 24th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC '19) in Kyoto, Japan. In the paper we describe an approach to reduce the number of required fault injections using program-structure informations while aiming full fault-space coverage. Results show that injections can be reduced by up to 76 percent with an deviation of less than 2.7 percent and we keep the locality of the results regarding silent data corruptions to a low deviation.

2018-06-27 Cross-Layer Fault Space Pruning at DAC 2018
Our paper Cross-Layer Fault-Space Pruning for Hardware-Assisted Fault Injection is presented by Christian Dietrich at the 55th Design Automation Conference in San Francisco. The paper describes a method to calculate fault-masking terms that are used to prune the fault space of a flip-flop level fault injection dynamically. Thereby, we can shrink the fault space by up to 20 percent.


PRDC Conference B
Program-Structure–Guided Approximation of Large Fault Spaces
Oskar Pusz, Daniel Kiechle, Christian Dietrich, Daniel Lohmann2019 24th Pacific Rim International Symposium on Dependable Computing (PRDC'19)IEEE Computer Society Press2019.
PDF Slides 10.1109/PRDC47002.2019.00044 [BibTex]
DAC Conference A
Cross-Layer Fault-Space Pruning for Hardware-Assisted Fault Injection
Christian Dietrich, Achim Schmider, Oskar Pusz, Guillermo Payá-Vayá, Daniel LohmannProceedings of the 55th Annual Design Automation Conference 2018 (DAC '18)ACM Press2018.
PDF Slides Raw Data 10.1145/3195970.3196019 [BibTex]


Open Topics

Fault-Space Pruning by Dynamic Register-Usage Recording

Typ: Masterarbeit
Status: offen
Supervisors: Christian Dietrich, Daniel Lohmann
In this thesis, the SAIL compiler should be extended to allow the C-emulator to record all dynamic register reads and writes to these state registers. This information should then be integrated into the FAIL* toolchain to inject only those state registers that are actually used by a given executed instruction.

Currently Running

Formalizing the Execution Semantics of the AVR Instruction Set with the Description Language SAIL

Typ: Bachelorarbeit
Status: reserviert
Supervisors: Christian Dietrich, Oskar Pusz, Daniel Lohmann
Bearbeiter: Luca Nedaskovskij
Implementing the AVR-processor instruction-set architecture in SAIL for generating emulators automatically.

Transient-Fault Resilience of a Capability-enabled Processor Plattform

Typ: Masterarbeit
Status: laufend
Supervisors: Christian Dietrich, Daniel Lohmann
Bearbeiter: Malte Bargholz
Integration of SAIL-based MIPS and CHERI emulators into the FAIL* fault-injection tool and quantitative fault-resilience comparision.

Finished Theses

Data-Flow Analysis for Fault-Equivalence Set Forming on the ISA Layer

Typ: Bachelorarbeit
Status: abgeschlossen
Supervisors: Oskar Pusz, Christian Dietrich, Daniel Lohmann
Bearbeiter: Zena Obeidi (abgegeben: 01. Mar 2019)

Acceleration of Fault-Injection Campaigns through Early Timeout Detection

Typ: Masterarbeit
Status: abgeschlossen
Supervisors: Oskar Pusz, Daniel Lohmann
Bearbeiter: Felix Siegel (abgegeben: 22. May 2020)
Developing methods to avoid unnecessary fault-injection campaign run time

Schotbruch: Automatisierte Ableitung von Injektionsplattformen für transiente Hardwarefehler aus formalen Prozessormodellen

Typ: Masterarbeit
Status: abgeschlossen
Supervisors: Christian Dietrich, Daniel Lohmann
Bearbeiter: Marcel Budoj (abgegeben: 08. May 2019)
Use SAIL language to integrate an ISA implementations into a fault injection framework. Different CPU architectures shall be evaluated for reliability. [PDF]