Research Activities
The Computer Engineering Group focuses on research acitivities in the area of test and diagnosis of nanoscale "systems-on-a-chip" (SoCs) as well as on fault tolerant design and the verification of fault tolerance properties. Nanoscale manufacturing processes are characterized by an increasing vulnerability to defects and by increasing parameter variations both among and inside chips. The "International Technology Roadmap for Semiconductors" expects that by 2019 feature sizes of 7 nm will be reached, and with current design stategies yields would go down to 20% or even 10%. Furthermore systems will become more susceptible to transient faults during system operation, and reliability will be an issue no longer restricted to safety critical systems.
To overcome these problems, innovative diagnosis techniques are needed to support an early and efficient identification of manufacturing problems. Furthermore, a robust system design becomes mandatory to tolerate defects to a certain extent. However, this in turn renders testing into a challenging task.
Faster-than-at-speed Test
Although small delay defects do not change the circuit outputs, they may indicate potential early life failures. If they cannot be propagated along long paths, the test must be performed at a frequency higher than the intended nominal frequency. This may in turn lead to problems in response compaction and evaluation, as the results at the endpoints of longer paths may not yet have stabilized. The research activities in this area aim at optimal selection of test frequencies and at the development of appropriate techniques for response compaction.
[more]
Testing Robust Systems
For robust systems it is no longer sufficent to identify and sort out failing chips during manufacturing test, but a measure for the remaining robustness of the system is needed ("quality binning").
Diagnosis
Diagnosis plays an important role throughout the life cycle of a system. The research focus here is on embedded techniques supporting volume and in-field diagnosis. Furthermore, advanced diagnosis approaches are developed which do not only point to fault locations but also classify faults into permanent, transient, and intermittent faults. This is particularly important for robust architectures, where transient faults may be uncritical, but permanent and intermittent faults can lead to system failures.
Approximate Fault Tolerance
Approximate Computing exploits the application specific fault tolerance for design optimization. To protect approximate designs against arbitrary faults, specific fault tolerance measures must be applied. While approximate computing and fault tolerance have been considered as orthogonal approaches so far, the work here focuses on a unified strategy for "approximate fault tolerance".