Cloud native EDA tools & pre-optimized hardware platforms
Electronics engineers are being thrust into the automotive market like never before. The move to electrify automobiles, along with the advent of self-driving cars, means that silicon designers will be designing ever more sophisticated automotive ICs. But cars aren’t like most other electronic systems; it’s imperative that they cause no harm should they fail.
This brings us to the realm of functional safety, a new layer of responsibility that used to affect the small number of engineers working on military and aerospace designs. It has now gone mainstream, enabled by the ISO 26262 standard that governs how functional safety is to be assured in electronic systems and designs for automobiles. That means that there is a much broader set of engineers affected by functional safety as their companies pivot towards the rapidly growing market for automotive applications.
The challenge is that functional safety adds non-trivial tasks to the design flow. Before the current automotive push, it not only required a lot of expertise in knowing how to implement safety mechanisms such as redundancy; but, it also went against the grain of typical design goals such as optimal area. This doesn’t fit so well into a competitive commercial market where power, performance and area are the key driving goals. Engineers need design tools that allow them to meet their functional-safety requirements while maintaining competitive productivity. Areas that can provide dramatic benefit in simplifying ISO 26262 requirements for IC design are:
The ISO 26262 standard identifies four levels of safety, expressed as Automotive Safety Integrity Levels (ASIL) A through D, with A applying to systems with relatively no safety implications and D referring to systems with higher risk that a failure could cause harm to lives. Many of the newly electrified automotive components are classified as ASIL D; while there has been movement from ASIL B and C designs to ASIL D as autonomous driving moves towards reality. This means that many new systems-on-chip (SoCs) are being developed for ASIL D applications, and those SoCs have to be reliable for a minimum of 10 years.
Figure 1: Automotive design migration to ASIL D with increased autonomous driving capabilities
In order to be compliant with ISO 26262 requirements, companies need to perform a software tool qualification assessment of the EDA tools they use to establish that the design tool will not introduce or fail to detect a functional safety issue in the design. EDA vendors have simplified this process by getting third-party ANSI-accredited assessors to perform ISO 26262 individual tool or tool-chain certification.
Such certification isn’t trivial, however. As an example, Synopsys involved around 100 people over almost a year to achieve certification for its digital and custom/AMS tools. Certification doesn’t imply that a tool is guaranteed never to fail, but rather that, for each possible failure scenario, a solution has been identified to mitigate the failure. These mitigations are documented as conditions-of-use (CoU) and assumptions-of-use (AoU). These CoUs and AoUs are documented in a functional safety manual, which, for every tool, provides guidance for tool use cases when doing safety-critical designs. The information can then be used in the company’s software tool qualification.
Random hardware faults represent uncontrollable events like single-event upsets (SEUs), such as a solar flare, which can cause glitches or loss of state. They can’t be outright eliminated, so their risks must be mitigated through redundancy and other safety mechanisms. The higher the ASIL, such as ASIL D, the more safety mechanisms are required to achieve the metric goal, such as calculations required for single point fault metric (SPFM). For example, ISO 26262 mandates that designs reach the following metrics for different ASIL levels:
Table 1: Increasing requirements for SPFM moving from ASIL B to ASIL D
Mitigation for random hardware faults is responsible for the bulk of the additional design implementation effort. The first step is to identify those logic paths that are vulnerable to such faults. All of the registers along those paths must then be made either redundant or error-tolerant. There are three ways of implementing redundancy, and the one chosen depends on the ASIL level. The options are:
Figure 2: Safety Register types that can be implemented as safety mechanisms
Another safety mechanism that can be implemented, particularly with designs that have processor cores, is dual-core lock-step (DCLS); and, while it can’t correct a fault, it can detect that a fault has occurred in either one of the cores. Two cores with identical input logic run in parallel, and their output is run through a comparator. If the output values are different from each other, it indicates a fault. This can then raise a signal so that the system can take some kind of remedial action, like moving into a known-safe state until the effects of the fault can be neutralized. Here again, careful layout of the two cores is important for ensuring that they are truly independent of each other such that cells or buffers from one core are not placed in the other, and that there is truly physical separation of the two cores. In addition, routes should not be shared or traverse from one core to the other.
While these steps are important for automotive designs, they can also be extremely time-consuming if done manually, especially for new automotive designers. Automation is the key to meeting the requirements of both a competitive market and ISO 26262. But, in order to automate this, we first need a way of expressing our functional-safety intent for consumption by the tools. This makes it possible for the tools to implement the various safety mechanisms.
The first step is to analyze the safety critical paths to identify those that must be enhanced with safety circuits. Once identified, those paths can then have the redundancy or fault-tolerant registers automatically inserted, placing them carefully with proper separation and, if needed, with taps on either side of the register. This makes them less susceptible to disturbances from the SEU fault. The rest of the tools must then respect those specific elements, since the natural inclination of tools is to optimize redundancy away. Finally, a verification step is needed to confirm that all of the desired circuits have been created, placed, and routed correctly.
One important consideration, however, is the ability to have a backup tool, which is always a good practice for verifying that safety elements are implemented. If you’re relying on a single tool for a task like, say, synthesis, then you really want another tool that can confirm that the task was performed correctly. For synthesis, you have numerous different verification tools that can act as that backup – including the new functionality needed to confirm the redundancy of safety circuits.
There’s no question that safety-critical design involves tasks and considerations that are not required for other types of design. And no amount of tool automation would make it possible to proceed in a manner oblivious to the safety requirement of the design. But, with certified tools, you have the ability to specify safety intent, you have automation to simplify the analysis and insertion of redundant circuits, and you have plenty of documentation. Given those tools and a modicum of training, SoCs for ASIL-D applications can be created more efficiently, removing much of the friction that has been previously associated with implementation of a safety-critical design.