What is the purpose of Failure Mode and Effect Analysis ?

Started by learn 7 months ago3 replieslatest reply 6 months ago77 views
Our embedded controller is Advanced Drive Assist Systems(ADAS).  It basically has Two software layers.  Application software and low-level software.  

Our controller has many low-level functions such as different communications protocols, different types of memory, Digital outputs, PWM Outputs, and so much more.

What might be the benefit of performing Failure Mode and Effect Analysis(FMEA) on these low-level functions?  Is the purpose of this exercise to catch low-level software design problems?  Also, would this analysis help in debugging?

How to perform thorough FMEA on low-level software for a typical Automotive ADAS Controller?
For example, one Failure mode may be Loss of I2C communication or intermittent I2C communication.  How to come up with all possible potential effects of this failure?  How to come up with all possible Potential Causes of Failure?

How to identify all possible Failure Modes?
[ - ]
Reply by beningjwOctober 19, 2020

The purpose of a FMEA is to identify potential failures modes and to then develop a strategy to properly address those failures. Many teams teams overlook potential failures such as a SPI bus lock up. The FMEA identifies these and then for each one you identify what effect it has on the system and what it means to the user. If it is a minor inconvenience, then you may not care about mitigating the issue. If the failure could cause death or injury, then you need to figure out how to counter the failure and minimize the chances that it would occur. 

Too many designers assume their system will never fail. So the FMEA forces them to really think through how their system can fail.

[ - ]
Reply by rmilneOctober 19, 2020

A complete waste of time IMHO.  A company I worked for brought in consultants to teach this stuff and it was obvious that it was designed for molecules and made little sense for software/firmware.  If I spend time fantasizing about a failure mode I can better spend that time writing defensive code with test code that pushes against all possible boundaries.  For board level buses one must conduct tests to check that they operate within the spec under various conditions (ie: temp, BCI, etc). If redundant systems are required for higher safety then that requires another set of tests. 

Investing in a safety package from your chip manufacturer can help if you need ISO 26262 certification.  Likewise for Autosar packages.  

[ - ]
Reply by mr_banditNovember 5, 2020

beningjw: "If the failure could cause death or injury". I would add if failure would cost a huge amount of money.

Spot on. This is critical in avionics DO-178C projects.

I was on an avionics DAL A project (failure meant the airplane crashes). FMEA is a required step.

It is not needed for most projects. However, it can be done in an informal manner. It is useful to think through some basic cases.