FMEA and Functional FMEA as a generic analysis method
Transcription
FMEA and Functional FMEA as a generic analysis method
FMEA and Functional FMEA A generic approach Tor Stålhane, NTNU, IDI FMEA problems Trammel and Davis: • Have to go through the FMEA process with a lot of unimportant failure modes. • Component-by-component review takes a considerable amount of time => teams get frustrated since most components assessed had minimal or no impact on the system Kmenta and Ishii: • Performed too late, FMEA does not affect key product / process decisions • FMEA does not capture key failures • FMEA is often an afterthought “checklist exercise” • The process is tedious • The Risk Priority Number gives a distorted measure of risk. Our solution – a generic approach Generic failure modes instead of component specific failure modes to obtain: • A consistent, sufficient and uniform set of failure modes. • Simplified FMEA, using only a sufficient level of details. • Not having to invent failure modes => free analyst to focus on failure causes, consequences and prevention. Generic failure modes => reuse of • Results from components and component assemblies. • Actions which are used to mitigate the failure modes. Using generic failure modes Be ware: Generic failure modes • Are not a replacement for using you head • Are most useful in the early stages where we still have a lot of choices when it come to – architecture – barrier solutions • Should be used as guide words in the analysis Generic failure modes - CESAR Component type Software systems - control system, e.g., a PLC Hardware component, e.g. a pump or a sensor Failure mode Omission – something is not done, no action Commission – something more is done Wrong action Too late – right action but too late No action Wrong action Generic failure modes - NRC ID Failure mode A1 Fail to perform the function at the required time A2 Fail to perform the function with correct value A3 Performance of an unwanted function A4 Interference or unexpected coupling with another module Elaboration Deviation from requirement in time domain Deviation from requirement in value domain Deviation from expected performance Deviation from expected system performance due to module interaction Remarks Omission, No action, No output, Reacts too late Wrong output Commission, Wrong action Commission Harrison’s human failure modes No 1 2 3 4 5 6 Failure mode Errors of omission Comments Something that should have been done is not done Errors of commission Something more has been done is not done Errors of sequence Actions has been done in a wrong sequence Errors of repetition An actions has been done too many / few times Too much / little Too much or too little has been done Too early / late / An action was done too early / too late / for long too long Enables the inclusion of operators in the safety analysis Simple FMEA – early phases Failure effect on the next Recommendations Unit description Failure mode Failure cause level Failure description FMEA in the early phases of product development has much in common with Preliminary Hazard Analysis – PHA “Recommendations” should be used for • Possible barriers • Design changes • Requirements changes Functional FMEA Function Functional Effects failure mode Function description Current detection Cause method Comments Generic functional failure modes – should be used as guidewords: • Over • Under • No • Intermittent • Unintended FMEA vs. Functional FMEA – 1 We can use • FMEA when focusing on the components’ inner working • Functional FMEA when focusing on the functionality that a component provides to its environment - usually parameter values FMEA vs. Functional FMEA – 2 FMEA for each control unit. Functional FMEA for the boiler, which provides the real value / service to the users 7 3 8 6 1 2 8 4 5 Extended FMEA – Trammell and Davis Project title FMEA type Design System Group Prepared by Core team Requirement Failure mode Effect of failure SEV Cause or mechanism OCC Current design and control DET RPN Recommendations SEV OCC • SEV: severity of failure mode if it occurs • OCC: rate of occurrence of the failure mode for • Current design • After design changes • DET: rate of detection – how often do we detect this failure mode if it occurs. • RPN: SEV * P(event | not detected) * P(not detected) DET RPN Why generic fault trees Several organizations have developed generic fault trees for systems in their domains. A fault tree • do not identify how a failure propagates through a system to cause an accident • can be used to see which failures can cause the top event in the fault tree. An FMEA can show whether the root cause events can occur for the system under consideration. Generic fault tree – water supply failure Generic fault tree – blowout Combining FMEA and FTA Generic hazards Murphy’s law: “If something can go wrong, it will”. However: What can go wrong, depends on the system’s operating environment. A generic hazard list is a list of hazards that are possible in a certain domain or environment Simple hazard lists Hazard list item Task description / location Reference no. Hazard description Date last reviewed Risk Assessment Matrix score Initial Controls / Barriers Reference Owner Status Current Grow your own hazard list You should start with a list of the domain-specific generic hazards. Add items identified from brainstorming and experience from: • Developers • Maintenance personnel • Users Remember to include all barriers – also those that are just planned for. From hazard to catastrophe To analyse a potentially dangerous event we need to consider • Where will it hurt us – e.g. people, assets, environment or reputation • How likely is it that it will hurt us – the accident likelihood • How can we prevent the accident – barriers and controllability Likelihood Reputation Environment Assets People Severity Consequences 0 No injury or No damage health effect No effect No impact 1 Slight injury or health effect Slight damage Slight effect Slight impact 2 Minor injury Minor or health damage effect Minor effect Minor impact 3 Major injury Moderate or health damage effect Moderate effect Moderate impact 4 PTD or more Major than 3 damage fatalities Major effect Major impact 5 More than 3 Massive fatalities damage Massive effect Massive impact A Never heard of in the industry B Heard of in the industry C D Has happened in the organization or more than once per year in the industry Has happened at the location or more than once per year in the organization E Has happened more than one per year at the location Controls / barriers Controls and barriers are used actively in some risk assessment models – e.g. ISO 26262. All hazard list items should include all barriers that are • Tried – been used before • Possible – known from literature, other industries etc. Controls, barriers and the FMEA results • Prevention. Change the design or implementation to remove or reduce the probability of the failure’s occurrence => preventing a risk (potential problem) from becoming a real problem • Handling. Prevent the failure’s consequences • Reduction. Reduce or control the failure’s consequences. Barriers Failure propagation From failure to failure mode Cause – consequence diagram Sensor error Wrong temperature Increase heat Too high pressure Too hot vessel Wrong command Safety valve OK Sensor s.a. low N Y Combustibles N Y Y Explosion CU EQ EN Fire N Input Focused FMEA – IF-FMEA Component ID Output failure Description mode FM1 Description of FM1 FM2 … List of input sources Component Component input deviation malfunction Input deviations Component that can cause failures that can FM1 cause FM1 … … Input source 1 Component Input source 2 Input source 3 Output l Temp. sensor Temp. controller Pressure sensor Component: Temp. controller Output failure Description mode On / Off signal to heating unit Generic functional failure modes – should be used as guidewords: • Over • Under • No Input sources: Temp. sensor, Pressure sensor Input deviation Component l malfunction Omission: fail to update based on The controller sensor signals should have kept Commission: Turn on when inactive (no Under – too low something not should have kept change needed) from temp and/or related to temp or or should have pressure off pressure input is turned the heat done instead on Wrong: controller sends wrong signal … … … … Example – initial hospital system X-ray machine PC Network X-ray Data base FMEA with generic failure modes Unit description PC and medic X-ray machine and operator Failure description Failure mode Failure cause Wrong action No action Wrong action No action Network Wrong action No action X-ray database Wrong action No action Failure effect on the next level Get wrong X-ray Get no info X-ray linked to wrong person Don’t work at all Recommendations Bar-code reader for patient ID Reliability requirements Sends info to Reliability wrong address requirements Reliability No traffic requirements Local, temporary database Returns wrong Reliability info requirements No info returned Mirror database Example – final hospital system X-ray machine PC Network Temp Data base Controller X-ray Data base Mirror Data base Bar code reader