Industrial Process Safety
Transcription
Industrial Process Safety
Industrial Process Safety Lessons from major accidents and their application in traditional workplace safety and health Graham D. Creedy, P. Eng, FCIC, FEIC Formerly Senior Manager, Responsible Care® Canadian Chemical Producers’ Association (now Chemistry Industry Association of Canada) [email protected] System Safety Society Spring Event May 26, 2011 Overview • How I got into this • The evolution of the philosophy of industrial safety and prevention of major accidents • Some key insights and concepts • How these apply to management of workplace safety in various sectors and at different levels of the organization 2 Some history • 1984 Bhopal accident is wake-up call to chemical industry • Industry responsibility to understand and control hazards and risks • Responsible Care launched in Canada – Principles, codes, commitment, tools, support, progress tracking, verification • Major Industrial Accidents Council of Canada 1987-1999 3 Safety Performance by Industry Sector Injuries & illnesses per 200,000 hours worked (2002) Services Finance, insurance & real estate Wholesale & retail trade Transportation & public utilities Petroleum and coal products Chemicals and allied products Printing & publishing Pulp & paper Textiles & apparel Food & food products Transportation equipment Electronic and electrical equipment Industrial machinery & equipment Primary metal industries Construction Mining Agriculture, forestry & fishing 0.0 2.0 4.0 6.0 8.0 10.0 12.0 Source: US Bureau of Labor Statistics (www.bls.gov/iif) 4 Relative risks of fatal accidents in the work place of selected occupations Fishers (as an occupation) 35.1 Timber cutters (as an occupation) 29.7 Airplane pilots (as an occupation) 14.9 Garbage collectors 12.9 Roofers 8.4 Taxi drivers 8.2 Farm occupations 6.5 Protective services (fire fighters, police guards, etc.) 2.7 “Average job” 1.0 Grocery store employees 0.91 Chemical and allied products 0.81 Finance, insurance and real estate 0.23 Sanders, R.E, J. Hazardous Materials 115 (2004) p143, citing Toscano (1997) 5 Chemistry Industry Association of Canada Member Performance CIAC website www.canadianchemistry.ca Staff contact: Stephanie Butler 613-237-6215 x 245 6 Incident Pyramid: 1 Serious/Disabling/Fatalities 10 Medical Aid Case 30 Property Loss/1st Aid Treatment 600 Near Misses Unsafe Behaviors/Conditions 10,000 A “proactive” approach focuses on these categories, but be careful – you may miss the really serious ones! 7 Terminology • Process hazard – A physical situation with potential to cause harm to people, property or the environment • Risk (acute) – probability x consequences of an undesired event occurring 8 They thought they were safe • “Good” companies can be lulled into a false sense of security by their performance in personal safety and health • They may not realise how vulnerable they are to a major accident until it happens • Subsequent investigations typically show that there were multiple causes, and many of these were known long before the event BP Deepwater Horizon 9 Why and how defences fail • People often assume systems work as intended, despite warning signs • Examples of good performance are cited as representing the whole, while poor ones are overlooked or soon forgotten • Analysis of failure modes and effects should include human and organizational aspects as well as equipment, physical and IT systems 10 Process safety management • Recognition of seriousness of consequences and mechanisms of causation lead to focus on the process rather than the individual worker • Many of the key decisions influencing safety may be beyond the control of the worker or even the site – they may be made by people at another site, country or organization • Causes differ from those for personnel safety • Need to look at the whole – materials, equipment and systems – and consider individuals and procedures as part of the system • Management system approach for control Flixborough, Bhopal, Pasadena 11 Scope (elements of process safety management) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Accountability Process Knowledge and Documentation Capital Project Review and Design Procedures Process Risk Management Management of Change Process and Equipment Integrity Human Factors Training and Performance Incident Investigation Company Standards, Codes and Regulations Audits and Corrective Actions Enhancement of Process Safety Knowledge CCPS: Guidelines for Technical Management of Chemical Process Safety 12 Functions of a management system Planning Leadership Results Organizing Structure Measurement Controlling Direction Implementing CCPS: Guidelines for Technical Management of Chemical Process Safety 13 Features and characteristics of a management system for process safety Planning Explicit goals and objectives Well-defined scope Clear-cut desired outputs Consideration of alternative achievement mechanisms Well-defined inputs and resource requirements Identification of needed tools and training Implementing Detailed work plans Specific milestones for accomplishments Initiating mechanisms Organizing Strong sponsorship Clear lines of authority Explicit assignments of roles and responsibilities Formal procedures Internal coordination and communication Controlling Performance standards and measurement methods Checks and balances Performance measurement and reporting Internal reviews Variance procedures Audit mechanisms Corrective action mechanisms Procedure renewal and reauthorization CCPS: Guidelines for Technical Management of Chemical Process Safety 14 Strategic Managerial Planning Planning Task Planning Organizing Organizing Implementing Organizing Controlling Implementing Controlling Controlling Examples of PSM management systems concerns at different organizational levels CCPS: Guidelines for Technical Management of Chemical Process Safety 15 Self-assessment of Current Status Process Safety Management Requirements to Achieve the ESSENTIAL Level For each survey question, indicate the level of awareness and use at the site by marking the appropriate box, based on the following: A page from the Site SelfAssessment Tool A Widespread and comprehensive use wherever significant hazard potential exists. B Moderate use, but coverage is uneven from unit to unit or not comprehensive in view of potential hazards. C Appropriate personnel are aware of this item and its application, but little or no actual use. D Little awareness or use of this item. Mark the box labeled "Help" if this is an item where you are in urgent need of guidance. We’ll have a team member contact you with advice on how and where to get the information or help. 1. (b) Is there a system for control of contractor operations? (b) Is there current comprehensive documentation covering the process operating basis, including both normal and abnormal conditions? Process Safety Review Procedures for Capital Projects (a) Are all project proposals for new or modified facilities subjected to documented hazard reviews before approval to proceed? (b) Are systems established to ensure that the facility is built as designed? (c) Is there an effective link between design modifications and operating procedures? 4. Process Knowledge and Documentation (a) Are the safety, health and environmental hazards of materials on site clearly defined? 3. Current Status A B C D Accountability: Objectives and Goals (a) Are responsibilities clearly defined and communicated, with those responsible held accountable? 2. Want Help Process Risk Management (a) Is there a system, conducted by competent personnel, to identify and assess the process hazards from materials present at this site? (b) Are corrective actions defined and implementation followed up? (c) Are the above items formally documented? 16 Use of self-assessment tool for collective progress reporting and action As of August 29, 2008 compared with past five years (some site changes) Target for meeting Essential level: June 30, 2003 Excellent 160 140 Enhanced 120 100 Essential 80 60 Almost at Essential 40 "In Progress" 20 0 2002 2003 (137 sites) (141 sites) 2004 2005 (134 sites) (143 sites) 2006 2007 (139 sites) (145 sites) 2008 (129 sites) 17 Process-Related Incident Measure (PRIM) 2007 Findings: All Elements Incidents Analyzed PRIM INCIDENT CAUSE ANALYSIS 1998/1999 TO 2007 40 35 30 25 20 15 10 5 0 98/99 2000 2001 2002 2003 2004 2005 2006 2007 . n s rs nt ge ce ge al s m. rity gs tio i gn o e n n u o t g s e a ion ed a a c c t m e l e G g t R i h c o a e t m w / r F C g In A s &D no s& &D rfo nt es ve of na ve an K i d e e w n t a ve t e I i o y m e n P t c g m i t t M c & /C Hu rre ev en me led ui p afe isk g o bj e d e R s 7 i q w S n R g t d i C c O r c s a n n : s no &E da es an rai s& 9I es oje s n t c ility r sK i c T M s a o b s d t o r e 8 5 e lP ta Pr Au tP yS ita roc un 4 roc n n 1 p P o e P a 1 a 6 p 2 cc em 3C om 1A nc C a h 10 En 2 1 PSM Element Possibly Involved 18 Assessing an organization’s safety effectiveness • What is the safety policy and culture (written, unwritten)? • How are the following handled? – Establishing what has to be done • Benchmarking • Communicating • Assigning accountabilities – Ensuring that it gets done • Monitoring and corrective action • Evidence (documentation) and audit process • Resourcing – not only for ideal but for anticipated conditions • Balancing with other priorities • How are exceptions handled? 19 Consider targets in groups • Those who: – Don’t care – Don’t know (and perhaps don’t know that they don’t know) – Did know, but may have forgotten or could have gaps in application (and perhaps don’t realize it) 20 Excellent guidance exists – but how is it being used? 21 Percent adoption The New Product Introduction Curve • Can be applied to adoption of new ideas • Categories differ by ability and more importantly, motivation 22 Accountability • Management commitment at all levels • Status of process safety compared to other organizational objectives such as output, quality and cost • Objectives must be supported by appropriate resources • Be accessible for guidance, communicate and lead 23 Management of Change • • • • • • Change of process technology Change of facility Organizational changes Variance procedures Permanent changes Temporary changes 24 Process and Equipment Integrity • Design to handle all anticipated conditions, not just ideal or typical ones • Make sure what you get is what you designed (construction, installation) • Test to make sure the design is indeed valid • Make sure it stays that way – Preventative maintenance – Ongoing maintenance – Review • Be especially careful of automatic safeguards 25 • Consider operator as fallible human performing tasks in background • Design for error tolerance, not just prevention – detection – correction Buncefield, UK 26 Realization of significance of sociocultural factors in human thought processes and hence in behaviours 27 Human behaviour aspects • People, and most organizations, don’t intend to get hurt (have accidents) • To understand why they do leads us eventually into understanding human behaviour, both at the individual and organizational level, and involves: Familiarity to engineers More – Physical interface • Ergonomics – Psychological interface • Perception, decision-making, control actions – Human thought processes • Basis for reaching decisions • Ideal versus actual behaviour – Social psychology • Relationships with others • Organizational behaviour Less 28 Human behaviour modes • Instead of looking at the ways in which people can fail, look at how they function normally: • Skill-based – Rapid responses to internal states with only occasional attention to external info to check that events are going according to plan – Often starts out as rule-based • Rule-based – IF…, THEN… – Rules need not make sense – they only need to work, and one has to know the conditions under which a particular rule applies • Knowledge-based – Used when no rules apply but some appropriate action must be found – Slowest, but most flexible 29 SSAP The ‘Swiss cheese’ model of organisational accidents 2 Hazards Some holes due To active failures Other holes due to latent conditions Losses Successive layers of defences Reason’s “Cheese Model” James Reason - The Management of Safety, SSAP Launch Event 17/02/2004 James Reason, presentation to Eurocontrol 2004 30 Active and latent failures • Active – Immediately adverse effect – Similar to “unsafe act” • Latent – Effect may not be noticeable for some time, if at all – Similar to “resident pathogen”. Unforeseen trigger conditions could activate the pathogens and defences could be undermined or unexpectedly outflanked 31 A Classic Example of a Latent Failure • Hazard of material known, but lack of awareness of potential system failure mode leads to defective procedure design through management decision Epichlorhydrin fire, Avonmouth, UK 32 And another Danvers, MA, Nov 2006 Solvent explosion at printing ink factory • Hazards known, but defences compromised by apparently benign change • Latent error in procedure design creates vulnerability to likely execution error US Chemical Safety Board 33 And another • Hazard of material not obvious (despite history) • Latent error allowed dust to accumulate, creating conditions for subsequent events Scottsbluff, NE 1996 Port Wentworth, GA 2007 34 Lessons from other fields • Aerospace and nuclear show how significant human and organizational aspects can be even where the obvious signs of failure are technical in nature • Finance shows: – Relevance of such factors without technical distractions – How fast a system can deteriorate once controls are relaxed – How wrong risk assessments can influence bad policy decisions 35 Relevance of organizational factors “The relevance of organizational factors has also been graphically and tragically revealed in the inquiry reports of recent UK transportation and offshore oil disasters. Prior to ..., senior managers in all the organizations propounded the pre-eminence of safety. They believed in the efficacy of the regulatory system, in the adequacy of their existing programs, and in their confidence of the skills and motivation of their staff. The inquiry reports reveal that their belief in safety was a mirage, their systems inadequate, and operator errors and violations commonplace. The inquiry reports stated that ultimate responsibility lay with complacent directors and managers who had failed to ensure that their good intentions were translated into a practical and monitored reality. Moreover, the weaknesses so starkly revealed were not matters of substantial concern to the regulatory authorities before the accidents.” HSC, 1993 36 Factors that can influence likelihood of failure • Organizational culture – “the way we do things around here – when no-one is looking” – increasingly being recognized as one of the most important factors in major accidents – perceived balance between output, cost and safety is heavily dependent on this culture, and influences whether personnel work in a certain way because they believe the company and their co-workers feel it is the right way to do things, or whether they are simply “going through the motions.” 37 In general, safety gets better as society learns more Standard of Safety Time 38 But the rate of improvement is not steady Standard of Safety x 10 Time 39 In fact, the curve can be one of periodic rapid gains followed by gradual but increasing declines Note how the rate of decay can be expected to increase due to normalization of deviance Standard of Safety x 100 Time 40 Organizational Culture Model James W. Bayer, Senior VP Mfg, Lyondell Chemical Company Strong Tribal Operational Excellence Chaotic Bureaucratic People Weak Systems Strong 41 Preservation – or loss – of corporate memory • Demographic effects – – – – Less staff Experienced cohort leaving or left Skills transfer senior > (middle) > junior Replacements understand the way something is done, but not why it is done that way, the potential consequences of doing it differently and how to detect and recover from undesired actions • “We are starting to see lowered standards of design and supervision that fifteen years ago would have been unthinkable in the chemical industry” (Challenger, 2004) 42 • What does an organization’s investigation of its failures reveal about its: – Culture – Management system? 43 • Knowledge – Never realized problem could occur (benchmarking error) • was it treated as a unique deficiency? • was there a broader review of the benchmarking process to find if there are other areas where knowledge could be deficient? • Policy – Thought situation would be acceptable but didn’t realize full implications until it happened • Does it appear to be acceptable now? • Was review of policy and accountability limited or broad in scope? • System design – Even if everything had been done as intended, problem would still have occurred • How comprehensive was analysis of system deficiencies and practicality of solutions? • How effective is action plan and follow through? • Was review of system design limited or broad in scope? • System execution (management system error) – Problem occurred because someone or something did not perform as intended • Did analysis consider why execution not as intended? • Was corrective action appropriate and balanced? • Was review of system execution limited or broad in scope? 44 Dealing with a Safety (or Engineering) Problem • Finding out who you’re dealing with – Where is the organization on the curve? (generally, and re the specific issue or problem) – Where are the people you’re dealing with on the curve? (generally, and re the issue or problem) • Finding out what to do – “Benchmark” – don’t try to reinvent the wheel unless you’re sure there isn’t one already (or you’ve time and it’s fun to do so) – Find out what others are doing about it – Read the instructions – Identify/define the issue – If it’s likely to be regulated, check with government agencies, trade associations, web, internet – If not regulated but likely good industry practice, check suppliers, other users of same material or item, other users of similar items, other industry contacts – but test the info!!! (cross-check, ask if it makes sense) – Check standard reference works, (Lees, CCPS, etc) • Doing it – Try to think of all situations that are likely to occur (process, eqpt, people) – “KISS”, keep it user-friendly, show basis for decisions if practical to do so – Follow up afterwards to see how it’s working 45 Questions? 46