Safety I/II/III/IV?
Different perspectives on safety management exist: Traditionally, safety is defined as the absence of accidents and incidents. Safety II views safety as the ability to succeed under varying conditions. Safety III incorporates systems theory and defines safety as freedom from unacceptable losses. Risk science can improve safety management by integrating uncertainties and risk into modeling and analysis, enhancing the understanding of complex systems, and providing decision support for risk assessments. To properly define, assess, and manage safety, the relationship between safety and risk must be considered. Safety should be understood as the absence of unacceptable risks, and risk assessments should focus on enhancing risk understanding. Contemporary risk management and risk science embrace resilience-based thinking and methods to address future expected and surprising events.
Terje Aven aims to stimulate a discussion on integrating risk and safety sciences. Different perspectives exist on safety:
- Safety I defines safety as the absence of accidents and incidents, focusing on failures and malfunctions in systems. It employs traditional risk assessment methods to evaluate potential scenarios and responds to unacceptable risks by eliminating causes or improving barriers. Safety I assumes accurate representations of the actual system or activity. Safety I is primarily reactive in its safety management principle, responding to historical events and risk calculations. It’s based on simple cause-effect relationships and the concept of root causes to explain accidents. Safety I relies on the establishment of accurate models to control and avoid risk events, considering systems as tractable and predictable. Safety I is suitable for simple, linear, tractable, and complicated systems, where the principles of functioning are known and a detailed description of the system can be provided. Regarding risk assessments, Safety I relies on traditional technical methods such as fault tree analysis, event tree analysis, and probabilistic risk assessments to estimate and evaluate risk. Safety I primarily relies on learning from failures, mistakes, and accidents. Failure indeed plays an important role in learning and progress. Learning from mistakes is significant and supported by evidence from various fields. But safety should not be defined as the absence of risk if risk is solely defined through probabilities.
- Safety II views safety as the ability to succeed under varying conditions, emphasizing the importance of making sensible adjustments to cope with situational demands. Safety II states that accurate models do not exist for intractable systems. It’s proactive and continuously anticipating developments. Safety II explains accidents by referring to combinations of variables rather than establishing a direct causal link. The effects of variable combinations are governed by functional resonance, where the variability of different functions can interact and produce disproportionate outcomes. Safety II argues that many real-life systems are intractable and cannot be accurately modeled. It’s focused on intractable sociotechnical systems, where the principles of functioning are only partly known, descriptions are elaborate with many details, and the system changes before descriptions can be completed. Safety II criticizes traditional risk assessments for their limitations in capturing important aspects of safety, such as variabilities, human and organizational factors, and dependencies between system elements. Safety II is more proactive, focusing on variability and trying to anticipate developments and future events. Learning is considered central. Safety II emphasizes learning from what goes right, recognizing that understanding performance variability and adjustments is crucial for improving performance. It suggests that learning from small but frequent events is more effective than focusing on rare events with severe outcomes. Aven writes that the safety concept should not only be linked to success but also encompass undesirable events or consequences. The definition of Safety II, which emphasizes the ability to succeed, requires clarification on what constitutes "no success" or failure. It’s crucial to specify what undesirable events or consequences are associated with safety in order to make meaningful judgments about its magnitude.
- Safety III, proposed by Leveson, incorporates systems theory and defines safety as freedom from unacceptable losses. It aims to prevent hazards, control risks, and allow for flexibility and resilience in handling unexpected events. Safety III emphasizes the use of process models by humans to understand complex phenomena, leaving out unimportant factors. It concentrates on preventing hazards and losses while learning from accidents, incidents and system performance audits. Safety III acknowledges that accidents are caused by inadequate control over hazards and emphasizes the need to understand causality to prevent accidents and reduce their consequences. Linear causality, causal loops, and multiple systemic factors are considered in understanding accidents. Safety III acknowledges the importance of causality models but emphasizes that traditional linear models are insufficient to understand complex phenomena. It introduces STAMP (System-Theoretic and Processes) as an alternative model that treats the system as a whole, considers emergent properties, and focuses on dynamic control rather than failure prevention. Safety III considers both linear and more complex sociotechnical systems. Safety III emphasizes hazard analysis as a proactive tool to understand the system and identify hazards, highlighting the limitations of traditional risk assessments for analyzing complex systems. Learning is considered central.
Aven further writes that the severity of undesirable events or consequences should be taken into account, rather than defining safety solely based on the absence of such events. Comparing two activities, both with no accidents or losses during a specific period, the definitions of Safety I, II, and III fail to differentiate between them. But intuitively, one activity may be much safer than the other if it has a lower probability of leading to extreme outcomes or severe losses.
Another issue is related to uncertainty and the future. The definitions of Safety I, II, and III do not address uncertainty, which is a crucial aspect when considering safety. Uncertainty refers to the lack of knowledge about the consequences of an activity. To incorporate uncertainty, the concept of risk is introduced, which reflects the potential for an activity to have undesirable consequences. Risk has two dimensions: events and consequences, and associated uncertainties. Probability is commonly used to measure uncertainty, but it is not sufficient on its own. The strength of knowledge supporting the probabilities must also be considered, including factors such as assumptions, data, agreement among experts, and understanding of the phenomena involved.
The judgment of safety is closely linked to the judgment of risk. A safe activity implies a sufficiently low risk, considering both known and unknown factors. The concept of unknown unknowns, events that are completely unknown to the scientific environment, can be acknowledged in the risk assessment qualitatively but not quantitatively. When judging safety as high or low, the evaluation of risk, including unknown unknowns, follows the same logic and rationale.
In light of these discussions, an alternative definition of Safety I is proposed: absence of unacceptable risks. Safety can be considered achieved when the risk is judged to be sufficiently low and the risks are deemed acceptable based on various factors such as standards, comparisons with similar activities, costs, and risk perception. Accepting a considerable risk for an activity may mean that safety is not achieved. The judgment of safety depends on weighing the risks and making a judgment call based on the concept of risk and the absence of unacceptable risks.
Risk assessments provide an improved understanding of the system or activity under study, including the interactions between subsystems and components. Different types of models, such as system models and probability models, are developed to capture this understanding. System models are commonly used in risk assessments and reflect variability and causality. They express the relationship between input quantities (variables) and the output quantity (variable) of the system. These models can be simple, linear, and causal, representing the state of components and the overall system state. The accuracy of system models is important, as there may be a deviation between the actual output and the model output. For simple and linear systems, proper modeling can control this error and provide accurate approximations. For intractable and complex systems, accurate modeling becomes challenging, and alternative models like FRAM and STAMP are needed. These models focus on understanding the system rather than accurately estimating risk. Uncertainties need to be considered and incorporated into these models to apply them effectively in risk assessments. Probability models are another category of models used in risk assessments. They are functions of probability models of system components' characteristics. Frequentist probabilities, as well as knowledge-based (subjective, judgmental) probabilities, can be used depending on the situation. Safety II and Safety III approaches have potential for further development by incorporating uncertainties and risk in modeling and analysis. This would enhance the understanding of safety and support decision-making.
Safety research is concerned with determining causal relationships between factors and outcomes. The definition of causality includes factors that are sufficient for the outcome to occur, precede the outcome, covary with the outcome, and have no better competing explanation for the relationship.
Aven uses the term "resilience management" instead of "resilience engineering" to encompass a broader range of contexts. Contemporary risk science recognizes the importance of resilience and resilience management as key elements of risk management. Strengthening resilience reduces risks, especially concerning potential surprises and unforeseen events. Risk management knowledge is essential for resilience management, as illustrated by the example of ventilators during the COVID-19 pandemic. However, resilience management and risk management are often treated as separate activities with limited scientific and professional interactions. Safety I, characterized by historical data and probabilities, is associated with a traditional, narrow perspective on risk. Safety II, on the other hand, emphasizes resilience management and proactive adjustment of performance to maintain system functionality. Current practice and standards do not fully capture the message of Safety II and that a shift is needed in risk assessments to accommodate resilience-based thinking.
In the conclusion, Aven highlights the potential for further development of the Safety I, II, and III perspectives by integrating knowledge from risk science. Safety cannot be properly defined, assessed, and managed without considering risk. The traditional probabilistic perspectives often referenced in safety science do not align with contemporary risk science knowledge, and incorporating this knowledge can strengthen safety science. Modern risk science concepts and principles provide a suitable framework for understanding, characterizing, communicating, and managing safety in all types of applications. They consider uncertainty, potential surprises, and robustness/resilience. Risk assessments aimed at improving risk understanding can offer valuable decision support for intractable and complex systems. There is a close relationship between the safety and risk fields and contributions from both are needed for their future development. Recent efforts have called for strengthening the foundations of safety science and risk science and integrating theories and practices. This work is essential.
Conclusions
Safety cannot be meaningfully defined without considering its relationship to risk. Safety should be understood as the antonym of risk or as the absence of unacceptable risks.
Safety II and Safety III provide system models that complement linear causal models, particularly for assessing intractable and complex systems. These models can be improved by incorporating uncertainty and knowledge considerations.
Risk assessments focused on enhancing risk understanding can provide valuable decision support, even for intractable and complex systems.
Safety and risk management should be better integrated to ensure and improve performance and prevent undesirable events. Contemporary risk management and risk science embrace resilience-based thinking and methods to address future expected and surprising events.
Risk assessment plays a role in Safety III and the STAMP (Systems-Theoretic Accident Model and Processes) approach, even though Safety III is not based on uncertainties and risk. Both Safety II and Safety III adopt a systems approach, which aligns with the view of risk management and risk science. Risk assessments can be strengthened and integrated further into Safety III and STAMP using contemporary risk science.
While the common perspective is that system performance improvements can be a goal, they should not be considered inherent aspects of safety and resilience concepts. The antifragility concept embraces stressors, failures, and mistakes to improve the system over time.
Source: Aven, T. (2022), A risk science perspective on the discussion concerning Safety I, Safety II and Safety III, Reliability Engineering & System Safety, Volume 217, January 2022, 108077.