ATS.OR.200 Safety management system

Regulation (EU) 2017/373

An air traffic services provider shall have in place a safety management system (SMS), which may be an integral part of the management system required in point ATM/ANS.OR.B.005, that includes the following components:

(1) Safety policy and objectives

(i) Management commitment and responsibility regarding safety which shall be included in the safety policy.

(ii) Safety accountabilities regarding the implementation and maintenance of the SMS and the authority to make decisions regarding safety.

(iii) Appointment of a safety manager who is responsible for the implementation and maintenance of an effective SMS;

(iv) Coordination of an emergency response planning with other service providers and aviation undertakings that interface with the ATS provider during the provision of its services.

(v) SMS documentation that describes all the elements of the SMS, the associated SMS processes and the SMS outputs.

(2) Safety risk management

(i) A process to identify hazards associated to its services which shall be based on a combination of reactive, proactive and predictive methods of safety data collection.

(ii) A process that ensures analysis, assessment and control of the safety risks associated with identified hazards.

(iii) A process to ensure that its contribution to the risk of aircraft accidents is minimised as far as is reasonably practicable.

(3) Safety assurance

(i) Safety performance monitoring and measurement means to verify the safety performance of the organisation and validate the effectiveness of the safety risk controls.

(ii) A process to identify changes which may affect the level of safety risk associated with its service and to identify and manage the safety risks that may arise from those changes.

(iii) A process to monitor and assess the effectiveness of the SMS to enable the continuous improvement of the overall performance of the SMS.

(4) Safety promotion

(i) Training programme that ensures that the personnel are trained and competent to perform their SMS duties.

(ii) Safety communication that ensures that the personnel are aware of the SMS implementation.

GENERAL — NON-COMPLEX ATS PROVIDERS

(a) The safety policy should include a commitment to improve towards the highest safety standards, comply with all the applicable legal requirements, meet all the applicable standards, consider the best practices and provide the appropriate resources.

(b) In cooperation with other stakeholders, the air traffic services provider should develop, coordinate and maintain an emergency response plan (ERP) that ensures orderly and safe transition from normal to emergency operations and return to normal operations. The ERP should determine the actions to be taken by the air traffic services provider or specified individuals in an emergency and reflect the size, nature and complexity of the activities performed by the air traffic services provider.

(c) Safety risk management may be performed using hazard checklists or similar risk management tools or processes, which are integrated into the activities of the air traffic services provider.

(d) An air traffic services provider should manage safety risks related to changes. Management of changes should be a documented process to identify external and internal changes that may have an adverse effect on safety. It should make use of the air traffic services provider’s existing hazard identification, risk assessment and mitigation processes.

(e) An air traffic services provider should identify persons who fulfil the role of safety managers and who are responsible for coordinating the safety management system (SMS). These persons may be accountable managers or individuals with an operational role in the air traffic services provider.

(f) Within the air traffic services provider, responsibilities should be identified for hazard identification, risk assessment and mitigation.

SAFETY POLICY — COMPLEX ATS PROVIDERS

(a) The safety policy should:

(1) be signed by the accountable manager;

(2) reflect organisational commitments regarding safety and its proactive and systematic management;

(3) be communicated, with visible endorsement, throughout the air traffic services provider; 

(4) include safety reporting principles;

(5) include a commitment to:

(i) improve towards the highest safety standards;

(ii) comply with all the applicable legal requirements, meet all the applicable standards and consider the best practices;

(iii) provide appropriate resources; and

(iv) enforce safety as one primary responsibility of all managers and staff;

(6) include the safety reporting procedures;

(7) clearly indicate which types of operational behaviours are unacceptable, and include the conditions under which disciplinary action would not apply; and

(8) be periodically reviewed to ensure it remains relevant and appropriate.

(b) Senior management should:

(1) continually promote the safety policy to all personnel and demonstrate their commitment to it;

(2) provide necessary human and financial resources for its implementation; and

(3) establish safety objectives and performance standards.

SAFETY POLICY — COMPLEX ATS PROVIDERS

Operational behaviour, when disciplinary action would not apply, could be where someone is not blamed for reporting something which would not have been otherwise detected.

SAFETY POLICY — COMPLEX ATS PROVIDERS

(a) The safety policy should state that the purpose of safety reporting and internal investigations is to improve safety, not to apportion blame to individuals.

(b) An air traffic services provider may combine the safety policy with the policy required by ATM/ANS.OR.B.005(a)(2).

SAFETY POLICY — NON-COMPLEX ATS PROVIDERS

(a) The safety policy should state that the purpose of safety reporting is to improve safety, not to apportion blame to individuals.

(b) An air traffic services provider may combine the safety policy with the policy required by ATM/ANS.OR.B.005(a)(2).

ACCOUNTABILITIES — COMPLEX ATS PROVIDERS

The SMS of the air traffic services provider should ensure that:

(a) everyone involved in the safety aspects of the provision of air traffic services has an individual safety responsibility for their own actions;

(b) managers should be responsible for the safety performance of their respective departments or divisions; and

(c) the top management of the provider carries an overall safety responsibility.

SAFETY ACTION GROUP — COMPLEX ATS PROVIDERS

(a) A safety action group may be established as a standing group or as an ad hoc group to assist or act on behalf of the safety review board as defined in point (b) of AMC2 ATS.OR.200(1)(ii);(iii).

(b) More than one safety action group may be established depending on the scope of the task and the specific expertise required.

(c) The safety action group should report to and take strategic direction from the safety review board and should comprise managers, supervisors and personnel from operational areas.

(d) The safety action group should:

(1) monitor operational safety;

(2) resolve identified risks;

(3) assess the impact on safety of operational changes; and

(4) ensure that safety actions are implemented within agreed timescales.

(e) The safety action group should review the effectiveness of previous safety recommendations and safety promotion.

(f) Members of the safety action group should participate in the local runway safety team as per GM2 ADR.OR.D.027 ‘Safety programmes’.

ORGANISATION AND ACCOUNTABILITIES

An air traffic service provider should:

(a) identify the safety manager who, irrespective of other functions, has ultimate responsibility and accountability, on behalf of the organisation, for the implementation and maintenance of the SMS;

(b) clearly define lines of safety accountability throughout the organisation, including a direct accountability for safety on the part of senior management;

(c) identify the accountabilities of all members of management, irrespective of other functions, as well as of employees, with respect to the safety performance of the SMS;

(d) document and communicate safety responsibilities, accountabilities and authorities throughout the organisation; and

(e) define the levels of management with authority to make decisions regarding safety risk tolerability.

ORGANISATION AND ACCOUNTABILITIES — COMPLEX ATS PROVIDERS

The SMS of the air traffic services provider should encompass safety by including a safety manager and a safety review board in the organisational structure.

(a) Safety manager

(1) The safety manager should act as the focal point and be responsible for the development, administration and maintenance of an effective SMS. He or she should be independent of line management, and accountable directly to the highest organisational level.

(2) The role of the safety manager should, as a minimum, be to:

(i) ensure that hazard identification, risk analysis and management are undertaken in accordance with the SMS processes;

(ii) monitor the implementation of actions taken to mitigate risks;

(iii) provide periodic reports on safety performance;

(iv) ensure maintenance of safety management documentation;

(v) ensure that there is safety management training available and that it meets acceptable standards;

(vi) provide advice on safety matters; and

(vii) monitor initiation and follow-up of internal occurrence/accident investigations.

(3) The safety manager should have:

(i) adequate practical experience and expertise in air traffic services or a similar area;

(ii) adequate knowledge of safety and quality management;

(iii) adequate knowledge of the working methods and operating procedures; and

(iv) comprehensive knowledge of the applicable requirements in the area of air traffic services.

(b) Safety review board

(1) The safety review board should be a high-level committee that considers matters of strategic safety in support of the accountable manager’s safety accountability.

(2) The board should be chaired by the accountable manager and composed of heads of functional areas.

(3) The safety review board should, as a minimum:

(i) monitor safety performance against safety policy and objectives;

(ii) ensure that any safety action is taken in a timely manner; and

(iii) monitor the effectiveness of the air traffic services provider’s SMS processes.

(4) The safety review board should ensure that appropriate resources are allocated to achieve the planned safety performance.

(5) The safety manager or any other relevant person may attend, as appropriate, safety review board meetings. He or she may communicate to the accountable manager all information, as necessary, to allow decision-making based on safety data.

SAFETY MANAGER — COMPLEX ATS PROVIDERS 

(a) Depending on the size of the air traffic services provider and the nature and complexity of their activities, the safety manager may be assisted by additional safety personnel in the performance of all the safety-management-related tasks.

(b) Regardless of the organisational set-up, it is important that the safety manager remains the unique focal point as regards the development, administration and maintenance of the air traffic services provider’s SMS.

SAFETY MANAGER — NON-COMPLEX AIR TRAFFIC SERVICES PROVIDERS

In the case of a non-complex air traffic services provider, the function of the safety manager could be combined with another function within the organisation provided that sufficient independence is guaranteed.

COORDINATION OF EMERGENCY RESPONSE PLANNING FOR ATS PROVIDERS — COMPLEX ATS PROVIDERS

(a) An air traffic services provider should develop, coordinate and maintain a plan for its response to an emergency. It should:

(1) reflect the nature and complexity of the activities performed by the air traffic services provider;

(2) ensure an orderly and safe transition from normal to emergency operations;

(3) ensure safe continuation of operations or return to normal operations as soon as practicable; and

(4) ensure coordination with the ERPs of other organisations, where appropriate.

(b) For emergencies occurring at the aerodrome or in its surroundings, the plan should be aligned with the aerodrome ERP and be coordinated with the aerodrome operator.

TYPES OF EMERGENCIES

At least the following types of emergencies may be considered:

(a) aircraft emergencies;

(b) natural phenomena (e.g. extreme weather conditions);

(c) acts of terrorism;

(d) loss of the ability to communicate with the aircraft; and

(e) loss of the air traffic services unit.

COORDINATION OF THE EMERGENCY RESPONSE PLANNING FOR ATS PROVIDERS — COMPLEX ATS PROVIDERS

For aerodrome-related emergencies, please refer to GM4 ADR.OPS.B.005(a) ‘Aerodrome Emergency Planning’.

SAFETY MANAGEMENT MANUAL (SMM) — COMPLEX ATS PROVIDERS

The safety management manual should be the key instrument for communicating the approach to safety for the air traffic services provider. The SMM should document all aspects of safety management, including but not limited to the:

(a) scope of the SMS;

(b) safety policy and objectives;

(c) safety accountability of the accountable manager;

(d) safety responsibilities, accountabilities and authorities of key safety personnel throughout the air traffic services provider;

(e) documentation control procedures;

(f) hazard identification and safety risk management schemes;

(g) safety performance monitoring;

(h) incident investigation and reporting;

(i) emergency response planning;

(j) management of change (including organisational changes with regard to safety responsibilities and changes to functional systems); and

(k) safety promotion.

SAFETY RECORDS — COMPLEX ATS PROVIDERS

Safety records that should be maintained and retained include but are not limited to:

(a) certificates;

(b) limited certificates;

(c) declarations;

(d) safety policy;

(e) safety accountabilities/responsibilities;

(f) safety occurrences;

(g) emergency response plan;

(h) SMS documentation;

(i) training and competence;

(j) occurrence reports;

(k) safety risk assessments including safety assessment of changes to the functional system;

(l) determination of either complex or non-complex organisation; and

(m) approved alternative means of compliance.

SAFETY MANAGEMENT MANUAL (SMM) — COMPLEX ATS PROVIDERS

The SMM may be contained in (one of) the manual(s) of the air traffic services provider.

SAFETY ASSURANCE — COMPLEX ATS PROVIDERS

(a) Leading indicators

(1) Metrics that measure inputs to the safety system (either within an organisation, a sector or across the total aviation system) to manage and improve safety performance.

(2) Leading indicators measure the specific features of the aviation safety system designed to support continuous improvement and to give an indication of likely future safety performance. They are designed to help identify whether the providers and regulators are taking actions and/or have processes in place that are effective in lowering the risk.

(b) Lagging indicators

Metrics that measure the outcome of the service delivery by measuring events that have already occurred and that impact safety performance. There are two subsets of lagging indicators:

(1) Outcome indicators: These include only the occurrences that one aims to prevent, for example fatal or catastrophic accidents. Depending on the system, the severity of the occurrences that are included as outcome indicators can be adjusted to include all accidents and serious incidents.

(2) Precursor indicators: These indicators do not manifest themselves in accidents or serious incidents. They indicate less severe system failures or ‘near misses’, and are used to assess how frequently the system comes close to severe failure. Because they are typically more numerous than outcome indicators, they can be used for trend monitoring.

(c) Safety management system

In the case of a complex air traffic services provider, the SMS should include all of these measures. Risk management efforts, however, should be targeted at leading indicators and precursor events. The reason for doing this is to reduce the number of accidents and serious incidents.

(d) Differing levels of safety performance monitoring

(1) Measurements of safety in terms of undesirable events, such as accidents and incidents, are examples of ‘lagging indicators’, which can capture safety performance a posteriori. Such indicators give valuable signals to all involved in air traffic services — providers, regulators, and recipients — of the levels of safety being experienced and of the ability of the organisations concerned to take appropriate mitigation action.

However, other types of measurement — ‘leading indicators’ — can give a wider perspective of the safety ‘health’ of the functional system, and focus on systemic issues, such as safety maturity and SMS performance.

(2) A holistic approach to performance monitoring is an essential input to decision-making with regard to safety. It is important to ensure that good safety performance is attributable to good performance of the SMS, not simply to lack of incidents or accidents. It is also essential that the metrics chosen match the requirements of the stakeholders and decision-makers involved in safety improvement.

(3) As shown in the diagram, stakeholders in the wider aviation industry and the general public require relatively small numbers of safety indicators (safety performance indicators or key performance indicators) which can give an instant ‘feel’ for the overall position regarding safety performance. Conversely, those involved in the management of services concerned need a more detailed set of metrics on which to base decisions regarding the management of the services and facilities being reviewed.

C:\Users\matiles\Desktop\Monitoring-metrics.jpg

CONTINUOUS IMPROVEMENT OF THE SMS — COMPLEX ATS PROVIDERS

An air traffic services provider should continuously improve the effectiveness of its SMS by:

(a) developing and maintaining a formal process to identify the causes of substandard performance of the SMS;

(b) establishing one or more mechanisms to determine the implications of substandard performance of the SMS;

(c) establishing one or more mechanisms to eliminate or mitigate the causes of substandard performance of the SMS; and

(d) developing and maintaining a process for the proactive evaluation of facilities, equipment, documentation, processes and procedures (through internal audits, surveys, etc.).

CONTINUOUS IMPROVEMENT OF THE SMS — COMPLEX ATS PROVIDERS

(a) Substandard performance of the SMS can manifest itself in two ways. Firstly, where the SMS processes themselves do not fit their purpose (e.g. not adequately enabling the air traffic services provider to identify, manage and mitigate hazards and their associated risks) resulting in the safety performance of the service being impacted in a negative way. Secondly, where the SMS processes fit their purpose, but are not applied correctly or adequately by the personnel whose safety accountabilities and responsibilities are discharged through the application of the SMS. Personnel who have safety accountabilities and responsibilities are considered an essential part of the effectiveness of the SMS and viewed as part of the SMS.

(b) Therefore, by detecting substandard performance of the SMS, the air traffic services provider can take action to improve the SMS processes themselves or to improve the application of the SMS processes by those with safety accountabilities and responsibilities resulting in an improvement to the safety performance.

(c) Continuous improvement of the effectiveness of the safety management processes can be achieved through:

(1) proactive and reactive evaluations of facilities, equipment, documentation, processes and procedures through safety audits and surveys; and

(2) reactive evaluations in order to verify the effectiveness of the system for control and mitigation of risks.

(d) In the same way that continuous improvement is sought through safety performance monitoring and measurement (see GM1 ATM/ANS.OR.B.005(a)(3) and GM1 ATS.OR.200(3)(i)) by the use of leading and lagging indicators, continuous improvement of the SMS provides the air traffic services provider with safety assurance for the service.

(e) As with safety performance monitoring, the continuous improvement of the SMS lends itself to a process that can be summarised as:

(1) Identify where there are potential weaknesses or opportunities for improvement;

(2) Identify what goes right and disseminate as best practice;

(3) Identify what can be done to tackle weaknesses or lead to improvement;

(4) Set performance standards for the actions identified;

(5) Monitor performance against the standards;

(6) Take corrective actions to improve performance; and

(7) Repeat the process by using the continuous improvement model below:

C:\Users\matiles\Desktop\CImodel.jpg

(f) Taking into account that the SMS is being required to manage safety, it can be assumed that by continuously improving the effectiveness of the SMS, ATS providers should be able to better manage and mitigate, and ultimately control the safety risks associated with the provisions of their services.

TRAINING AND COMMUNICATION — COMPLEX ATS PROVIDERS

(a) Training

(1) All personnel should receive safety training as appropriate for their safety responsibilities.

(2) Adequate records of all safety training provided should be kept.

(b) Communication

(1) The ATS provider should establish communication about safety matters that:

(a) ensures that all personnel are aware of the safety management activities as appropriate for their safety responsibilities;

(b) conveys critical information, especially relating to assessed risks and analysed hazards;

(c) explains why particular actions are taken; and

(d) explains why safety procedures are introduced or changed.

(2) Regular meetings with personnel where information, actions and procedures are discussed, may be used to communicate safety matters.

GM1 ATS.OR.200(4)(i) Safety management system

ED Decision 2017/001/R

TRAINING — COMPLEX ATS PROVIDERS

The safety training programme may consist of self-instruction (e.g. newsletters, flight safety magazines), classroom training, e-learning or similar training provided by training organisations.

ATS.OR.205 Safety assessment and assurance of changes to the functional system

Regulation (EU) 2017/373

(a) For any change notified in accordance with point ATM/ANS.OR.A.045(a)(1), the air traffic services provider shall:

(1) ensure that a safety assessment is carried out covering the scope of the change, which is:

(i) the equipment, procedural and human elements being changed;

(ii) interfaces and interactions between the elements being changed and the remainder of the functional system;

(iii) interfaces and interactions between the elements being changed and the context in which it is intended to operate;

(iv) the life cycle of the change from definition to operations including transition into service;

(v) planned degraded modes of operation of the functional system; and

(2) provide assurance, with sufficient confidence, via a complete, documented and valid argument that the safety criteria identified via the application of point ATS.OR.210 are valid, will be satisfied and will remain satisfied.

(b) An air traffic services provider shall ensure that the safety assessment referred to in point (a) comprises:

(1) the identification of hazards;

(2) the determination and justification of the safety criteria applicable to the change in accordance with point ATS.OR.210;

(3) the risk analysis of the effects related to the change;

(4) the risk evaluation and, if required, risk mitigation for the change such that it can meet the applicable safety criteria;

(5) the verification that:

(i) the assessment corresponds to the scope of the change as defined in point (a)(1);

(ii) the change meets the safety criteria;

(6) the specification of the monitoring criteria necessary to demonstrate that the service delivered by the changed functional system will continue to meet the safety criteria.

GENERAL

(a) The safety assessment should be conducted by the air traffic services provider itself. It may also be carried out by another organisation, on its behalf, provided that the responsibility for the safety assessment remains with the air traffic services provider.

(b) A safety assessment needs to be performed when a change affects a part of the functional system managed by the provider of air traffic services and that is being used in the provision of its (air traffic) services. The safety assessment or the way it is conducted does not depend on whether the change is a result of a business decision or a decision to improve safety.

SCOPE OF THE CHANGE

(a) The description of the elements being changed includes the nature, functionality, location, performance, maintenance tasks, training and responsibilities of these elements, where applicable. The description of interfaces and interactions, between machines and between humans and machines, should include communication means, e.g. language, phraseology, protocol, format, order and timing and transmission means, where applicable. In addition, it includes the description of the context in which they operate.

(b) There are two main aspects to consider in evaluating the scope of a change:

(1) The interactions within the changed functional system;

(2) The interactions within the changing functional system, i.e. those that occur during transitions from the current functional system to the changed functional system. During such transitions, components are replaced/installed in the functional system. These installation activities are interactions within the changing functional system and are to be included within the scope of the change.

As each transition can be treated as a change to the functional system, the identification of both the above has a common approach described below.

(c) The scope of the change is defined as the set of the changed components and affected components. In order to identify the affected components and the changed components, it is necessary to:

(1) know which components will be changed;

(2) know which component’s (components’) behaviour might be directly affected by the changed components, although it is (they are) not changed itself (themselves);

(3) detect indirectly affected components by identifying:

(i) new interactions introduced by the changed or directly affected components; and/or

(ii) interactions with changed or directly affected components via the environment.

(4) Furthermore, directly and indirectly affected components will be identified as a result of applying the above iteratively to any directly and indirectly affected components that have been identified previously.

The scope of the change is the set of changed, directly impacted and indirectly impacted components identified when the iteration identifies no new components.

(d) The context in which the changed service is intended to operate (see ATS.OR.205(a)(1)(iii)) includes the interface through which the service will be delivered to its users.

TRAINING

If the change modifies the way people interact with the rest of the functional system, then a training might be required before the change becomes operational. Care should be taken when training operational staff before the change is operational, as the training may change the behaviour of the operational staff when they interact with the existing functional system before any other part of the change is made, and so may have to be treated as a transitional stage of the change. 

For example, as a result of training, air traffic controllers (ATCOs) may come to expect information or alerts to be presented differently. People may also need refreshment training periodically in order to ensure that their performance does not degrade over time. The training needed before operation forms part of the design of the change, while the refreshment training is part of the maintenance of the functional system after the change is in operation.

DESCRIPTION OF THE SCOPE — ‘MULTI-ACTOR CHANGE

In reference to ‘multi-actor change’, please refer to GM1 ATM/ANS.OR.C.005(b)(1) Safety support assessment and assurance of changes to the functional system.

INTERACTIONS

The identification of changed interactions is necessary in order to identify the scope of the change because any changed behaviour in the system comes about via a changed interaction. Changed interaction happens via an interaction at an interface of the functional system and the context in which it operates. Consequently, identification of both interfaces and interactions is needed to be sure that all interactions have identified interfaces and all interfaces have identified interactions. From this, all interactions and interfaces that will be changed can be identified.

FORM OF ASSURANCE

The air traffic services provider should ensure that the assurance required by ATS.OR.205(a)(2) is documented in a safety case.

COMPLETENESS OF THE ARGUMENT

The argument should be considered complete when it shows, as applicable, that:

(a) the safety assessment in ATS.OR.205(b) has produced a sufficient set of non-contradictory valid safety criteria;

(b) safety requirements have been placed on the elements changed and on those elements affected by the change;

(c) the safety requirements as implemented meet the safety criteria;

(d) all safety requirements have been traced from the safety criteria to the level of the architecture at which they have been satisfied;

(e) each component satisfies its safety requirements;

(f) each component operates as intended, without adversely affecting the safety; and

(g) the evidence is derived from known versions of the components and the architecture and known sets of products, data and descriptions that have been used in the production or verification of those versions.

COMPLETENESS OF THE ARGUMENT

(a) Sufficiency of safety criteria

(1) A sufficient set of safety criteria is one where the safety goal of the change is validly represented by the set of individual safety criteria, each criterion of which must be valid in its own right and not contradict another criterion or any other subset of criteria. A valid criterion is a correct, complete and unambiguous statement of the desired property. An individual valid criterion does not necessarily represent a complete safety criterion. An example of an invalid criterion is that the maximum take-off weight must not exceed 225 Tonnes because weight is measured in Newtons and not in Tonnes. An example of an incomplete criterion is that the accuracy must be 5 m because no reliability attribute is present. This implies it must always be within 5 m, which is impossible in practice.

(2) Optimally, a sufficient set of criteria would consist of the minimum set of non-overlapping valid criteria and it is preferable to a set containing overlapping criteria.

(3) Criteria that are not relevant, i.e. ones that do not address the safety goal of the change at all, should be removed from the set as they contribute nothing, may contradict other valid criteria and may serve to confuse.

(4) There are two forms of overlap: complete overlap and partial overlap.

(i) In the first case, one or more criteria can be removed and the set would remain sufficient, i.e. there are unnecessary criteria.

(ii) In the second case, (partially overlapping criteria) if any criterion were to be removed, the set would not be sufficient. Consequently, all criteria are necessary; however, validating the set would be much more difficult. Showing that a set of criteria with significant overlap do not contradict each other is extremely difficult and consequently prone to error.

(5) It may, in fact, be simpler to develop an architecture that supports non-overlapping criteria than to attempt to validate a partially overlapping set of criteria.

(b) Safety requirements

(1) The safety requirements are design characteristics/items of the functional system to ensure that the system operates as specified. Based on the verification/demonstration of these characteristics/items, it could be concluded that the safety criteria are met.

(2) The highest layer of safety requirements represents the desired safety behaviour of the change at its interface with the operational context.

(3) In almost all cases, verification that a system behaves as specified cannot be accomplished, to an acceptable level of confidence, at the level of its interface with its operational environment. To this end, the system verification should be decomposed into verifiable parts, taking into account the following principles:

(i) Verification relies on requirements placed on these parts via a hierarchical decomposition of the top level requirements, in accordance with the constraints imposed by the chosen architecture.

(ii) At the lowest level, this decomposition places requirements on elements, where verification that the implementation satisfies its requirements can be achieved by testing.

(iii) At higher levels in the architecture, during integration, verified elements of different types are combined into subsystems/components, in order to verify more complete parts of the system.

(iv) While they cannot be fully tested, other verification techniques may be used to provide sufficient levels of confidence that these subsystems/components do what they are supposed to do.

(v) Consequently, since decomposing the system into verifiable parts relies on establishing requirements for those parts, then safety requirements are necessary.

(4) The architecture may not have requirements. During development, the need to argue satisfaction of safety criteria, which cannot be performed at the system level for any practical system, drives the architecture because verifiability depends on the decomposition of the system into verifiable parts.

(c) Satisfaction of safety criteria

(1) The concept laid down in AMC2 ATS.OR.205(a)(2) is that, provided each element meets its safety requirements, the system will meet its safety criteria. This will be true provided (2) and (3) below are met.

(2) The activity needed to meet this objective consists of obtaining sufficient confidence that the set of safety requirements is complete and correct, i.e. that:

(i) the architectural decomposition of the elements leads to a complete and correct set of safety requirements being allocated to each sub-element;

(ii) each safety requirement is a correct, complete and unambiguous statement of the desired behaviour and does not contradict another requirement or any other subset of requirements; and

(iii) the safety requirements allocated to an element necessitate the complete required safety behaviour of the element in the target environment.

(3) This should take into account specific aspects such as:

(i) the possible presence of functions within the element that produce unnecessary behaviour. For instance, in the case where a previously developed element is used, activities should be undertaken to identify all the possible behaviours of the element. If any of these behaviours is not needed for the foreseen use, then additional requirements may be needed to make sure that these functions will not be solicited or inadvertently activated in operation or that the effects of any resulting behaviour are mitigated;

(d) other requirements that are not directly related to the desired behaviour of the functional system. These requirements often relate to technical aspects of the system or its components. Activities should ensure that each of these requirements does not compromise the safety of the system, i.e. does not contradict the safety requirements or criteria.

(e) Traceability of requirements

The traceability requirement can be met by tracing to the highest-level element in the architectural hierarchy that has been shown to satisfy its requirements, by verifying it in isolation.

(f) Satisfaction of safety requirements

(1) The component view taken must be able to support verification, i.e. the component must be verifiable.

(2) Care should be taken in selecting subsystems that are to be treated as components for verification to ensure that they are small and simple enough to be verifiable.

(g) Adverse effects on safety

(1) Interactions of all changed components or components affected by the change, operating in their defined context, have to be identified and assessed for safety in order to be able to show that they do not adversely affect safety. This assessment must include the failure conditions for all components and the behaviour of the services delivered to the component including failures in those services.

(2) Interactions between changing components, as they are installed during transitions into operation, and the context in which they operate have to be identified and assessed for safety in order to be able to show that they do not adversely affect safety. This assessment must include the failure conditions for all installation activities.

In some cases, installing components during transition into operation may cause disruption to services other than the one being changed. These services fall within the scope of the change (see GM1 ATM/ANS.OR.A.045(c); (d)), and consequently the safety effects failures of these services, due to failures of the installation activities, have to be assessed as well and, if necessary, their impacts mitigated.

(3) Interactions in complex systems are dealt with in ATM/ANS.OR.A.045(e)(1).

(h) Configuration identification

(1) AMC2 ATS.OR.205(a)(2), point (f) is only about configuration of the evidence and should not be interpreted as configuration management of the changed functional system. However, since the safety case is based on a set of elements and the way they are joined together, the safety case will only be valid if the configuration remains as described in the safety case.

(2) Evidence for the use of a component should rely on testing activities considering the actual usage domains and contexts. When the same component is used in different parts of the system or in different systems, it may not be possible to rely on testing in a single context since it is unlikely that the contexts for each use will be the same or can be covered by a single set of test conditions. This applies equally to the reuse of evidence gathered from testing subsystems.

ASSURANCE — SOFTWARE

(a)  When a change to a functional system includes the introduction of new software or modifications to existing software, the ATS provider should ensure the existence of documented software assurance processes necessary to produce evidence and arguments that demonstrate that the software behaves as intended (software requirements), with a level of confidence consistent with the criticality of the required application.

(b)  The ATS provider should use the software experience gained to confirm that the software assurance processes are effective and, when used, the allocated software assurance levels (SWALs) and the rigour of the assurances are appropriate. For that purpose, the effects from a software malfunction (i.e. the inability of a programme to perform a required function correctly) or failure (i.e. the inability of a programme to perform a required function) reported according to the relevant requirements on reporting and assessment of service occurrences should be assessed in comparison with the effects identified for the system concerned as per the severity classification scheme.

ASSURANCE — SOFTWARE ASSURANCE PROCESSES

(a)  The software assurance processes should provide evidence and arguments that they, as a minimum, demonstrate the following:

(1)  The software requirements correctly state what is required by the software, in order to meet the upper level requirements, including the allocated system safety requirements as identified by the safety assessment of changes to the functional system (AMC2 ATS.OR.205(a)(2)). For that purpose, the software requirements should:

(i)  be correct, complete and compliant with the upper level requirements; and

(ii)  specify the functional behaviour, in nominal and downgraded modes, timing performances, capacity, accuracy, resource usage on the target hardware, robustness to abnormal operating conditions and overload tolerance, as appropriate, of the software.

(2)  The traceability is addressed in respect of all software requirements as follows:

(i)  Each software requirement should be traced to the same level of design at which its satisfaction is demonstrated.

(ii)  Each software requirement allocated to a component should either be traced to an upper level requirement or its need should be justified and assessed that it does not affect the satisfaction of the safety requirements allocated to the component.

(3)  The software implementation does not contain functions that adversely affect safety.

(4)  The functional behaviour, timing performances, capacity, accuracy, resource usage on the target hardware, robustness to abnormal operating conditions and overload tolerance, of the implemented software comply with the software requirements.

(5)  The software verification is correct and complete, and is performed by analysis and/or testing and/or equivalent means, as agreed with the competent authority.

(b)  The evidence and arguments produced by the software assurance processes should be derived from:

(1)  a known executable version of the software;

(2)  a known range of configuration data; and

(3)  a known set of software items and descriptions, including specifications, that have been used in the production of that version, or can be justified as applicable to that version.

(c)  The software assurance processes should determine the rigour to which the evidence and arguments are produced.

(d)  The software assurance processes should include the necessary activities to ensure that the software life cycle data can be shown to be under configuration control throughout the software life cycle, including the possible evolutions due to changes or problems’ corrections. They should include, as a minimum:

(1)  configuration identification, traceability and status accounting activities, including archiving procedures;

(2)  problem reporting, tracking and corrective actions management; and

(3)  retrieval and release procedures.

(e)  The software assurance processes should also cover the particularities of specific types of software such as COTS, non-development software and previously developed software where generic assurance processes cannot be applied. The software assurance processes should include other means to give sufficient confidence that the software meets the safety objectives and requirements, as identified by the safety risk assessment and mitigation processes. If sufficient assurance cannot be provided, complementary mitigation means aiming at decreasing the impact of specific failure modes of this type of software, should be applied. This may include but is not limited to:

(1) software and/or system architectural considerations;

(2) existing service level experience; and

(3) monitoring.

ASSURANCE — SOFTWARE ASSURANCE PROCESS

In reference to the terms ‘correct and complete software verification’, ‘software timing performances’, ‘software capacity’, ‘software accuracy’, ‘software resource usage’, ‘software robustness’, ‘overload tolerance’, ‘software life cycle data’ and ‘COTS’, please refer to GM1 to AMC6 ATM/ANS.OR.C.005(a)(2) ‘Safety support assessment and assurance of changes to the functional system’.

ASSURANCE — SOFTWARE ASSURANCE LEVELS

(a)  The assurance required by AMC4 ATS.OR.205(a)(2) can be provided with a level of confidence consistent with the criticality of the software in order to generate an appropriate and sufficient body of evidence to help to establish the required confidence in the argument.

(b)  The use of the SWAL concept can be helpful to provide an explicit link between the criticality of the software and the rigour of the assurance.

(c)  The use of multiple SWALs would also allow the possibility of managing several criticalities of the different software components within the system (with partitioning or other architectural strategies) by the same set of software assurance processes. When the software assurance processes employ on several SWALs, they should define for each SWAL the rigour of the assurances to achieve compliance with the objectives set out in AMC4 ATS.OR.205(a)(2). As a minimum:

(1)  the rigour should increase as the criticality of the service supported by the software solution increases; and

(2)  the variation in rigour of the evidence and arguments per SWAL should include a classification of the activities and objectives according to the following criteria:

(i)  required to be achieved with independence, i.e. the verification process activities are performed by a person (or persons) other than the developer of the item being verified;

(ii)  required to be achieved; and

(iii)  not required.

ASSURANCE — SOFTWARE ASSURANCE LEVELS ALLOCATION

The process to allocate a SWAL to a software consistently with its foreseen criticality, as identified by the risk assessment and mitigation process, should consider the following elements:

(a)  The allocated SWAL should relate the rigour of the software assurances to the foreseen criticality of the software by using the combination of the used severity classification scheme with the likelihood of occurrence of a certain adverse effect.

(b)  The allocated SWAL should be commensurate with the worst credible effect that software malfunctions (i.e. the inability of a programme to perform a required function correctly) or failures (i.e. the inability of a programme to perform a required function) may cause. It should, in particular, take into account the risks associated with software malfunctions or failures and the architecture and/or procedural defences.

(c)  The software components that cannot be shown to be independent of one another should be allocated to the SWAL of the most critical of the dependent components. In this context, the term ‘software components’ is understood to be a building block that can be fitted or connected together with other reusable blocks of software to combine and create a custom software application, and ‘independent software components’ are those software components which are not rendered inoperative by the same failure condition.

(d)  The allocated SWALs should be consistent with the levels defined in the software assurance processes of the ATS provider and of the non-ATS provider(s), when the safety case is based on the evidence presented in the corresponding safety support case(s).

ASSURANCE — EXAMPLES OF EXISTING INDUSTRIAL STANDARDS

(a)  The service provider is responsible for the definition of the software assurance processes. In this definition of processes, the service provider may consider the guidance material contained in existing industrial standards for the software assurance considerations of software. It should be considered that not all standards address all aspects required and the service provider may need to define additional software assurance processes. The guidance material typically includes:

(1)  objectives of the software life cycle processes;

(2)  activities for satisfaction of those objectives;

(3)  descriptions of the evidence, in the form of software life cycle data, that indicates that the objectives have been satisfied;

(4)  variations according to the SWAL, to accommodate the different levels of rigour of the software assurances; and

(5)  particular aspects (e.g. previously developed software) that may be applicable to certain applications.

(b)  The following table presents some of the existing industrial standards (at the latest available issue) used by the stakeholders:

Document title

Reference

Date

Software Integrity Assurance Considerations for Communication, Navigation, Surveillance and Air Traffic Management (CNS/ATM) Systems.

EUROCAE ED-109A/ RTCA DO-278A

January 2012

Guidelines for ANS Software Safety Assurance

EUROCAE ED-153

August 2009

Functional safety of electrical/electronic/ programmable electronic safety-related systems – Part 3: Software requirements

IEC 61508 – Part 3

April 2010

Software Considerations in Airborne Systems and Equipment Certification

EUROCAE ED-12C/ RTCA DO-178C

January 2012

EUROCAE ED-109A/RTCA DO-278A and EUROCAE ED-12C/RTCA DO-178C make reference to some external documents (supplements), which are integral part of the standard for the use of some particular technologies and development techniques. The supplements are the following:

(1)  Formal Methods Supplement to ED-12C and ED-109A (EUROCAE ED-216/RTCA DO-333)

(2)  Object-Oriented Technology and related Techniques Supplement to ED-12C and ED-109A (EUROCAE ED-217/RTCA DO-332)

(3)  Model-Based Development and Verification Supplement to ED-12C and ED-109A (EUROCAE ED-218/RTCA DO-331)

When tools are used during the software development lifecycle, EUROCAE ED-215/RTCA DO-330 ‘Software Tool Qualification Considerations’ may be considered in addition to EUROCAE ED-12C/RTCA DO-178C and EUROCAE ED-109A/RTCA DO-278A.

(c)  The definition of the software assurance processes may be based on one of these industrial standards, without combining provisions from different standards as far as the consistency and validation of each of the industrial standards have only been performed at individual level by each specific standardisation group.

ASSURANCE — SWAL COORDINATION

(a)  Within the scope of this Regulation, only the ATS provider can identify hazards, assess the associated risks and mitigate or propose mitigating measures where necessary. This requirement is also applicable to software assurance evidence which may include information on the mitigation measures established to address software failures or unintended behaviours.

(b)  ATS and non-ATS providers may rely on different sets of software assurance processes and, if applicable, different sets of SWALs.

(c)  For a particular change to the functional system, the safety assessment performed by the ATS provider, and documented in the safety case, may rely on evidence associated with the services provided by a non-ATS provider, as documented in its corresponding safety support case. It should as a minimum demonstrate that the rigour of the assurances produced by the non-ATS provider within the safety support case provides the adequate level of confidence for the purpose of the ATS safety demonstration in the safety case.

(d)  If SWALs are used, the ATS provider should evaluate the adequacy of the SWALs defined in the software assurance processes of the non-ATS providers and the consistency of the allocated SWALs for the parts of the functional system affected by the change at the non-ATS provider.

SAFETY CRITERIA

‘Safety criteria will remain satisfied’ means that the safety criteria continue to be satisfied after the change is implemented and put into operation. The safety case needs to provide assurance that the monitoring requirements of ATS.OR.205(b)(6) are suitable for demonstrating, during operation, that the safety criteria remain satisfied and, therefore, the argument remains valid.

ASSURANCE LEVELS

The use of assurance level concepts, e.g. design assurance levels (DAL), software assurance levels (SWAL), hardware assurance levels (HWAL), can be helpful in generating an appropriate and sufficient body of evidence to help establish the required confidence in the argument.

SAFETY REQUIREMENTS

The following non-exhaustive list contains examples of safety requirements that specify:

(a) for equipment, the complete behaviour, in terms of functions, accuracy, timing, order, format, capacity, resource usage, robustness to abnormal conditions, overload tolerance, availability, reliability, confidence and integrity;

The complete behaviour is limited to the scope of the change. Safety requirements should only apply to the parts of a system affected by the change. In other words, if parts of a system can be isolated from each other and only some parts are affected by the change, then these are the only parts that are of concern;

(b) for people, their performance in terms of tasks (e.g. accuracy, response times, acceptable workload, reliability, confidence, skills, and knowledge in relation to their tasks);

(c) for procedures, the circumstances for their enactment, the resources needed to perform the procedure (i.e. people and equipment), the sequence of actions to be performed and the timing and accuracy of the actions; and

(d) interactions between all parts of the system.

SAFETY ASSESSMENT METHODS

(a) The air traffic services provider can use a standard safety assessment method or it can use its own safety assessment method to assist with structuring the process. However, the application of a method is not a guarantee of the quality of the results. It is therefore not sufficient for a safety case to claim that the assurance provided is adequate due to compliance with a standard or method.

(b) There are databases available that describe different safety assessment methods, tools and techniques34 For example, http://www.nlr.nl/downloads/safety-methods-database.pdf or http://www.scsc.org.uk/ that can be used by the air traffic services provider. The provider must ensure that the safety assessment method is adequate for the change being assessed and that the assumptions inherent in the use of the method are recognised and accommodated appropriately.

COMPLETENESS OF HAZARD IDENTIFICATION

The air traffic services provider should ensure that hazard identification:

(a) targets complete coverage of any condition, event, or circumstance related to the change, which could, individually or in combination, induce a harmful effect;

(b) has been performed by personnel trained and competent for this task; and

(c) need only include hazards that are generally considered as credible.

HAZARDS TO BE IDENTIFIED

The following hazards should be identified:

(a) New hazards, i.e. those introduced by the change relating to the:

(1) failure of the functional system; and

(2) normal operation of the functional system; and

(b) Already existing hazards that are affected by the change and are related to:

(1) the existing parts of the functional systems; and

(2) hazards outside the functional system, for example, those inherent to aviation.

HAZARD IDENTIFICATION

(a) Completeness of hazard identification

In order to achieve completeness in the identification of hazards, it might be beneficial to aggregate hazards and to formulate them in a more abstract way, e.g. at the service level. This might in turn have drawbacks when analysing and evaluating the risk of the hazards. The appropriate level of detail in the set of hazards and their formulation, therefore, depends on the change and the way the safety assessment is executed.

Only credible hazards need to be identified. A credible hazard is one that has a material effect on the risk assessment. A hazard will not be considered credible when it is either highly improbable that the hazard will occur or that the accident trajectories it initiates will materialise. In other words, a hazard need not be considered if it can be shown that it induces an insignificant risk.

(b) Sources of hazards

(1) Hazards introduced by failures or nominal operations of the ATM/ANS functional systems may include the following factors and processes:

(i) design factors, including equipment, procedural and task design;

(ii) operating practices, including the application of procedures under actual operating conditions and the unwritten ways of operating;

(iii) communications, including means, terminology, order, timing and language and including human–human, human–machine and machine–machine communications;

(iv) installation issues;

(v) equipment and infrastructure, including failures, outages, error tolerances, nuisance alerts, defect defence systems and delays; and

(vi) human performance, including restrictions due to fatigue and medical conditions, and physical limitations, when considered relevant to the change assessment.

(2) Hazards introduced in the context in which the ATM/ANS functional system operates may include the following factors and processes:

(i) wrong, insufficient or delayed information and inadequate services delivered by third parties;

(ii) personnel factors, including working conditions, company policies for and actual practice of recruitment, training and allocation of resources, when considered relevant to the change;

(iii) organisational factors, including the incompatibility of production and safety goals, the allocation of resources, operating pressures and the safety culture;

(iv) work environment factors such as ambient noise, temperature, lighting, annoyance, ergonomics and the quality of man–machine interfaces; and

(v) external threats such as fire, electromagnetic interference and sources of distraction, when considered relevant to the change.

(3) The hazards introduced in the context in which the ATM/ANS services are delivered may include the following factors and processes:

(i) errors, failures, non-compliance and misunderstandings between the airborne and ground domains;

(ii) traffic complexity, including traffic growth, fleet mix and different types of traffic, when considered relevant to the change;

(iii) wrong, insufficient or delayed information delivered by third parties;

(iv) inadequate service provisioning by third parties; and

(v) external physical factors, including terrain, weather phenomena, volcanoes and animal behaviour, when considered relevant to the change.

(c) Methods to identify hazards

(1) The air traffic services provider may use a combination of tools and techniques, including functional analysis, what if techniques, brainstorming sessions, expert judgement, literature search (including accident and incident reports), queries of accident and incident databases in order to identify hazards.

(2) The air traffic services provider needs to make sure that the method is appropriate for the change and produces (either individually or in combination) a valid (necessary and sufficient) set of hazards. This may be aided by drawing up a list of the functions associated with part of the functional system being changed. The air traffic services provider needs to make sure their personnel that use these techniques are appropriately trained to apply these methods and techniques.

DETERMINATION OF THE SAFETY CRITERIA FOR THE CHANGE

When determining the safety criteria for the change being assessed, the air traffic services provider should, in accordance with ATS.OR.210, ensure that:

(a) the safety criteria support a risk analysis that is:

(1) relative or absolute, i.e. refers to:

(i) the difference in safety risk of the system due to the change (relative); or

(ii) the difference in safety risk of the system and a similar system (can be absolute or relative); and

(iii) the safety risk of the system after the change (absolute); and

(2) objective, whether risk is expressed numerically or not;

(b) the safety criteria are measurable to an adequate degree of certainty;

(c) the set of safety criteria can be represented totally by safety risks, by other measures that relate to safety risk or a mixture of safety risks and these other measures;

(d) the set of safety criteria should cover the change; the safety criteria selected are consistent with the overall safety objectives established by the air traffic services provider through its SMS and represented by its annual and business plan and safety key performance indicators; and

(e) where a safety risk or a proxy cannot be compared against its related safety criteria with acceptable certainty, the safety risk should be constrained and actions should be taken, in the long term, so as to manage safety and ensure that the air traffic services provider’s overall safety objectives are met.

COMPLETENESS OF RISK ANALYSIS

The air traffic services provider should ensure that the risk analysis is carried out by personnel trained and competent to perform this task and should also ensure that:

(a) a complete list of harmful effects in relation to the identified:

(1) hazards, when the safety criteria are expressed in terms of safety risk, or proxies, when the safety criteria are expressed in relation to proxies; and

(2) hazards introduced due to implementation

is produced; and

(b) the risk contributions of all hazards and proxies are evaluated; and

(c) risk analysis is conducted in terms of risk or in terms of proxies or a combination of them, using specific measurable properties that are related to operational safety risk; and

(d) results can be compared against the safety criteria.

SEVERITY CLASSIFICATION OF ACCIDENTS LEADING TO HARMFUL EFFECTS

When performing a risk analysis in terms of risk, the air traffic services provider should ensure that the harmful effects of all hazards are allocated a safety severity category and that, where there is more than one safety severity category of harm, any severity classification scheme satisfies the following criteria:

(a) The scheme is independent of the causes of the accidents that it classifies, i.e. the severity of the worst accident does not depend upon whether it was caused by an equipment malfunction or human error;

(b) The scheme permits unique assignment of every harmful effect to a severity category;

(c) The severity categories are expressed in terms of a single scalar quantity and in terms relevant to the field of their application;

(d) The level of granularity (i.e. the span of the categories) is appropriate to the field of their application;

(e) The scheme is supported by rules for assigning a harmful effect unambiguously to a severity category; and

(f) The scheme is consistent with the air traffic services providers views of the severity of the harmful effects covered and can be shown to incorporate societal views of their severity.

RISK EVALUATION

The air traffic services provider should ensure that the risk evaluation includes:

(a) an assessment of the identified hazards for a notified change, including possible mitigation means, in terms of risk or in terms of proxies or a combination of them;

(b) a comparison of the risk analysis results against the safety criteria taking the uncertainty of the risk assessment into account; and

(c) the identification of the need for risk mitigation or reduction in uncertainty or both.

RISK ANALYSIS IN TERMS OF PROXIES — EXAMPLES

Point (c) of AMC1 ATS.OR.205(b)(2) allows safety assessment to be performed in terms of risk, proxies or a combination of risk and proxies. This GM provides two examples to illustrate the use of proxies in safety analysis.

(a) Use of proxies when assessing the safety of a wind farm installation

(1) A wind farm is to be introduced on or near an aerodrome. It is assumed that before the introduction of the wind farm, the safety risk of the air traffic services being provided at the aerodrome was acceptable. To return to this level after the introduction of the farm, the change would also be acceptable.

A diagram showing the effects this has on the risk at the aerodrome is shown below:

Figure 1: Evaluation of risks after the introduction of wind farm

(2) The risk due to the introduction of the wind farm will rise from ① to ②, if not mitigated, because:

(i) turbulence will increase and so may destabilise manoeuvring of aircraft;

(ii) the movement of the blades will cause radio interference (communications radio and surveillance radar) and so communications may be lost or aircraft may be hidden from view on the radar screen; and

(iii) the flicker in the peripheral vision of ATCOs, caused by the rotation of the blades, may capture attention and increase their perception error rate.

(3) The problem of analysing the safety impact can be split into these areas of concern since they do not interact or overlap and so satisfy the independence criterion (b) of AMC2 ATS.OR.210(a). However, whilst it can be argued that each is a circumstantial hazard and that in each case a justifiable qualitative relationship can be established linking the hazard with the resulting accident (so satisfying the causality criterion (a) of AMC2 ATS.OR.210(a)), the actual or quantitative logical relationship is, in each case, extremely difficult to determine. Conditions for seeking proxies have, therefore, been established:

             Performing a risk evaluation using actual risk may not be worthwhile due to the considerable cost and effort involved; and

             The first two criteria for proxies have been satisfied.

Consequently, it may be possible to find proxies that can be used more simply and effectively than performing an analysis based on risk.

(4) The solutions proposed below are for illustrative purposes only. There are many other solutions and, for each change, several should be investigated. In this example, the following proxies, which satisfy the measurability criterion (c) of AMC2 ATS.OR.210(a), are used to set safety criteria:

(i) Turbulence can be measured and predicted by models so the level of turbulence can be a proxy.

In this example, let’s assume the only significant effect of turbulence is to light aircraft using a particular taxiway. It is possible to predict the level of turbulence at different sites on the aerodrome and an alternative taxiway is found where the level of turbulence after the introduction of the wind farm will be less than that currently encountered on the present taxiway. This can be confirmed during operation after the change by monitoring.

(ii) Signal quality can be also be predicted by models and measured so it can be used as a proxy.

In this example, it is possible to move the communications transmitter and receiver aerials so that communications are not affected by interference. Sites can be found using modelling and the signal quality confirmed prior to moving the aerials by trial installations during periods when the aerodrome is not operating.

(iii) Human error rate in detecting events on the manoeuvring area can be measured in simulations and can be used as a proxy.

It is suggested that increasing the opaqueness of the glass in the control tower will reduce the effects of flicker on the ATCOs, but there is no direct relationship between the transmissivity and the effects of flicker. It is, therefore, decided to make a simulation of the control tower and measure the effects of flicker on human error rate using glass of different levels of transmissivity.

However, there is a conflict between increasing the opaqueness of the glass to reduce the effects of flicker and decreasing it to improve direct vision, which is needed so that manoeuvring aircraft can be seen clearly. In other words, the simulation predicts a minimum for the human error rate that relates to a decrease, as the effects of flicker decrease, followed by an increase, as the effects of a lack of direct vision increase. This minimum is greater than the human error rate achieved by the current system and so the risk of the wind farm, in respect of flicker, cannot be completely mitigated. This is shown by the red box with a question mark in it on the diagram.

(5) Finally, the argument for the performance of surveillance radars is commonly performed using risk. This can be repeated in this case since the idea is to filter the effects of the interference without increasing the risk. Moreover, if necessary, a system may be added (or a current one improved) to reduce the risk simply and economically and the effects of the additional system may be argued using risk.

(6) Since risks can be combined, the safety impacts of the changes to the surveillance radar by filtering the effects of the interference together with the addition of another system or the improvement of the current system can be established by summing the risks associated with these two kinds of change.

(7) In these circumstances, it is not possible to argue objectively that the risk of introducing the wind farm has been mitigated, as risks cannot be summed with proxies. This demonstrates the difficulties of using proxies. However, it may be possible to argue convincingly, albeit subjectively, that installing another system or improving the current system improves the current level of risk by a margin large enough to provide adequate compensation for the unmitigated effects of flicker.

(8) In summary, this example shows how proxies and risks can be combined in a single assurance case to argue that a change to a functional system can be introduced safely. It also demonstrates that the strategies available to demonstrate safety are not generic, but are dependent on identifying analysable qualities or quantities related to specific properties of the system or service that are impacted by the change.

(b) Use of proxies when changing to electronic flight strips

(1) An air traffic services provider considers the introduction of a digital strip system in one of its air traffic control towers to replace the paper flight progress strips currently in use. This change is expected to have an impact on several aspects of the air traffic control service that is provided such as the controller’s recollection of the progress of the flight, the mental modelling of the traffic situation and the communication and task allocation between controllers. A change of the medium, from paper to digital, might, therefore, have implications on the tower operations, and, hence, on the safety of the air traffic. The actual relation between the change of the strip medium and the risk for the traffic is, however, difficult to establish.

(2) The influence of the quantity on the risk is globally known, but cannot easily be quantified. One difficulty is that strip management is at the heart of the air traffic control operations: the set of potential sequences of events from a strip management error to an accident or incident is enormous. This set includes, for example, the loss of the call sign at the moment a ground controller needs to intervene in a taxiway conflict, and whether this results in an incident depends, for example, on the visibility. This set also includes the allocation of a wrong standard instrument departure (SID) to an aircraft, and whether this results in an accident depends, for example, on the runway configuration.

Figure 2: Notional Bow Tie Model of a strip management error

(3) The Bow Tie Model of a strip management error has, figuratively speaking, a vertically stretched right part. This expresses that a hazard — such as the loss of a single strip — may have many different outcomes which heavily depend on factors that have nothing to do with the cause of the hazard — factors such as the status of the aircraft corresponding to the absent strip, that aircraft’s position on the aerodrome, the traffic situation and the visibility.

(4) Another difficulty with the relationship between the change of the medium and the risk to the air traffic is that several human and cultural aspects are involved. The difficulty lies in the largely unknown causal relationship between these human and cultural aspects and the occurrences of accidents and incidents. As an example of this, it is noted that strip manipulation — like moving a strip into another bay, or making a mark to indicate that a landing clearance is given — assists a controller in distinguishing the potential from the actual developments. The way of working with paper strips generates impressions in a wider variety than digital strips by their physical nature: handling paper strips has tactile, auditory and social aspects. This difference in these aspects may lead to a difference in the quality of the controller’s situation awareness which may lead to a difference in the efficacy of the controller’s instructions and advisories, which may lead to a difference in the occurrence of accidents and incidents. However, the relation between the change of the medium and the risk for the air traffic is difficult to assess and would require a great deal of effort, time and experimentation to quantify.

Figure 3: Relation between the change of flight strip and the risk

(5) There is probably a relation between the change of the flight progress strip medium and the risk for air traffic: a new human–machine interface may have an effect on the situation awareness of some individual controllers in some circumstances, which might have an effect on whether, when and what instructions are given, and this in turn influences the aircraft movements, and, hence, the risks. The question by what amount risks increase or decrease is very hard to answer. 

(6) Performing a risk evaluation using actual risk may not be worthwhile due to the difficulties and considerable cost and effort involved in assessing the risk of the change directly. Therefore, the use of proxies might be preferred. A quantity is only considered an appropriate proxy if it satisfies the criteria in point AMC2 ATS.OR.210(a):

(i) Causality: The quantity used as proxy can be expected to be influenced by the change, and the risk can be expected to be influenced by the quantity. In addition to this causal relationship, a criterion can be formulated and agreed upon that expresses by which amount the value of the quantity may shift due to the change. Note that the influence of the proxy on the risk cannot easily be quantified, otherwise it might be more beneficial to use risk as a measure and the quantity as an auxiliary function.

(ii) Measurability: The influence of the change on the quantity can be assessed before as well as after the change.

(iii) Independence: When the proxy selected does not cover all hazards, a set of proxies should be used. Any proxy of that set should be sufficiently isolated from other proxies to be treated independently.

Figure 4: Relation between proxy and risk

(7) There is a relationship between the change and the proxy, and there is a relationship between the proxy and the risk to traffic. The first relationship can be assessed (indicated by the ‘!’), while the second cannot (indicated by the ‘?’). An acceptance criterion is typically formulated for the amount the proxy value might increase or decrease.

(8) Proxy 1: Head-down time. The head-down time is a good proxy as it satisfies the conditions of:

(i) Causality: It is known that more head-down time leads to a higher risk but there is no well-established or generally accepted statement in literature in terms of: ‘x % more head-down time implies y% more accidents’, not to mention for the specific circumstances of the specific air traffic control tower. The causal relationship indicated in Figure 4 can be established because:

(A) the head-down time can be expected to change as the manipulation, writing and reading of digital strips might cost more, or perhaps less, attention and effort than the handling of paper strips;

(B) the loss of head-up time of ground and runway controllers implies less surveillance, at least less time for the out-of-the-window-view in good visibility, and this implies a later or less probable detection of conflicts; and

(C) an example of an acceptance criterion reads: ‘The introduction of the digital strip system does not lead to a significant increase in the head down time’.

(ii) Measurability: The influence of the change on the head-down time can be assessed before the change by means of real-time human-in-the-loop experiments in which controllers are tasked to handle equal amounts of traffic in equal circumstances, one time using paper strips and another time using digital strips. The percentage of head-down time can then be determined by observing the controllers by cameras and eye-trackers.

(9) Proxy 2: Fraction of erroneous SID allocations. The fraction of erroneous SID allocations is a good proxy as it satisfies the conditions of:

(i) Causality: It can be imagined that an erroneous SID selected in the flight management system (FMS) might lead to accidents, but the precise conditional probability is small and difficult to estimate as it depends on several external factors such as the flight paths of the correct and incorrect SIDs, the presence of other traffic, the timing and geometry of the trajectories, the cloud base or the vigilance of the controller. The causal relationship indicated in Figure 4 can be established because:

(A) the number of incorrect SIDs indicated on electronic strips can be expected to be less than on paper strips, because of the possibilities of systematic checks with respect to runway allocation, runway configuration, SID allocation of the predecessor and destination in the flight plan;

(B) the allocation of an incorrect SID to an aircrew might lead to a situation in which the aircraft manoeuvres in an unanticipated way, possibly leading to a conflict with another aircraft, for example departing from a parallel runway; and

(C) an example of an acceptance criterion reads: ‘The introduction of the digital strip system should lead to a decrease of the fraction of erroneous SID allocations of more than 20 %’.

(ii) Measurability: The influence of the change on the fraction of erroneous SID allocations can be assessed before the change by means of an analysis of the causes and occurrences of such errors and the estimated efficacy of the systematic checks. The fractions can be assessed after the change by the statistics of the event reports.

(10) Finally, the last condition of independence of proxies is also satisfied. For the purpose of this example, the proxies in (5) and (6) form a set of independent proxies that are complete, i.e. they cover all identified hazards introduced by the replacement of paper strips by a digital strip system.

RISK MITIGATION

When the risk evaluation results show that the safety criteria cannot be satisfied, then the air traffic services provider should either abandon the change or propose additional means of mitigating the risk. If risk mitigation is proposed, then the air traffic services provider should ensure that it identifies:

(a) all of the elements of the functional system, e.g. training, procedures that need to be reconsidered; and

(b) for each part of the amended change, those parts of the safety assessment (requirements from (1) to (6) listed in ATS.OR.205(b)) that need to be repeated in order to demonstrate that the safety criteria will be satisfied.

RISK ANALYSIS IN TERMS OF SAFETY RISK

(a) Risk analysis

When a risk assessment of a set of hazards is executed, in terms of risk:

(1) the frequency or probability of the occurrence of the hazard should be determined;

(2) the possible sequences of events from the occurrence of a hazardous event to the occurrence of an accident, which may be referred to as accident trajectories, should be identified. The contributing factors and circumstances that distinguish the different trajectories from one another should also be identified, as should any mitigations between a hazardous event and the associated accident;

(3) the potential harmful effects of the accident, including those resulting from a simultaneous occurrence of a combination of hazards, should be identified;

(4) the severity of these harmful effects should be assessed, using a defined severity scheme according to point (f) of AMC2 ATS.OR.205(b)(3); and

(5) the risk of the potential harmful effects of all the accidents, given the occurrence of the hazard, should be determined, taking into account the probabilities that the mitigations may fail as well as succeed, and that particular accident trajectories will be followed when particular contributing factors and circumstances occur.

(b) Severity schemes

The severity determination should take place according to a severity classification scheme.

The purpose of a severity classification scheme is to facilitate the management and control of risk. A severity class is, in effect, a container within which accidents can be placed if their severities are considered similar. Each container can be given a value which represents the consequences, i.e. small for accidents causing little harm and big for accidents causing a lot of harm. The sum of the probabilities of all the accidents assigned to a severity class multiplied by the value that is related to the severity class, is the risk associated with that class. If the value that represents severity for all classes is scalar, then the total risk is the sum of the risks in each severity class.

(1) Single-risk value severity schemes

Such schemes use a single severity category to represent harm to humans. Other categories representing other kinds of harm e.g. damage to aircraft and loss of separation, may be present but do not represent harm to humans. In these circumstances, risk analysis would actually be reduced to frequency/probability analysis.

(2) Multiple-risk value severity schemes

Multiple-risk value severity schemes, which use a number of severity categories to classify different levels of harm, facilitate the management and control of risk in a number of ways. At the simplest level, the distribution of accidents across the severity classes gives a picture of whether the risk profile of a system is well balanced. For example, many accidents in the top and bottom severity classes with few in between suggests an imbalance in risk, perhaps due to an undue amount of attention having been paid to some types of accident at the expense of others. More detailed management and control of risk includes:

(i) Severity classes may be used as the basis for reporting accident statistics.

(ii) Severity classes combined with frequency (or probability) classes can be used to define criteria for decision-making regarding risk acceptance.

(iii) The total risk associated with one or more severity classes can be managed and controlled. For example, the sum of the risk from all severity classes represents the total risk and may be used as a basis for making decisions about changes.

(iv) Similarly, the risk associated with accident types of different levels of severity can be compared. For example, comparing runway infringement accidents with low speed taxiway accidents would allow an organisation to focus their efforts on mitigating the accident type with greatest risk.

(c) The air traffic services provider should coordinate its severity scheme(s) when performing multi-actor changes to ensure adequate assessment. This includes coordination with air traffic services providers outside of the EU.

VERIFICATION

The air traffic services provider should ensure that verification activities of the safety assessment process include verification that:

(a) the full scope of the change is addressed throughout the whole assessment process, i.e. all the elements of the functional system or environment of operation that are changed and those unchanged elements that depend upon them and on which they depend are identified;

(b) the way the service behaves complies with and does not contradict any applicable requirements placed on the changed service or the conditions attached to the providers certificate;

(c) the specification of the way the service behaves is complete and correct;

(d) the specification of the operational context is complete and correct;

(e) the risk analysis is complete as per AMC1 ATS.OR.205(b)(3);

(f) the safety requirements are correct and commensurate with the risk analysis;

(g) the design is complete and correct with reference to the specification and correctly addresses the safety requirements;

(h) the design was the one analysed; and

(i) the implementation, to the intended degree of confidence, corresponds to that design and behaves only as specified in the given operational context.

OUTCOME OF RISK EVALUATION

The purpose of risk evaluation is to evaluate the risk of the change and to compare that against the safety criteria with the following outcomes in mind:

(a) A possible (desired) outcome is that the assessed risk satisfies the safety criteria. This implies that the change is assessed as sufficiently safe to implement.

(b) Another possible outcome is that the assessed risk does not satisfy the safety criteria. This might lead to the decision to refine the risk analysis, to the decision to add mitigating means, or to the decision to abandon the change.

RISK EVALUATION — UNCERTAINTY

(a) The outcome of a risk analysis is uncertain due to modelling, estimates, exclusion of rare circumstances or contributing factors, incident and safety event underreporting, false or unclear evidence, different expert opinions, etc. The uncertainty may be indicated explicitly, e.g. by means of an uncertainty interval, or implicitly, e.g. by means of a reference to the sources the estimates are based upon.

(b) Where possible sequences of events, contributing factors and circumstances are excluded in order to simplify the risk estimate, which may be necessary to make the estimate of risks feasible, arguments and evidence justifying this should be provided in the safety case. This may result in increasing the uncertainty of the risk estimations.

RISK EVALUATION — FORMS OF RISK EVALUATION

The risk evaluation can take several forms, even within the safety assessment of a single change, depending on the nature of the risk analysis and the safety criteria:

(a) If a set of safety requirements has been created and can be unambiguously and directly related to the safety criteria, then the risk evaluation takes the form of justifying that these requirements satisfy the safety criteria;

(b) If the safety criteria have been established in terms of the likelihood of the hazards and the severity of their effects, then the risk evaluation takes the form of verifying that the assessed risks satisfy the safety criteria in terms of risks; and

(c) If the values of all relevant proxies have been determined, then the risk evaluation takes the form of verifying that these values satisfy the safety criteria in terms of proxies.

TYPE OF RISK MITIGATION

Risk mitigation may be achieved in the following ways:

(a) an improvement of the performance of a functional subsystem;

(b) an additional change of the ATM/ANS functional system;

(c) an improvement of the services delivered by third parties;

(d) a change in the physical environment; or

(e) any combination of the above-mentioned methods.

VERIFICATION OF SAFETY CRITERIA

As the complete behaviour of the change is reflected in satisfying the safety criteria for the change, no safety requirements are set at system or change level. Nevertheless, safety requirements can be placed on the architecture and the components affected by the change.

MONITORING OF INTRODUCED CHANGE

The air traffic services provider should ensure that within the safety assessment process for a change, the monitoring criteria, that are to be used to demonstrate that the safety case remains valid during the operation of the changed functional system, are identified and documented. These criteria are specific to the change and should be such that they indicate that:

(a) the assumptions made in the argument remain valid;

(b) critical proxies remain as predicted in the safety case and are no more uncertain; and

(c) other properties that may be affected by the change remain within the bounds predicted by the safety case.

MONITORING OF INTRODUCED CHANGE

(a) Monitoring is intended to maintain confidence in the safety case during operation of the changed functional system. At entry into service, the safety criteria become performance criteria rather than design criteria. Monitoring is, therefore, only applicable following entry into service of the change.

(b) Monitoring is likely to be of internal parameters of the functional system that provide a good indication of the performance of the service. These parameters may not be directly observable at the service level, i.e. at the interface of the service with the operational context. For example, where a function is provided by multiple redundant resources, the availability of the function will be so high that monitoring it may not be useful. However, monitoring the availability of individual resources, which fail much more often, may be a useful indicator of the performance of the overall function.

ATS.OR.210 Safety criteria

Regulation (EU) 2017/373

(a) An air traffic services provider shall determine the safety acceptability of a change to a functional system, based on the analysis of the risks posed by the introduction of the change, differentiated on basis of types of operations and stakeholder classes, as appropriate.

(b) The safety acceptability of a change shall be assessed by using specific and verifiable safety criteria, where each criterion is expressed in terms of an explicit, quantitative level of safety risk or another measure that relates to safety risk.

(c) An air traffic services provider shall ensure that the safety criteria:

(1) are justified for the specific change, taking into account the type of change;

(2) when fulfilled, predict that the functional system after the change will be as safe as it was before the change or the air traffic services provider shall provide an argument justifying that:

(i) any temporary reduction in safety will be offset by future improvement in safety; or

(ii) any permanent reduction in safety has other beneficial consequences;

(3) when taken collectively, ensure that the change does not create an unacceptable risk to the safety of the service;

(4) support the improvement of safety whenever reasonably practicable.

AMC1 ATS.OR.210(a) Safety criteria

ED Decision 2017/001/R

OTHER MEASURES RELATED TO SAFETY RISKS

When the air traffic services provider specifies the safety criteria with reference to another measure that relates to safety risk, it should use one or more of the following:

(a) proxies;

(b) recognised standards and/or codes of practice; and

(c) the safety performance of the existing functional system or a similar system elsewhere.

AMC2 ATS.OR.210(a) Safety criteria

ED Decision 2017/001/R

OTHER MEASURES RELATED TO SAFETY RISKS — PROXIES

Proxies for safety risk, used as safety criteria for those parts of the functional system affected by the change, can only be employed when:

(a) a justifiable causal relationship exists between the proxy and the harmful effect, e.g. proxy increase/decrease causes risk increase/decrease;

(b) a proxy is sufficiently isolated from other proxies to be treated independently; and

(c) the proxy is measurable, quantitatively or qualitatively, to an adequate degree of certainty.

GM1 ATS.OR.210(a) Safety criteria

ED Decision 2017/001/R

SAFETY CRITERIA IN TERMS OF PROXIES FOR SAFETY RISKS

(a) In the safety assessment of functional systems, it may not always be possible or desirable to specify safety criteria in terms of quantitative values of risk. Instead, safety criteria may be defined in terms of other measures that are related to risk. These measures are called proxies and they need to meet the requirements for a proxy as stated in AMC2 ATS.OR.210(a). For examples of their use, see GM1 to AMC1 ATS.OR.205(b)(4).

(b) A proxy is some measurable property that can be used to represent the value of something else. In the safety assessment of functional systems, the value of a proxy may be used as a substitute for a value of risk, providing it meets the requirements for a proxy as stated in AMC2 ATS.OR.210(a). Examples of proxies are the frequency of airspace infringements, runway incursions, false alert rate, head-down time, limited sight, level of situation awareness, fraction of read back errors, reduced vigilance, amount of turbulence, distraction of controller’s attention, inappropriate pilot behaviour, system availability, information integrity and service continuity.

An example of the concept of using a different but specific quantity to assess an actually relevant quantity is the transposition/measure of an aircraft’s altitude which is in terms of barometric pressure or the transposition/measure of an aircraft’s airspeed which is in terms of dynamic pressure.

(c) A proxy is a measure of a certain property along the causal trajectory between the hazard/event and the harmful effects of the hazard/event in question (see Figure 5). The causal relationship between the proxy and the accident must be justified in the safety case, i.e. it must satisfy AMC2 ATS.OR.210(a). This means that the accident trajectory must be modelled and analysed such that the causal relationship can be assured but without the need to evaluate the quantitative nature of this relationship. It is assumed that since the proxy lies between the hazard/event and the accident, then there is a quantitative causal relationship between the rate of the hazard/event’s occurrence and the rate of the proxy’s occurrence. As a consequence, the variation of values of the proxy correlates with values of the hazards/events rate of occurrence and the value of the rate at which the harmful effects occur, i.e. the accident rate, and this relationship is a monotonically increasing one. This means that when the proxy value, e.g. Proxy1, increases/decreases, the associated risk value of the related accident, e.g. Accident1, increases/decreases accordingly.

Figure 5: Use of proxies along accident trajectories

(d) Proxies might be preferred where the extra effort needed to identify, describe and analyse a complete set of sequences of events from the occurrence of a hazard to the occurrence of an accident or incident has no added value in the safety assessment. The intrinsic reasons for the amount of the extra effort are the number of significantly different event sequences, the complexity of some accident scenarios, the existence of many barriers preventing the occurrence of a hazard developing into an accident and the lack of evidence on the probability of some events or the frequency of occurrence of some external circumstances and factors. The usage of proxies might then make the safety assessment more tractable and comprehensible and increase the quality of the risk analysis.

(e) The main advantages of proxies are the easy recognition of safety issues by operational staff involved in the safety assessment, and the direct focus on the analysis and mitigation of the identified hazards and safety issues introduced or affected by the change.

(f) The main disadvantage of using proxies is that it is not possible to express risk by a uniform measure. However, the value of the proxy should be measurable.

(g) For further details on the use of proxies, please refer to GM1 to AMC1 ATS.OR.205(b)(4), which contains two examples to assist in the selection and use of proxies in safety analysis.

ATS.OR.215 Licensing and medical certification requirements for air traffic controllers

Regulation (EU) 2017/373

An air traffic services provider shall ensure that air traffic controllers are properly licensed and hold a valid medical certificate, in accordance with Regulation (EU) 2015/340.