From e908f889208531ef2a64163380bda1a3a8b33d23 Mon Sep 17 00:00:00 2001 From: Simon Hein Date: Wed, 5 Apr 2023 09:00:50 +0000 Subject: [PATCH] docs: safety: Add safety documentation Add the safety documentation to the Zephyr documentation tree. Starts with the safety overview which describes the general goals for the safety in zephyr and introduces a process overview how a safety certification can be achieved in the zephyr project. Signed-off-by: Simon Hein --- doc/index-tex.rst | 1 + doc/index.rst | 1 + doc/safety/images/zephyr-safety-process.svg | 1 + doc/safety/index.rst | 13 + doc/safety/safety_overview.rst | 264 ++++++++++++++++++++ 5 files changed, 280 insertions(+) create mode 100644 doc/safety/images/zephyr-safety-process.svg create mode 100644 doc/safety/index.rst create mode 100644 doc/safety/safety_overview.rst diff --git a/doc/index-tex.rst b/doc/index-tex.rst index 6f55f862750..09ef9e9459e 100644 --- a/doc/index-tex.rst +++ b/doc/index-tex.rst @@ -23,3 +23,4 @@ Zephyr Project Documentation contribute/index.rst project/index.rst security/index.rst + safety/index.rst diff --git a/doc/index.rst b/doc/index.rst index 85962ad79b3..43f98f75976 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -111,6 +111,7 @@ Sections contribute/index.rst project/index.rst security/index.rst + safety/index.rst samples/index.rst boards/index.rst releases/index.rst diff --git a/doc/safety/images/zephyr-safety-process.svg b/doc/safety/images/zephyr-safety-process.svg new file mode 100644 index 00000000000..d607ff01cce --- /dev/null +++ b/doc/safety/images/zephyr-safety-process.svg @@ -0,0 +1 @@ +
Certification side
Certification side
Releases
Releases
Main
Main
LTS Releases
LTS Releases
LTS
LTS
Safety scope
Safety scope
Backport / Bugfix 
Backport / Bugfix 
Issue / PR within safety scope or safety relevance
Issue / PR within safety scope...
Auditable
Auditable
Decision how to handle 
Decision how to handle 
Safety committee /Architect
Safety Documentation
Safety Documentation
Code update
Code update
Documentation update
Documentation update
Certified Releases
Certified Releases
1
1
2
2
3
3
4
4
5
5
Technical SteeringCommittee ( TSC )
Decision how to handle 
Decision how to handle 
Backport / Bugfix 
Backport / Bugfix 
Text is not SVG - cannot display
\ No newline at end of file diff --git a/doc/safety/index.rst b/doc/safety/index.rst new file mode 100644 index 00000000000..1eb55d12a32 --- /dev/null +++ b/doc/safety/index.rst @@ -0,0 +1,13 @@ +.. _safety_section: + +Safety +######## + +These documents describe the processes, developer guidelines and requirements +for ensuring safety is addressed within the Zephyr project. + +.. toctree:: + :maxdepth: 1 + :glob: + + safety_overview.rst diff --git a/doc/safety/safety_overview.rst b/doc/safety/safety_overview.rst new file mode 100644 index 00000000000..9a17434e6d0 --- /dev/null +++ b/doc/safety/safety_overview.rst @@ -0,0 +1,264 @@ +.. _safety_overview: + +Zephyr Safety Overview +######################## + +Introduction +************ + +This document is the safety documentation providing an overview over the safety-relevant activities +and what the Zephyr Project and the Zephyr Safety Working Group / Committee try to achieve. + +This overview is provided for people who are interested in the functional safety development part +of the Zephyr RTOS and project members who want to contribute to the safety aspects of the +project. + +Overview +******** + +In this section we give the reader an overview of what the general goal of the safety certification +is, what standard we aim to achieve and what quality standards and processes need to be implemented +to reach such a safety certification. + +Safety Document update +********************** + +This document is a living document and may evolve over time as new requirements, guidelines, or +processes are introduced. + +#. Changes will be submitted from the interested party(ies) via pull requests to the Zephyr + documentation repository. + +#. The Zephyr Safety Committee will review these changes and provide feedback or acceptance of + the changes. + +#. Once accepted, these changes will become part of the document. + +General safety scope +******************** + +The general scope of the Safety Committee is to achieve a certification for the `IEC 61508 +`__ standard and the Safety Integrity Level (SIL) 3 / +Systematic Capability (SC) 3 for a limited source scope (see certification scope TBD). Since the +code base is pre-existing, we use the route 3s/1s approach defined by the IEC 61508 standard. + +Route 3s + *Assessment of non-compliant development. Which is basically the route 1s with existing + sources.* + +Route 1s + *Compliant development. Compliance with the requirements of this standard for the avoidance and + control of systematic faults in software.* + +Summarization IEC 61508 standard +================================ + +The IEC 61508 standard is a widely recognized international standard for functional safety of +electrical, electronic, and programmable electronic safety-related systems. Here's an overview of +some of the key safety aspects of the standard: + +#. **Hazard and Risk Analysis**: The IEC 61508 standard requires a thorough analysis of potential + hazards and risks associated with a system in order to determine the appropriate level of safety + measures needed to reduce those risks to acceptable levels. + +#. **Safety Integrity Level (SIL)**: The standard introduces the concept of Safety Integrity Level + (SIL) to classify the level of risk reduction required for each safety function. The higher the + SIL, the greater the level of risk reduction required. + +#. **System Design**: The IEC 61508 standard requires a systematic approach to system design that + includes the identification of safety requirements, the development of a safety plan, and the + use of appropriate safety techniques and measures to ensure that the system meets the required + SIL. + +#. **Verification and Validation**: The standard requires rigorous testing and evaluation of the + safety-related system to ensure that it meets the specified SIL and other safety requirements. + This includes verification of the system design, validation of the system's functionality, and + ongoing monitoring and maintenance of the system. + +#. **Documentation and Traceability**: The IEC 61508 standard requires a comprehensive + documentation process to ensure that all aspects of the safety-related system are fully + documented and that there is full traceability from the safety requirements to the final system + design and implementation. + +Overall, the IEC 61508 standard provides a framework for the design, development, and +implementation of safety-related systems that aims to reduce the risk of accidents and improve +overall safety. By following the standard, organizations can ensure that their safety-related +systems are designed and implemented to the highest level of safety integrity. + +Quality +******* + +Quality is a mandatory expectation for software across the industry. The code base of the project +must achieve various software quality goals in order to be considered an auditable code base from a +safety perspective and to be usable for certification purposes. But software quality is not an +additional requirement caused by functional safety standards. Functional safety considers quality +as an existing pre-condition and therefore the "quality managed" status should be pursued for any +project regardless of the functional safety goals. The following list describes the quality goals +which need to be reached to achieve an auditable code base: + +1. Basic software quality standards + + a. :ref:`coding_guidelines` (including: static code analysis, coding style, etc.) + b. Requirements and requirements tracing + c. Test coverage + +2. Software architecture design principles + + a. Layered architecture model + b. Encapsulated components + c. Encapsulated single functionality (if not fitable and manageable in safety) + +Basic software quality standards - Safety view +============================================== + +In this chapter the Safety Committee describes why they need the above listed quality goals as +pre-condition and what needs to be done to achieve an auditable code base from the safety +perspective. Generally speaking, it can be said that all of these quality measures regarding safety +are used to minimize the error rate during code development. + +Coding Guidelines +----------------- + +The coding guidelines are the basis to a common understanding and a unified ruleset and development +style for industrial software products. For safety the coding guidelines are essential and have +another purpose beside the fact of a unified ruleset. It is also necessary to prove that the +developers follow a unified development style to prevent **systematic errors** in the process of +developing software and thus to minimize the overall **error rate** of the complete software +system. + +Also the **IEC 61508 standard** sets a pre-condition and recommendation towards the use of coding +standards / guidelines to reduce likelihood of errors. + +Requirements and requirements tracing +------------------------------------- + +Requirements and requirement management are not only important for software development, but also +very important in terms of safety. On the one hand, this specifies and describes in detail and on a +technical level what the software should do, and on the other hand, it is an important and +necessary tool to verify whether the described functionality is implemented as expected. For this +purpose, tracing the requirements down to the code level is used. With the requirements management +and tracing in hand, it can now be verified whether the functionality has been tested and +implemented correctly, thus minimizing the systematic error rate. + +Also the IEC 61508 standard highly recommends (which is like a must-have for the certification) +requirements and requirements tracing. + +Test coverage +------------- + +A high test coverage, in turn, is evidence of safety that the code conforms precisely to what it +was developed for and does not execute any unforeseen instructions. If the entire code is tested +and has a high (ideally 100%) test coverage, it has the additional advantage of quickly detecting +faulty changes and further minimizing the error rate. However, it must be noted that different +requirements apply to safety for test coverage, and various metrics must be considered, which are +prescribed by the IEC 61508 standard for the SIL 3 / SC3 target. The following must be fulfilled, +among other things: + +* Structural test coverage (entry points) 100% +* Structural test coverage (statements) 100% +* Structural test coverage (branches) 100% + +If the 100% cannot be reached (e.g. statement coverage of defensive code) that part needs to be +described and justified in the documentation. + +Software architecture design principles +======================================= + +To create and maintain a structured software product it is also necessary to consider individual +software architecture designs and implement them in accordance with safety standards because some +designs and implementations are not reasonable in safety, so that the overall software and code +base can be used as auditable code. However, most of these software architecture designs have +already been implemented in the Zephyr project and need to be verified by the Safety Committee / +Safety Working Group and the safety architect. + +Layered architecture model +-------------------------- + +The **IEC 61508 standard** strongly recommends a modular approach to software architecture. This +approach has been pursued in the Zephyr project from the beginning with its layered architecture. +The idea behind this architecture is to organize modules or components with similar functionality +into layers. As a result, each layer can be assigned a specific role in the system. This model has +the advantage in safety that interfaces between different components and layers can be shown at a +very high level, and thus it can be determined which functionalities are safety-relevant and can be +limited. Furthermore, various analyses and documentations can be built on top of this architecture, +which are important for certification and the responsible certification body. + +Encapsulated components +----------------------- + +Encapsulated components are an essential part of the architecture design for safety at this point. +The most important aspect is the separation of safety-relevant components from non-safety-relevant +components, including their associated interfaces. This ensures that the components have no +**repercussions** on other components. + +Encapsulated single functionality (if not reasonable and manageable in safety) +------------------------------------------------------------------------------ + +Another requirement for the overall system and software environment is that individual +functionalities can be disabled within components. This is because if a function is absolutely +unacceptable for safety (e.g. complete dynamic memory management), then these individual +functionalities should be able to be turned off. The Zephyr Project already offers such a +possibility through the use of Kconfig and its flexible configurability. + +Processes and workflow +********************** + +.. figure:: images/zephyr-safety-process.svg + :align: center + :alt: Safety process and workflow overview + :figclass: align-center + + Safety process and workflow overview + +The diagram describes the rough process defined by the Safety Committee to ensure safety in the +development of the Zephyr project. To ensure understanding, a few points need to be highlighted and +some details explained regarding the role of the safety architect and the role of the safety +committee in the whole process. The diagram only describes the paths that are possible when a +change is related to safety. + +#. On the main branch, the safety scope of the project should be identified, which typically + represents a small subset of the entire code base. This subset should then be made auditable + during normal development on “main”, which means that special attention is paid to quality goals + (`Quality`_) and safety processes within this scope. The Safety Architect works alongside the + Technical Steering Committee (TSC) in this area, monitoring the development process to ensure + that the architecture meets the safety requirements. + +#. At this point, the safety architect plays an increasingly important role. For PRs/issues that + fall within the safety scope, the safety architect should ideally be involved in the discussions + and decisions of minor changes in the safety scope to be able to react to safety-relevant + changes that are not conformant. If a pull request or issue introduces a significant and + influential change or improvement that requires extended discussion or decision-making, the + safety architect should bring it to the attention of the Safety Committee or the Technical + Steering Committee (TSC) as appropriate, so that they can make a decision on the best course of + action. + +#. This section describes the certification side. At this point, the code base has to be in an + "auditable" state, and ideally no further changes should be necessary or made to the code base. + There is still a path from the main branch to this area. This is needed in case a serious bug or + important change is found or implemented on the main branch in the safety scope, after the LTS + and the auditable branch were created. In this case, the Safety Committee, together with the + safety architect, must decide whether this bug fix or change should be integrated into the LTS + so that the bug fix or change could also be integrated into the auditable branch. This + integration can take three forms: First either as only a code change or second as only an update + to the safety documentation or third as both. + +#. This describes the necessary safety process required for certification itself. Here, the final + analyses, tests, and documents are created and conducted which must be created and conducted + during the certification, and which are prescribed by the certifying authority and the standard + being certified. If the certification body approves everything at this stage and the safety + process is completed, a safety release can be created and published. + +#. This transition from the auditable branch to the main branch should only occur in exceptional + circumstances, specifically when something has been identified during the certification process + that needs to be quickly adapted on the “auditable” branch in order to obtain certification. In + order to prevent this issue from arising again during the next certification, there needs to be + a path to merge these changes back into the main branch so that they are not lost, and to have + them ready for the next certification if necessary. + +.. important:: + Safety should not block the project and minimize the room to grow in any way. + +.. important:: + **TODO:** Find and define ways, guidelines and processes which minimally impact the daily work + of the maintainers, reviewers and contributors and also the safety architect itself. + But which are also suitable for safety.