doc: move usermode under references

Move usermode documentation to be under api reference and not under
kernel.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This commit is contained in:
Anas Nashif 2019-03-25 15:50:40 -04:00
commit 21de8733c6
9 changed files with 1 additions and 1 deletions

View file

@ -0,0 +1,190 @@
.. _usermode:
User Mode
#########
This section describes access policies for kernel objects, how system calls
are defined, and how memory may be managed to support user mode threads.
For details on creating threads that run in user mode, please see
:ref:`lifecycle_v2`.
Threat Model
************
User mode threads are considered to be untrusted by Zephyr and are therefore
isolated from other user mode threads and from the kernel. A flawed or
malicious user mode thread cannot leak or modify the private data/resources
of another thread or the kernel, and cannot interfere with or
control another user mode thread or the kernel.
Example use-cases of Zephyr's user mode features:
- The kernel can protect against many unintentional programming errors which
could otherwise silently or spectacularly corrupt the system.
- The kernel can sandbox complex data parsers such as interpreters, network
protocols, and filesystems such that malicious third-party code or data
cannot compromise the kernel or other threads.
- The kernel can support the notion of multiple logical "applications", each
with their own group of threads and private data structures, which are
isolated from each other if one crashes or is otherwise compromised.
Design Goals
============
For threads running in a non-privileged CPU state (hereafter referred to as
'user mode') we aim to protect against the following:
- We prevent access to memory not specifically granted, or incorrect access to
memory that has an incompatible policy, such as attempting to write to a
read-only area.
- Threads are automatically granted access to their own stack memory
region, and all other stacks are inaccessible.
- By default, program text and read-only data are accessible to all threads
on read-only basis, kernel-wide. This policy may be adjusted.
- If the optional "application memory" feature is enabled, then all
non-kernel globals defined in the application and libraries will be
accessible.
- We prevent use of device drivers or kernel objects not specifically granted,
with the permission granularity on a per object or per driver instance
basis.
- We validate kernel or driver API calls with incorrect parameters that would
otherwise cause a crash or corruption of data structures private to the
kernel. This includes:
- Using the wrong kernel object type.
- Using parameters outside of proper bounds or with nonsensical values.
- Passing memory buffers that the calling thread does not have sufficient
access to read or write, depending on the semantics of the API.
- Use of kernel objects that are not in a proper initialization state.
- We ensure the detection and safe handling of user mode stack overflows.
- We prevent invoking system calls to functions excluded by the kernel
configuration.
- We prevent disabling of or tampering with kernel-defined and hardware-
enforced memory protections.
- We prevent re-entry from user to supervisor mode except through the kernel-
defined system calls and interrupt handlers.
- We prevent the introduction of new executable code by user mode threads,
except to the extent to which this is supported by kernel system calls.
We are specifically not protecting against the following attacks:
- The kernel itself, and any threads that are executing in supervisor mode,
are assumed to be trusted.
- The toolchain and any supplemental programs used by the build system are
assumed to be trusted.
- The kernel build is assumed to be trusted. There is considerable build-time
logic for creating the tables of valid kernel objects, defining system calls,
and configuring interrupts. The .elf binary files that are worked with
during this process are all assumed to be trusted code.
- We can't protect against mistakes made in memory domain configuration done in
kernel mode that exposes private kernel data structures to a user thread. RAM
for kernel objects should always be configured as supervisor-only.
- It is possible to make top-level declarations of user mode threads and
assign them permissions to kernel objects. In general, all C and header
files that are part of the kernel build producing zephyr.elf are assumed to
be trusted.
- We do not protect against denial of service attacks through thread CPU
starvation. Zephyr has no thread priority aging and a user thread of a
particular priority can starve all threads of lower priority, and also other
threads of the same priority if time-slicing is not enabled.
- There are build-time defined limits on how many threads can be active
simultaneously, after which creation of new user threads will fail.
- Stack overflows for threads running in supervisor mode may be caught,
but the integrity of the system cannot be guaranteed.
High-level Policy Details
*************************
Broadly speaking, we accomplish these thread-level memory protection goals
through the following mechanisms:
- Any user thread will only have access to its own stack memory by default.
Access to any other RAM will need to be done on the thread's behalf through
system calls, or specifically granted by a supervisor thread using the
:ref:`memory_domain` APIs. Newly created threads inherit the memory domain
configuration of the parent. Threads may communicate with each other
by having shared membership of the same memory domains, or via kernel objects
such as semaphores and pipes.
- User threads cannot directly access memory belonging to kernel objects.
Although pointers to kernel objects are used to reference them, actual
manipulation of kernel objects is done through system call interfaces. Device
drivers and threads stacks are also considered kernel objects. This ensures
that any data inside a kernel object that is private to the kernel cannot be
tampered with.
- User threads by default have no permission to access any kernel object or
driver other than their own thread object. Such access must be granted by
another thread that is either in supervisor mode or has permission on both
the receiving thread object and the kernel object being granted access to.
The creation of new threads has an option to automatically inherit
permissions of all kernel objects granted to the parent, except the parent
thread itself.
- For performance and footprint reasons Zephyr normally does little or no
parameter error checking for kernel object or device driver APIs. Access from
user mode through system calls involves an extra layer of handler functions,
which are expected to rigorously validate access permissions and type of
the object, check the validity of other parameters through bounds checking or
other means, and verify proper read/write access to any memory buffers
involved.
- Thread stacks are defined in such a way that exceeding the specified stack
space will generate a hardware fault. The way this is done specifically
varies per architecture.
Constraints
***********
All kernel objects, thread stacks, and device driver instances must be defined
at build time if they are to be used from user mode. Dynamic use-cases for
kernel objects will need to go through pre-defined pools of available objects.
There are some constraints if additional application binary data is loaded
for execution after the kernel starts:
- Loaded object code will not be able to define any kernel objects that will be
recognized by the kernel. This code will instead need to use APIs for
requesting kernel objects from pools.
- Similarly, since the loaded object code will not be part of the kernel build
process, this code will not be able to install interrupt handlers,
instantiate device drivers, or define system calls, regardless of what
mode it runs in.
- Loaded object code that does not come from a verified source should always
be entered with the CPU already in user mode.
.. toctree::
:maxdepth: 2
kernelobjects.rst
syscalls.rst
memory_domain.rst
mpu_stack_objects.rst
mpu_userspace.rst
usermode_sharedmem.rst

View file

@ -0,0 +1,274 @@
.. _kernelobjects:
Kernel Objects
##############
A kernel object can be one of three classes of data:
* A core kernel object, such as a semaphore, thread, pipe, etc.
* A thread stack, which is an array of :c:type:`struct _k_thread_stack_element`
and declared with :c:macro:`K_THREAD_STACK_DEFINE()`
* A device driver instance (struct device) that belongs to one of a defined
set of subsystems
The set of known kernel objects and driver subsystems is defined in
include/kernel.h as :cpp:enum:`k_objects`.
Kernel objects are completely opaque to user threads. User threads work
with addresses to kernel objects when making API calls, but may never
dereference these addresses, doing so will cause a memory protection fault.
All kernel objects must be placed in memory that is not accessible by
user threads.
Since user threads may not directly manipulate kernel objects, all use of
them must go through system calls. In order to perform a system call on
a kernel object, checks are performed by system call handler functions
that the kernel object address is valid and that the calling thread
has sufficient permissions to work with it.
Permission on an object also has the semantics of a reference to an object.
This is significant for certain object APIs which do temporary allocations,
or objects which themselves have been allocated from a runtime memory pool.
If an object loses all references, two events may happen:
* If the object has an associated cleanup function, the cleanup function
may be called to release any runtime-allocated buffers the object was using.
* If the object itself was dynamically allocated, the memory for the object
will be freed.
Object Placement
****************
Kernel objects that are only used by supervisor threads have no restrictions
and can be located anywhere in the binary, or even declared on stacks. However,
to prevent accidental or intentional corruption by user threads, they must
not be located in any memory that user threads have direct access to.
In order for a static kernel object to be usable by a user thread via system
call APIs, several conditions must be met on how the kernel object is declared:
* The object must be declared as a top-level global at build time, such that it
appears in the ELF symbol table. It is permitted to declare kernel objects
with static scope. The post-build script ``gen_kobject_list.py`` scans the
generated ELF file to find kernel objects and places their memory addresses
in a special table of kernel object metadata. Kernel objects may be members
of arrays or embedded within other data structures.
* Kernel objects must be located in memory reserved for the kernel. They
must not be located in any memory partitions that are user-accessible.
* Any memory reserved for a kernel object must be used exclusively for that
object. Kernel objects may not be members of a union data type.
Kernel objects that are found but do not meet the above conditions will not be
included in the generated table that is used to validate kernel object pointers
passed in from user mode.
The debug output of the ``gen_kobject_list.py`` script may be useful when
debugging why some object was unexpectedly not being tracked. This
information will be printed if the script is run with the ``--verbose`` flag,
or if the build system is invoked with verbose output.
Dynamic Objects
***************
Kernel objects may also be allocated at runtime if
:option:`CONFIG_DYNAMIC_OBJECTS` is enabled. In this case, the
:cpp:func:`k_object_alloc()` API may be used to instantiate an object from
the calling thread's resource pool. Such allocations may be freed in two
ways:
* Supervisor threads may call :cpp:func:`k_object_free()` to force a dynamic
object to be released.
* If an object's references drop to zero (which happens when no threads have
permissions on it) the object will be automatically freed. User threads
may drop their own permission on an object with
:cpp:func:`k_object_release()`, and their permissions are automatically
cleared when a thread terminates. Supervisor threads may additionally
revoke references for another thread using
:cpp:func:`k_object_access_revoke()`.
Because permissions are also used for reference counting, it is important for
supervisor threads to acquire permissions on objects they are using even though
the access control aspects of the permission system are not enforced.
Implementation Details
======================
The ``gen_kobject_list.py`` script is a post-build step which finds all the
valid kernel object instances in the binary. It accomplishes this by parsing
the DWARF debug information present in the generated ELF file for the kernel.
Any instances of structs or arrays corresponding to kernel objects that meet
the object placement criteria will have their memory addresses placed in a
special perfect hash table of kernel objects generated by the 'gperf' tool.
When a system call is made and the kernel is presented with a memory address
of what may or may not be a valid kernel object, the address can be validated
with a constant-time lookup in this table.
Drivers are a special case. All drivers are instances of :c:type:`struct
device`, but it is important to know what subsystem a driver belongs to so that
incorrect operations, such as calling a UART API on a sensor driver object, can
be prevented. When a device struct is found, its API pointer is examined to
determine what subsystem the driver belongs to.
The table itself maps kernel object memory addresses to instances of
:c:type:`struct _k_object`, which has all the metadata for that object. This
includes:
* A bitfield indicating permissions on that object. All threads have a
numerical ID assigned to them at build time, used to index the permission
bitfield for an object to see if that thread has permission on it. The size
of this bitfield is controlled by the :option:`CONFIG_MAX_THREAD_BYTES`
option and the build system will generate an error if this value is too low.
* A type field indicating what kind of object this is, which is some
instance of :cpp:enum:`k_objects`.
* A set of flags for that object. This is currently used to track
initialization state and whether an object is public or not.
* An extra data field. This is currently used for thread stack objects
to denote how large the stack is, and for thread objects to indicate
the thread's index in kernel object permission bitfields.
Dynamic objects allocated at runtime are tracked in a runtime red/black tree
which is used in parallel to the gperf table when validating object pointers.
Supervisor Thread Access Permission
***********************************
Supervisor threads can access any kernel object. However, permissions for
supervisor threads are still tracked for two reasons:
* If a supervisor thread calls :cpp:func:`k_thread_user_mode_enter()`, the
thread will then run in user mode with any permissions it had been granted
(in many cases, by itself) when it was a supervisor thread.
* If a supervisor thread creates a user thread with the
:c:macro:`K_INHERIT_PERMS` option, the child thread will be granted the
same permissions as the parent thread, except the parent thread object.
User Thread Access Permission
*****************************
By default, when a user thread is created, it will only have access permissions
on its own thread object. Other kernel objects by default are not usable.
Access to them needs to be explicitly or implicitly granted. There are several
ways to do this.
* If a thread is created with the :c:macro:`K_INHERIT_PERMS`, that thread
will inherit all the permissions of the parent thread, except the parent
thread object.
* A thread that has permission on an object, or is running in supervisor mode,
may grant permission on that object to another thread via the
:c:func:`k_object_access_grant()` API. The convenience function
:c:func:`k_thread_access_grant()` may also be used, which accepts a
NULL-terminated list of kernel objects and calls
:c:func:`k_object_access_grant()` on each of them. The thread being granted
permission, or the object whose access is being granted, do not need to be in
an initialized state. If the caller is from user mode, the caller must have
permissions on both the kernel object and the target thread object.
* Supervisor threads may declare a particular kernel object to be a public
object, usable by all current and future threads with the
:c:func:`k_object_access_all_grant()` API. You must assume that any
untrusted or exploited code will then be able to access the object. Use
this API with caution!
* If a thread was declared statically with :c:macro:`K_THREAD_DEFINE()`,
then the :c:macro:`K_THREAD_ACCESS_GRANT()` may be used to grant that thread
access to a set of kernel objects at boot time.
Once a thread has been granted access to an object, such access may be
removed with the :c:func:`k_object_access_revoke()` API. This API is not
available to user threads, however user threads may use
:c:func:`k_object_release()` to relinquish their own permissions on an
object.
API calls from supervisor mode to set permissions on kernel objects that are
not being tracked by the kernel will be no-ops. Doing the same from user mode
will result in a fatal error for the calling thread.
Objects allocated with :cpp:func:`k_object_alloc()` implicitly grant
permission on the allocated object to the calling thread.
Initialization State
********************
Most operations on kernel objects will fail if the object is considered to be
in an uninitialized state. The appropriate init function for the object must
be performed first.
Some objects will be implicitly initialized at boot:
* Kernel objects that were declared with static initialization macros
(such as :c:macro:`K_SEM_DEFINE` for semaphores) will be in an initialized
state at build time.
* Device driver objects are considered initialized after their init function
is run by the kernel early in the boot process.
If a kernel object is initialized with a private static initializer, the
object must have :c:func:`_k_object_init()` called on it at some point by a supervisor
thread, otherwise the kernel will consider the object uninitialized if accessed
by a user thread. This is very uncommon, typically only for kernel objects that
are embedded within some larger struct and initialized statically.
.. code-block:: c
struct foo {
struct k_sem sem;
...
};
struct foo my_foo = {
.sem = _K_SEM_INITIALIZER(my_foo.sem, 0, 1),
...
};
...
_k_object_init(&my_foo.sem);
...
Creating New Kernel Object Types
********************************
When implementing new kernel features or driver subsystems, it may be necessary
to define some new kernel object types. There are different steps needed
for creating core kernel objects and new driver subsystems.
Creating New Core Kernel Objects
================================
* In ``scripts/gen_kobject_list.py``, add the name of the struct to the
:py:data:`kobjects` list.
Instances of the new struct should now be tracked.
Creating New Driver Subsystem Kernel Objects
============================================
All driver instances are :c:type:`struct device`. They are differentiated by
what API struct they are set to.
* In ``scripts/gen_kobject_list.py``, add the name of the API struct for the
new subsystem to the :py:data:`subsystems` list.
Driver instances of the new subsystem should now be tracked.
Configuration Options
*********************
Related configuration options:
* :option:`CONFIG_USERSPACE`
* :option:`CONFIG_MAX_THREAD_BYTES`
API Reference
*************
.. doxygengroup:: usermode_apis
:project: Zephyr

View file

@ -0,0 +1,177 @@
.. _memory_domain:
Memory Domain
#############
The memory domain APIs are used by unprivileged threads to share data to
the threads in the same memory domain and protect sensitive data from threads
outside their domain. Memory domains are not only used for improving security,
but are also useful for debugging (unexpected access would cause an exception).
Since architectures generally have constraints on how many partitions can be
defined, and the size/alignment of each partition, users may need to group
related data together using linker sections.
.. contents::
:local:
:depth: 2
Concepts
********
A memory domain contains some number of memory partitions.
A memory partition is a memory region (might be RAM, peripheral registers,
or flash, for example) with specific attributes (access permission, e.g.
privileged read/write, unprivileged read-only, or execute never).
Memory partitions are defined by a set of underlying MPU regions
or MMU tables. A thread belongs to a single memory domain at
any point in time but a memory domain may contain multiple threads.
Threads in the same memory domain have the same access permissions
to the memory partitions belonging to the memory domain. New threads
will inherit any memory domain configuration from the parent thread.
Implementation
**************
Create a Memory Domain
======================
A memory domain is defined using a variable of type
:c:type:`struct k_mem_domain`. It must then be initialized by calling
:cpp:func:`k_mem_domain_init()`.
The following code defines and initializes an empty memory domain.
.. code-block:: c
struct k_mem_domain app0_domain;
k_mem_domain_init(&app0_domain, 0, NULL);
Add Memory Partitions into a Memory Domain
==========================================
There are two ways to add memory partitions into a memory domain.
This first code sample shows how to add memory partitions while creating
a memory domain.
.. code-block:: c
/* the start address of the MPU region needs to align with its size */
u8_t __aligned(32) app0_buf[32];
u8_t __aligned(32) app1_buf[32];
K_MEM_PARTITION_DEFINE(app0_part0, app0_buf, sizeof(app0_buf),
K_MEM_PARTITION_P_RW_U_RW);
K_MEM_PARTITION_DEFINE(app0_part1, app1_buf, sizeof(app1_buf),
K_MEM_PARTITION_P_RW_U_RO);
struct k_mem_partition *app0_parts[] = {
app0_part0,
app0_part1
};
k_mem_domain_init(&app0_domain, ARRAY_SIZE(app0_parts), app0_parts);
This second code sample shows how to add memory partitions into an initialized
memory domain one by one.
.. code-block:: c
/* the start address of the MPU region needs to align with its size */
u8_t __aligned(32) app0_buf[32];
u8_t __aligned(32) app1_buf[32];
K_MEM_PARTITION_DEFINE(app0_part0, app0_buf, sizeof(app0_buf),
K_MEM_PARTITION_P_RW_U_RW);
K_MEM_PARTITION_DEFINE(app0_part1, app1_buf, sizeof(app1_buf),
K_MEM_PARTITION_P_RW_U_RO);
k_mem_domain_add_partition(&app0_domain, &app0_part0);
k_mem_domain_add_partition(&app0_domain, &app0_part1);
.. note::
The maximum number of memory partitions is limited by the maximum
number of MPU regions or the maximum number of MMU tables.
Add Threads into a Memory Domain
================================
Adding threads into a memory domain grants threads permission to access
the memory partitions in the memory domain.
The following code shows how to add threads into a memory domain.
.. code-block:: c
k_mem_domain_add_thread(&app0_domain, app_thread_id);
Remove a Memory Partition from a Memory Domain
==============================================
The following code shows how to remove a memory partition from a memory
domain.
.. code-block:: c
k_mem_domain_remove_partition(&app0_domain, &app0_part1);
The k_mem_domain_remove_partition() API finds the memory partition
that matches the given parameter and removes that partition from the
memory domain.
Remove a Thread from the Memory Domain
======================================
The following code shows how to remove a thread from the memory domain.
.. code-block:: c
k_mem_domain_remove_thread(app_thread_id);
Destroy a Memory Domain
=======================
The following code shows how to destroy a memory domain.
.. code-block:: c
k_mem_domain_destroy(&app0_domain);
Available Partition Attributes
==============================
When defining a partition, we need to set access permission attributes
to the partition. Since the access control of memory partitions relies on
either an MPU or MMU, the available partition attributes would be architecture
dependent.
The complete list of available partition attributes for a specific architecture
is found in the architecture-specific include file
``include/arch/<arch name>/arch.h``, (for example, ``include/arch/arm/arch.h``.)
Some examples of partition attributes are:
.. code-block:: c
/* Denote partition is privileged read/write, unprivileged read/write */
K_MEM_PARTITION_P_RW_U_RW
/* Denote partition is privileged read/write, unprivileged read-only */
K_MEM_PARTITION_P_RW_U_RO
Configuration Options
*********************
Related configuration options:
* :option:`CONFIG_MAX_DOMAIN_PARTITIONS`
API Reference
*************
The following memory domain APIs are provided by :zephyr_file:`include/kernel.h`:
.. doxygengroup:: mem_domain_apis
:project: Zephyr

View file

@ -0,0 +1,65 @@
.. _mpu_stack_objects:
MPU Stack Objects
#################
Thread Stack Creation
*********************
Thread stacks are declared statically with :c:macro:`K_THREAD_STACK_DEFINE()`
or embedded within structures using :c:macro:`K_THREAD_STACK_MEMBER()`
For architectures which utilize memory protection unit (MPU) hardware,
stacks are physically contiguous allocations. This contiguous allocation
has implications for the placement of stacks in memory, as well as the
implementation of other features such as stack protection and userspace. The
implications for placement are directly attributed to the alignment
requirements for MPU regions. This is discussed in the memory placement
section below.
Stack Guards
************
Stack protection mechanisms require hardware support that can restrict access
to memory. Memory protection units can provide this kind of support.
The MPU provides a fixed number of regions. Each region contains information
about the start, end, size, and access attributes to be enforced on that
particular region.
Stack guards are implemented by using a single MPU region and setting the
attributes for that region to not allow write access. If invalid accesses
occur, a fault ensues. The stack guard is defined at the bottom (the lowest
address) of the stack.
Memory Placement
****************
During stack creation, a set of constraints are enforced on the allocation of
memory. These constraints include determining the alignment of the stack and
the correct sizing of the stack. During linking of the binary, these
constraints are used to place the stacks properly.
The main source of the memory constraints is the MPU design for the SoC. The
MPU design may require specific constraints on the region definition. These
can include alignment of beginning and end addresses, sizes of allocations,
or even interactions between overlapping regions.
Some MPUs require that each region be aligned to a power of two. These SoCs
will have :option:`CONFIG_MPU_REQUIRES_POWER_OF_TWO_ALIGNMENT` defined.
This means that a 1500 byte stack should be aligned to a 2kB boundary and the
stack size should also be adjusted to 2kB to ensure that nothing else is
placed in the remainder of the region. SoCs which include the unmodified ARM
v7m MPU will have these constraints.
Some ARM MPUs use start and end addresses to define MPU regions and both the
start and end addresses require 32 byte alignment. An example of this kind of
MPU is found in the NXP FRDM K64F.
MPUs may have a region priority mechanisms that use the highest priority region
that covers the memory access to determine the enforcement policy. Others may
logically OR regions to determine enforcement policy.
Size and alignment constraints may result in stack allocations being larger
than the requested size. Region priority mechanisms may result in
some added complexity when implementing stack guards.

View file

@ -0,0 +1,26 @@
.. _mpu_userspace:
MPU Backed Userspace
####################
The MPU backed userspace implementation requires the creation of a secondary
set of stacks. These stacks exist in a 1:1 relationship with each thread stack
defined in the system. The privileged stacks are created as a part of the
build process.
A post-build script ``gen_priv_stacks.py`` scans the generated
ELF file and finds all of the thread stack objects. A set of privileged
stacks, a lookup table, and a set of helper functions are created and added
to the image.
During the process of dropping a thread to user mode, the privileged stack
information is filled in and later used by the swap and system call
infrastructure to configure the MPU regions properly for the thread stack and
guard (if applicable).
During system calls, the user mode thread's access to the system call and the
passed-in parameters are all validated. The user mode thread is then elevated
to privileged mode, the stack is switched to use the privileged stack, and the
call is made to the specified kernel API. On return from the kernel API, the
thread is set back to user mode and the stack is restored to the user stack.

View file

@ -0,0 +1,514 @@
.. _syscalls:
System Calls
############
User threads run with a reduced set of privileges than supervisor threads:
certain CPU instructions may not be used, and they have access to only a
limited part of the memory map. System calls (may) allow user threads to
perform operations not directly available to them.
When defining system calls, it is very important to ensure that access to the
API's private data is done exclusively through system call interfaces.
Private kernel data should never be made available to user mode threads
directly. For example, the ``k_queue`` APIs were intentionally not made
available as they store bookkeeping information about the queue directly
in the queue buffers which are visible from user mode.
APIs that allow the user to register callback functions that run in
supervisor mode should never be exposed as system calls. Reserve these
for supervisor-mode access only.
This section describes how to declare new system calls and discusses a few
implementation details relevant to them.
Components
**********
All system calls have the following components:
* A **C prototype** for the API, declared in some header under ``include/`` and
prefixed with :c:macro:`__syscall`. This prototype is never implemented
manually, instead it gets created by the ``scripts/gen_syscalls.py`` script.
What gets generated is an inline function which either calls the
implementation function directly (if called from supervisor mode) or goes
through privilege elevation and validation steps (if called from user
mode).
* An **implementation function**, which is the real implementation of the
system call. The implementation function may assume that all parameters
passed in have been validated if it was invoked from user mode.
* A **handler function**, which wraps the implementation function and does
validation of all the arguments passed in.
C Prototype
***********
The C prototype represents how the API is invoked from either user or
supervisor mode. For example, to initialize a semaphore:
.. code-block:: c
__syscall void k_sem_init(struct k_sem *sem, unsigned int initial_count,
unsigned int limit);
The :c:macro:`__syscall` attribute is very special. To the C compiler, it
simply expands to 'static inline'. However to the post-build
``parse_syscalls.py`` script, it indicates that this API is a system call.
The ``parse_syscalls.py`` script does some parsing of the function prototype,
to determine the data types of its return value and arguments, and has some
limitations:
* Array arguments must be passed in as pointers, not arrays. For example,
``int foo[]`` or ``int foo[12]`` is not allowed, but should instead be
expressed as ``int *foo``.
* Function pointers horribly confuse the limited parser. The workaround is
to typedef them first, and then express in the argument list in terms
of that typedef.
* :c:macro:`__syscall` must be the first thing in the prototype.
The preprocessor is intentionally not used when determining the set of
system calls to generate. However, any generated system calls that don't
actually have a handler function defined (because the related feature is not
enabled in the kernel configuration) will instead point to a special handler
for unimplemented system calls. Data type definitions for APIs should not
have conditional visibility to the compiler.
Any header file that declares system calls must include a special generated
header at the very bottom of the header file. This header follows the
naming convention ``syscalls/<name of header file>``. For example, at the
bottom of ``include/sensor.h``:
.. code-block:: c
#include <syscalls/sensor.h>
Invocation Context
==================
Source code that uses system call APIs can be made more efficient if it is
known that all the code inside a particular C file runs exclusively in
user mode, or exclusively in supervisor mode. The system will look for
the definition of macros :c:macro:`__ZEPHYR_SUPERVISOR__` or
:c:macro:`__ZEPHYR_USER__`, typically these will be added to the compiler
flags in the build system for the related files.
* If :option:`CONFIG_USERSPACE` is not enabled, all APIs just directly call
the implementation function.
* Otherwise, the default case is to make a runtime check to see if the
processor is currently running in user mode, and either make the system call
or directly call the implementation function as appropriate.
* If :c:macro:`__ZEPHYR_SUPERVISOR__` is defined, then it is assumed that
all the code runs in supervisor mode and all APIs just directly call the
implementation function. If the code was actually running in user mode,
there will be a CPU exception as soon as it tries to do something it isn't
allowed to do.
* If :c:macro:`__ZEPHYR_USER__` is defined, then it is assumed that all the
code runs in user mode and system calls are unconditionally made.
Implementation Details
======================
Declaring an API with :c:macro:`__syscall` causes some code to be generated in
C and header files by ``scripts/gen_syscalls.py``, all of which can be found in
the project out directory under ``include/generated/``:
* The system call is added to the enumerated type of system call IDs,
which is expressed in ``include/generated/syscall_list.h``. It is the name
of the API in uppercase, prefixed with ``K_SYSCALL_``.
* A prototype for the handler function is also created in
``include/generated/syscall_list.h``
* An entry for the system call is created in the dispatch table
``_k_sycall_table``, expressed in ``include/generated/syscall_dispatch.c``
* A weak handler function is declared, which is just an alias of the
'unimplemented system call' handler. This is necessary since the real
handler function may or may not be built depending on the kernel
configuration. For example, if a user thread makes a sensor subsystem
API call, but the sensor subsystem is not enabled, the weak handler
will be invoked instead.
The body of the API is created in the generated system header. Using the
example of :c:func:`k_sem_init()`, this API is declared in
``include/kernel.h``. At the bottom of ``include/kernel.h`` is::
#include <syscalls/kernel.h>
Inside this header is the body of :c:func:`k_sem_init()`::
K_SYSCALL_DECLARE3_VOID(K_SYSCALL_K_SEM_INIT, k_sem_init, struct k_sem *,
sem, unsigned int, initial_count,
unsigned int, limit);
This generates an inline function that takes three arguments with void
return value. Depending on context it will either directly call the
implementation function or go through a system call elevation. A
prototype for the implementation function is also automatically generated.
In this example, the implementation of the :c:macro:`K_SYSCALL_DECLARE3_VOID()`
macro will be::
#if !defined(CONFIG_USERSPACE) || defined(__ZEPHYR_SUPERVISOR__)
#define K_SYSCALL_DECLARE3_VOID(id, name, t0, p0, t1, p1, t2, p2) \
extern void _impl_##name(t0 p0, t1 p1, t2 p2); \
static inline void name(t0 p0, t1 p1, t2 p2) \
{ \
_impl_##name(p0, p1, p2); \
}
#elif defined(__ZEPHYR_USER__)
#define K_SYSCALL_DECLARE3_VOID(id, name, t0, p0, t1, p1, t2, p2) \
static inline void name(t0 p0, t1 p1, t2 p2) \
{ \
_arch_syscall_invoke3((u32_t)p0, (u32_t)p1, (u32_t)p2, id); \
}
#else /* mixed kernel/user macros */
#define K_SYSCALL_DECLARE3_VOID(id, name, t0, p0, t1, p1, t2, p2) \
extern void _impl_##name(t0 p0, t1 p1, t2 p2); \
static inline void name(t0 p0, t1 p1, t2 p2) \
{ \
if (_is_user_context()) { \
_arch_syscall_invoke3((u32_t)p0, (u32_t)p1, (u32_t)p2, id); \
} else { \
compiler_barrier(); \
_impl_##name(p0, p1, p2); \
} \
}
#endif
The header containing :c:macro:`K_SYSCALL_DECLARE3_VOID()` is itself
generated due to its repetitive nature and can be found in
``include/generated/syscall_macros.h``. It is created by
``scripts/gen_syscall_header.py``.
The final layer is the invocation of the system call itself. All architectures
implementing system calls must implement the seven inline functions
:c:func:`_arch_syscall_invoke0` through :c:func:`_arch_syscall_invoke6`. These
functions marshal arguments into designated CPU registers and perform the
necessary privilege elevation. In this layer, all arguments are treated as an
unsigned 32-bit type. There is always a 32-bit unsigned return value, which
may or may not be used.
Some system calls may have more than six arguments. The number of arguments
passed via registers is fixed at six for all architectures. Additional
arguments will need to be passed in a struct, which needs to be treated as
untrusted memory in the handler function. This is done by the derived
functions :c:func:`_syscall_invoke7` through :c:func:`_syscall_invoke10`.
Some system calls may return a value that will not fit in a 32-bit register,
such as APIs that return a 64-bit value. In this scenario, the return value is
populated in a memory buffer that is passed in as an argument. For example,
see the implementation of :c:func:`_syscall_ret64_invoke0` and
:c:func:`_syscall_ret64_invoke1`.
Implementation Function
***********************
The implementation function is what actually does the work for the API.
Zephyr normally does little to no error checking of arguments, or does this
kind of checking with assertions. When writing the implementation function,
validation of any parameters is optional and should be done with assertions.
All implementation functions must follow the naming convention, which is the
name of the API prefixed with ``_impl_``. Implementation functions may be
declared in the same header as the API as a static inline function or
declared in some C file. There is no prototype needed for implementation
functions, these are automatically generated.
Handler Function
****************
The handler function runs on the kernel side when a user thread makes
a system call. When the user thread makes a software interrupt to elevate to
supervisor mode, the common system call entry point uses the system call ID
provided by the user to look up the appropriate handler function for that
system call and jump into it.
Handler functions only run when system call APIs are invoked from user mode.
If an API is invoked from supervisor mode, the implementation is simply called.
The purpose of the handler function is to validate all the arguments passed in.
This includes:
* Any kernel object pointers provided. For example, the semaphore APIs must
ensure that the semaphore object passed in is a valid semaphore and that
the calling thread has permission on it.
* Any memory buffers passed in from user mode. Checks must be made that the
calling thread has read or write permissions on the provided buffer.
* Any other arguments that have a limited range of valid values.
Handler functions involve a great deal of boilerplate code which has been
made simpler by some macros in ``kernel/include/syscall_handlers.h``.
Handler functions should be declared using these macros.
Argument Validation
===================
Several macros exist to validate arguments:
* :c:macro:`Z_SYSCALL_OBJ()` Checks a memory address to assert that it is
a valid kernel object of the expected type, that the calling thread
has permissions on it, and that the object is initialized.
* :c:macro:`Z_SYSCALL_OBJ_INIT()` is the same as
:c:macro:`Z_SYSCALL_OBJ()`, except that the provided object may be
uninitialized. This is useful for handlers of object init functions.
* :c:macro:`Z_SYSCALL_OBJ_NEVER_INIT()` is the same as
:c:macro:`Z_SYSCALL_OBJ()`, except that the provided object must be
uninitialized. This is not used very often, currently only for
:c:func:`k_thread_create()`.
* :c:macro:`Z_SYSCALL_MEMORY_READ()` validates a memory buffer of a particular
size. The calling thread must have read permissions on the entire buffer.
* :c:macro:`Z_SYSCALL_MEMORY_WRITE()` is the same as
:c:macro:`Z_SYSCALL_MEMORY_READ()` but the calling thread must additionally
have write permissions.
* :c:macro:`Z_SYSCALL_MEMORY_ARRAY_READ()` validates an array whose total size
is expressed as separate arguments for the number of elements and the
element size. This macro correctly accounts for multiplication overflow
when computing the total size. The calling thread must have read permissions
on the total size.
* :c:macro:`Z_SYSCALL_MEMORY_ARRAY_WRITE()` is the same as
:c:macro:`Z_SYSCALL_MEMORY_ARRAY_READ()` but the calling thread must
additionally have write permissions.
* :c:macro:`Z_SYSCALL_VERIFY_MSG()` does a runtime check of some boolean
expression which must evaluate to true otherwise the check will fail.
A variant :c:macro:`Z_SYSCALL_VERIFY` exists which does not take
a message parameter, instead printing the expression tested if it
fails. The latter should only be used for the most obvious of tests.
* :c:macro:`Z_SYSCALL_DRIVER_OP()` checks at runtime if a driver
instance is capable of performing a particular operation. While this
macro can be used by itself, it's mostly a building block for macros
that are automatically generated for every driver subsystem. For
instance, to validate the GPIO driver, one could use the
:c:macro:`Z_SYSCALL_DRIVER_GPIO()` macro.
* :c:macro:`Z_SYSCALL_SPECIFIC_DRIVER()` is a runtime check to verify that
a provided pointer is a valid instance of a specific device driver, that
the calling thread has permissions on it, and that the driver has been
initialized. It does this by checking the init function pointer that
is stored within the driver instance and ensuring that it matches the
provided value, which should be the address of the specific driver's
init function.
If any check fails, the macros will return a nonzero value. The macro
:c:macro:`Z_OOPS()` can be used to induce a kernel oops which will kill the
calling thread. This is done instead of returning some error condition to
keep the APIs the same when calling from supervisor mode.
Handler Declaration
===================
All handler functions have the same prototype:
.. code-block:: c
u32_t _handler_<API name>(u32_t arg1, u32_t arg2, u32_t arg3,
u32_t arg4, u32_t arg5, u32_t arg6, void *ssf)
All handlers return a value. Handlers are passed exactly six arguments, which
were sent from user mode to the kernel via registers in the
architecture-specific system call implementation, plus an opaque context
pointer which indicates the system state when the system call was invoked from
user code.
To simplify the prototype, the variadic :c:macro:`Z_SYSCALL_HANDLER()` macro
should be used to declare the handler name and names of each argument. Type
information is not necessary since all arguments and the return value are
:c:type:`u32_t`. Using :c:func:`k_sem_init()` as an example:
.. code-block:: c
Z_SYSCALL_HANDLER(k_sem_init, sem, initial_count, limit)
{
...
}
After validating all the arguments, the handler function needs to then call
the implementation function. If the implementation function returns a value,
this needs to be returned by the handler, otherwise the handler should return
0.
.. note:: Do not forget that all the arguments to the handler are passed in as
unsigned 32-bit values. If checks are needed on parameters that are
actually signed values, casts may be needed in order for these checks to
be performed properly.
Using :c:func:`k_sem_init()` as an example again, we need to enforce that the
semaphore object passed in is a valid semaphore object (but not necessarily
initialized), and that the limit parameter is nonzero:
.. code-block:: c
Z_SYSCALL_HANDLER(k_sem_init, sem, initial_count, limit)
{
Z_OOPS(Z_SYSCALL_OBJ_INIT(sem, K_OBJ_SEM));
Z_OOPS(Z_SYSCALL_VERIFY(limit != 0));
_impl_k_sem_init((struct k_sem *)sem, initial_count, limit);
return 0;
}
Simple Handler Declarations
---------------------------
Many kernel or driver APIs have very simple handler functions, where they
either accept no arguments, or take one object which is a kernel object
pointer of some specific type. Some special macros have been defined for
these simple cases, with variants depending on whether the API has a return
value:
* :c:macro:`Z_SYSCALL_HANDLER1_SIMPLE()` one kernel object argument, returns
a value
* :c:macro:`Z_SYSCALL_HANDLER1_SIMPLE_VOID()` one kernel object argument,
no return value
* :c:macro:`Z_SYSCALL_HANDLER0_SIMPLE()` no arguments, returns a value
* :c:macro:`Z_SYSCALL_HANDLER0_SIMPLE_VOID()` no arguments, no return value
For example, :c:func:`k_sem_count_get()` takes a semaphore object as its
only argument and returns a value, so its handler can be completely expressed
as:
.. code-block:: c
Z_SYSCALL_HANDLER1_SIMPLE(k_sem_count_get, K_OBJ_SEM, struct k_sem *);
System Calls With 6 Or More Arguments
=====================================
System calls may have more than six arguments, however the number of arguments
passed in via registers when the privilege elevation is invoked is fixed at six
for all architectures. In this case, the sixth and subsequent arguments to the
system call are placed into a struct, and a pointer to that struct is passed to
the handler as its sixth argument.
See ``include/syscall.h`` to see how this is done; the struct passed in must be
validated like any other memory buffer. For example, for a system call
with nine arguments, arguments 6 through 9 will be passed in via struct, which
must be verified since memory pointers from user mode can be incorrect or
malicious:
.. code-block:: c
Z_SYSCALL_HANDLER(k_foo, arg1, arg2, arg3, arg4, arg5, more_args_ptr)
{
struct _syscall_9_args *margs = (struct _syscall_9_args *)more_args_ptr;
Z_OOPS(Z_SYSCALL_MEMORY_READ(margs, sizeof(*margs)));
...
}
It is also very important to note that arguments passed in this way can change
at any time due to concurrent access to the argument struct. If any parameters
are subject to enforcement checks, they need to be copied out of the struct and
only then checked. One way to ensure this isn't optimized out is to declare the
argument struct as ``volatile``, and copy values out of it into local variables
before checking. Using the previous example:
.. code-block:: c
Z_SYSCALL_HANDLER(k_foo, arg1, arg2, arg3, arg4, arg5, more_args_ptr)
{
volatile struct _syscall_9_args *margs =
(struct _syscall_9_args *)more_args_ptr;
int arg8;
Z_OOPS(Z_SYSCALL_MEMORY_READ(margs, sizeof(*margs)));
arg8 = margs->arg8;
Z_OOPS(Z_SYSCALL_VERIFY_MSG(arg8 < 12, "arg8 must be less than 12"));
_impl_k_foo(arg1, arg2, arg3, arg3, arg4, arg5, margs->arg6,
margs->arg7, arg8, margs->arg9);
return 0;
}
System Calls With 64-bit Return Value
=====================================
If a system call has a return value larger than 32-bits, the handler will not
return anything. Instead, a pointer to a sufficient memory region for the
return value will be passed in as an additional argument. As an example, we
have the system call for getting the current system uptime:
.. code-block:: c
__syscall s64_t k_uptime_get(void);
The handler function has the return area passed in as a pointer, which must
be validated as writable by the calling thread:
.. code-block:: c
Z_SYSCALL_HANDLER(k_uptime_get, ret_p)
{
s64_t *ret = (s64_t *)ret_p;
Z_OOPS(Z_SYSCALL_MEMORY_WRITE(ret, sizeof(*ret)));
*ret = _impl_k_uptime_get();
return 0;
}
Configuration Options
*********************
Related configuration options:
* :option:`CONFIG_USERSPACE`
APIs
****
Helper macros for creating system call handlers are provided in
:zephyr_file:`kernel/include/syscall_handler.h`:
* :c:macro:`Z_SYSCALL_HANDLER()`
* :c:macro:`Z_SYSCALL_HANDLER1_SIMPLE()`
* :c:macro:`Z_SYSCALL_HANDLER1_SIMPLE_VOID()`
* :c:macro:`Z_SYSCALL_HANDLER0_SIMPLE()`
* :c:macro:`Z_SYSCALL_HANDLER0_SIMPLE_VOID()`
* :c:macro:`Z_SYSCALL_OBJ()`
* :c:macro:`Z_SYSCALL_OBJ_INIT()`
* :c:macro:`Z_SYSCALL_OBJ_NEVER_INIT()`
* :c:macro:`Z_OOPS()`
* :c:macro:`Z_SYSCALL_MEMORY_READ()`
* :c:macro:`Z_SYSCALL_MEMORY_WRITE()`
* :c:macro:`Z_SYSCALL_MEMORY_ARRAY_READ()`
* :c:macro:`Z_SYSCALL_MEMORY_ARRAY_WRITE()`
* :c:macro:`Z_SYSCALL_VERIFY_MSG()`
* :c:macro:`Z_SYSCALL_VERIFY`
Functions for invoking system calls are defined in
:zephyr_file:`include/syscall.h`:
* :c:func:`_arch_syscall_invoke0`
* :c:func:`_arch_syscall_invoke1`
* :c:func:`_arch_syscall_invoke2`
* :c:func:`_arch_syscall_invoke3`
* :c:func:`_arch_syscall_invoke4`
* :c:func:`_arch_syscall_invoke5`
* :c:func:`_arch_syscall_invoke6`
* :c:func:`_syscall_invoke7`
* :c:func:`_syscall_invoke8`
* :c:func:`_syscall_invoke9`
* :c:func:`_syscall_invoke10`
* :c:func:`_syscall_ret64_invoke0`
* :c:func:`_syscall_ret64_invoke1`

View file

@ -0,0 +1,75 @@
.. _usermode_sharedmem:
Application Shared Memory
#########################
.. note::
In this document, we will cover the basic usage of enabling shared
memory using a template around app_memory subsystem.
Overview
********
The use of subsystem app_memory in userspace allows control of
shared memory between threads. The foundation of the implementation
consists of memory domains and partitions. Memory partitions are created
and used in the definition of variable to group them into a
common space. The memory partitions are linked to domains
that are then assigned to a thread. The process allows selective
access to memory from a thread and sharing of memory between two
threads by assigning a partition to two different domains. By using
the shared memory template, code to protect memory can be used
on different platform without the application needing to implement
specific handlers for each platform. Note the developer should understand
the hardware limitations in context to the maximum number of memory
partitions available to a thread. Specifically processors with MPU's
cannot support the same number of partitions as a MMU.
This specific implementation adds a wrapper to simplify the programmers
task of using the app_memory subsystem through the use of macros and
a python script to generate the linker script. The linker script provides
the proper alignment for processors requiring power of two boundaries.
Without the wrapper, a developer is required to implement custom
linker scripts for each processor in the project.
The general usage is as follows. Include app_memory/app_memdomain.h
in the userspace source file. Mark the variable to be placed in
a memory partition. The two markers are for data and bss respectively:
K_APP_DMEM(id) and K_APP_BMEM(id). The id is used as the partition name.
The resulting section name can be seen in the linker.map as
"data_smem_id" and "data_smem_idb".
To create a k_mem_partition, call the macro K_APPMEM_PARTITION_DEFINE(part0)
where "part0" is the name then used to refer to that partition. The
standard memory domain APIs may be used to add it to domains; the declared
name is a k_mem_partition symbol.
Example:
.. code-block:: c
/* create partition at top of file outside functions */
K_APPMEM_PARTITION_DEFINE(part0);
/* create domain */
struct k_mem_domain dom0;
/* assign variables to the domain */
K_APP_DMEM(part0) int var1;
K_APP_BMEM(part0) static volatile int var2;
int main()
{
k_mem_domain_init(&dom0, 0, NULL)
k_mem_domain_add_partition(&dom0, part0);
k_mem_domain_add_thread(&dom0, k_current_get());
...
}
If multiple partitions are being created, a variadic
preprocessor macro can be used as provided in
app_macro_support.h:
.. code-block:: c
FOR_EACH(K_APPMEM_PARTITION_DEFINE, part0, part1, part2);