zephyr/arch/x86/core/ia32/float.c

321 lines
9.3 KiB
C
Raw Permalink Normal View History

/*
* Copyright (c) 2010-2014 Wind River Systems, Inc.
*
* SPDX-License-Identifier: Apache-2.0
*/
/**
* @file
* @brief Floating point register sharing routines
*
* This module allows multiple preemptible threads to safely share the system's
* floating point registers, by allowing the system to save FPU state info
* in a thread's stack region when a preemptive context switch occurs.
*
* Note: If the kernel has been built without floating point register sharing
* support (CONFIG_FPU_SHARING), the floating point registers can still be used
* safely by one or more cooperative threads OR by a single preemptive thread,
* but not by both.
*
* This code is not necessary for systems with CONFIG_EAGER_FPU_SHARING, as
* the floating point context is unconditionally saved/restored with every
* context switch.
*
* The floating point register sharing mechanism is designed for minimal
* intrusiveness. Floating point state saving is only performed for threads
* that explicitly indicate they are using FPU registers, to avoid impacting
* the stack size requirements of all other threads. Also, the SSE registers
* are only saved for threads that actually used them. For those threads that
* do require floating point state saving, a "lazy save/restore" mechanism
* is employed so that the FPU's register sets are only switched in and out
* when absolutely necessary; this avoids wasting effort preserving them when
* there is no risk that they will be altered, or when there is no need to
* preserve their contents.
*
* WARNING
* The use of floating point instructions by ISRs is not supported by the
* kernel.
*
* INTERNAL
* The kernel sets CR0[TS] to 0 only for threads that require FP register
* sharing. All other threads have CR0[TS] set to 1 so that an attempt
* to perform an FP operation will cause an exception, allowing the kernel
* to enable FP register sharing on its behalf.
*/
#include <zephyr/kernel.h>
headers: Refactor kernel and arch headers. This commit refactors kernel and arch headers to establish a boundary between private and public interface headers. The refactoring strategy used in this commit is detailed in the issue This commit introduces the following major changes: 1. Establish a clear boundary between private and public headers by removing "kernel/include" and "arch/*/include" from the global include paths. Ideally, only kernel/ and arch/*/ source files should reference the headers in these directories. If these headers must be used by a component, these include paths shall be manually added to the CMakeLists.txt file of the component. This is intended to discourage applications from including private kernel and arch headers either knowingly and unknowingly. - kernel/include/ (PRIVATE) This directory contains the private headers that provide private kernel definitions which should not be visible outside the kernel and arch source code. All public kernel definitions must be added to an appropriate header located under include/. - arch/*/include/ (PRIVATE) This directory contains the private headers that provide private architecture-specific definitions which should not be visible outside the arch and kernel source code. All public architecture- specific definitions must be added to an appropriate header located under include/arch/*/. - include/ AND include/sys/ (PUBLIC) This directory contains the public headers that provide public kernel definitions which can be referenced by both kernel and application code. - include/arch/*/ (PUBLIC) This directory contains the public headers that provide public architecture-specific definitions which can be referenced by both kernel and application code. 2. Split arch_interface.h into "kernel-to-arch interface" and "public arch interface" divisions. - kernel/include/kernel_arch_interface.h * provides private "kernel-to-arch interface" definition. * includes arch/*/include/kernel_arch_func.h to ensure that the interface function implementations are always available. * includes sys/arch_interface.h so that public arch interface definitions are automatically included when including this file. - arch/*/include/kernel_arch_func.h * provides architecture-specific "kernel-to-arch interface" implementation. * only the functions that will be used in kernel and arch source files are defined here. - include/sys/arch_interface.h * provides "public arch interface" definition. * includes include/arch/arch_inlines.h to ensure that the architecture-specific public inline interface function implementations are always available. - include/arch/arch_inlines.h * includes architecture-specific arch_inlines.h in include/arch/*/arch_inline.h. - include/arch/*/arch_inline.h * provides architecture-specific "public arch interface" inline function implementation. * supersedes include/sys/arch_inline.h. 3. Refactor kernel and the existing architecture implementations. - Remove circular dependency of kernel and arch headers. The following general rules should be observed: * Never include any private headers from public headers * Never include kernel_internal.h in kernel_arch_data.h * Always include kernel_arch_data.h from kernel_arch_func.h * Never include kernel.h from kernel_struct.h either directly or indirectly. Only add the kernel structures that must be referenced from public arch headers in this file. - Relocate syscall_handler.h to include/ so it can be used in the public code. This is necessary because many user-mode public codes reference the functions defined in this header. - Relocate kernel_arch_thread.h to include/arch/*/thread.h. This is necessary to provide architecture-specific thread definition for 'struct k_thread' in kernel.h. - Remove any private header dependencies from public headers using the following methods: * If dependency is not required, simply omit * If dependency is required, - Relocate a portion of the required dependencies from the private header to an appropriate public header OR - Relocate the required private header to make it public. This commit supersedes #20047, addresses #19666, and fixes #3056. Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
2019-10-24 17:08:21 +02:00
#include <kernel_internal.h>
/* SSE control/status register default value (used by assembler code) */
extern uint32_t _sse_mxcsr_default_value;
/**
* @brief Disallow use of floating point capabilities
*
* This routine sets CR0[TS] to 1, which disallows the use of FP instructions
* by the currently executing thread.
*/
static inline void z_FpAccessDisable(void)
{
void *tempReg;
__asm__ volatile(
"movl %%cr0, %0;\n\t"
"orl $0x8, %0;\n\t"
"movl %0, %%cr0;\n\t"
: "=r"(tempReg)
:
: "memory");
}
/**
* @brief Save non-integer context information
*
* This routine saves the system's "live" non-integer context into the
* specified area. If the specified thread supports SSE then
* x87/MMX/SSEx thread info is saved, otherwise only x87/MMX thread is saved.
* Function is invoked by FpCtxSave(struct k_thread *thread)
*/
static inline void z_do_fp_regs_save(void *preemp_float_reg)
{
__asm__ volatile("fnsave (%0);\n\t"
:
: "r"(preemp_float_reg)
: "memory");
}
/**
* @brief Save non-integer context information
*
* This routine saves the system's "live" non-integer context into the
* specified area. If the specified thread supports SSE then
* x87/MMX/SSEx thread info is saved, otherwise only x87/MMX thread is saved.
* Function is invoked by FpCtxSave(struct k_thread *thread)
*/
static inline void z_do_fp_and_sse_regs_save(void *preemp_float_reg)
{
__asm__ volatile("fxsave (%0);\n\t"
:
: "r"(preemp_float_reg)
: "memory");
}
/**
* @brief Initialize floating point register context information.
*
* This routine initializes the system's "live" floating point registers.
*/
static inline void z_do_fp_regs_init(void)
{
__asm__ volatile("fninit\n\t");
}
/**
* @brief Initialize SSE register context information.
*
* This routine initializes the system's "live" SSE registers.
*/
static inline void z_do_sse_regs_init(void)
{
__asm__ volatile("ldmxcsr _sse_mxcsr_default_value\n\t");
}
/*
* Save a thread's floating point context information.
*
* This routine saves the system's "live" floating point context into the
* specified thread control block. The SSE registers are saved only if the
* thread is actually using them.
*/
static void FpCtxSave(struct k_thread *thread)
{
#ifdef CONFIG_X86_SSE
if ((thread->base.user_options & K_SSE_REGS) != 0) {
z_do_fp_and_sse_regs_save(&thread->arch.preempFloatReg);
return;
}
#endif
z_do_fp_regs_save(&thread->arch.preempFloatReg);
}
/*
* Initialize a thread's floating point context information.
*
* This routine initializes the system's "live" floating point context.
* The SSE registers are initialized only if the thread is actually using them.
*/
static inline void FpCtxInit(struct k_thread *thread)
{
z_do_fp_regs_init();
#ifdef CONFIG_X86_SSE
if ((thread->base.user_options & K_SSE_REGS) != 0) {
z_do_sse_regs_init();
}
#endif
}
/*
* Enable preservation of floating point context information.
*
* The transition from "non-FP supporting" to "FP supporting" must be done
* atomically to avoid confusing the floating point logic used by z_swap(), so
* this routine locks interrupts to ensure that a context switch does not occur.
* The locking isn't really needed when the routine is called by a cooperative
* thread (since context switching can't occur), but it is harmless.
*/
void z_float_enable(struct k_thread *thread, unsigned int options)
{
unsigned int imask;
struct k_thread *fp_owner;
if (!thread) {
return;
}
/* Ensure a preemptive context switch does not occur */
imask = irq_lock();
/* Indicate thread requires floating point context saving */
thread->base.user_options |= (uint8_t)options;
/*
* The current thread might not allow FP instructions, so clear CR0[TS]
* so we can use them. (CR0[TS] gets restored later on, if necessary.)
*/
__asm__ volatile("clts\n\t");
/*
* Save existing floating point context (since it is about to change),
* but only if the FPU is "owned" by an FP-capable task that is
* currently handling an interrupt or exception (meaning its FP context
* must be preserved).
*/
kernel/arch: consolidate tTCS and TNANO definitions There was a lot of duplication between architectures for the definition of threads and the "nanokernel" guts. These have been consolidated. Now, a common file kernel/unified/include/kernel_structs.h holds the common definitions. Architectures provide two files to complement it: kernel_arch_data.h and kernel_arch_func.h. The first one contains at least the struct _thread_arch and struct _kernel_arch data structures, as well as the struct _callee_saved and struct _caller_saved register layouts. The second file contains anything that needs what is provided by the common stuff in kernel_structs.h. Those two files are only meant to be included in kernel_structs.h in very specific locations. The thread data structure has been separated into three major parts: common struct _thread_base and struct k_thread, and arch-specific struct _thread_arch. The first and third ones are included in the second. The struct s_NANO data structure has been split into two: common struct _kernel and arch-specific struct _kernel_arch. The latter is included in the former. Offsets files have also changed: nano_offsets.h has been renamed kernel_offsets.h and is still included by the arch-specific offsets.c. Also, since the thread and kernel data structures are now made of sub-structures, offsets have to be added to make up the full offset. Some of these additions have been consolidated in shorter symbols, available from kernel/unified/include/offsets_short.h, which includes an arch-specific offsets_arch_short.h. Most of the code include offsets_short.h now instead of offsets.h. Change-Id: I084645cb7e6db8db69aeaaf162963fe157045d5a Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-11-08 16:36:50 +01:00
fp_owner = _kernel.current_fp;
if (fp_owner != NULL) {
if ((fp_owner->arch.flags & X86_THREAD_FLAG_ALL) != 0) {
FpCtxSave(fp_owner);
}
}
/* Now create a virgin FP context */
FpCtxInit(thread);
/* Associate the new FP context with the specified thread */
if (thread == _current) {
/*
* When enabling FP support for the current thread, just claim
* ownership of the FPU and leave CR0[TS] unset.
*
* (The FP context is "live" in hardware, not saved in TCS.)
*/
_kernel.current_fp = thread;
} else {
/*
* When enabling FP support for someone else, assign ownership
* of the FPU to them (unless we need it ourselves).
*/
if ((_current->base.user_options & _FP_USER_MASK) == 0) {
/*
* We are not FP-capable, so mark FPU as owned by the
* thread we've just enabled FP support for, then
* disable our own FP access by setting CR0[TS] back
* to its original state.
*/
_kernel.current_fp = thread;
z_FpAccessDisable();
} else {
/*
* We are FP-capable (and thus had FPU ownership on
* entry), so save the new FP context in their TCS,
* leave FPU ownership with self, and leave CR0[TS]
* unset.
*
* The saved FP context is needed in case the thread
* we enabled FP support for is currently pre-empted,
* since z_swap() uses it to restore FP context when
* the thread re-activates.
*
* Saving the FP context reinits the FPU, and thus
* our own FP context, but that's OK since it didn't
* need to be preserved. (i.e. We aren't currently
* handling an interrupt or exception.)
*/
FpCtxSave(thread);
}
}
irq_unlock(imask);
}
unified/x86: add unified kernel support for x86 arch The x86 architecture port is fitted with support for the unified kernel, namely: - the interrupt exit code now calls _Swap() if the current thread is not a coop thread and if the scheduler is not locked - there is no 'task' fields in the _nanokernel anymore: _Swap() now calls _get_next_ready_thread instead - the _nanokernel.fiber field is replaced by a more sophisticated ready_q, based on the microkernel's priority-bitmap-based one - nano_private includes nano_internal.h from the unified directory - the FIBER, TASK and PREEMPTIBLE flags do not exist anymore: the thread priority drives the behaviour - the tcs uses a dlist for queuing in both ready and wait queues instead of a custom singly-linked list - other new fields in the tcs include a schedule-lock count, a back-pointer to init data (when the task is static) and a pointer to swap data, needed when a thread pending on _Swap() must be passed more then just one value (e.g. k_stack_pop() needs an error code and data) - fiberRtnValueSet() is aliased to _set_thread_return_value since it also operates on preempt threads now - _set_thread_return_value_with_data() sets the swap_data field in addition to a return value from _Swap() - convenience aliases are created for shorter names: - _current is defined as _nanokernel.current - _ready_q is defined as _nanokernel.ready_q - _Swap() sets the threads's return code to -EAGAIN before swapping out to prevent timeouts to have to set it (solves hard issues in some kernel objects). - Floating point support. Note that, in _Swap(), the register holding the thread to be swapped in has been changed from %ecx to %eax in both the legacy kernel and the unified kernel to take advantage of the fact that the return value of _get_next_ready_thread() is stored in %eax, and this avoids moving it to %ecx. Work by: Dmitriy Korovkin <dmitriy.korovkin@windriver.com> Allan Stephens <allan.stephens@windriver.com> Benjamin Walsh <benjamin.walsh@windriver.com> Change-Id: I4ce2bd47bcdc62034c669b5e889fc0f29480c43b Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-09-02 22:34:35 +02:00
/**
* Disable preservation of floating point context information.
unified/x86: add unified kernel support for x86 arch The x86 architecture port is fitted with support for the unified kernel, namely: - the interrupt exit code now calls _Swap() if the current thread is not a coop thread and if the scheduler is not locked - there is no 'task' fields in the _nanokernel anymore: _Swap() now calls _get_next_ready_thread instead - the _nanokernel.fiber field is replaced by a more sophisticated ready_q, based on the microkernel's priority-bitmap-based one - nano_private includes nano_internal.h from the unified directory - the FIBER, TASK and PREEMPTIBLE flags do not exist anymore: the thread priority drives the behaviour - the tcs uses a dlist for queuing in both ready and wait queues instead of a custom singly-linked list - other new fields in the tcs include a schedule-lock count, a back-pointer to init data (when the task is static) and a pointer to swap data, needed when a thread pending on _Swap() must be passed more then just one value (e.g. k_stack_pop() needs an error code and data) - fiberRtnValueSet() is aliased to _set_thread_return_value since it also operates on preempt threads now - _set_thread_return_value_with_data() sets the swap_data field in addition to a return value from _Swap() - convenience aliases are created for shorter names: - _current is defined as _nanokernel.current - _ready_q is defined as _nanokernel.ready_q - _Swap() sets the threads's return code to -EAGAIN before swapping out to prevent timeouts to have to set it (solves hard issues in some kernel objects). - Floating point support. Note that, in _Swap(), the register holding the thread to be swapped in has been changed from %ecx to %eax in both the legacy kernel and the unified kernel to take advantage of the fact that the return value of _get_next_ready_thread() is stored in %eax, and this avoids moving it to %ecx. Work by: Dmitriy Korovkin <dmitriy.korovkin@windriver.com> Allan Stephens <allan.stephens@windriver.com> Benjamin Walsh <benjamin.walsh@windriver.com> Change-Id: I4ce2bd47bcdc62034c669b5e889fc0f29480c43b Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-09-02 22:34:35 +02:00
*
* The transition from "FP supporting" to "non-FP supporting" must be done
* atomically to avoid confusing the floating point logic used by z_swap(), so
* this routine locks interrupts to ensure that a context switch does not occur.
* The locking isn't really needed when the routine is called by a cooperative
* thread (since context switching can't occur), but it is harmless.
*/
int z_float_disable(struct k_thread *thread)
{
unsigned int imask;
/* Ensure a preemptive context switch does not occur */
imask = irq_lock();
/* Disable all floating point capabilities for the thread */
thread->base.user_options &= ~_FP_USER_MASK;
if (thread == _current) {
z_FpAccessDisable();
_kernel.current_fp = (struct k_thread *)0;
} else {
if (_kernel.current_fp == thread) {
_kernel.current_fp = (struct k_thread *)0;
}
}
irq_unlock(imask);
return 0;
}
/*
* Handler for "device not available" exception.
*
* This routine is registered to handle the "device not available" exception
* (vector = 7).
*
* The processor will generate this exception if any x87 FPU, MMX, or SSEx
* instruction is executed while CR0[TS]=1. The handler then enables the
* current thread to use all supported floating point registers.
*/
void _FpNotAvailableExcHandler(z_arch_esf_t *pEsf)
{
ARG_UNUSED(pEsf);
/*
unified/x86: add unified kernel support for x86 arch The x86 architecture port is fitted with support for the unified kernel, namely: - the interrupt exit code now calls _Swap() if the current thread is not a coop thread and if the scheduler is not locked - there is no 'task' fields in the _nanokernel anymore: _Swap() now calls _get_next_ready_thread instead - the _nanokernel.fiber field is replaced by a more sophisticated ready_q, based on the microkernel's priority-bitmap-based one - nano_private includes nano_internal.h from the unified directory - the FIBER, TASK and PREEMPTIBLE flags do not exist anymore: the thread priority drives the behaviour - the tcs uses a dlist for queuing in both ready and wait queues instead of a custom singly-linked list - other new fields in the tcs include a schedule-lock count, a back-pointer to init data (when the task is static) and a pointer to swap data, needed when a thread pending on _Swap() must be passed more then just one value (e.g. k_stack_pop() needs an error code and data) - fiberRtnValueSet() is aliased to _set_thread_return_value since it also operates on preempt threads now - _set_thread_return_value_with_data() sets the swap_data field in addition to a return value from _Swap() - convenience aliases are created for shorter names: - _current is defined as _nanokernel.current - _ready_q is defined as _nanokernel.ready_q - _Swap() sets the threads's return code to -EAGAIN before swapping out to prevent timeouts to have to set it (solves hard issues in some kernel objects). - Floating point support. Note that, in _Swap(), the register holding the thread to be swapped in has been changed from %ecx to %eax in both the legacy kernel and the unified kernel to take advantage of the fact that the return value of _get_next_ready_thread() is stored in %eax, and this avoids moving it to %ecx. Work by: Dmitriy Korovkin <dmitriy.korovkin@windriver.com> Allan Stephens <allan.stephens@windriver.com> Benjamin Walsh <benjamin.walsh@windriver.com> Change-Id: I4ce2bd47bcdc62034c669b5e889fc0f29480c43b Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-09-02 22:34:35 +02:00
* Assume the exception did not occur in an ISR.
* (In other words, CPU cycles will not be consumed to perform
* error checking to ensure the exception was not generated in an ISR.)
*/
unified/x86: add unified kernel support for x86 arch The x86 architecture port is fitted with support for the unified kernel, namely: - the interrupt exit code now calls _Swap() if the current thread is not a coop thread and if the scheduler is not locked - there is no 'task' fields in the _nanokernel anymore: _Swap() now calls _get_next_ready_thread instead - the _nanokernel.fiber field is replaced by a more sophisticated ready_q, based on the microkernel's priority-bitmap-based one - nano_private includes nano_internal.h from the unified directory - the FIBER, TASK and PREEMPTIBLE flags do not exist anymore: the thread priority drives the behaviour - the tcs uses a dlist for queuing in both ready and wait queues instead of a custom singly-linked list - other new fields in the tcs include a schedule-lock count, a back-pointer to init data (when the task is static) and a pointer to swap data, needed when a thread pending on _Swap() must be passed more then just one value (e.g. k_stack_pop() needs an error code and data) - fiberRtnValueSet() is aliased to _set_thread_return_value since it also operates on preempt threads now - _set_thread_return_value_with_data() sets the swap_data field in addition to a return value from _Swap() - convenience aliases are created for shorter names: - _current is defined as _nanokernel.current - _ready_q is defined as _nanokernel.ready_q - _Swap() sets the threads's return code to -EAGAIN before swapping out to prevent timeouts to have to set it (solves hard issues in some kernel objects). - Floating point support. Note that, in _Swap(), the register holding the thread to be swapped in has been changed from %ecx to %eax in both the legacy kernel and the unified kernel to take advantage of the fact that the return value of _get_next_ready_thread() is stored in %eax, and this avoids moving it to %ecx. Work by: Dmitriy Korovkin <dmitriy.korovkin@windriver.com> Allan Stephens <allan.stephens@windriver.com> Benjamin Walsh <benjamin.walsh@windriver.com> Change-Id: I4ce2bd47bcdc62034c669b5e889fc0f29480c43b Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-09-02 22:34:35 +02:00
/* Enable highest level of FP capability configured into the kernel */
kernel/arch: consolidate tTCS and TNANO definitions There was a lot of duplication between architectures for the definition of threads and the "nanokernel" guts. These have been consolidated. Now, a common file kernel/unified/include/kernel_structs.h holds the common definitions. Architectures provide two files to complement it: kernel_arch_data.h and kernel_arch_func.h. The first one contains at least the struct _thread_arch and struct _kernel_arch data structures, as well as the struct _callee_saved and struct _caller_saved register layouts. The second file contains anything that needs what is provided by the common stuff in kernel_structs.h. Those two files are only meant to be included in kernel_structs.h in very specific locations. The thread data structure has been separated into three major parts: common struct _thread_base and struct k_thread, and arch-specific struct _thread_arch. The first and third ones are included in the second. The struct s_NANO data structure has been split into two: common struct _kernel and arch-specific struct _kernel_arch. The latter is included in the former. Offsets files have also changed: nano_offsets.h has been renamed kernel_offsets.h and is still included by the arch-specific offsets.c. Also, since the thread and kernel data structures are now made of sub-structures, offsets have to be added to make up the full offset. Some of these additions have been consolidated in shorter symbols, available from kernel/unified/include/offsets_short.h, which includes an arch-specific offsets_arch_short.h. Most of the code include offsets_short.h now instead of offsets.h. Change-Id: I084645cb7e6db8db69aeaaf162963fe157045d5a Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
2016-11-08 16:36:50 +01:00
k_float_enable(_current, _FP_USER_MASK);
}
_EXCEPTION_CONNECT_NOCODE(_FpNotAvailableExcHandler,
IV_DEVICE_NOT_AVAILABLE, 0);