This fixes a subtle race condition in the thread timeout expiration
handler z_thread_timeout(). There was a small window of opportunity
between when sys_clock_announce() unlocked interrupts and that the
handler re-locked them that one or more higher priority interrupts
(or threads running on another CPU if in an SMP environment) could
abort the thread's timeout.
The fix has two parts. Part one ensures that _sched_spinlock is held
in every location before a thread's time can be canceled. Of the
various locations, only z_unpend_thread() was found to need updating.
Part two updates the timeout handler z_thread_timeout() to bail early
if the thread's timeout has been found to be canceled (or re-used)
during that aforementioned window.
Fixes#106653
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
This partially reverts commit 9cef0da05c ("kernel: avoid recursive
scheduler lock in k_heap_free path"), keeping only the sched.c changes
(z_unpend_all_locked / z_unpend_all refactoring).
The _sched_locked variants of k_free, k_heap_free, k_msgq_cleanup,
k_stack_cleanup and the sched_locked parameter plumbing through
unref_check were an overcomplicated approach. A simpler fix follows.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
IAR emits diagnostic Go004 ("function cannot be inlined") for
every ALWAYS_INLINE function when optimisation is disabled, e.g.
in debug builds. The previous workaround wrapped each affected
function in per-function preprocessor guard pairs:
#ifdef IAR_SUPPRESS_ALWAYS_INLINE_WARNING_FLAG
TOOLCHAIN_DISABLE_WARNING(TOOLCHAIN_WARNING_ALWAYS_INLINE)
#endif
static ALWAYS_INLINE void foo(...) { ... }
#ifdef IAR_SUPPRESS_ALWAYS_INLINE_WARNING_FLAG
TOOLCHAIN_ENABLE_WARNING(TOOLCHAIN_WARNING_ALWAYS_INLINE)
#endif
This pattern is highly intrusive, scatters toolchain-specific
knowledge across generic source files, and requires a guard pair
every time a new ALWAYS_INLINE function is added for IAR.
Replace it with a single override of ALWAYS_INLINE inside
iccarm.h, using the C99 _Pragma operator to embed the diagnostic
suppression in the macro itself.
Assisted-by: GitHub Copilot:claude-sonnet-4.6
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
gen_offset.h is an architecture-specific header, not a kernel one.
Move it under the arch tree where it belongs.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Migrate scheduler API implementations (k_sched_lock/unlock,
z_reschedule, z_yield_current, etc.) and their private declarations
from ksched.h/sched.c into scheduler.c and scheduler.h.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move time-slice related declarations from ksched.h into the
dedicated kernel/include/timeslicing.h header.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Migrate z_add_thread_to_ready_q(), z_remove_thread_from_ready_q(),
and related helpers from sched.c to scheduler.c.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move z_sched_init to scheduler.c and somplify implementation getting rid
of single use init_ready_q.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move run-queue management functions (add/remove/peek thread,
choose_next_thread) from sched.c into the new
kernel/include/run_q.h header.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move meta-IRQ (highest-priority cooperative queue) scheduling
functions from sched.c into a new kernel/include/metairq.h header
to reduce sched.c size and group related logic.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Relocate k_thread_start(), k_thread_abort(), k_thread_suspend(), and
k_thread_resume() from sched.c to thread.c alongside related thread
lifecycle code.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
When halt_thread() calls k_thread_perms_all_clear() under
_sched_spinlock, the permission cleanup can trigger k_free() on
dynamic objects. k_heap_free() then calls z_unpend_all() which
attempts to take _sched_spinlock again, causing a recursive lock.
Fix this by introducing k_heap_free_sched_locked() and
k_free_sched_locked() variants that use z_unpend_all_locked()
to operate on the wait queue without re-acquiring the scheduler
lock. The existing z_unpend_all() becomes a wrapper that takes
the lock and delegates to z_unpend_all_locked().
unref_check() gains a sched_locked parameter: the abort path
(clear_perms_cb) passes true to use the locked free variant,
while k_thread_perms_clear() passes false for the normal path.
Fixes#106659
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The routine sys_clock_announce() removes the timeout from the timeout
list and unlocks the timeout spinlock before invoking the timeout's
handler. This creates a window where another ISR (or a thread running
on another CPU) can abort or reuse the timeout before the handler
executes. When this happens, the timeout handler should bail early.
Use the dticks field to carry this state: set it to
TIMEOUT_DTICKS_ANNOUNCING after remove_timeout() (which needs
dticks = 0 to propagate remaining ticks) and before calling the
handler. In z_abort_timeout(), set TIMEOUT_DTICKS_ABORTED when the
timeout is either linked (existing behavior) or in the announcing
state (new). The z_add_timeout() path naturally overwrites dticks
with a real tick value, so re-use is also detected.
Provide z_is_timeout_handler_canceled() for handlers to check if
they should bail. This avoids adding a flags field to struct _timeout,
keeping the struct size unchanged.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
z_get_next_switch_handle() is a clean API, but implementing it as a
(comparatively large) callable function requires significant
entry/exit boilerplate and hides the very common "no switch needed"
early exit condition from the enclosing C code that calls it. (Most
architectures call this from assembly though and don't notice).
Provide an unwrapped version for the specific needs non-SMP builds.
It's compatible in all other ways.
Slightly ugly, but the gains are significant (like a dozen cycles or
so).
Signed-off-by: Andy Ross <andyross@google.com>
Pick some low hanging fruit on non-SMP code paths:
+ The scheduler spinlock is always taken, but as we're already in an
irqlocked state that's a noop. But the optmizer can't tell, because
arch_irq_lock() involves an asm block it can't see inside. Elide
the call when possible.
+ The z_swap_next_thread() function evaluates to just a single load of
_kernel.ready_q.cache when !SMP, but wasn't being inlined because of
function location. Move that test up into do_swap() so it's always
done correctly.
Signed-off-by: Andy Ross <andyross@google.com>
When CONFIG_WAITQ_SCALABLE=y, wake up all threads from a post-waitq-walk
callback which is invoked while the scheduler spinlock is still held. This
solves the race condition that was worked around via `no_wake_in_timeout`
flag in k_thread and `is_timeout` parameter of z_sched_wake_thread_locked()
which can now both be dropped.
Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
Modify z_sched_waitq_walk() to accept an optional callback invoked after
the walk while still holding the scheduler spinlock. This can be used to
perform post-walk operations "atomically". Update all callers to work with
this new function signature.
While at it, create dedicated (private) typedefs for the callbacks and
clean up/improve the routine and callbacks' documentation.
Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
z_sched_waitq_walk() used _WAIT_Q_FOR_EACH, a wrapper around the
"unsafe" SYS_DLIST_FOR_EACH_CONTAINER which does not allow detaching
elements from the list during the walk. As a result, attempting to
detach threads from the wait queue as part of the callback provided
to z_sched_waitq_walk() would result in breakage.
Introduce new _WAIT_Q_FOR_EACH_SAFE macro as wrapper around the "safe"
SYS_DLIST_FOR_EACH_CONTAINER_SAFE which allows detaching nodes from
the list during the walk, and use it inside z_sched_waitq_walk().
While at it:
- add documentation on the _WAIT_Q_FOR_EACH macro, including a warning
about detaching elements as part of the loop not being allowed
- add note to documentation of z_sched_waitq_walk() indicating that
the callback can safely remove the thread from wait queue as this
will no longer break the FOR_EACH loop
- add _WAIT_Q_FOR_EACH_SAFE to the list of ForEachMacros in .clang-format
NOTE: this new "safe removal inside callback" behavior is only available
when CONFIG_WAITQ_SCALABLE=n. When the option is 'y', red-black trees are
used instead of doubly-linked lists which prevent mutation of the list
while it is being walked. This limitation is explicitly documented.
Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
Don't acquire the _sched_spinlock in z_sched_wake_thread(). This allows
calling the function from callbacks which already own the spinlock. The
function is renamed to z_sched_wake_thread_locked() to reflect this new
behavior, and all existing callers are updated to ensure they hold the
_sched_spinlock as is now required.
Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
IAR compiler may emit Error[Go004]: Could not inline function
when handling functions marked as always_inline or inline=forced,
especially in complex kernel code
Signed-off-by: Thinh Le Cong <thinh.le.xr@bp.renesas.com>
Within z_sched_ipi() there is no need for the thread_is_sliceable()
test as z_time_slice() performs that check. Since as a result of this
thread_is_sliceable() is now only used within timeslicing.c, the
'static' keyword is applied to it.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
All instances of the internal routine move_thread_to_end_of_prio_q()
use the current thread. Renaming it to move_current_to_end_of_prio_q()
to reflect that.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The routine z_move_thread_to_end_of_prio_q() has been renamed to
z_yield_testing_only() as it was only both only used for test code
and always operated on the current thread.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Add support for stacktrace in dummy thread which is used to run
the early system initialization code before the kernel switches
to the main thread.
On RISC-V, the dummy thread will be running temporarily on the
interrupt stack, but currently we do not initialize the stack
info for the dummy thread, hence check the address against the
interrupt stack.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Since these helper functions are read-only, mark the `thread`
arg as `const` so that we can pass const thread to it without
triggering warnings.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
This patch moves `is_aborting()` and `is_halting()`
from `kernel/sched.c` to `kernel/include/kthread.h`
and renames them to `z_is_thread_aborting()` and `z_is_thread_halting()`,
for consistency with other internal kernel APIs.
It replaces the previous inline function definitions in `sched.c`
with calls to the new header functions. Additionally, direct bitwise
checks like `(thread->base.thread_state & _THREAD_DEAD) != 0U`
are updated to use the new `z_is_thread_dead()` helper function.
This enhances code readability and maintainability.
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Do not directly include and use APIs from ksched.h outside of the
kernel. For now do this using more suitable (ipi.h and
kernel_internal.h) internal APIs until more cleanup is done.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move device model syscalls to device.c and decouple kernel header from
device related routines. Cleanup init to have only what is needed.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Cleanup init.c code and move early boot code into arch/ and make it
accessible outside of the boot process/kernel.
All of this code is not related to the 'kernel' and is mostly used
within the architecture boot / setup process.
The way it was done, some soc code was including kernel_internal.h
directly, which shouldn't be done.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Not really a kernel feature, more for architecture, which is reflected
in how XIP is enabled and tested. Move it to architecture code to keep
which much of the 'implementation' and usage is.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The do_swap() routine used when CONFIG_USE_SWITCH=y asserts that caller
thread does not hold any spinlock when CONFIG_SPIN_VALIDATE is enabled.
However, there is no similar check in place when CONFIG_USE_SWITCH=n.
Copy this assertion in the USE_SWITCH=n implementation of z_swap_irqlock().
Signed-off-by: Mathieu Choplain <mathieu.choplain@st.com>
When lazy HiFi context switching is enabled, the system starts with
the HiFi coprocessor disabled. Should the thread use that coprocessor,
it will generate an exception which in turn will enable the coprocessor
and save/restore the HiFi registers as appropriate. When switching
to a new thread, the HiFi coprocessor is again disabled.
For simplicity, there are no restrictions as to which thread is allowed
to use the coprocessor.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The intent of arch_coprocessors_disable() is to replace
arch_float_disable() in halt_thread() for the FPU will not
always be the only coprocessor that will need to be disabled.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
z_sched_lock() has exactly one user in tree which is
k_sched_lock(). So combine them to make it easier to
follow (or not, since there is no jumping to another
file anymore).
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Add Kconfig option to dump only a portion of stack from the
current stack pointer to the stack end. This is enough to
let gdb reconstruct the stack trace and can significantly
reduce the dump size. This is crucial if the core dump needs
to be sent over radio.
Additionally, add another option to set the limit for the
dumped stack portion.
Signed-off-by: Damian Krolik <damian.krolik@nordicsemi.no>
Add return value to z_add_timeout. It returns system tick when timeout
will expire.
Signed-off-by: Krzysztof Chruściński <krzysztof.chruscinski@nordicsemi.no>
Accessing system timer registers can be costly and it shall be avoided
if possible. When thread is waken up in z_tick_sleep it may be because
timeout expired or because thread was waken up before sleeping period
passed.
Add function to detect if timeout is aborted (before it was expired).
Use it in the sleep function and avoid reading system ticks if timeout
was not aborted.
Signed-off-by: Krzysztof Chruściński <krzysztof.chruscinski@nordicsemi.no>
When k_malloc() is expressed in terms of k_aligned_alloc() it invokes a
longer aligned allocation code path with an extra runtime overhead even
though no alignment is necessary.
Let's reference and invoke the aligned allocation code path only when an
actual aligned allocation is requested. This opens the possibility for
the linker to garbage-collect the aligning code otherwise.
Also bypass k_heap_malloc() and friends given they're invoked with
K_NO_WAIT. Go directly to sys_heap_*() instead to cut some more unneeded
overhead.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Improve naming of the scheduler and call it what it is: simple. Using
'dumb' for the default scheduler algorithm in Zephyr is a bad idea.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The current_fp field in the z_kernel structure is only used
by 32-bit x86 (which does not support SMP). As such, it should
reside in the arch specific of section of _kernel.cpus[0].
This also changes the name of 'current_fp' to 'fpu_owner' to
be more consistent with other architectures.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The 'order_key' field in the thread structure '_thread_base' is only
required when CONFIG_SCHED_SCALABLE and/or CONFIG_WAITQ_SCALABLE are
enabled (neither of which is a default setting). Making the existence
of this field conditional slightly reduces the size of the k_thread
structure when neither of those Kconfig options are selected.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>