Commit graph

452 commits

Author SHA1 Message Date
Peter Mitsis
1b8c7a3038 kernel: thread timeout: Fix race condition
This fixes a subtle race condition in the thread timeout expiration
handler z_thread_timeout(). There was a small window of opportunity
between when sys_clock_announce() unlocked interrupts and that the
handler re-locked them that one or more higher priority interrupts
(or threads running on another CPU if in an SMP environment) could
abort the thread's timeout.

The fix has two parts. Part one ensures that _sched_spinlock is held
in every location before a thread's time can be canceled. Of the
various locations, only z_unpend_thread() was found to need updating.
Part two updates the timeout handler z_thread_timeout() to bail early
if the thread's timeout has been found to be canceled (or re-used)
during that aforementioned window.

Fixes #106653

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-05-11 04:04:02 +02:00
Nicolas Pitre
01b3821fd9 kernel: spinlock: provide z_spin_is_locked() for UP builds
Extend z_spin_is_locked() to non-SMP configurations so assertions like
the one in z_unpend_all_locked() can validate lock ownership in UP
builds too. In UP a spinlock reduces to an IRQ lock, so the check
samples the current IRQ state via arch_irq_lock() / arch_irq_unlock().

Drop the now-unnecessary CONFIG_SMP guard around the sched spinlock
assertion in z_unpend_all_locked().

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-05-01 11:18:04 -05:00
Anas Nashif
26e88cee76 toolchain: iar: suppress Go004 via ALWAYS_INLINE override
IAR emits diagnostic Go004 ("function cannot be inlined") for
every ALWAYS_INLINE function when optimisation is disabled, e.g.
in debug builds.  The previous workaround wrapped each affected
function in per-function preprocessor guard pairs:

  #ifdef IAR_SUPPRESS_ALWAYS_INLINE_WARNING_FLAG
  TOOLCHAIN_DISABLE_WARNING(TOOLCHAIN_WARNING_ALWAYS_INLINE)
  #endif
  static ALWAYS_INLINE void foo(...) { ... }
  #ifdef IAR_SUPPRESS_ALWAYS_INLINE_WARNING_FLAG
  TOOLCHAIN_ENABLE_WARNING(TOOLCHAIN_WARNING_ALWAYS_INLINE)
  #endif

This pattern is highly intrusive, scatters toolchain-specific
knowledge across generic source files, and requires a guard pair
every time a new ALWAYS_INLINE function is added for IAR.

Replace it with a single override of ALWAYS_INLINE inside
iccarm.h, using the C99 _Pragma operator to embed the diagnostic
suppression in the macro itself.

Assisted-by: GitHub Copilot:claude-sonnet-4.6
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-29 10:00:10 +02:00
Anas Nashif
7edd8834f6 kernel: sched.c: remove useless return on void function
Remove useless return on void function.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
d94fbad890 kernel: sched: rename z_reset_time_slice() to z_time_slice_reset()
Align the function name with the z_<subsystem>_<verb> convention
used elsewhere in the kernel.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
d24cfd96f5 kernel: sched: move public scheduler API to scheduler.c/scheduler.h
Migrate scheduler API implementations (k_sched_lock/unlock,
z_reschedule, z_yield_current, etc.) and their private declarations
from ksched.h/sched.c into scheduler.c and scheduler.h.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
05cbc7c98a kernel: sched: extract timeslice declarations to timeslicing.h
Move time-slice related declarations from ksched.h into the
dedicated kernel/include/timeslicing.h header.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
28f157beae kernel: sched: move core schedule/deschedule functions to scheduler.c
Migrate z_add_thread_to_ready_q(), z_remove_thread_from_ready_q(),
and related helpers from sched.c to scheduler.c.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
bde2ba8901 kernel: sched: simplify z_sched_init using run_q.h helpers
Move z_sched_init to scheduler.c and somplify implementation getting rid
of single use init_ready_q.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
78993d7020 kernel: sched: reorder z_unready_thread before its callers in sched.c
Move z_unready_thread next to related functions.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
cf3f8331e5 kernel: sched: group z_ready_thread and z_unready_thread in sched.c
Reorder so that z_ready_thread and z_unready_thread are adjacent,
improving code locality for related queue operations.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
a564d82a04 kernel: sched: extract run-queue helpers to run_q.h
Move run-queue management functions (add/remove/peek thread,
choose_next_thread) from sched.c into the new
kernel/include/run_q.h header.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
4b525273bf kernel: sched: simplify thread_runq()
Simplify code and make it more readable.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
fc11a9166e kernel: sched: extract meta-IRQ handling to metairq.h
Move meta-IRQ (highest-priority cooperative queue) scheduling
functions from sched.c into a new kernel/include/metairq.h header
to reduce sched.c size and group related logic.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
2ea5924943 kernel: k_yield: move code to thread.c
Move k_yield() from sched.c to thread.c alongside other thread
lifecycle calls.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
222eba03d7 kernel: sleep: move sleep code into own file
Reduce complexity of sched.c by encapsulating sleep handling code
(k_sleep, k_usleep, k_msleep) into its own file.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
908169d9df kernel: deadline: move deadline handling to own file
Move deadline scheduling to deadline.c, reducing complexity and
clutter in sched.c.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
0659dc18b3 kernel: sched: move thread lifecycle calls to thread.c
Relocate k_thread_start(), k_thread_abort(), k_thread_suspend(), and
k_thread_resume() from sched.c to thread.c alongside related thread
lifecycle code.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Nicolas Pitre
184b5a3804 kernel: assert scheduler lock is held in z_unpend_all_locked()
Add a runtime assertion in z_unpend_all_locked() to verify that
_sched_spinlock is actually held by the caller. This catches misuse
early given the function call depth involved.

Extend the availability of z_spin_is_locked() from CONFIG_SMP &&
CONFIG_TEST to also include CONFIG_ASSERT, so the check can be
used in __ASSERT() outside of test builds.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-04-07 08:40:28 -05:00
Nicolas Pitre
9cef0da05c kernel: avoid recursive scheduler lock in k_heap_free path
When halt_thread() calls k_thread_perms_all_clear() under
_sched_spinlock, the permission cleanup can trigger k_free() on
dynamic objects. k_heap_free() then calls z_unpend_all() which
attempts to take _sched_spinlock again, causing a recursive lock.

Fix this by introducing k_heap_free_sched_locked() and
k_free_sched_locked() variants that use z_unpend_all_locked()
to operate on the wait queue without re-acquiring the scheduler
lock. The existing z_unpend_all() becomes a wrapper that takes
the lock and delegates to z_unpend_all_locked().

unref_check() gains a sched_locked parameter: the abort path
(clear_perms_cb) passes true to use the locked free variant,
while k_thread_perms_clear() passes false for the normal path.

Fixes #106659

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-04-07 08:40:28 -05:00
Andrew Bresticker
1666066082 kernel/sched: fix race in consuming self-directed IPIs
Move signal_pending_ipi() inside the K_SPINLOCK block in
z_get_next_switch_handle(). Calling it after the lock release creates a
window where a CPU can consume its own pending IPI bit via atomic_clear
in signal_pending_ipi(), then silently drop it in
arch_sched_directed_ipi() which skips the calling CPU (i == id).

In configurations where secondary CPUs have a single pinned thread and
take no timer or external interrupts, this can lead to a permanent hang:
the idle CPU can only be woken by IPIs, but no IPIs are pending and no
timeslicing IPIs will be generated since the idle thread is not sliceable.
This was reproduced when running under QEMU with the following sequence
of events observed:

  CPU 0                                  CPU 1
  ─────                                  ─────

                                         Thread calls k_poll(K_MSEC(1))
                                           z_pend_curr():
                                             mark thread PENDING
                                             z_add_timeout(1ms)
                                             do_swap() to idle thread
                                         WFI

  Timer tick fires
  sys_clock_announce():
    slice_timeout(cpu1):
      flag_ipi(BIT(1))
    signal_pending_ipi():
      MSIP[cpu1] = 1

                                         CPU1 wakes from WFI
                                         z_get_next_switch_handle():
                                           acquire _sched_spinlock
                                           next_up() → idle
                                             (thread still PENDING,
                                              timeout hasn't fired yet)
                                           release _sched_spinlock

  Timer tick fires
  sys_clock_announce():
    z_thread_timeout(thread):
      z_unpend_thread(thread)
      z_ready_thread(thread):
        flag_ipi(BIT(1))

                                         signal_pending_ipi():
                                           atomic_clear(pending_ipi)
                                             returns BIT(1)
                                           arch_sched_directed_ipi(BIT(1))
                                             skips self, IPI silently lost
                                         return to idle thread
                                           WFI
                                             thread still on ready queue

Such an interleaving of events is, of course, likely only reproducible in
practice in virtualized environments where (v)CPUs can be descheduled.

With signal_pending_ipi() inside the lock, next_up() and the IPI
dispatch are atomic. Either the concurrent flag_ipi lands before the
lock is acquired (and next_up sees the thread), or it lands after the
lock is released (and the caller dispatches the IPI). There is no
window where a CPU can consume its own bit for a thread it hasn't seen.

Similar races exist in reschedule() and z_reschedule_irqlock() as well.
Although they won't cause the same permanent hang described above, it
can result in unnecessary rescheduling latency. Fix reschedule(), and
add a TODO to z_reschedule_irqlock(); it doesn't not currently take
the sched spinlock.

Signed-off-by: Andrew Bresticker <abrestic@meta.com>
2026-04-04 10:57:11 -05:00
Tharaka Jayasena
f6141e5ccf doc: kernel: fix incorrect Doxygen @retval/@return usage
Fix several incorrect uses of the Doxygen `@retval` and @return command in
kernel sources.

- Convert @return to structured @retval where functions return
  discrete values.
- Replace incorrect @retval usage with @return for non-discrete
  return types.

Signed-off-by: Tharaka Jayasena <9dmpires2k17.tuj@gmail.com>
2026-03-17 18:24:33 -04:00
Cheng-Yang Chou
d67038c7e7 kernel: fix z_tick_sleep unsigned comparison when !CONFIG_TIMEOUT_64BIT
When CONFIG_TIMEOUT_64BIT is not set, k_ticks_t is uint32_t. The previous
code cast left_ticks through int32_t but then stored the result back in
k_ticks_t (uint32_t), losing the sign. The subsequent ticks > 0 check was
therefore an unsigned comparison, causing a past-due wakeup (where the
subtraction wraps to a large uint32_t) to be misread as a large positive
remainder and propagated up through k_sleep() as INT_MAX ms.

Fix by retaining the signed intermediate and comparing it directly as
int32_t so negative remainders (past-due) correctly fall through to
return 0.

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
2026-03-17 18:18:28 -04:00
Andy Ross
a04a30c6d1 kernel/sched: Move reschedule under lock in k_sched_unlock()
This function was a little clumsy, taking the scheduler lock,
releasing it, and then calling z_reschedule_unlocked() instead of the
normal locked variant of reschedule.  Don't take the lock twice.

Mostly this is a code size and hygiene win.  Obviously the sched lock
is not normally a performance path, but I happened to have picked this
API for my own microbenchmark in tests/benchmarks/swap and so noticed
the double-lock while staring at disassembly.

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Andy Ross
2bbcece6ee kernel/sched: Refactor reschedule to permit better code generation
z_reschedule() is the basic kernel entry point for context switch,
wrapping z_swap(), and thence arch_switch().  It's currently defined
as a first class function for entry from other files in the kernel and
elsewhere (e.g. IPC library code).

But in practice it's actually a very thin wrapper without a lot of
logic of its own, and the context switch layers of some of the more
obnoxiously clever architectures are designed to interoperate with the
compiler's own spill/fill logic to avoid double saving.  And with a
small z_reschedule() there's not a lot to work with.

Make reschedule() an inlinable static, so the compiler has more
options.

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Andy Ross
d535d17cbc kernel: Minor optimizations to z_swap()
Pick some low hanging fruit on non-SMP code paths:

+ The scheduler spinlock is always taken, but as we're already in an
  irqlocked state that's a noop.  But the optmizer can't tell, because
  arch_irq_lock() involves an asm block it can't see inside.  Elide
  the call when possible.

+ The z_swap_next_thread() function evaluates to just a single load of
  _kernel.ready_q.cache when !SMP, but wasn't being inlined because of
  function location.  Move that test up into do_swap() so it's always
  done correctly.

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Mathieu Choplain
5a0f73f045 kernel: events: wake threads atomically using waitq post-walk callback
When CONFIG_WAITQ_SCALABLE=y, wake up all threads from a post-waitq-walk
callback which is invoked while the scheduler spinlock is still held. This
solves the race condition that was worked around via `no_wake_in_timeout`
flag in k_thread and `is_timeout` parameter of z_sched_wake_thread_locked()
which can now both be dropped.

Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
2026-02-18 14:43:10 +00:00
Mathieu Choplain
73feef8b20 kernel: sched: add post-walk callback argument to z_sched_waitq_walk()
Modify z_sched_waitq_walk() to accept an optional callback invoked after
the walk while still holding the scheduler spinlock. This can be used to
perform post-walk operations "atomically". Update all callers to work with
this new function signature.

While at it, create dedicated (private) typedefs for the callbacks and
clean up/improve the routine and callbacks' documentation.

Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
2026-02-18 14:43:10 +00:00
Mathieu Choplain
e725225489 kernel: sched: perform safe waitq walk inside z_sched_waitq_walk()
z_sched_waitq_walk() used _WAIT_Q_FOR_EACH, a wrapper around the
"unsafe" SYS_DLIST_FOR_EACH_CONTAINER which does not allow detaching
elements from the list during the walk. As a result, attempting to
detach threads from the wait queue as part of the callback provided
to z_sched_waitq_walk() would result in breakage.

Introduce new _WAIT_Q_FOR_EACH_SAFE macro as wrapper around the "safe"
SYS_DLIST_FOR_EACH_CONTAINER_SAFE which allows detaching nodes from
the list during the walk, and use it inside z_sched_waitq_walk().
While at it:
- add documentation on the _WAIT_Q_FOR_EACH macro, including a warning
  about detaching elements as part of the loop not being allowed
- add note to documentation of z_sched_waitq_walk() indicating that
  the callback can safely remove the thread from wait queue as this
  will no longer break the FOR_EACH loop
- add _WAIT_Q_FOR_EACH_SAFE to the list of ForEachMacros in .clang-format

NOTE: this new "safe removal inside callback" behavior is only available
when CONFIG_WAITQ_SCALABLE=n. When the option is 'y', red-black trees are
used instead of doubly-linked lists which prevent mutation of the list
while it is being walked. This limitation is explicitly documented.

Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
2026-02-18 14:43:10 +00:00
Mathieu Choplain
eac6c7cb24 kernel: sched: don't acquire scheduler spinlock in z_sched_wake_thread()
Don't acquire the _sched_spinlock in z_sched_wake_thread(). This allows
calling the function from callbacks which already own the spinlock. The
function is renamed to z_sched_wake_thread_locked() to reflect this new
behavior, and all existing callers are updated to ensure they hold the
_sched_spinlock as is now required.

Signed-off-by: Mathieu Choplain <mathieu.choplain-ext@st.com>
2026-02-18 14:43:10 +00:00
Yong Cong Sin
e843931194 kernel: check if interrupt is disabled in k_can_yield
`k_yield()` can't be called when interrupt is disabled, update
`k_can_yield()` to reflect that.

Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
2026-02-16 00:13:38 +00:00
Benjamin Cabé
a87520bd04 kernel: use proper essential type to initialize boolean variables
As per Zephyr coding guideline #59, "operands shall not be of an
inappropriate essential type". This makes sure boolean variables are
initialized with true/false, not 1/0.

Signed-off-by: Benjamin Cabé <benjamin@zephyrproject.org>
2026-02-04 13:52:38 +01:00
Peter Mitsis
669a8d0704 kernel: O(1) search for threads among CPUs
Instead of performing a linear search to determine if a given
thread is running on another CPU, or if it is marked as being
preempted by a metaIRQ on any CPU do this in O(1) time.

On SMP systems, Zephyr already tracks the CPU on which a thread
executes (or lasted executed). This information is leveraged to
do the search in O(1) time.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-01-08 17:34:14 -06:00
Thinh Le Cong
15cdf90bee include: zephyr: toolchain: suppress Go004 warning for inline functions
IAR compiler may emit Error[Go004]: Could not inline function
when handling functions marked as always_inline or inline=forced,
especially in complex kernel code

Signed-off-by: Thinh Le Cong <thinh.le.xr@bp.renesas.com>
2026-01-08 12:00:29 +00:00
Peter Mitsis
3a8c9797ca kernel: Re-instate metaIRQ z_is_thread_ready() check
Re-instate a z_is_thread_ready() check on the preempted metaIRQ
thread before selecting it as the preferred next thread to
schedule. This code exists because of a corner case where it is
possible for the thread that was recorded as being pre-empted
by a meta-IRQ thread can be marked as not 'ready to run' when
the meta-IRQ thread(s) complete.

Such a scenario may occur if an interrupt ...
  1. suspends the interrupted thread, then
  2. readies a meta-IRQ thread, then
  3. exits
The resulting reschedule can result in the suspended interrupted
thread being recorded as being interrupted by a meta-IRQ thread.
There may be other scenarios too.

Fixes #101296

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-12-22 22:33:18 +01:00
Peter Mitsis
5dd36854fd kernel: Update clearing metairq_preempted record
If the thread being aborted or suspended was preempted by a metaIRQ
thread then clear the metairq_preempted record. In the case of
aborting a thread, this prevents a re-used thread from being
mistaken for a preempted thread. Furthermore, it removes the need
to test the recorded thread for readiness in next_up().

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-12-10 10:32:50 +00:00
Peter Mitsis
36d195b717 kernel: MetaIRQ on SMP fix
When a cooperative thread (temporary or otherwise) is preempted by a
metaIRQ thread on SMP, it is no longer re-inserted into the readyQ.
This prevents it from being scheduled by another CPU while the
preempting metaIRQ thread runs.

Fixes #95081

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-12-10 10:32:50 +00:00
Peter Mitsis
653731d2e1 kernel: Adjust metairq preemption tracking bounds
Adjust the bounds for tracking metairq preemption to include the
case where the number of metairq threads matches the number of
cooperative threads. This is needed as a thread that is schedule
locked through k_sched_lock() is documented to be treated as a
cooperative thread. This implies that if such a thread is preempted
by a metairq thread that execution control must return to that
thread after the metairq thread finishes its work.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-12-10 10:32:50 +00:00
Daniel Leung
169304813a cache: move arch_mem_coherent() into cache subsys
arch_mem_coherent() is cache related so it is better to move it
under cache subsys. It is renamed to sys_cache_is_mem_coherent()
to reflect this change.

The only user of arch_mem_coherent() is Xtensa. However, it is
not an architecture feature. That's why it is moved to the cache
subsys.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-12-09 09:25:33 +01:00
Peter Mitsis
ce6c26a927 kernel: Simplify move_current_to_end_of_prio_q()
It is now more obvious that the move_current_to_end_or_prio_q() logic
is supposed to match that of k_yield() (without the schedule point).

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-11-25 17:37:52 +00:00
Peter Mitsis
77ad7111e1 kernel: Rename move_thread_to_end_of_prio_q()
All instances of the internal routine move_thread_to_end_of_prio_q()
use the current thread. Renaming it to move_current_to_end_of_prio_q()
to reflect that.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-11-25 17:37:52 +00:00
Peter Mitsis
ffc6c8839b kernel: Rename z_move_thread_to_end_of_prio_q()
The routine z_move_thread_to_end_of_prio_q() has been renamed to
z_yield_testing_only() as it was only both only used for test code
and always operated on the current thread.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2025-11-25 17:37:52 +00:00
Nicolas Pitre
af7ae5d61f kernel: sched: plug assertion race in z_get_next_switch_handle()
Commit d4d51dc062 ("kernel:  Replace redundant switch_handle assignment
with assertion") introduced an assertion check that may be triggered
as follows by tests/kernel/smp_abort:

CPU0              CPU1              CPU2
----              ----              ____
* [thread A]      * [thread B]      * [thread C]
* irq_offload()   * irq_offload()   * irq_offload()
* k_thread_abort(thread B)
                  * k_thread_abort(thread C)
                                    * k_thread_abort(thread A)
* thread_halt_spin()
* z_is_thread_halting(_current) is false
* while (z_is_thread_halting(thread B));
                  * thread_halt_spin()
                  * z_is_thread_halting(_current) is true
                  * halt_thread(_current...);
                  * z_dummy_thread_init()
                    - dummy_thread->switch_handle = NULL;
                    - _current = dummy_thread;
                  * while (z_is_thread_halting(thread C));
* z_get_next_switch_handle()
* z_arm64_context_switch()
* [thread A is dead]
                                    * thread_halt_spin()
                                    * z_is_thread_halting(_current) is true
                                    * halt_thread(_current...);
                                    * z_dummy_thread_init()
                                      - dummy_thread->switch_handle = NULL;
                                      - _current = dummy_thread;
                                    * while(z_is_thread_halting(thread A));
                  * z_get_next_switch_handle()
                    - old_thread == dummy_thread
                    - __ASSERT(old_thread->switch_handle == NULL) OK
                  * z_arm64_context_switch()
                    - str x1, [x1, #___thread_t_switch_handle_OFFSET]
                  * [thread B is dead]
                  * %%% dummy_thread->switch_handle no longer NULL %%%
                                    * z_get_next_switch_handle()
                                      - old_thread == dummy_thread
                                      - __ASSERT(old_thread->
                                             switch_handle == NULL) FAIL

This needs at least 3 CPUs and the perfect timing for the race to work as
sometimes CPUs 1 and 2 may be close enough in their execution paths for
the assertion to pass. For example, QEMU is OK while FVP is not.
Also adding sufficient debug traces can make the issue go away.

This happens because the dummy thread is shared among concurrent CPUs.
It could be argued that a per-CPU dummy thread structure would be the
proper solution to this problem. However the purpose of a dummy thread
structure is to provide a dumping ground for the scheduler code to work
while the original thread structure might already be reused and
therefore can't be clobbered as demonstrated above. But the dummy
structure _can_ be clobbered to some extent and it is not worth the
additional memory footprint implied by per-CPU instances. We just have
to ignore some validity tests when the dummy thread is concerned.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2025-11-04 13:45:24 -05:00
Nicolas Pitre
1c8f1c8647 kernel: sched: use clearly invalid value for halting thread switch_handle
When a thread halts and dummifies, set its switch_handle to (void *)1
instead of the thread pointer itself. This maintains the non-NULL value
required to prevent deadlock in k_thread_join() while making it obvious
that this value is not meant to be dereferenced or used.

The switch_handle should be an opaque architecture-specific value and
not be assumed to be a thread pointer in generic code. Using 1 makes
the intent clearer.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2025-11-04 13:45:24 -05:00
TaiJu Wu
d4d51dc062 kernel: Replace redundant switch_handle assignment with assertion
The switch_handle for the outgoing thread is expected to be NULL
at the start of a context switch.
The previous code performed a redundant assignment to NULL.

This change replaces the assignment with an __ASSERT(). This makes the
code more robust by explicitly enforcing this precondition, helping to
catch potential scheduler bugs earlier.

Also, the switch_handle pointer is used to check a thread's state during a
context switch. For dummy threads, this pointer was left uninitialized,
potentially holding a unexpected value.

Set the handle to NULL during initialization to ensure these threads are
handled safely and predictably.

Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
2025-10-25 15:59:29 +03:00
TaiJu Wu
91f1acbb85 kernel: Add more debug info and thread checking in run queue
1. There are debug info within k_sched_unlock so we shoulld add
   same debug info to k_sched_lock.

2. The thread in run queue should be normal or metairq thread, we should
   check it is not dummy thread.

Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
2025-10-24 13:26:15 -04:00
Fabio Baltieri
700a1a5a28 lib, kernel: use single evaluation min/max/clamp
Replace all in-function instances of MIN/MAX/CLAMP with the single
evaluation version min/max/clamp.

There's probably no race conditions in these files, but the single
evaluation ones save a couple of instructions each so they should save
few code bytes and potentially perform better, so they should be
preferred in general.

Signed-off-by: Fabio Baltieri <fabiobaltieri@google.com>
2025-10-24 01:10:40 +03:00
TaiJu Wu
623d8fa540 kernel: cleanup thread state checks and nunecessary CONFIG check
The commit replaces negative thread state checks with a new,
 more descriptivepositive check.
The expression `!z_is_thread_prevented_from_running()`
is updated to `z_is_thread_ready()` where appropriate, making
the code's intent clearer.

 Removes a redundant `IS_ENABLED(CONFIG_SMP)`, they are included #ifdef.

Finally, this patch add the missing `#endif` directive.

Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
2025-09-24 09:43:30 +02:00
TaiJu Wu
e069ce242c kernel: Consolidate thread state checking functions
This patch moves `is_aborting()` and `is_halting()`
from `kernel/sched.c` to `kernel/include/kthread.h`
and renames them to `z_is_thread_aborting()` and `z_is_thread_halting()`,
for consistency with other internal kernel APIs.

It replaces the previous inline function definitions in `sched.c`
with calls to the new header functions. Additionally, direct bitwise
checks like `(thread->base.thread_state & _THREAD_DEAD) != 0U`
are updated to use the new `z_is_thread_dead()` helper function.
This enhances code readability and maintainability.

Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
2025-09-24 09:43:30 +02:00
Marcin Szkudlinski
91d17f6931 kernel: add k_thread_absolute_deadline_set call
k_thread_absolute_deadline_set is simiar to existing
k_thread_deadline_set. Diffrence is that k_thread_deadline_set
takes a deadline as a time delta from the current time,
k_thread_absolute_deadline_set is expecting a timestamp
in the same units used by k_cycle_get_32().

This allows to calculate deadlines for several thread and
set them in deterministic way, using a common timestamp as
a "now" time base.

Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
2025-09-11 14:18:16 +01:00