Commit graph

3,401 commits

Author SHA1 Message Date
Anas Nashif
4b525273bf kernel: sched: simplify thread_runq()
Simplify code and make it more readable.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
fc11a9166e kernel: sched: extract meta-IRQ handling to metairq.h
Move meta-IRQ (highest-priority cooperative queue) scheduling
functions from sched.c into a new kernel/include/metairq.h header
to reduce sched.c size and group related logic.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
2ea5924943 kernel: k_yield: move code to thread.c
Move k_yield() from sched.c to thread.c alongside other thread
lifecycle calls.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
222eba03d7 kernel: sleep: move sleep code into own file
Reduce complexity of sched.c by encapsulating sleep handling code
(k_sleep, k_usleep, k_msleep) into its own file.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
908169d9df kernel: deadline: move deadline handling to own file
Move deadline scheduling to deadline.c, reducing complexity and
clutter in sched.c.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
0659dc18b3 kernel: sched: move thread lifecycle calls to thread.c
Relocate k_thread_start(), k_thread_abort(), k_thread_suspend(), and
k_thread_resume() from sched.c to thread.c alongside related thread
lifecycle code.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-24 15:39:20 -04:00
Anas Nashif
03263c5213 kernel: heap: add BLOCKING trace and fix EXIT ordering in k_heap_realloc
SYS_PORT_TRACING_OBJ_FUNC_EXIT fired while the spinlock was still
held.  The standard pattern across all other heap/kernel-object
functions is to release the lock first, then emit the EXIT trace.
Swap the two lines so the unlock precedes the tracing call.

Assisted-by: GitHub Copilot:claude-opus-4.6
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-21 18:39:55 -04:00
Anas Nashif
2f58de0626 kernel: stack: move BLOCKING trace to after K_NO_WAIT check in k_stack_pop
In z_impl_k_stack_pop, SYS_PORT_TRACING_OBJ_FUNC_BLOCKING was emitted
before the K_NO_WAIT timeout check.  When the stack is empty and
timeout == K_NO_WAIT, the function emitted the BLOCKING trace and then
immediately returned -EBUSY without ever blocking.  Tracing consumers
that expect a thread block to follow each BLOCKING event would observe
a spurious BLOCKING with no corresponding suspend.

Move the BLOCKING trace to after the K_NO_WAIT early-return so it is
only emitted when the thread is actually about to pend.

Assisted-by: GitHub Copilot:claude-opus-4.6
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-21 18:39:55 -04:00
Anas Nashif
8dbe977090 kernel: mailbox: emit tracing EXIT on async-send matched-receiver path
In mbox_message_put, when an async (dummy thread) sender matches a
waiting receiver, the function calls z_reschedule() and returns 0
without emitting SYS_PORT_TRACING_OBJ_FUNC_EXIT.  Every other return
path from mbox_message_put emits the EXIT trace before returning.
This missing trace leaves a dangling ENTER event for tracing consumers
(e.g. Percepio TraceRecorder) that expect matched ENTER/EXIT pairs.

Add the missing SYS_PORT_TRACING_OBJ_FUNC_EXIT call after
z_reschedule() on the async early-return path, matching the result
value 0 used by all other successful paths.

Assisted-by: GitHub Copilot:claude-opus-4.6
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-21 18:39:55 -04:00
Anas Nashif
d01ea12e78 kernel: queue: remove spurious BLOCKING trace in queue_insert
queue_insert emitted SYS_PORT_TRACING_OBJ_FUNC_BLOCKING twice:

  1. When a pending thread is found and woken (correct).
  2. Unconditionally before sys_sflist_insert when no thread is
     pending and the item is placed directly on the list (incorrect).

The second emission is wrong: no blocking occurs in that path — the
caller's data is simply enqueued and the function returns.  Emitting
a BLOCKING event there misrepresents the operation to tracing
consumers and is likely a copy-paste error from the first branch.

Remove the second SYS_PORT_TRACING_OBJ_FUNC_BLOCKING call.

Assisted-by: GitHub Copilot:claude-opus-4.6
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-21 18:39:55 -04:00
Christoph Busold
5ca3c912b2 kernel: userspace: Add k_object_access_revoke_others
This is can be used to revoke access from all but the current
thread, which is useful when reassigning an object without having
to worry about previous permissions.

Signed-off-by: Christoph Busold <cbusold@qti.qualcomm.com>
2026-04-21 10:20:39 +01:00
Anas Nashif
243012c33c kernel: move thread_entry from lib/os to kernel
Not really library code, this a core component that is part of the core
os/kernel.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
c60e0e9436 kernel: move userspace sem into kernel/sys
This is a kernel permitive for use with userspace, so move it under
kernel.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
b572cb23fc kernel: userspace: move mutex/user_work to userspace
Move userspace code out of lib/os into userspace folder under kernel.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
85ca9bb992 kernel: move smp code into smp/
Isolate SMP code into own folder.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
974dbbf2c0 kernel: move userspace kconfigs into own file
Move userspace Kconfig under userspace/.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
d8a1960c8b kernel: reorg mem domain kconfig
Reorganize memory domain Kconfig and move it under userspace/.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
eb294b7a1e kernel: move userspace code to own folder
Isolate userspace code into userspace/.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-14 22:31:16 -04:00
Anas Nashif
07fa9eabfe kernel: fix name of scheduler/wait queue: Dumb -> Simple
Rename leftover in kernel headers: Dumb -> Simple.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-04-13 11:09:25 -05:00
Peter Mitsis
083629e520 kernel: timer: Fix k_timer re-use in its handler
This fixes a subtle race-condition in the k_timer expiration
handler z_timer_expiration_handler(). There was a small window
of opportunity between when sys_clock_announce() unlocked
interrupts and that handler re-locked them that one or more
higher priority interrupts (or threads running on another CPU
if in an SMP environment) could not only abort the ktimer's
timeout, but restart it as well. Both of these situations are
now detectable in the handler (resulting in an immediate return
from the handler).

To make this work, every case where the ktimer internals either
adds or aborts its timeout is now encapsulated by the ktimer lock.
Thus, when the handler tests if the timeout handler has been
canceled with only the ktimer lock being held, we know that no
other thread or ISR can be modifying the ktimer's timeout.

Fixes #106654

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-04-11 10:17:20 -04:00
Nicolas Pitre
32b1399669 kernel/timeout: introduce sys_clock_lock() and sys_clock_announce_locked()
On SMP systems with tickless kernels, a race condition exists between
timer driver ISRs and the kernel's tick accounting. The driver updates
its hardware cycle baseline under a private lock, then calls
sys_clock_announce() which updates curr_tick under the separate
timeout_lock. In the gap between these two lock releases, any kernel
code calling sys_clock_elapsed() sees the new driver baseline but the
old curr_tick, producing inconsistent time values that can go backwards.

This affects every code path using the internal elapsed() helper:
uptime queries, timeout scheduling, timeout cancellation, remaining
time queries, and next-expiry calculations.

The root cause is two separate locks protecting state that must be
mutually consistent. Fix this by exposing the kernel's timeout_lock
to timer drivers via sys_clock_lock()/sys_clock_unlock(), and
providing sys_clock_announce_locked() which assumes the lock is
already held.

Timer drivers can now acquire the single lock, update their hardware
state, and announce ticks all under the same lock — eliminating the
race window entirely. The key is passed to sys_clock_announce_locked()
which consumes it (releasing the lock when it returns).

The existing sys_clock_announce() becomes a backward-compatible wrapper,
allowing incremental driver migration with no flag day.

Document that sys_clock_set_timeout(), sys_clock_elapsed(), and
sys_clock_idle_exit() are called by the kernel with the timer lock
held. Update the timer driver guide in clocks.rst accordingly.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-04-07 11:40:49 -05:00
Nicolas Pitre
184b5a3804 kernel: assert scheduler lock is held in z_unpend_all_locked()
Add a runtime assertion in z_unpend_all_locked() to verify that
_sched_spinlock is actually held by the caller. This catches misuse
early given the function call depth involved.

Extend the availability of z_spin_is_locked() from CONFIG_SMP &&
CONFIG_TEST to also include CONFIG_ASSERT, so the check can be
used in __ASSERT() outside of test builds.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-04-07 08:40:28 -05:00
Nicolas Pitre
9cef0da05c kernel: avoid recursive scheduler lock in k_heap_free path
When halt_thread() calls k_thread_perms_all_clear() under
_sched_spinlock, the permission cleanup can trigger k_free() on
dynamic objects. k_heap_free() then calls z_unpend_all() which
attempts to take _sched_spinlock again, causing a recursive lock.

Fix this by introducing k_heap_free_sched_locked() and
k_free_sched_locked() variants that use z_unpend_all_locked()
to operate on the wait queue without re-acquiring the scheduler
lock. The existing z_unpend_all() becomes a wrapper that takes
the lock and delegates to z_unpend_all_locked().

unref_check() gains a sched_locked parameter: the abort path
(clear_perms_cb) passes true to use the locked free variant,
while k_thread_perms_clear() passes false for the normal path.

Fixes #106659

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-04-07 08:40:28 -05:00
Andrew Bresticker
1666066082 kernel/sched: fix race in consuming self-directed IPIs
Move signal_pending_ipi() inside the K_SPINLOCK block in
z_get_next_switch_handle(). Calling it after the lock release creates a
window where a CPU can consume its own pending IPI bit via atomic_clear
in signal_pending_ipi(), then silently drop it in
arch_sched_directed_ipi() which skips the calling CPU (i == id).

In configurations where secondary CPUs have a single pinned thread and
take no timer or external interrupts, this can lead to a permanent hang:
the idle CPU can only be woken by IPIs, but no IPIs are pending and no
timeslicing IPIs will be generated since the idle thread is not sliceable.
This was reproduced when running under QEMU with the following sequence
of events observed:

  CPU 0                                  CPU 1
  ─────                                  ─────

                                         Thread calls k_poll(K_MSEC(1))
                                           z_pend_curr():
                                             mark thread PENDING
                                             z_add_timeout(1ms)
                                             do_swap() to idle thread
                                         WFI

  Timer tick fires
  sys_clock_announce():
    slice_timeout(cpu1):
      flag_ipi(BIT(1))
    signal_pending_ipi():
      MSIP[cpu1] = 1

                                         CPU1 wakes from WFI
                                         z_get_next_switch_handle():
                                           acquire _sched_spinlock
                                           next_up() → idle
                                             (thread still PENDING,
                                              timeout hasn't fired yet)
                                           release _sched_spinlock

  Timer tick fires
  sys_clock_announce():
    z_thread_timeout(thread):
      z_unpend_thread(thread)
      z_ready_thread(thread):
        flag_ipi(BIT(1))

                                         signal_pending_ipi():
                                           atomic_clear(pending_ipi)
                                             returns BIT(1)
                                           arch_sched_directed_ipi(BIT(1))
                                             skips self, IPI silently lost
                                         return to idle thread
                                           WFI
                                             thread still on ready queue

Such an interleaving of events is, of course, likely only reproducible in
practice in virtualized environments where (v)CPUs can be descheduled.

With signal_pending_ipi() inside the lock, next_up() and the IPI
dispatch are atomic. Either the concurrent flag_ipi lands before the
lock is acquired (and next_up sees the thread), or it lands after the
lock is released (and the caller dispatches the IPI). There is no
window where a CPU can consume its own bit for a thread it hasn't seen.

Similar races exist in reschedule() and z_reschedule_irqlock() as well.
Although they won't cause the same permanent hang described above, it
can result in unnecessary rescheduling latency. Fix reschedule(), and
add a TODO to z_reschedule_irqlock(); it doesn't not currently take
the sched spinlock.

Signed-off-by: Andrew Bresticker <abrestic@meta.com>
2026-04-04 10:57:11 -05:00
Fengming Ye
f75db68d03 kernel: workq: not yield when current workq is empty
Workq optionally yield after every work handler to avoid starving
other threads.
When current workq is empty after this work handler, current thread
will go to sleep in next loop. So no need to yield, bringing one more
schedule cost.

Signed-off-by: Fengming Ye <frank.ye@nxp.com>
2026-04-03 23:15:04 +09:00
Peter Mitsis
df630e09ae kernel: Fix timeout handler for delayable work
Between the points in time when sys_clock_announce() calls the
timeout handler for delayable work and when that handler wins
the work queue spinlock another thread or ISR could have called
k_work_reschedule_for_queue(). Should this occur, the timeout
that the handler is trying to process becomes stale and the
handler should not proceed any further with it.

As the workqueue spinlock is the controlling lock (it is always
held before either aborting or adding a timeout), it is safe
for the handler to call z_is_timeout_handler_canceled() once
it holds the workqueue spinlock.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-04-03 23:13:23 +09:00
Peter Mitsis
f9376ddde5 kernel: Fix gap in workqueue work timeout
The workqueue work timeout feature is supposed to abort the work
queue thread if the time to execute a work item exceeds the work
queue's configured threshold. The work thread may race against the
timeout handler responsible for aborting the thread when the two
are running on separate CPUs--particularly since the timeout handler
only locks the workqueue spinlock for part of its duration.

To get around this, two separate flags must be checked a 'finished'
flag to indicate that the thread has finished processing the work
item and the timeout's flag indicating if it has been removed while
processing its timeout handler. Should either be found to be true
within in the timeout handler, the thread is deemed to have completed
in time and the timeout handler proceeds no further.

Otherwise the timeout handler is deemed to have won the race and the
workqueue thread is aborted. Should the workqueue thread detect this,
it goes to sleep until it can be aborted to prevent it from handling
any more work items.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-04-03 23:13:23 +09:00
Nicolas Pitre
5a3c601e71 kernel: track announcing state in timeout dticks field
The routine sys_clock_announce() removes the timeout from the timeout
list and unlocks the timeout spinlock before invoking the timeout's
handler. This creates a window where another ISR (or a thread running
on another CPU) can abort or reuse the timeout before the handler
executes. When this happens, the timeout handler should bail early.

Use the dticks field to carry this state: set it to
TIMEOUT_DTICKS_ANNOUNCING after remove_timeout() (which needs
dticks = 0 to propagate remaining ticks) and before calling the
handler. In z_abort_timeout(), set TIMEOUT_DTICKS_ABORTED when the
timeout is either linked (existing behavior) or in the announcing
state (new). The z_add_timeout() path naturally overwrites dticks
with a real tick value, so re-use is also detected.

Provide z_is_timeout_handler_canceled() for handlers to check if
they should bail. This avoids adding a flags field to struct _timeout,
keeping the struct size unchanged.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2026-04-03 23:13:23 +09:00
Daniel Leung
23054a97f4 kernel: dynamic stack to cached area if coherence
With kernel coherence enabled, it is possible that the stack has
been allocated on uncached area. This has implications on
performance as memory access is not cached.

This adds a kconfig to force the indicated stack pointer of
the allocated thread stack object to be in cached area.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2026-03-31 11:45:30 -04:00
Cheng-Yang Chou
10c974c9d5 kernel: futex: fix TOCTOU race in k_futex_wait
Move the futex value validation inside the spinlock critical section
in z_impl_k_futex_wait().

Previously, a time-of-check to time-of-use (TOCTOU) race condition
existed because the futex value was evaluated before acquiring
futex_data->lock. This created a vulnerability window:

    Thread A (waiter)                 Thread B (waker)
    ─────────────────────────         ────────────────────────
    atomic_get() == expected
                                      atomic_set(new_val)
                                      k_futex_wake() -> no waiters yet
    k_spin_lock()
    z_pend_curr()
    [waits forever, wake lost]

If the waker updates the futex value and signals between the waiter's
value check and lock acquisition, the wake signal is lost. This causes
the waiting thread to block indefinitely.

Holding the lock during the evaluation ensures the value check and the
subsequent wait-queue operations are atomic relative to concurrent
wakeups. A concurrent wake must now either complete before the waiter
acquires the lock (waiter sees the updated value and returns -EAGAIN)
or arrive after (waiter is safely in the wait queue and gets woken).

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
2026-03-23 12:34:58 -05:00
Joel Holdsworth
76def70bed arch: Added initial OpenRISC architecture port
This patch adds support for the OpenRISC 1000 (or1k) architecture: a
MIPS-like open hardware ISA which was first introduced in 2000.

The thread switching implementation uses the modern Zephyr thread "switch"
architecture.

Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
2026-03-21 07:50:57 -05:00
Christoph Busold
11f89f73eb kernel: userspace: Add k_object_access_check syscall
This allows user threads to test if they have permission to access
an object before attempting to perform an operation on it and fail
gracefully if not.

Signed-off-by: Christoph Busold <cbusold@qti.qualcomm.com>
2026-03-19 14:49:23 -05:00
Tharaka Jayasena
f6141e5ccf doc: kernel: fix incorrect Doxygen @retval/@return usage
Fix several incorrect uses of the Doxygen `@retval` and @return command in
kernel sources.

- Convert @return to structured @retval where functions return
  discrete values.
- Replace incorrect @retval usage with @return for non-discrete
  return types.

Signed-off-by: Tharaka Jayasena <9dmpires2k17.tuj@gmail.com>
2026-03-17 18:24:33 -04:00
Cheng-Yang Chou
d67038c7e7 kernel: fix z_tick_sleep unsigned comparison when !CONFIG_TIMEOUT_64BIT
When CONFIG_TIMEOUT_64BIT is not set, k_ticks_t is uint32_t. The previous
code cast left_ticks through int32_t but then stored the result back in
k_ticks_t (uint32_t), losing the sign. The subsequent ticks > 0 check was
therefore an unsigned comparison, causing a past-due wakeup (where the
subtraction wraps to a large uint32_t) to be misread as a large positive
remainder and propagated up through k_sleep() as INT_MAX ms.

Fix by retaining the signed intermediate and comparing it directly as
int32_t so negative remainders (past-due) correctly fall through to
return 0.

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
2026-03-17 18:18:28 -04:00
Andy Ross
a04a30c6d1 kernel/sched: Move reschedule under lock in k_sched_unlock()
This function was a little clumsy, taking the scheduler lock,
releasing it, and then calling z_reschedule_unlocked() instead of the
normal locked variant of reschedule.  Don't take the lock twice.

Mostly this is a code size and hygiene win.  Obviously the sched lock
is not normally a performance path, but I happened to have picked this
API for my own microbenchmark in tests/benchmarks/swap and so noticed
the double-lock while staring at disassembly.

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Andy Ross
2bbcece6ee kernel/sched: Refactor reschedule to permit better code generation
z_reschedule() is the basic kernel entry point for context switch,
wrapping z_swap(), and thence arch_switch().  It's currently defined
as a first class function for entry from other files in the kernel and
elsewhere (e.g. IPC library code).

But in practice it's actually a very thin wrapper without a lot of
logic of its own, and the context switch layers of some of the more
obnoxiously clever architectures are designed to interoperate with the
compiler's own spill/fill logic to avoid double saving.  And with a
small z_reschedule() there's not a lot to work with.

Make reschedule() an inlinable static, so the compiler has more
options.

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Andy Ross
8638ed12f5 kernel/sched: Add optimized next switch handle wrapper
z_get_next_switch_handle() is a clean API, but implementing it as a
(comparatively large) callable function requires significant
entry/exit boilerplate and hides the very common "no switch needed"
early exit condition from the enclosing C code that calls it.  (Most
architectures call this from assembly though and don't notice).

Provide an unwrapped version for the specific needs non-SMP builds.
It's compatible in all other ways.

Slightly ugly, but the gains are significant (like a dozen cycles or
so).

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Andy Ross
d535d17cbc kernel: Minor optimizations to z_swap()
Pick some low hanging fruit on non-SMP code paths:

+ The scheduler spinlock is always taken, but as we're already in an
  irqlocked state that's a noop.  But the optmizer can't tell, because
  arch_irq_lock() involves an asm block it can't see inside.  Elide
  the call when possible.

+ The z_swap_next_thread() function evaluates to just a single load of
  _kernel.ready_q.cache when !SMP, but wasn't being inlined because of
  function location.  Move that test up into do_swap() so it's always
  done correctly.

Signed-off-by: Andy Ross <andyross@google.com>
2026-03-10 17:24:10 +01:00
Andy Ross
e2e5542d14 arch/arm: Platform integration for new Cortex M arch_switch()
Integrate the new context layer, allowing it to be selected via the
pre-existing CONFIG_USE_SWITCH.  Not a lot of changes, but notable
ones:

+ There was code in the MPU layer to adjust PSP on exception exit at a
  stack overflow so that it remained inside the defined stack bounds.
  With the new context layer though, exception exit will rewrite the
  stack frame in a larger format, and needs PSP to be adjusted to make
  room.

+ There was no such treatment in the PSPLIM case (the hardware prents
  the SP from going that low), so I had to add similar code to
  validate PSP at exit from fault handling.

+ The various return paths for fault/svc assembly handlers need to
  call out to the switch code to do the needed scheduler work.  Really
  almost all of these can be replaced with C now, only userspace
  syscall entry (which has to "return" into the privileged stack)
  needs special treatment.

+ There is a gcc bug that prevents the arch_switch() inline assembly
  from building when frame pointers are enabled (which they almost
  never are on ARM): it disallows you from touching r7 (the thumb
  frame pointer) entirely.  But it's a context switch, we need to!
  Worked around by enforcing -fomit-frame-pointer even in the two
  scheduler files that can swap when NO_OPTIMIZATIONS=y.

Signed-off-by: Andy Ross <andyross@google.com>
Signed-off-by: Sudan Landge <sudan.landge@arm.com>
2026-03-10 17:24:10 +01:00
Anas Nashif
b84319c460 kernel: drop deprecated options SCHED_DUMB and WAITQ_DUMB
Those kconfig options were deprecated in 4.2. Now they are removed.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2026-03-09 15:09:04 -05:00
Henrik Brix Andersen
fb350217cc kernel: Kconfig.device: fix CONFIG_DEVICE_DEINIT_SUPPORT help text
Drivers supporting device deinitialization should not select
CONFIG_DEVICE_DEINIT_SUPPORT. Enabling deinit should be left up to the
application configuration.

Signed-off-by: Henrik Brix Andersen <henrik@brixandersen.dk>
2026-03-08 16:36:39 +01:00
Peter Mitsis
fa558229af kernel: mem_slab: Change loop variable type
Changes the loop variable type from 'int' to 'uint32_t' in the
create_free_list() routine to match the type of the 'num_blocks'
field. Otherwise, if a very large number of blocks is specified,
the conversion from 'uint32_t' to 'int' could have resulted in
a negative number. The result of this improper conversion would
be an empty free list.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-03-05 04:41:02 +01:00
Flavio Ceolin
2517803d98 kernel: msgq: Check possible overflow in put/get
Not only checks if writer_ptr is smaller than buffer_end but also
checks that write_ptr + msg_size is smaller than buffer_end to
avoid overflow when copying data.

Signed-off-by: Flavio Ceolin <flavio@hubblenetwork.com>
2026-02-27 07:59:02 +01:00
Flavio Ceolin
41fbeea3a8 kernel: msgq: Check overflow when initing queue
Check for possible overflow in k_msgq_init.

Signed-off-by: Flavio Ceolin <flavio@hubblenetwork.com>
2026-02-27 07:59:02 +01:00
Flavio Ceolin
26bd97edbc kernel: mem_slab: Check block size equals 0
Borrowing Peter Mitsis rationale in #104283

If someone passes a 0 block_size, then the buffer size must also be 0.
However, we iterate through the loop below num_blocks times, writing a
pointer to the buffer address. If the buffer is truly zero-sized, then
we are overwriting something else. If it is not truly zero-sized, then
we are creating a corrupted linked list as the pointer never actually
changes. This can cause problems later on when attempting to allocate
a slab because k_mem_slab_alloc() will only ever "allocate" the first
zero-sized block and act as though it was never truly consumed because
of the corrupted linked list.

Signed-off-by: Flavio Ceolin <flavio@hubblenetwork.com>
2026-02-27 07:59:02 +01:00
Flavio Ceolin
5688bcc10a kernel: mem_slab: Check possible overflow on init
Check possible overflow when initializing a memory block.

Signed-off-by: Flavio Ceolin <flavio@hubblenetwork.com>
2026-02-27 07:59:02 +01:00
Peter Mitsis
143076008b kernel: msgq: Fix __ASSERT_NO_MSG() checks
The message queue 'buffer_end' field points to the next address AFTER
the end of the buffer. When the buffer goes to the last addressable
byte, the next byte is 0x0. To ensure proper evaluation of the bounds
the __ASSERT_NO_MSG() checks must not use "< buffer_end", but
"<= buffer_end - 1".

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2026-02-24 15:36:04 +01:00
Daniel Leung
c1ff64599b kernel: mem_domain: support memory domain de-initialization
This adds the ability to de-initialization a memory domain.
This requires support in the architecture layer. One usage of
this is to release the resources associated with the domain.
For example, we can release allocated page tables so they can
go back to the pool of page tables to be allocated later.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2026-02-24 10:39:59 +01:00
Zhaoxiang Jin
cddedf9e3d kernel: busy_wait: handle runtime system timer frequency updates
When CONFIG_SYSTEM_CLOCK_HW_CYCLES_PER_SEC_RUNTIME_UPDATE is enabled,
the system timer frequency can change at runtime. Some timer drivers
(e.g. Cortex-M SysTick) rescale the cycle counter when the frequency
changes, which can break k_busy_wait() if the frequency changes during
the wait period.

Update k_busy_wait() to handle runtime frequency changes:
- Add busy_wait_us_to_cyc_ceil32() helper to convert microseconds to
  cycles with a given frequency (rounds up to avoid returning early)
- Implement a frequency-aware busy wait loop that:
  - Samples the frequency before and after reading the cycle counter to
    detect concurrent frequency changes
  - Rescales the start_cycles reference point when the frequency changes
    to keep it in the same scale as the cycle counter
  - Recomputes cycles_to_wait with the new frequency to preserve the
    requested duration
  - Retries sampling if a frequency change is detected mid-read

The original implementation is preserved when
CONFIG_SYSTEM_CLOCK_HW_CYCLES_PER_SEC_RUNTIME_UPDATE is not enabled.

Signed-off-by: Zhaoxiang Jin <Zhaoxiang.Jin_1@nxp.com>
2026-02-20 13:31:07 +01:00
Zhaoxiang Jin
60197bf514 kernel: refactor sys_clock_hw_cycles_per_sec runtime support
- Move the variable declaration and related code from kernel/timeout.c
  to a new kernel/sys_clock_hw_cycles.c file. The motivation is that
  both functions are part of the system clock frequency plumbing
  (runtime query / update) and don’t naturally fit the responsibilities
  of timeout.c, which is otherwise focused on timeout queue management
  and tick announcement logic.

- Make sys_clock_hw_cycles_per_sec_runtime_get() (and its
  z_impl_sys_clock_hw_cycles_per_sec_runtime_get() implementation)
  visible under CONFIG_SYSTEM_CLOCK_HW_CYCLES_PER_SEC_RUNTIME_UPDATE
  as well, not only under CONFIG_TIMER_READS_ITS_FREQUENCY_AT_RUNTIME.
  This allows callers and time unit conversion helpers to retrieve the
  current system timer frequency after runtime clock changes even when
  the timer driver does not discover the rate by querying hardware.

Signed-off-by: Zhaoxiang Jin <Zhaoxiang.Jin_1@nxp.com>
2026-02-20 13:31:07 +01:00