Add Kconfig option to dump only a portion of stack from the
current stack pointer to the stack end. This is enough to
let gdb reconstruct the stack trace and can significantly
reduce the dump size. This is crucial if the core dump needs
to be sent over radio.
Additionally, add another option to set the limit for the
dumped stack portion.
Signed-off-by: Damian Krolik <damian.krolik@nordicsemi.no>
Add return value to z_add_timeout. It returns system tick when timeout
will expire.
Signed-off-by: Krzysztof Chruściński <krzysztof.chruscinski@nordicsemi.no>
Accessing system timer registers can be costly and it shall be avoided
if possible. When thread is waken up in z_tick_sleep it may be because
timeout expired or because thread was waken up before sleeping period
passed.
Add function to detect if timeout is aborted (before it was expired).
Use it in the sleep function and avoid reading system ticks if timeout
was not aborted.
Signed-off-by: Krzysztof Chruściński <krzysztof.chruscinski@nordicsemi.no>
When k_malloc() is expressed in terms of k_aligned_alloc() it invokes a
longer aligned allocation code path with an extra runtime overhead even
though no alignment is necessary.
Let's reference and invoke the aligned allocation code path only when an
actual aligned allocation is requested. This opens the possibility for
the linker to garbage-collect the aligning code otherwise.
Also bypass k_heap_malloc() and friends given they're invoked with
K_NO_WAIT. Go directly to sys_heap_*() instead to cut some more unneeded
overhead.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Improve naming of the scheduler and call it what it is: simple. Using
'dumb' for the default scheduler algorithm in Zephyr is a bad idea.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The current_fp field in the z_kernel structure is only used
by 32-bit x86 (which does not support SMP). As such, it should
reside in the arch specific of section of _kernel.cpus[0].
This also changes the name of 'current_fp' to 'fpu_owner' to
be more consistent with other architectures.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The 'order_key' field in the thread structure '_thread_base' is only
required when CONFIG_SCHED_SCALABLE and/or CONFIG_WAITQ_SCALABLE are
enabled (neither of which is a default setting). Making the existence
of this field conditional slightly reduces the size of the k_thread
structure when neither of those Kconfig options are selected.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Changes the return type of z_handle_obj_poll_events() so that it
returns true if there were polling events to handle (false
otherwise).
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Fix a void* to k_thread_entry_t conversion (that is silent in GCC but
not so in some other tools) in _is_valid_prio()
Signed-off-by: Björn Bergman <bjorn.bergman@iar.com>
The check for an active timeout in z_is_thread_ready() was originally
added to cover the case of a sleeping thread. However, since there is
now a bit in the thread state that indicates if the thread is sleeping
we can drop that superfluous check.
Making this change necessitates moving k_wakeup()'s call to
z_abort_thread_timeout() so that it is within the locked
_sched_spinlock section to ensure that we do not end up with
a stray thread timeout in the timeout list.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Removes an unnecessary clearing of the current CPU's swap_ok field
in do_swap() as that clearing is already done at the end of next_up()
which was just called by z_swap_next_thread() a little earlier.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
This function is getting quite involved and it also gained more callers
lately. This is not performance critical so Uninline it to save on
binary size.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Define the generic _current directly and get rid of the generic
arch_current_get().
The SMP default implementation is now known as z_smp_current_get().
It is no longer inlined which saves significant binary size (about 10%
for some random test case I checked).
Introduce z_current_thread_set() and use it in place of
arch_current_thread_set() for updating the current thread pointer
given this is not necessarily an architecture specific operation.
The architecture specific optimization, when enabled, should only care
about its own things and not have to also update the generic
_current_cpu->current copy.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Mostly a revert of commit b1def7145f ("arch: deprecate `_current`").
This commit was part of PR #80716 whose initial purpose was about providing
an architecture specific optimization for _current. The actual deprecation
was sneaked in later on without proper discussion.
The Zephyr core always used _current before and that was fine. It is quite
prevalent as well and the alternative is proving rather verbose.
Furthermore, as a concept, the "current thread" is not something that is
necessarily architecture specific. Therefore the primary abstraction
should not carry the arch_ prefix.
Hence this revert.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Even though calculating the priority queue index in the priority
multiq is quick, caching it allows us to extract an extra 2% in
terms of performance as measured by the thread_metric cooperative
benchmark.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Adds customized yield implementations based upon the selected
scheduler (dumb, multiq or scalable). Although each follows the
same broad outline, some of them allow for additional tweaking
to extract maximal performance. For example, the multiq variant
improves the performance of k_yield() by about 20%.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Dequeuing from a doubly linked list is similar to removing an item
except that it does not re-initialize the dequeued node.
This comes in handy when sorting a doubly linked list (where the
node gets removed and re-added). In that circumstance, re-initializing
the node is required. Furthermore, the compiler does not always
'understand' this. Thus, when performance is critical, dequeuing
may be preferred to removing.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Minor cleanups include ...
1. Eliminating unnecessary if-defs and forward declarations
2. Co-locating routines of the same queue type
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The `lock` arg is used multiple times in the function, making the
`ARG_UNUSED(lock);` redundant, remove it.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Sleeping and suspended are now orthogonal states. That is, a thread
may be both sleeping and suspended and the two do not interact. One
repercussion of this is that suspending a thread will no longer
abort its timeout.
Threads are now created in the 'sleeping' state instead of a
'suspended' state. This dovetails nicely with the start delay that
can be given to a newly created thread--it is as though the very
first operation that a thread with a start delay is a sleep.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
At the present time, Zephyr does has overlap between sleeping and
suspending. Not only should sleeping and suspended be orthogonal
states, but we should ensure users always employ the correct API.
For example, to wake a sleeping thread, k_wakeup() should be used,
and to resume a suspended thread, k_thread_resume() should be used.
However, at the present time k_thread_resume() can be used on a
thread that called k_sleep(K_FOREVER). Sleeping should have nothing
to do with suspension.
This commit introduces the new _THREAD_SLEEPING thread state along
with some prep-work to facilitate the decoupling of the sleeping and
suspended thread states.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Moves the arch_swap() declaration out of kernel_arch_interface.h
and into the various architectures' kernel_arch_func.h. This
permits the arch_swap() to be inlined on ARM, but extern'd on
the other architectures that still implement arch_swap().
Inlining this function on ARM has shown at least a +5% performance
boost according to the thread_metric benchmark on the disco_l475_iot1
board.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Traditionally threads have been initialized with a PRESTART flag set,
which gets cleared when the thread runs for the first time via either
its timeout or the k_thread_start() API.
But if you think about it, this is no different, semantically, than
SUSPENDED: the thread is prevented from running until the flag is
cleared.
So unify the two. Start threads in the SUSPENDED state, point
everyone looking at the PRESTART bit to the SUSPENDED flag, and make
k_thread_start() be a synonym for k_thread_resume().
There is some mild code size savings from the eliminated duplication,
but the real win here is that we make space in the thread flags byte,
which had run out.
Signed-off-by: Andy Ross <andyross@google.com>
`_current` is now functionally equals to `arch_curr_thread()`, remove
its usage in-tree and deprecate it instead of removing it outright,
as it has been with us since forever.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Add the following arch-specific APIs:
- arch_curr_thread()
- arch_set_curr_thread()
which allow SMP architectures to implement a faster "get current
thread pointer" than the default provided by the kernel. The 'set'
function is required for the 'get' to work, more on that later.
When `CONFIG_ARCH_HAS_CUSTOM_CURRENT_IMPL` is selected, calls to
`_current` & `k_sched_current_thread_query()` will be redirected to
`arch_curr_thread()`, which ideally should translate into a single
instruction read, avoiding the current
"lock > read CPU > read current thread > unlock" path in SMP
architectures and thus greatly improves the read performance.
However, since the kernel relies on a copy of the "current thread"s on
every CPU for certain operations (i.e. to compare the priority of the
currently scheduled thread on another CPU to determine if IPI should be
sent), we can't eliminate the copy of "current thread" (`current`) from
the `struct _cpu` and therefore the kernel now has to invoke
`arch_set_curr_thread()` in addition to what it has been doing. This
means that it will take slightly longer (most likely one instruction
write) to change the current thread pointer on the current
CPU.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Without multithreading only two stacks present: ISR and main.
As any stack they also could overflow, so it make sense to add stack
guard for them also.
Remove stack guard dependency on multithreading and mark
`Z_RISCV_STACK_GUARD_SIZE` bytes at the beginning of stack as read-only
region with PMP entry.
Signed-off-by: Volodymyr Fialko <vfialko@marvell.com>
Move the initialization of the priority q for running out of sched.c to
remove one more ifdef from sched.c. No change in functionality but
better matches the rest of sched.c and priority_q.h such that the
ifdefry needed is done in in priority_q.h.
Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
In a uniprocessor system, _sched_spinlock may not need to be
held in all the same cases that it does in a multiprocessor
system. Removing those unnecessary usages can lead to better
performance on UP systems. In the case of uncontested taking
and giving of a semaphore, this can be as much as a +14%
performance gain.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Inlining z_unpend_first_thread() has been observed to give a
+8% and +16% performance boost to the thread_metric benchmark's
message processing and synchronization tests respectively.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Re-orders the checks in should_preempt() tests so that the
z_is_thread_timeout_active() check is done last.
This change has been observed to give a +7% performance boost on
the thread_metric benchmark's preemptive scheduling test.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Removing the routine z_ready_thread_locked() as it is not
used anywhere. It was a leftover artefact from development
that previously escaped cleanup.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
This adds a interface to allow coredump to dump privileged
stack which is defined in architecture specific way.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Fill the memory of all CPU's IRQ stack with 0xAA on init, so
that `z_stack_space_get` can calculate the remaining space
correctly.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
This adds a new arch_thread_priv_stack_space_get() interface for
each architecture to report privileged stack space usage. Each
architecture will need to implement this function as each arch
has their own way of defining privileged stacks.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The intention of this API is to allow setting the posix thread
name equal to the zephyr thread name.
By defining it as an arch interface, the implementation becomes
generic.
Signed-off-by: Rubin Gerritsen <rubin.gerritsen@nordicsemi.no>
Utilize a code spell-checking tool to scan for and correct spelling errors
in all files within the `kernel` directory.
Signed-off-by: Pisit Sawangvonganan <pisit@ndrsolution.com>
This is part of a series of moving memory management related
stuff out of the Z_ namespace and into its own namespace.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is part of a series of move memory management related
stuff out of Z_ namespace into its own namespace.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Also any demand paging and page frame related bits are
renamed.
This is part of a series to move memory management related
stuff out of the Z_ namespace into its own namespace.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is part of a series to move memory management related
stuff from Z_ namespace into its own namespace.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is part of a series to move memory management related
stuff from Z_ namespace into its own namespace.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is part of a series to move memory management related
stuff from Z_ namespace into its own namespace.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>