Commit graph

353 commits

Author SHA1 Message Date
Jeremy Bettis 1e0a36c655 build: Remove unused functions
Removed unused functions, or moved inside #ifdefs.

This allows using -Werror=unused-function on the clang compiler. Tested
by building the ChromeOS EC on all supported platforms with
-Werror=unused-functions.

Signed-off-by: Jeremy Bettis <jbettis@google.com>
2021-12-13 15:49:08 -05:00
Andy Ross 410f911018 kernel/sched: Separate idle from app thread stats in THREAD_USAGE
It turns out that we have a sample (though not a test) that really
does want to use "k_thread_runtime_stats_all_get()" to measure system
uptime.

Instead of breaking this needlessly, separate the accounting for idle
and non-idle threads.  The legacy API can report their sum, and the
more useful value is available via the kernel struct for future
analysis.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-11-08 21:32:20 -05:00
Andy Ross 52351458f4 kernel/sched: Add timing.h support to thread_usage
The runtime stats feature has always supported this, so use the same
kconfig to indirect the timing source in the same way.

(Personally I'm not a fan of the "timing" API, which really doesn't do
anything that the existing core "cycles" API does not except add a
bunch of code due to the separate implementation of frequency
management and conversion routines.  It comes from an era where
"cycles" were fixed to a MHz frequency clock on platforms like x86 yet
we had benchmarks that wanted to use the TSC.  Those days are behind
us and "cycles" can be fast everywhere.)

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-11-08 21:32:20 -05:00
Andy Ross b62d6e17a4 kernel/sched: Add an optional "all" counter for thread_usage
Tally the runtime of all non-idle threads.  Make it optional via
kconfig to avoid overhead.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-11-08 21:32:20 -05:00
Andy Ross 4ae3250301 sched: Hook SCHED_USAGE from existing tracing hook
On older architectures, we don't have the
architecture-independent/scheduler-internal hooks (which require
USE_SWITCH) but there is a hook shared by the tracing layer we can use.

This is sort of a layering violation (stat tracking is a core feature,
tracing is supposed to be optional), but simple and lightweight.  And
eventually it will go away as these architectures migrate.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-11-08 21:32:20 -05:00
Andy Ross 40d12c142d kernel/sched: Add "thread_usage" API for thread runtime cycle monitoring
This is an alternate backend that does what THREAD_RUNTIME_STATS is
doing currently, but with a few advantages:

* Correctly synchronized: you can't race against a running thread
  (potentially on another CPU!) while querying its usage.

* Realtime results: you get the right answer always, up to timer
  precision, even if a thread has been running for a while
  uninterrupted and hasn't updated its total.

* Portable, no need for per-architecture code at all for the simple
  case. (It leverages the USE_SWITCH layer to do this, so won't work
  on older architectures)

* Faster/smaller: minimizes use of 64 bit math; lower overhead in
  thread struct (keeps the scratch "started" time in the CPU struct
  instead).  One 64 bit counter per thread and a 32 bit scratch
  register in the CPU struct.

* Standalone.  It's a core (but optional) scheduler feature, no
  dependence on para-kernel configuration like the tracing
  infrastructure.

* More precise: allows architectures to optionally call a trivial
  zero-argument/no-result cdecl function out of interrupt entry to
  avoid accounting for ISR runtime in thread totals.  No configuration
  needed here, if it's called then you get proper ISR accounting, and
  if not you don't.

For right now, pending unification, it's added side-by-side with the
older API and left as a z_*() internal symbol.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-11-08 21:32:20 -05:00
Andy Ross b11e796c36 kernel/sched: Add CONFIG_CPU_MASK_PIN_ONLY
Some SMP applications have threading designs where every thread
created is always assigned to a specific CPU, and never want to
schedule them symmetrically across CPUs under any circumstance.

In this situation, it's possible to optimize the run queue design a
bit to put a separate queue in each CPU struct instead of having a
single global one.  This is probably good for a few cycles per
scheduling event (maybe a bit more on architectures where cache
locality can be exploited) in circumstances where there is more than
one runnable thread.  It's a mild optimization, but a basically simple
one.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-09-28 20:15:05 -04:00
Andy Ross b155d06712 kernel/sched: Factor out ready_q initialization
Split "init_ready_q()" into a separate function that operates on the
queue pointer and not the global kernel object.  Pure refactoring.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-09-28 20:15:05 -04:00
Andy Ross 387fdd2e53 kernel/sched: Refactor/simplify run queue accessors
Similar to the previous patch, the various _priq_run_*() functions are
always passed a first argument that is the singleton system run queue
(this is because the same backend functions are used by wait queues).

Refactor into a simpler API that places the access to the run queue in
just a single spot.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-09-28 20:15:05 -04:00
Andy Ross c230fb3580 kernel/sched: Simply de/queue_thread()
Pure refactoring.  For historical reasons these two functions took a
first argument (a pointer to the run queue) that was always the same.
Eliminate it.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-09-28 20:15:05 -04:00
Chen Peng1 0f63d1135c cmsis_rtos_v1: fix thread instances management.
add a bitarray into struct osThreadDef_t to indicate whether the
thread is used or not, then we can get the first available thread
by searching this array when creating a new thread, and update this
array to add a free thread when terminating a thread.

Signed-off-by: Chen Peng1 <peng1.chen@intel.com>
2021-09-09 12:01:06 -04:00
Andy Ross 0d763e0a10 cmake/compiler/xcc: sched: Support XCC inlining semantics
Cadence XCC is based off of a very old 4.2 gcc compiler, which didn't
perfectly support C99 "inline" semantics with respect to
cross-translation-unit inline linkage (which Zephyr does not use, our
inlines are static only) and declaration order.

Fix the one spot where we were calling an inline before its
ALWAYS_INLINE definition, and add a flag to suppress the warning so
CI's trying to build with XCC and -Werror don't flip out.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-09-08 09:28:31 -04:00
Andrew Boie f07df42d49 kernel: make k_current_get() work without syscall
We cache the current thread ID in a thread-local variable
at thread entry, and have k_current_get() return that,
eliminating system call overhead for this API.

DL: changed _current to use z_current_get() as it is
    being used during boot where TLS is not available.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-07-30 20:16:47 -04:00
Anas Nashif 8b3f36c656 kernel: move internal headers into include/kernel
Move 2 headers that are internal to the kernel into include/kernel.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-06-16 20:38:55 -04:00
Maksim Masalski 78ba2ec830 coding guidelines: add to function prototypes form named parameters
Function types shall be in prototype form with named parameters

Found as a coding guideline violation (MISRA R8.2) by static
coding scanning tool.

Signed-off-by: Maksim Masalski <maksim.masalski@intel.com>
2021-06-04 16:20:06 -05:00
Lauren Murphy 4c85b4606b kernel: k_sleep: fix return value for absolute timeout
Fixes calculation of remaining ticks returned from z_tick_sleep
so that it takes absolute timeouts into account.

Fixes #32506

Signed-off-by: Lauren Murphy <lauren.murphy@intel.com>
2021-05-26 18:11:52 -05:00
Maksim Masalski 970820e92d sched: create unique function name
In file include/kernel/thread.h in "struct _thread_base" is a member
called "_wait_q_t *pended_on"
At the same time in file kernel/sched.c is function called
"static _wait_q_t *pended_on()"

Coding scanning tool assigns violation (MISRA R5.9) that static
object reused, because thread.h is included in struct.c file.

I think we can rename function to avoid misreading in the future.

Signed-off-by: Maksim Masalski <maksim.masalski@intel.com>
2021-05-25 19:06:21 -04:00
Andy Ross 851d14afc8 kernel/sched: Remove "cooperative scheduling only" special cases
The scheduler has historically had an API where an application can
inform the kernel that it will never create a thread that can be
preempted, and the kernel and architecture layer would use that as an
optimization hint to eliminate some code paths.

Those optimizations have dwindled to almost nothing at this point, and
they're now objectively a smaller impact than the special casing that
was required to handle the idle thread (which, obviously, must always
be preemptible).

Fix this by eliminating the idea of "cooperative only" and ensuring
that there will always be at least one preemptible priority with value
>=0.  CONFIG_NUM_PREEMPT_PRIORITIES now specifies the number of
user-accessible priorities other than the idle thread.

The only remaining workaround is that some older architectures (and
also SPARC) use the CONFIG_PREEMPT_ENABLED=n state as a hint to skip
thread switching on interrupt exit.  So detect exactly those platforms
and implement a minimal workaround in the idle loop (basically "just
call swap()") instead, with a big explanation.

Note that this also fixes a bug in one of the philosophers samples,
where it would ask for 6 cooperative priorities but then use values -7
through -2.  It was assuming the kernel would magically create a
cooperative priority for its idle thread, which wasn't correct even
before.

Fixes #34584

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-05-24 23:38:16 -04:00
Torbjörn Leksell f17144349b Tracing: Thread tracing
Add thread tracing hooks, default hooks, and documentation.

Signed-off-by: Torbjörn Leksell <torbjorn.leksell@percepio.com>
2021-05-07 22:10:21 -04:00
Anas Nashif 6df4405cca doc: fix typos
Fix various typos in the docs.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-04-30 16:03:08 -04:00
Krzysztof Chruscinski 7dcff6ecfe kernel: Move _kernel from sched to init
_kernel struct can be used when multithreading is disabled.
In that case sched.c may not be compiled.

Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
2021-04-29 14:50:35 +02:00
Anas Nashif 3f4f3f6c43 kernel: make tests of a value against zero should be made explicit
Tests of a value against zero should be made explicit, unless the
operand is effectively Boolean. This is based on MISRA rule 14.4.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-04-01 05:34:17 -04:00
Anas Nashif 25c87db860 kernel/arch: cleanup function definitions
make identifiers used in the declaration and definition identical. This
is based on MISRA rule 8.3.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-04-01 05:34:17 -04:00
Anas Nashif bbbc38ba8f kernel: Make both operands of operators of same essential type category
Add a 'U' suffix to values when computing and comparing against
unsigned variables and other related fixes of the same MISRA rule (10.4)

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-04-01 05:34:17 -04:00
Anas Nashif 5c90ceb105 clock: rename z_tick_get_32 -> sys_clock_tick_get_32
Do not use z_ for internal APIs, z_ is for private APIs within one
subsystem only.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-03-19 11:22:17 -04:00
Anas Nashif 9c1efe6b4b clock: remove z_ from semi-public APIs
The clock/timer APIs are not application facing APIs, however, similar
to arch_ and a few other APIs they are available to implement drivers
and add support for new hardware and are documented and available to be
used outside of the clock/kernel subsystems.

Remove the leading z_ and provide them as clock_* APIs for someone
writing a new timer driver to use.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-03-19 11:22:17 -04:00
Lauren Murphy d88ce65463 kernel/sched: only send IPI to abort thread if hardware supports it
Wrap arch_sched_ipi() call in z_thread_abort() with ifdef checking for
hardware support of IPI.

Fixes #32723

Signed-off-by: Lauren Murphy <lauren.murphy@intel.com>
2021-03-10 14:27:33 -05:00
Spoorthy Priya Yerabolu 4118ed1d4d kernel: sched: removing dead code
Due to the recent changes to scheduler z_find_first_thread_to_unpend
& z_remove_thread_from_ready_q are not used anymore. So removing the
dead code.

fixes: #32691

Signed-off-by: Spoorthy Priya Yerabolu <spoorthy.priya.yerabolu@intel.com>
2021-03-05 11:05:25 +03:00
Peter Bigot 0259c864df kernel: add private scheduler APIs
These functions are a subset of proposed public APIs to clean up
several issues related to safely handling waking of threads.  They
have been made private as they interface may change, but their use
will simplify the reimplementation of the k_work functionality.

See: https://github.com/zephyrproject-rtos/zephyr/pull/29668

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
2021-03-03 20:06:00 -05:00
James Harris 6543e06914 kernel: sched: avoid unnecessary lock in z_impl_k_yield
`z_impl_k_yield` unlocked sched_spinlock, only to lock it again
immediately, do a little bit more work, then unlock it again.
This causes performance issues on SMP, where `sched_spinlock`
is often fairly highly contended and cores often end up spinning
for quite a while waiting to retake the lock in `z_swap_unlocked`.

Instead directly pass the spinlock key to `z_swap` and avoid the
extra lock+unlock.

Signed-off-by: James Harris <james.harris@intel.com>
2021-03-02 14:35:21 -05:00
James Harris 2cd0f66515 kernel: sched: change to 3-way thread priority comparison
`z_is_t1_higher_prio_than_t2` was being called twice in both the
context-switch fastpath and in `z_priq_rb_lessthan`, just to
dealing with priority ties. In addition, the API was error-prone
(and too much in the fastpath to be able to assert its invarients)
- see also #32710 for a previous example of this API breaking
and returning a>b but also b>a.

Replacing this with a direct 3-way comparison `z_cmp_t1_prio_with_t2`
sidesteps most of these issues. There is still a concern that
`sgn(z_cmp_t1_prio_with_t2(a,b)) != -sgn(z_cmp_t1_prio_with_t2(b,a))`
but I don't see any way to alleviate this aside from adding an
assert to the fastpath.

Signed-off-by: James Harris <james.harris@intel.com>
2021-03-02 14:27:14 -05:00
James Harris 3330ab12d8 kernel: fix yielding between tasks with same deadline
Previously two tasks with the same deadline and priority would
always have `z_is_t1_higher_prio_than_t2` `true` in both directions.

This is logically inconsistent, and results in `k_yield` not actually
yielding between identical threads.

Signed-off-by: James Harris <james.harris@intel.com>
2021-02-27 10:25:47 +01:00
Andy Ross 6fb6d3cfbe kernel: Add new k_thread_abort()/k_thread_join()
Add a newer, much smaller and simpler implementation of abort and
join.  No need to involve the idle thread.  No need for a special code
path for self-abort.  Joining a thread and waiting for an aborting one
to terminate elsewhere share an implementation.  All work in both
calls happens under a single locked path with no unexpected
synchronization points.

This fixes a bug with the current implementation where the action of
z_sched_single_abort() was nonatomic, releasing the lock internally at
a point where the thread to be aborted could self-abort and confuse
the state such that it failed to abort at all.

Note that the arm32 and native_posix architectures, which have their
own thread abort implementations, now see a much simplified
"z_thread_abort()" internal API.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andy Ross c0c8cb0e97 kernel: Remove abort and join implementation (UNBISECTABLE)
THIS COMMIT DELIBERATELY BREAKS BISECTABILITY FOR EASE OF REVIEW.
SKIP IF YOU LAND HERE.

Remove the existing implementatoin of k_thread_abort(),
k_thread_join(), and the attendant facilities in the thread subsystem
and idle thread that support them.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andy Ross 419f37043b kernel/sched: Clamp minimum timeslice when TICKLESS
When the kernel is TICKLESS, timeouts are set as needed, and drivers
all have some minimum amount of time before which they can reliably
schedule an interrupt.  When this happens, drivers will kick the
requested interrupt out by one tick.  This means that it's not
reliably possible to get a timeout set for "one tick in the
future"[1].

And attempting to do that is dangerous anyway.  If the driver will
delay a one-tick interrupt, then code that repeatedly tries to
schedule an imminent interrupt may end up in a state where it is
constantly pushing the interrupt out into the future, and timer
interrupts stop arriving!  The timeout layer actually has protection
against this case.

Finally getting to the point: in recent changes, the timeslice layer
lost its integration with the "imminent" test in the timeout code, so
it's now able to run into this situation: very rapidly context
switching code (or rapidly arriving interrupts) will have the effect
of infinitely[2] delaying timeouts and stalling the whole timeout
subsystem.

Don't try to be fancy.  Just clamp timeslice duration such that a
slice is 2 ticks at minimum and we'll never hit the problem.  Adjust
the two tests that were explicitly requesting very short slice rates.

[1] Of course, the tradeoff is that the tick rate can be 100x higher
or more, so on balance tickless is a huge win.

[2] Actually it only lasts until a 31 bit signed rollover in the HPET
cycle count in practice.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andy Ross a202670c18 kernel/sched: Remove now-spurious SWAP_NONATOMIC workaround
Recent work to normalize use of the thread QUEUED state bit means that
we never attempt to remove unqueued threads from the low-level run
queue.  So the old workaround for SWAP_NONATOMIC that was trying to
detect this condition isn't necessary anymore.

Which is serendipitous, because it was written to encode some very
specific logic about the circumstances where _current could be
dequeued that I'd like to be able to break.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andy Ross 05c468f594 kernel/sched: Make z_ready_thread() safe vs. already-running threads
This is part of the scheduler API, and was always just a synchronized
wrapper around the internal ready_thread() function.  But where the
internal users seem to be careful not to call it on threads that are
not known to be already queued or running, the general users in the
IPC code seem to be less strict.

Add a simple test to detect the case where a thread is already
running.  Right now this just loops over the array of CPUs, so is O(N)
in the CPU count even though N is never more than four for us
currently.  But this is possible without modifying data structures.  A
more scalable way to do this if we ever need to run on very parallel
systems would be to use another state bit for RUNNING, or to keep a
backpointer in the thread struct to the CPU it's running on, etc...

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andy Ross 6b84ab3830 kernel/sched: Adjust locking in z_swap()
Swap was originally written to use the scheduler lock just to select a
new thread, but it would be nice to be able to rely on scheduler
atomicity later in the process (in particular it would be nice if the
assignment to cpu.current could be seen atomically).  Rework the code
a bit so that swap takes the lock itself and holds it until just
before the call to arch_switch().

Note that the local interrupt mask has always been required to be held
across the swap, so extending the lock here has no effect on latency
at all on uniprocessor setups, and even on SMP only affects average
latency and not worst case.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andy Ross 37866336f9 kernel/sched: Fix race between thread wakeup timeout and abort
Aborted threads will cancel their timeouts, but the timeout subsystem
isn't protected under the same lock so it's possible for a timeout to
fire just as a thread is being aborted and wake it up unexpectedly.
Check the state before blowing anything up.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-24 16:39:15 -05:00
Andrei Emeltchenko 377456c5af kernel: Move LOCKED() macro to kernel_internal.h
Remove duplication in the code by moving macro LOCKED() to the correct
kernel_internal.h header.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
2021-02-22 14:56:37 -05:00
Andy Ross 4ff457113e kernel/sched: Fix rare SMP deadlock
It was possible with pathological timing (see below) for the scheduler
to pick a cycle of threads on each CPU and enter the context switch
path on all of them simultaneously.

Example:
   * CPU0 is idle, CPU1 is running thread A
   * CPU1 makes high priority thread B runnable
   * CPU1 reaches a schedule point (or returns from an interrupt) and
     decides to run thread B instead
   * CPU0 simultaneously takes its IPI and returns, selecting thread A

Now both CPUs enter wait_for_switch() to spin, waiting for the context
switch code on the other thread to finish and mark the thread
runnable.  So we have a deadlock, each CPU is spinning waiting for the
other!

Actually, in practice this seems not to happen on existing hardware
platforms, it's only exercisable in emulation.  The reason is that the
hardware IPI time is much faster than the software paths required to
reach a schedule point or interrupt exit, so CPU1 always selects the
newly scheduled thread and no deadlock appears.  I tried for a bit to
make this happen with a cycle of three threads, but it's complicated
to get right and I still couldn't get the timing to hit correctly.  In
qemu, though, the IPI is implemented as a Unix signal sent to the
thread running the other CPU, which is far slower and opens the window
to see this happen.

The solution is simple enough: don't store the _current thread in the
run queue until we are on the tail end of the context switch path,
after wait_for_switch() and going to reach the end in guaranteed time.

Note that this requires changing a little logic to handle the yield
case: because we can no longer rely on _current's position in the run
queue to suppress it, we need to do the priority comparison directly
based on the existing "swap_ok" flag (which has always meant
"yielded", and maybe should be renamed).

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-14 16:22:45 -05:00
Andy Ross 91946ef21c kernel/sched: Refactor, unify management of QUEUED state
The QUEUED state flag was managed separately from the run queue
insertion/deletion, and the logic (while AFAICT perfectly correct) was
tangled in a few places trying to keep them in sync.  Put the
management of both behind a queue_thread()/dequeue_thread() API for
clarity.  The ALWAYS_INLINE usage seems to be working to get the
compiler to condense the resulting multiple assignments.  No behavior
change.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-14 16:22:45 -05:00
Andy Ross dd43221540 kernel/sched: Fix race with switch handle
The "null out the switch handle and put it back" code in the swap
implementation is a holdover from some defensive coding (not wanting
to break the case where we picked our current thread), but it hides a
subtle SMP race: when that field goes NULL, another CPU that may have
selected that thread (which is to say, our current thread) as its next
to run will be spinning on that to detect when the field goes
non-NULL.  So it will get the signal to move on when we revert the
value, when clearly we are still running on the stack!

In practice this was found on x86 which poisons the switch context
such that it crashes instantly.

Instead, be firm about state and always set the switch handle of a
currently running thread to NULL immediately before it starts running:
right before entering arch_switch() and symmetrically on the interrupt
exit path.

Fixes #28105

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-14 16:22:45 -05:00
Andy Ross 1ba7414029 kernel/sched: Correct coherence assert
Some legacy spots in our IPC layer (legally) pass a NULL wait queue to
pend().  Allow this in the coherence assertion.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-11 14:47:40 -05:00
Andy Ross 604f0f44b6 kernel/sched: Add missing lock around waitq unpend calls
The two calls to unpend a thread from a wait queue were inexplicably*
unsynchronized, as James Harris discovered.  Rework them to call the
lowest level primities so we can wrap the process inside the scheduler
lock.

Fixes #32136

* I took a brief look.  What seems to have happened here is that these
  were originally synchronized via an implicit from an outer caller
  (remember the original Uniprocessor irq_lock() API is a recursive
  lock), and they were mostly implemented in terms of middle-level
  calls that were themselves locked.  So those got ported over to the
  newer spinlock but the outer wrapper layer got forgotten.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-10 07:43:18 -05:00
Anas Nashif 39f632e7f0 kernel: fix usage of KERNEL_COHERENCE macro
Add missing CONFIG_ to KERNEL_COHERENCE usage in code.

Fixes #30380

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2021-02-03 10:42:04 -05:00
Enjia Mai 53ca709828 tests: coverage: exclude the CODE UNREACHABLE of code coverage
1. Exclude the CODE UNREACHABLE line while generating coverage report.
2. Exclude the memory domain deprecated API when calculating code
coverage.

Signed-off-by: Enjia Mai <enjiax.mai@intel.com>
2021-01-15 12:42:00 -05:00
Andy Ross ef626571b2 kernel/sched: Optimize deadline comparison
Needing to check the current cycle time (which involves a spinlock and
register read on most architectures) is wasteful in the scheduler
priority predicate, which is a hot path.  If we "burn" one bit of
precision (and document the rule), we can do the comparison without
knowing the current time.

2^31 cycles is still far longer than a live deadline thread in any
legitimate realtime app should ever live before being scheduled.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-01-15 11:35:50 -05:00
Andy Ross e956639dd6 kernel: Remove CONFIG_LEGACY_TIMEOUT_API
This was a fallback for an API change several versions ago.  It's time
for it to go.

Fixes: #30893

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-01-14 21:33:16 -05:00
Marcin Niestroj 11cb1cf336 kernel: sched: fix legacy timeout calculation in z_tick_sleep
Ticks should be assigned directly to timeout value in case of
CONFIG_LEGACY_TIMEOUT_API=y, just as they were before referenced patch.

Fixes: 7a815d5d99 ("kernel: sched: Use k_ticks_t in z_tick_sleep")
Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com>
2020-12-18 14:03:25 -05:00
Anas Nashif 87ddddae52 Revert "kernel: fix usage of KERNEL_COHERENCE macro"
This reverts commit 67c5e6b0c0.

This is causing build issues on some platforms. Revert for now.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-12-08 14:27:27 -05:00
Anas Nashif 67c5e6b0c0 kernel: fix usage of KERNEL_COHERENCE macro
Add missing CONFIG_ to KERNEL_COHERENCE usage in code.

Fixes #30380

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-12-08 09:30:02 -05:00
Flavio Ceolin 9a16097fd8 kernel: sched: Change variable name in z_tick_sleep
Change a variable name to avoid confusion between time and ticks.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2020-12-04 23:05:57 -05:00
Flavio Ceolin 7a815d5d99 kernel: sched: Use k_ticks_t in z_tick_sleep
z_tick_sleep was using int32_t what could cause a possible overflow
when converting from k_ticks_t.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2020-12-04 23:05:57 -05:00
Krzysztof Chruscinski 3ed8083dc1 kernel: Cleanup logger setup in kernel files
Most of kernel files where declaring os module without providing
log level. Because of that default log level was used instead of
CONFIG_KERNEL_LOG_LEVEL.

Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
2020-11-27 09:56:34 -05:00
Daniel Leung 11e6b43090 tracing: roll thread switch in/out into thread stats functions
Since the tracing of thread being switched in/out has the same
instrumentation points, we can roll the tracing function calls
into the one for thread stats gathering functions.
This avoids duplicating code to call another function.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-11-11 23:55:49 -05:00
Andrew Boie 933b420235 kernel: add context pointer to thread->fn_abort
For compatibility layers like CMSIS where thread objects
are drawn from a pool, provide a context pointer to the
exited thread object so it may be freed.

This is somewhat obscure and has no supporting APIs or
overview documentation and should be considered a private
kernel feature. Applications should really be using
k_thread_join() instead.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-10-22 23:32:37 -04:00
Anas Nashif d2c71796af kernel: document k_sleep with K_FOREVER
When calling k_sleep with K_FOREVER as the timeout value, we consider
this as a suspend call.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-10-22 07:00:15 -04:00
Anas Nashif 081605ee23 kernel: do not queue a thread that is already queued
Do not add a thread to the run queue if it was already added.

Fixes #29244

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-10-22 07:00:15 -04:00
Anas Nashif bf69afcdae kernel: only resume suspended threads
Do not try to resume a thread that was not suspended.

Fixes #28694

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-10-22 07:00:15 -04:00
Andy Ross f6d32ab0a4 kernel: Add cache coherence management framework
Zephyr SMP kernels need to be able to run on architectures with
incoherent caches.  Naive implementation of synchronization on such
architectures requires extensive cache flushing (e.g. flush+invalidate
everything on every spin lock operation, flush on every unlock!) and
is a performance problem.

Instead, many of these systems will have access to separate "coherent"
(usually uncached) and "incoherent" regions of memory.  Where this is
available, place all writable data sections by default into the
coherent region.  An "__incoherent" attribute flag is defined for data
regions that are known to be CPU-local and which should use the cache.
By default, this is used for stack memory.

Stack memory will be incoherent by default, as by definition it is
local to its current thread.  This requires special cache management
on context switch, so an arch API has been added for that.

Also, when enabled, add assertions to strategic places to ensure that
shared kernel data is indeed coherent.  We check thread objects, the
_kernel struct, waitq's, timeouts and spinlocks.  In practice almost
all kernel synchronization is built on top of these structures, and
any shared data structs will contain at least one of them.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-10-21 06:38:53 -04:00
Andrew Boie b5a71f74a8 userspace: remove threads from domain on abort
When threads exited we were leaving dangling references to
them in the domain's mem_domain_q.

z_thread_single_abort() now calls into the memory domain
code via z_mem_domain_exit_thread() to take it off.

The thread setup code now invokes z_mem_domain_init_thread(),
avoiding extra checks in k_mem_domain_add_thread(), we know
the object isn't currently a member of a doamin.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-10-20 09:37:49 -07:00
Andrew Boie e0ca403f4c kernel: add assert for mis-used k_thread_create()
k_thread_create() works as expected on both uninitialized memory,
or threads that have completely exited.

However, horrible and difficult to comprehend things can happen if a
thread object is already being used by the kernel and
k_thread_create() is called on it.

Historically this has been a problem with test cases trying to be
parsimonious with thread objects and not properly cleaning up
after themselves. Add an assertion for this which should catch
both the illegal creation of a thread already active, or threads
racing to create the same thread object.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:11:59 -04:00
Andrew Boie f5a7e1a108 kernel: handle thread self-aborts on idle thread
Fixes races where threads on another CPU are joining the
exiting thread, since it could still be running when
the joiners wake up on a different CPU.

Fixes problems where the thread object is still being
used by the kernel when the fn_abort() function is called,
preventing the thread object from being recycled or
freed back to a slab pool.

Fixes a race where a thread is aborted from one CPU while
it self-aborts on another CPU, that was currently worked
around with a busy-wait.

Precedent for doing this comes from FreeRTOS, which also
performs final thread cleanup in the idle thread.

Some logic in z_thread_single_abort() rearranged such that
when we release sched_spinlock, the thread object pointer
is never dereferenced by the kernel again; join waiters
or fn_abort() logic may free it immediately.

An assertion added to z_thread_single_abort() to ensure
it never gets called with thread == _current outside of an ISR.

Some logic has been added to ensure z_thread_single_abort()
tasks don't run more than once.

Fixes: #26486
Related to: #23063 #23062

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:11:59 -04:00
Watson Zeng 37f75d2d1f kernel: sched: bug fix for trace and monitor
sys_trace_thread_abort and z_thread_monitor_exit in
z_thread_single_abort also need to be protected by
sched_spinlock, otherwise when after the spinlock
release, if there is an pending interrupt, it will cause an
sched in the interrrupt exit, and those trace and monitor
function will never reach.

Signed-off-by: Watson Zeng <zhiwei@synopsys.com>
2020-09-17 09:30:22 +02:00
Andrew Boie a8775ab8cb sched: don't use local lock in z_tick_sleep()
We're modifying thread_state. Use sched_spinlock instead.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-16 13:11:12 -05:00
Andrew Boie 8e0f6a5936 sched: hold spinlock in z_time_slice()
We are checking thread->base members like thread_state and prio
and making decisions based on it, hold the sched_spinlock to
avoid potential concurrency problems if these members are modified
on another CPU or nested interrupt.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-16 13:11:12 -05:00
Andrew Boie 83d7770de4 sched: check if runnable in sliceable()
We need to check if a thread is runnable at all before we
contemplate putting it on the end of the priority queue,
it might not be on the queue at all if it was suspended.

Replaces the less comprehensive check to see if the thread
was pending a timeout.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-16 13:11:12 -05:00
Andrew Boie ffc5bdffbb sched: hold spinlock in z_thread_timeout()
We are checking and modifying members of thread->base
(in particular it's waitq and thread_state) which are
nominally protected by sched_spinlock. Hold it while
doing this to avoid concurrent changes on another CPU
or ISR preeemption.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-16 13:11:12 -05:00
Watson Zeng 1dddbecb35 tracing: swap: bug fix and enhancement for ARC
* Move switched_in into the arch context switch assembly code,
  which will correctly record the switched_in information.

* Add switched_in/switched_out for context switch in irq exit.

Signed-off-by: Watson Zeng <zhiwei@synopsys.com>
2020-09-03 21:54:15 +02:00
Andrew Boie 3425c32328 kernel: move stuff into z_thread_single_abort()
The same code was being copypasted in k_thread_abort()
implementations, just move into z_thread_single_abort().

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-02 15:02:06 -07:00
Anas Nashif 5c31d00a6a tracing: trace k_sleep
Trace when k_sleep is called.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-08-24 13:21:12 +02:00
Anas Nashif 379b93f0d3 kernel: do not call swap if next ready thread is current thread
Check if next ready thread is same as current thread before calling
z_swap.
This avoids calling swap to just go back to the original thread.

Original code: thread 0x20000118 switches out and then in again...

>> 0x20000118 gives semaphore(signal): 0x20000104 (count: 0)
>> thread ready: 0x20000118
>> 0x20000118 switched out
>> 0x20000118 switched in
>> end call to k_sem_give
>> 0x20000118 takes semaphore(wait): 0x200000f4 (count: 0)
>> thread pend: 0x20000118
>> 0x20000118 switched out
>> 0x200001d0 switched in

with this patch:

>> 0x200001d0 gives semaphore(signal): 0x200000f4 (count: 0)
>> thread ready: 0x200001d0
>> end call to k_sem_give
>> 0x200001d0 takes semaphore(wait): 0x20000104 (count: 0)
>> thread pend: 0x200001d0
>> 0x200001d0 switched out
>> 0x20000118 switched in
>> end call to k_sem_take

The above is output from tracing with a custom format used for
debugging.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-08-12 15:32:29 -04:00
Enjia Mai 7ac40aabc0 tests: adding test cases for arch-dependent SMP function
Add one another test case for testing both arch_curr_cpu() and
arch_sched_ipi() architecture layer interface.

Signed-off-by: Enjia Mai <enjiax.mai@intel.com>
2020-07-02 08:42:53 -04:00
Anas Nashif 2c5d40437b kernel: logging: convert K_DEBUG to LOG_DBG
Move K_DEBUG to use LOG_DBG instead of plain printk.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-06-25 16:12:36 -05:00
Kumar Gala a1b77fd589 zephyr: replace zephyr integer types with C99 types
git grep -l 'u\(8\|16\|32\|64\)_t' | \
		xargs sed -i "s/u\(8\|16\|32\|64\)_t/uint\1_t/g"
	git grep -l 's\(8\|16\|32\|64\)_t' | \
		xargs sed -i "s/s\(8\|16\|32\|64\)_t/int\1_t/g"

Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
2020-06-08 08:23:57 -05:00
Andrew Boie f1b5d9db8e kernel: fix issue with k_thread_join() timeouts
If k_thread_join() was passed with an actual timeout value,
and not K_FOREVER, the blocking thread was not being properly
woken up when the target thread exits. The timeout itself
was never aborted, causing the joining thread to remain
un-scheduled until the timeout expires.

Amend the k_thread_join() test cases to check that the join
completed before the provided timeout period expired.

Fixes: #24744

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-05-05 11:43:08 -07:00
Andy Ross cfeb07eded kernel/timeout: Enable 64 bit timeout precision
Add a CONFIG_TIMEOUT_64BIT kconfig that, when selected, makes the
k_ticks_t used in timeout computations pervasively 64 bit.  This will
allow much longer timeouts and much faster (i.e. more precise) tick
rates.  It also enables the use of absolute (not delta) timeouts in an
upcoming commit.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-03-31 19:40:47 -04:00
Andy Ross 7832738ae9 kernel/timeout: Make timeout arguments an opaque type
Add a k_timeout_t type, and use it everywhere that kernel API
functions were accepting a millisecond timeout argument.  Instead of
forcing milliseconds everywhere (which are often not integrally
representable as system ticks), do the conversion to ticks at the
point where the timeout is created.  This avoids an extra unit
conversion in some application code, and allows us to express the
timeout in units other than milliseconds to achieve greater precision.

The existing K_MSEC() et. al. macros now return initializers for a
k_timeout_t.

The K_NO_WAIT and K_FOREVER constants have now become k_timeout_t
values, which means they cannot be operated on as integers.
Applications which have their own APIs that need to inspect these
vs. user-provided timeouts can now use a K_TIMEOUT_EQ() predicate to
test for equality.

Timer drivers, which receive an integer tick count in ther
z_clock_set_timeout() functions, now use the integer-valued
K_TICKS_FOREVER constant instead of K_FOREVER.

For the initial release, to preserve source compatibility, a
CONFIG_LEGACY_TIMEOUT_API kconfig is provided.  When true, the
k_timeout_t will remain a compatible 32 bit value that will work with
any legacy Zephyr application.

Some subsystems present timeout (or timeout-like) values to their own
users as APIs that would re-use the kernel's own constants and
conventions.  These will require some minor design work to adapt to
the new scheme (in most cases just using k_timeout_t directly in their
own API), and they have not been changed in this patch, instead
selecting CONFIG_LEGACY_TIMEOUT_API via kconfig.  These subsystems
include: CAN Bus, the Microbit display driver, I2S, LoRa modem
drivers, the UART Async API, Video hardware drivers, the console
subsystem, and the network buffer abstraction.

k_sleep() now takes a k_timeout_t argument, with a k_msleep() variant
provided that works identically to the original API.

Most of the changes here are just type/configuration management and
documentation, but there are logic changes in mempool, where a loop
that used a timeout numerically has been reworked using a new
z_timeout_end_calc() predicate.  Also in queue.c, a (when POLL was
enabled) a similar loop was needlessly used to try to retry the
k_poll() call after a spurious failure.  But k_poll() does not fail
spuriously, so the loop was removed.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-03-31 19:40:47 -04:00
Oleg Zhurakivskyy b1e1f64d14 global: Replace BUILD_ASSERT_MSG() with BUILD_ASSERT()
Replace all occurences of BUILD_ASSERT_MSG() with BUILD_ASSERT()
as a result of merging BUILD_ASSERT() and BUILD_ASSERT_MSG().

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2020-03-31 07:18:06 +02:00
Andrew Boie a4c9190649 kernel: fix oops policy for k_thread_abort()
Don't generate a Z_OOPS() if k_thread_abort() is called on a
thread that isn't running. Just return to the caller instead,
much like how k_thread_join() functions.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-03-25 10:23:12 -07:00
Carles Cufi 4b37a8f3a4 Revert "global: Replace BUILD_ASSERT_MSG() with BUILD_ASSERT()"
This reverts commit 8739517107.

Pull Request #23437 was merged by mistake with an invalid manifest.

Signed-off-by: Carles Cufi <carles.cufi@nordicsemi.no>
2020-03-19 18:45:13 +01:00
Oleg Zhurakivskyy 8739517107 global: Replace BUILD_ASSERT_MSG() with BUILD_ASSERT()
Replace all occurences of BUILD_ASSERT_MSG() with BUILD_ASSERT()
as a result of merging BUILD_ASSERT() and BUILD_ASSERT_MSG().

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2020-03-19 15:47:53 +01:00
Andrew Boie 2dc2ecfb60 kernel: rename struct _k_object
Private type, internal to the kernel, not directly associated
with any k_object_* APIs. Is the return value of z_object_find().
Rename to struct z_object.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-03-17 20:11:27 +02:00
Andrew Boie 322816eada kernel: add k_thread_join()
Callers will go to sleep until the thread exits, either normally
or crashing out.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-03-13 08:42:43 -04:00
Andrew Boie 6cf496f324 kernel: use sched lock for k_thread_suspend/resume
This logic should be using the sched_lock and not its own
separate lock for these two functions.

Some simplications were made; z_thread_single_resume and
z_thread_single_suspend were only used in one place, and there was
some redundant logic for whether to reschedule in the suspend case.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-03-10 09:57:58 -04:00
Andrew Boie 896e32b414 kernel: remove problematic pend() assertion
This assertion, if built in, allows users threads to crash
the kernel in a critical section by passing a negative timeout
value, creating a DoS attack vector.

Remove this assertion, immediately below it there's a check
which just resets it to 0 anyway.

Fixes: #22999

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-02-21 08:57:07 -08:00
Andy Ross eefd3daa81 kernel/smp: arch/x86_64: Address race with CPU migration
Use of the _current_cpu pointer cannot be done safely in a preemptible
context.  If a thread is preempted and migrates to another CPU, the
old CPU record will be wrong.

Add a validation assert to the expression that catches incorrect
usages, and fix up the spots where it was wrong (most important being
a few uses of _current outside of locks, and the arch_is_in_isr()
implementation).

Note that the resulting _current expression now requires locking and
is going to be somewhat slower.  Longer term it's going to be better
to augment the arch API to allow SMP architectures to implement a
faster "get current thread pointer" action than this default.

Note also that this change means that "_current" is no longer
expressible as an lvalue (long ago, it was just a static variable), so
the places where it gets assigned now assign to _current_cpu->current
instead.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-02-08 08:51:04 -05:00
Andy Ross 5737b5c843 kernel/sched: Re-add IPI calls on k_wakeup() and k_thread_priority_set()
These got dropped by an earlier patch, but are required on SMP systems
so synchronously notify other CPUs of changed scheduler state.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-02-04 17:50:11 -05:00
Andy Ross c44d566aee kernel/sched: Re-fix SMP wait-for-switch on interrupt exit
This got clobbered by commit adac4cbafa in what I think was a rebase
mistake.  Without it, on SMP systems it's possible to select a new
_current thread and try to return into it before another CPU has
actually finished switching away from it.

Interestingly: the frequency with which this bug got caught once it
was reintroduced was much, much higher than it was when it was fixed
the first time due to the instruction pointer poisoning introduced in
the interrim.  Incompletely saved threads now have deliberately broken
state when assertions are enabled and will panic synchronously.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-02-04 17:50:11 -05:00
Andy Ross 96ccc46e03 kernel/sched: Put k_thread_start() under a single lock
Similar to the suspend refactoring earlier, this really nees to be
done in an atomic block.  There were two confirmable races here,
though it's not completely clear either was being hit in practice:

1. The bit operations in z_mark_thread_as_started() aren't atomic so
   it needs to be protected.

2. The intermediate state in z_ready_thread() could result in a dead
   or suspended thread being added to the ready queue if another
   context tried a simultaneous abort or suspend.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-02-03 09:31:56 -05:00
Andy Ross ed6b4fb21c kernel/sched: Properly synchronize pend()
Kernel wait_q's and the thread pended_on backpointer are scheduler
state and need to be modified under the scheduler lock.  There was one
spot in pend() where they were not.

Also unpack z_remove_thread_from_ready_q() into an unsynchronized
utility so that it can be called by this process in a single lock
block.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-02-03 09:31:56 -05:00
Daniel Leung adac4cbafa sched: smp: fix thread marked dead but still running
Under SMP, when a thread is marked aborting, this thread may still
be running on another CPU. However, if there is only one thread
available to run, this thread may be selected to run again due to
next_up() not checking for the aborting state. Moreover, when
there is no IPI to signal to others k_thread_abort() being called,
the k_thread_abort() target thread is marked dead after a new
thread is selected to run. This causes the original thread calling
k_thread_abort() to mistaken that target thread is no longer
running and returns.

Note that, with working IPI, z_sched_ipi() is called as an ISR
to mark the target thread dead. A new thread is then selected to
run, so that the target thread would not be selected due to it
being dead.

This moves the code to mark thread dead into next_up(), where
the next best thread is selected, and the current thread being
swapped out. z_sched_ipi() now becomes an empty function, and
calls to it are removed.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-01-31 11:46:35 -05:00
Andy Ross 3235451880 kernel/swap: Add SMP "wait for switch" synchronization
On SMP, there is an inherent race when swapping: the old thread adds
itself back to the run queue before calling into the arch layer to do
the context switch.  The former is properly synchronized under the
scheduler lock, and the later operates with interrupts locally
disabled.  But until somewhere in the middle of arch_switch(), the old
thread (that is in the run queue!) does not have complete saved state
that can be restored.

So it's possible for another CPU to grab a thread before it is saved
and try to restore its unsaved register contents (which are garbage --
typically whatever state it had at the last interrupt).

Fix this by leveraging the "swapped_from" pointer already passed to
arch_switch() as a synchronization primitive.  When the switch
implementation writes the new handle value, we know the switch is
complete.  Then we can wait for that in z_swap() and at interrupt
exit.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-01-21 14:47:52 -08:00
Andy Ross e06ba702d5 kernel/sched: Address thread abort termination delay issue on SMP
It's possible for a thread to abort itself simultaneously with an
external abort from another thread.  In fact in our test suite this is
a common thing, as ztest will abort its own spawend threads at the end
of a test, as they tend to be exiting on their own.

When that happens, the thread marks itself DEAD and does all its
scheduler bookeeping, but it is STILL RUNNING on its own stack until
it makes its way to its final swap.  The external context would see
that "dead" metadata and return from k_thread_abort(), allowing the
next test to reuse and spawn the same thread struct while the old
context was still running.  Obviously that's bad.

Unfortunately, this is impossible to address completely without
modifying every SMP architecture to add a API-visible hook to every
swap that signals completion.  In practice the best we can do is add a
delay.  But note the optimization: almost always, the scheduler IPI
catches the running thread and kills it from interrupt context
(i.e. on a different stack).  When that happens, we know that the
interrupted thread will never be resumed (because it's dead) and can
elide the delay.  We only pay the cost when we actually detect a race.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-01-21 14:47:52 -08:00
Andy Ross 60247ca149 kernel/sched: Correct IPI usage
These two spots were calling z_sched_ipi() (the IPI handler run under
the ISR, which is a noop here because obviously the current thread
isn't DEAD) and not arch_sched_ipi() (which triggers an IPI on other
CPUs to inform them of scheduling state changes), presumably because
of a typo.

Apparently we don't have tests for k_wakeup() and
k_thread_priority_set() that are sensitive to latency in SMP
contexts...

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-01-21 14:47:52 -08:00
Anas Nashif 0ad67650f2 tracing: better positioning of tracing points
Improve positioning of tracing calls. Avoid multiple calls and missing
events because of complex logix. Trace the event where things happen
really.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2020-01-09 11:21:19 -05:00
Andy Ross 8bdabcc46b kernel/sched: Move thread suspend and abort under the scheduler lock
Historically, these routines were placed in thread.c and would use the
scheduler via exported, synchronized functions (e.g. "remove from
ready queue").  But those steps were very fine grained, and there were
races where the thread could be seen by other contexts (in particular
under SMP) in an intermediate state.  It's not completely clear to me
that any of these were fatal bugs, but it's very hard to prove they
weren't.

At best, this is fragile.  Move the z_thread_single_suspend/abort()
functions into the scheduler and do the scheduler logic in a single
critical section.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-01-08 14:21:10 +01:00
Anas Nashif 9e3e7f6dda kernel: use 'thread' for thread variable consistently
We have been using thread, th and t for thread variables making the code
less readable, especially when we use t for timeouts and other time
related variables. Just use thread where possible and keep things
consistent.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2019-12-21 19:57:57 -05:00
Danny Oerndrup c9d78401cc spinlock: Make SPIN_VALIDATE a Kconfig option.
SPIN_VALIDATE is, as it was previously, enabled per default when having
less than 4 CPUs and either having no flash or a flash size greater than
32kB.

Small targets, which needs to have asserts enabled, can chose to have
the spinlock validation enabled or not and thereby decide whether the
overhead added is acceptable or not.

Signed-off-by: Danny Oerndrup <daor@demant.com>
2019-12-20 19:51:16 -05:00
Peter Bigot 8162e586e3 kernel: sched: assert when k_sleep invoked from interrupt context
Fix a gap where k_sleep(K_FOREVER) could execute a code path that
would not verify that the call was not from interrupt context.

Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
2019-12-13 15:47:43 -05:00
Andy Ross 11a050b2c3 kernel/sched: Fix edge case in MetaIRQ preemption of cooperative threads
When a MetaIRQ preempts a cooperative thread, that thread would be
added back to the generic run queue.  When the MetaIRQ is done, the
highest priority thread will be selected to run, which may obviously
be a cooperative thread of a higher priority than the one that was
preempted.

But that's wrong, because the original thread was promised that it
would NOT be preempted until it reached a scheduling point on its own
(that's the whole point of a cooperative thread, of course).

We need to track the thread that got preempted (one per CPU) and
return to it instead of whatever else the scheduler might have found.

Fixes #20255

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-11-15 13:09:02 +01:00
Krzysztof Chruscinski f831929cb5 kernel: Add assert to detect negative timeouts
Add assert when negative (except K_FOREVER) is passed as timeout.
Add negative timeout correction to 0.

Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
2019-11-08 16:03:05 -08:00
Andrew Boie d2b8922fa3 kernel: allow threads to sleep forever
Previously, passing K_FOREVER to k_sleep() would return
immediately.

Forever is a long time. Even if woken up at some point,
we still had forever to sleep, so return K_FOREVER in this
case.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-11-09 00:36:34 +01:00
Andy Ross 8892406c1d kernel/sys_clock.h: Deprecate and convert uses of old conversions
Mark the old time conversion APIs deprecated, leave compatibility
macros in place, and replace all usage with the new API.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-11-08 11:08:58 +01:00
Andrew Boie 4f77c2ad53 kernel: rename z_arch_ to arch_
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.

This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-11-07 15:21:46 -08:00
Wayne Ren b1fbe85156 kernel: need to release spinlock before busy_wait
need to release spinlock first before busy_wait,
or other cores cannot get the spinlock when the holder is
busy waitting.

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
2019-11-07 16:54:56 -05:00
Andrew Boie 8f0bb6afe6 tracing: simplify idle thread detection
We now define z_is_idle_thread_object() in ksched.h,
and the repeated definitions of a function that does
the same thing now changed to just use the common
definition.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-09-30 15:25:55 -04:00
Andrew Boie fe031611fd kernel: rename main/idle thread/stacks
The main and idle threads, and their associated stacks,
were being referenced in various parts of the kernel
with no central definition. Expose these in kernel_internal.h
and namespace with z_ appropriately.

The main and idle threads were being defined statically,
with another variable exposed to contain their pointer
value. This wastes a bit of memory and isn't accessible
to user threads anyway, just expose the actual thread
objects.

Redundance MAIN_STACK_SIZE and IDLE_STACK_SIZE defines
in init.c removed, just use the Kconfigs they derive
from.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-09-30 15:25:55 -04:00
Andrew Boie e1ec59f9c2 kernel: renamespace z_is_in_isr()
This is part of the core kernel -> architecture interface
and is appropriately renamed z_arch_is_in_isr().

References from test cases changed to k_is_in_isr().

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-09-30 15:25:55 -04:00
Anas Nashif 4abbd54cd5 tracing: remove useless ifdefing for CONFIG_TRACING
Tracing functions are noop if CONFIG_TRACING is disabled.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2019-09-30 10:49:37 -04:00
Andy Ross d82f76a0bb kernel/sched: Don't make an IPI if we don't need it
If an architecture declares support for IPI, we still want to use it
only when running in SMP mode.

(This also fixes a build failure on ARC, which declares
CONFIG_SCHED_IPI_SUPPORTED but doesn't actually implement
z_arch_sched_ipi() yet).

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-26 16:54:06 -04:00
Andy Ross 11bd67db53 kernel/idle: Use normal idle in SMP when IPI is available
Now that we have a working IPI framework, there's no reason for the
default spin loop for the SMP idle thread.  Just use the default
platform idle and send an IPI when a new thread is readied.

Long term, this can be optimized if necessary (e.g. only send the IPI
to idling CPUs, or check priorities, etc...), but for a 2-cpu system
this is a very reasonable default.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-26 16:54:06 -04:00
Andy Ross cb3964f04f kernel/sched: Reset time slice on swap in SMP
In uniprocessor mode, the kernel knows when a context switch "is
coming" because of the cache optimization and can use that to do
things like update time slice state.  But on SMP the scheduler state
may be updated on the other CPU at any time, so we don't know that a
switch is going to happen until the last minute.

Expose reset_time_slice() as a public function and call it when needed
out of z_swap().

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-26 16:54:06 -04:00
Andy Ross d442927667 kernel/sched: Add missing SMP thread abort case
The loop in thread abort on SMP where we wait for the results on an
IPI correctly handled the case where a thread running on another CPU
gets its interrupt and self-aborts, but it missed the case where the
other thread pends before receiving the interrupt.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-26 16:54:06 -04:00
Andy Ross b0158cc81f kernel/sched: Fix reschedule points in SMP
There were two related bugs when in SMP mode:

1. Underneath z_reschedule(), the code was inexplicably checking the
   swap_ok flag on the current CPU to see if it was OK to preempt the
   current thread, but reschedule is the DEFINITION of a schedule
   point and we always want to swap, even if the current thread is
   non-preemptible.

2. With similar symptoms: in k_yield() a previous fix correct the
   queue handling for SMP, but it missed the case where a thread of
   the SAME priority as _current was on the queue and would fail to
   swap.  Yielding must always add the current thread to the back of
   the current priority.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-26 16:54:06 -04:00
Andy Ross 075c94f6e2 kernel: Port remaining syscalls to new API
These calls are not accessible in CI test, nor do they get built on
common platforms (in at least one case I found a typo which proved the
code was truly unused).  These changes are blind, so live in a
separate commit.  But the nature of the port is mechanical, all other
syscalls in the system work fine, and any errors should be easily
corrected.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-12 11:31:50 +08:00
Andy Ross 6564974bae userspace: Support for split 64 bit arguments
System call arguments, at the arch layer, are single words.  So
passing wider values requires splitting them into two registers at
call time.  This gets even more complicated for values (e.g
k_timeout_t) that may have different sizes depending on configuration.
This patch adds a feature to gen_syscalls.py to detect functions with
wide arguments and automatically generates code to split/unsplit them.

Unfortunately the current scheme of Z_SYSCALL_DECLARE_* macros won't
work with functions like this, because for N arguments (our current
maximum N is 10) there are 2^N possible configurations of argument
widths.  So this generates the complete functions for each handler and
wrapper, effectively doing in python what was originally done in the
preprocessor.

Another complexity is that traditional the z_hdlr_*() function for a
system call has taken the raw list of word arguments, which does not
work when some of those arguments must be 64 bit types.  So instead of
using a single Z_SYSCALL_HANDLER macro, this splits the job of
z_hdlr_*() into two steps: An automatically-generated unmarshalling
function, z_mrsh_*(), which then calls a user-supplied verification
function z_vrfy_*().  The verification function is typesafe, and is a
simple C function with exactly the same argument and return signature
as the syscall impl function.  It is also not responsible for
validating the pointers to the extra parameter array or a wide return
value, that code gets automatically generated.

This commit includes new vrfy/msrh handling for all syscalls invoked
during CI runs.  Future commits will port the less testable code.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-09-12 11:31:50 +08:00
Andy Ross 6f13980fc7 kernel/mutex: Fix locking to be SMP-safe
The mutex locking was written to use k_sched_lock(), which doesn't
work as a synchronization primitive if there is another CPU running
(it prevents the current CPU from preempting the thread, it says
nothing about what the others are doing).

Use the pre-existing spinlock for all synchronization.  One wrinkle is
that the priority code was needing to call z_thread_priority_set(),
which is a rescheduling call that cannot be called with a lock held.
So that got split out with a low level utility that can update the
schedule state but allow the caller to defer yielding until later.

Fixes #17584

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-08-22 17:58:16 -04:00
Yasushi SHOJI 20d072465d kernel: sched: Do not force preempt when k_sched_unlock()
The scheduler lock is a nestable lock.  Unlocking a nested,
still-having, lock shouldn't preempt the current thread.

	k_sched_lock();
	k_sched_lock();
	k_sched_unlock();  /* <--- this shouldn't be a scheduling point */
	k_sched_unlock();  /* <--- this is a scheduling point */

This commit changes the preempt_ok argument from 1 to 0.  This let
should_preempt() check whether it should preempt at the point or not.

This fixes #17869.

Signed-off-by: Yasushi SHOJI <y-shoji@ispace-inc.com>
2019-08-06 10:19:50 +02:00
Wentong Wu 2463ded4c8 kernel: timeout: do not active time slicing if idle thread ready
zero slice_ticks when can't time slice so that next_timeout will
ignore slice_ticks of _current_cpu and system can stay low power
state longer time.

Fixes: #17368.

Signed-off-by: Wentong Wu <wentong.wu@intel.com>
2019-07-24 14:02:23 -07:00
Andy Ross 4d8e1f223b kernel/sched: Fix k_thread_priority_set() on SMP
On SMP systems, currently scheduled threads are not in the run queue
and can't be unconditionally removoed/added.

Fixes #17170

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-12 14:09:16 -07:00
Andy Ross ed7d86310f kernel/sched: Interpret zero timeslice time correctly
The scheduler API has always allowed setting a zero slice size as a
way to disable timeslicing.  But the workaround introduced for
CONFIG_SWAP_NONATOMIC forgot that convention, and was calling
reset_time_slice() with that zero value (i.e. requesting an immediate
interrupt) in circumstances where z_swap() had been interrupted
nonatomically.

In practice, this never happened.  And if it did, it was a single
spurious no-op interrupt that no one cared about.  Until it did,
anyway...

Now that ticks on nRF devices are at full 32 kHz speed, we can get
into a situation where the rapidly triggering timeslice interrupts are
interrupting z_swap() calls, and the process feeds back on itself and
becomes self-sustaining.

Put that test into the time slice code itself to prevent this kind of
mistake in the future.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-02 22:52:29 -04:00
Anas Nashif 68c389c1f8 include: move system timer headers to include/drivers/timer/
Move internal and architecture specific headers from include/drivers to
subfolder for timer:

   include/drivers/timer

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2019-06-25 15:27:00 -04:00
Andrew Boie 3f974243be kernel: allow k_sleep(K_FOREVER)
Threads that are sleeping forever may be woken up with
k_wakeup(), this shouldn't fail assertions.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-06-18 09:08:01 -04:00
Andy Ross 312b43f145 kernel/sched: Don't reschedule inside a nested lock
The internal "reschedule" API has always understood the idea that it
might run in a ISR context where it can't swap.  But it has always
been happy to swap away when in thread mode, even when the environment
contains an outer lock that would NOT be expecting to swap!  As it
happened, the way irq locks are implemented (they store flag state
that can be restored without context) this would "work" even though it
was completely breaking the synchronization promise made by the outer
lock.

But now, with spinlocks, the error gets detected (albeit in a clumsy
way) in debug builds.  The unexpected swap triggers SPIN_VALIDATE
failures in later threads (this gets reported as a "recursive" lock,
but what actually happened is that another thread got to run before
the lock was released and tried to grab the same lock).

Fix this so that swap can only be called in a situation where the irq
lock key it was passed would have the effect of unmasking interrupts.
Note that this is a real behavioral change that affects when swaps
occur: it's not impossible that there is code out there that actually
relies on this "lock breaking reschedule" for correct behavior.  But
our previous implementation was irredeemably broken and I don't know
how to address that.

Fixes #16273

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-06-03 12:03:48 -07:00
Charles E. Youse a567831bed kernel/sched.c: add k_usleep() API function
Add k_usleep() API, analogous to k_sleep(), excepting that the argument
is in microseconds rather than milliseconds.

Signed-off-by: Charles E. Youse <charles.youse@intel.com>
2019-05-21 23:09:16 -04:00
Charles E. Youse b186303cb6 kernel/sched.c: refactor k_sleep() implementation for varied timescales
Current z_impl_k_sleep() does double duty, converting between units
specified by the API and ticks, as well as implementing the sleeping
mechanism itself. This patch separates the API from the mechanism,
so that sleeps need not be tied to millisecond timescales.

Signed-off-by: Charles E. Youse <charles.youse@intel.com>
2019-05-21 23:09:16 -04:00
Andrew Boie ae0d1b2b79 kernel: sched: move stack sentinel check earlier
Checking the stack sentinel may abort the current thread,
make this check before we determine what the next thread
to run is.

Fixes: #15037

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2019-03-29 22:13:40 -04:00
Patrik Flykt 21358baa72 all: Update unsigend 'U' suffix due to multiplication
As the multiplication rule is updated, new unsigned suffixes
are added in the code.

Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
2019-03-28 17:15:58 -05:00
Patrik Flykt 24d71431e9 all: Add 'U' suffix when using unsigned variables
Add a 'U' suffix to values when computing and comparing against
unsigned variables.

Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
2019-03-28 17:15:58 -05:00
Flavio Ceolin 2df02cc8db kernel: Make if/iteration evaluate boolean operands
Controlling expression of if and iteration statements must have a
boolean type.

MISRA-C rule 14.4

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2019-03-26 22:06:45 -04:00
Flavio Ceolin a996203739 kernel: Use macro BIT for shift operations
BIT macro uses an unsigned int avoiding implementation-defiend behavior
when shifting signed types.

MISRA-C rule 10.1

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2019-03-26 14:31:29 -04:00
Andy Ross 4521e0c111 kernel/sched: Mark sleeping threads suspended
On SMP, there was a bug where the logic that re-adds _current to the
run queue at swap time would accidentally reschedule threads that had
just gone to sleep, because the is_thread_prevented_from_running()
predicate only tests for threads that are "suspended" or "pending" and
not sleeping.

Overload _THREAD_SUSPENDED to indicate "sleeping" also.  Simple fix
for an immediate bug, though long term we really want to unify all the
blocked conditions to prevent this kind of state bug.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-23 19:28:15 -04:00
Andy Ross 3dea408405 kernel/sched: Flag DEAD on correct thread in cross-CPU abort
Daniel Leung caught a good one: In the (SMP) case where we were
aborting a thread that was not currently scheduled, we were flagging
the DEAD state on _current and not the thread we were aborting!  This
wasn't as fatal as it seems, as the thread that called z_sched_abort()
would effectively go on living (as a zombie?) in a state where it
would always be preempted, but would otherwise remain scheduleable.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-19 13:39:24 -05:00
Andy Ross 722aeead91 kernel/sched: Nonatomic swap workaround update for qemu behavior
The workaround for nonatomic swap had yet another edge case: it would
save off the _current pointer when pending a thread so that the next
time slice interrupt could test it to see if the swap had actually
happened before assuming that _current could be rescheduled (if it
just pended itself, that's impossible).  Then it would clear the
pending_current pointer so future interrupts wouldn't be confused.

BUT: it turns out that qemu, when faced with really rapid timer rates
that exceed its (host-based) timing accuracy, is perfectly willing to
"stack up" timer interrupts such the one goes pending before the
previous one is finished executing.  In that case, we can enter the
SECOND timer interrupt, to try timeslicing a SECOND time, STILL before
the PendSV exception has run to actually effect the context switch.
Except this time pending_current has been cleared and we try to
reschedule the pended _current thread incorrectly.  In theory real
hardware could do this too, though it would involve absolutely crazy
interrupt latency problems.

Work around this by moving the clear to the thread itself, immediately
after it wakes up from the pend call it retakes a lock and clears
pending_current if it still matches _current.  That is not a perfect
fix: there remains a 2-3 instruction race at that moment where we
return from pend and before we can lock interrupts again where a timer
interrupt will see an incorrect pointer.  But I hammered at this and
couldn't make qemu do that (i.e. return from a timer interrupt but
flag a new one in just a cycle or two).

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-15 05:50:43 +01:00
Andy Ross ea1c99b11b kernel/sched: Fix k_yield() in SMP
This was always doing a remove/add of the _current thread to the run
queue, which is wrong because in SMP _current isn't in the queue to
remove.  But it went undetected until the recent dlist changes.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-13 19:15:20 +01:00
Andy Ross 8c1bdda33c kernel/sched: Fix spinlock validation glitch in SMP
In SMP, we are setting the _current pointer while holding the
scheduler spinlock locally, which means that when we try to release it
the validation layer (not the spinlock per se) will scream at us
because the thread that took the lock doesn't match the one releasing
it.

Special case this when validation is enabled.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-13 19:15:20 +01:00
Andy Ross b18685bcf1 kernel/sched: Clean up tracing hooks
The tracing fixes in commit e87193896a ("subsys: debug: tracing: Fix
thread tracing") were... not a readability win.  The point appears to
have been to put a tracing hook immediately before and after the
assignment to the _current pointer.  So do that in an abstracted
function and clean up _get_next_switch_handle() (which is a subtle and
important function already polluted with some unavoidable preprocessor
testing!)

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-13 19:15:20 +01:00
Andy Ross 42ed12a387 kernel/sched: arch/x86_64: Support synchronous k_thread_abort() in SMP
Currently thread abort doesn't work if a thread is currently scheduled
on a different CPU, because we have no way of delivering an interrupt
to the other CPU to force the issue.  This patch adds a simple
framework for an architecture to provide such an IPI, implements it
for x86_64, and uses it to implement a spin loop in abort for the case
where a thread is currently scheduled elsewhere.

On SMP architectures (xtensa) where no such IPI is implemented, we
fall back to waiting on an arbitrary interrupt to occur.  This "works"
for typical code (and all current tests), but of course it cannot be
guaranteed on such an architecture that k_thread_abort() will return
in finite time (e.g. the other thread on the other CPU might have
taken a spinlock and entered an infinite loop, so it will never
receive an interrupt to terminate itself)!

On non-SMP architectures this patch changes no code paths at all.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-13 19:15:20 +01:00
Andy Ross aed8288196 kernel/sched: Handle aboring _current correctly in SMP
In SMP, _current is not "queued".  (The run queue only stores
unscheduled threads because we can't rely on the head of the list
being _current).  We weren't updating the cache choice, which would
flag swap_ok, so calling k_thread_abort(_current) (for example, when a
thread exits from its entry function) would try to switch back into
the thread and then run off the end of the function.

Amusingly this was more benign than you'd think.  Stumbled on it by
accident.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-03-13 19:15:20 +01:00
Patrik Flykt 4344e27c26 all: Update reserved function names
Update reserved function names starting with one underscore, replacing
them as follows:
   '_k_' with 'z_'
   '_K_' with 'Z_'
   '_handler_' with 'z_handl_'
   '_Cstart' with 'z_cstart'
   '_Swap' with 'z_swap'

This renaming is done on both global and those static function names
in kernel/include and include/. Other static function names in kernel/
are renamed by removing the leading underscore. Other function names
not starting with any prefix listed above are renamed starting with
a 'z_' or 'Z_' prefix.

Function names starting with two or three leading underscores are not
automatcally renamed since these names will collide with the variants
with two or three leading underscores.

Various generator scripts have also been updated as well as perf,
linker and usb files. These are
   drivers/serial/uart_handlers.c
   include/linker/kobject-text.ld
   kernel/include/syscall_handler.h
   scripts/gen_kobject_list.py
   scripts/gen_syscall_header.py

Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
2019-03-11 13:48:42 -04:00
Patrik Flykt cf2d57952e kernel/sched: Rename scheduler spinlock
Rename scheduler spinlock sched_lock to sched_spinlock as it will
collide with the cleanup of the reserved function name _sched_lock(),
which will also be called sched_lock().

Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
2019-03-11 13:48:42 -04:00
Andy Ross dff6b71450 kernel/sched: More nonatomic swap fixes
Nonatomic swap strikes again.  These issues are all longstanding, but
were unmasked by the dlist work in commit d40b8ce1fb ("sys: dlist:
Add sys_dnode_is_linked") where list node pointers become nulls on
removal.

The previous fix was for a specific case where a timeslicing interrupt
would try to slice out the "wrong" current thread because the thread
has "just" pended itself.  That was incomplete, because the parallel
code in k_sleep() didn't flag itself the same way.

And beyond that, it turns out to be basically impossible (now that I'm
thinking about it correctly) to prevent interrupt code from calling
into the scheduler to suspend a "just pended but not quite" current
and/or preempt away to another thread.  In any of these cases, the
scheduler modifications to the state bits remain correct but the queue
nodes may be corrupt because the thread was already removed from the
ready queue.  So we have to test and correct this at the lowest level,
where a thread is being removed from a priq: check that it's (1) the
ready queue and not a waitq, (2) the current thread, and (3) already
marked suspended and thus not in the queue.

There are lots of existing issues filed in the last few months all
pointing to odd instability on ARM platforms.  I'm reasonably certain
this is the root cause for most or all of them.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-27 12:07:34 -08:00
Andy Ross 1202810119 kernel/sched: _thread_priority_set needs to be sched_lock aware
This API doesn't use the normal thread priority comparison itself, so
doesn't get the magic that thread_base.prio provides.  If called when
another thread should be run, this would preempt the current thread
always, even if the scheduler lock was taken.

That was benign until recent spinlockifiation exposed it: a mutex in
the philosophers test run in preempt_only mode would swap away while
holding a spinlock (which used to work with irq locks) and fail later
with a "recursive" spinlock assert.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-08 14:49:39 -05:00
Andy Ross d27d4e6af2 kernel/sched: Remove remaining irq_lock use
The k_sleep() locking was actually to protect the _current state from
preemption before the context switch, so document that and replace
with a spinlock.  Should probably unify this with the rather cleaner
logic in pend_curr(), but right now "sleeping" and "pended" are
needlessly distinct states.

And we can remove the locking entirely from k_wakeup().  There's no
reason for any of that to need to be synchronized.  Even if we're
racing with other thread modifiations, the state on exit will be a
runnable thread without a timeout, or whatever timeout/pend state the
other side was requesting (i.e. it's a bug, but not one solved by
synhronization).

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-08 14:49:39 -05:00
Andy Ross 1bf9bd04b1 kernel: Add _unlocked() variant to context switch primitives
These functions, for good design reason, take a locking key to
atomically release along with the context swtich.  But there's still a
common pattern in code to do a switch unconditionally by passing
irq_lock() directly.  On SMP that's a little hurtful as it spams the
global lock.  Provide an _unlocked() variant for
_Swap/_reschedule/_pend_curr for simplicity and efficiency.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-08 14:49:39 -05:00
Andy Ross ec554f44d9 kernel: Split reschdule & pend into irq/spin lock versions
Just like with _Swap(), we need two variants of these utilities which
can atomically release a lock and context switch.  The naming shifts
(for byte count reasons) to _reschedule/_pend_curr, and both have an
_irqlock variant which takes the traditional locking.

Just refactoring.  No logic changes.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-08 14:49:39 -05:00
Andy Ross aa6e21c24c kernel: Split _Swap() API into irqlock and spinlock variants
We want a _Swap() variant that can atomically release/restore a
spinlock state in addition to the legacy irqlock.  The function as it
was is now named "_Swap_irqlock()", while _Swap() now refers to a
spinlock and takes two arguments.  The former will be going away once
existing users (not that many!  Swap() is an internal API, and the
long port away from legacy irqlocking is going to be happening mostly
in drivers) are ported to spinlocks.

Obviously on uniprocessor setups, these produce identical code.  But
SMP requires that the correct API be used to maintain the global lock.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-08 14:49:39 -05:00
Andy Ross dc0713a706 kernel: Cleanup. Remove redundant test when calling _Swap()
_Swap() must already handle the case where _get_next_ready_thread() is
the same as _current.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-08 14:49:39 -05:00