Commit graph

399 commits

Author SHA1 Message Date
Andy Ross
b2791b0ac8 kernel/sched: Force inlining of some routines within the scheduler guts
GCC 6.2.0 is making frustratingly poor inlining decisions with some of
these routines, resulting in an awful lot of runtime calls for code
that is only ever expanded once or twice within the file.

Treat with targetted ALWAYS_INLINE's to force the issue.  The
scheduler code is a hot path.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-01 15:57:21 -05:00
Andy Ross
eda4c027da misc/dlist: Swap insertion API for a faster one
The sys_dlist_insert_*() functions had a behavior where a NULL
argument for the insertion position to sys_dlist_insert_after/before()
was interpreted as "the end of the list".  We never used that
convention (except in one spot internal to dlist.h which was not
itself used anywhere), and of course already have an API for appending
and prepending to a list.

In practice this was a performance disaster.  The NULL check is
virtually never provable statically by the compiler, so that test and
branch is present always.  And worse, the check and call to another
function was pushing this beyond the complexity limit for gcc to
inline a function (at -Os optimization anyway), forcing us to use
function calls for what should be a ~8 instruction sequence.  The
upshot is that dlist insertions were 2-3x slower than they needed to
be.

Deprecate these older APIs and introduce a new sys_dlist_insert() call
which can be much better optimized.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-02-01 15:57:21 -05:00
Peter A. Bigot
b4ece0ad44 kernel: timeout: detect inactive timeouts using dnode linked state
Whether a timeout is linked into the timeout queue can be determined
from the corresponding sys_dnode_t linked state.  This removes the need
to use a special flag value in dticks to determine that the timeout is
inactive.

Update _abort_timeout to return an error code, rather than the flag
value, when the timeout to be aborted was not active.

Remove the _INACTIVE flag value, and replace its external uses with an
internal API function that checks whether a timeout is inactive.

Signed-off-by: Peter A. Bigot <pab@pabigot.com>
2019-01-23 20:46:49 +01:00
Peter A. Bigot
692e1033e7 kernel: sched: fix empty list detection
CONTAINER_OF() on a NULL pointer returns some offset around NULL and not
another NULL pointer.  We have to check for that ourselves.

This only worked because the dnode happened to be at the start of the
struct.

Signed-off-by: Peter A. Bigot <pab@pabigot.com>
2019-01-23 20:46:49 +01:00
Andy Ross
7fb8eb57e8 kernel/sched: SWAP_NONATOMIC workaround for timeslicing
Timeslicing works by removing the _current thread from the run queue
and re-adding it at the end of its priority.  On systems with a
_Swap() that can be preempted by a timer interrupt, that means it's
possible for the timeslice to try to slice out a thread that had
already pended itself!

This behavior used to be benign (or at least undetectable) as the
duplicated list operations were idempotent.  But now the dlist code is
stricter about correctness and has exposed the bug -- it will blow up
if you try to remove an already-removed list node.

Fix (on affected platforms) by stashing the _current pointer in
_pend_current_thread() that is checked and cleared in the timer
interrupt.  If we discover we're trying to interrupt a thread that's
already interrupted itself, we can safely exit z_time_slice() as a
noop.  The timeslicing bookeeping was already done for us underneath
the pend code.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-01-15 13:06:35 +01:00
Andy Ross
23c5a63aa8 kernel/sched: Predicate SWAP_NONATOMIC workaround properly
This is a refactoring of the fix in commit 6c95dafd82 to limit its
application to affected platforms now that the root cause is
understood.

Note that the bug that fix was addressing was rare and seen only on
after multi-hour sessions on Michael Scott's test rig.  So if
something regresses, this is where to look!

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-01-15 13:06:35 +01:00
Andy Ross
bb86f2019c kernel/sched: Remove stale comment
The recent change that added a locked z_set_timeout_expiry() API
obsoleted the subtle note about synchronization above
reset_time_slice().  None of that matters any more, the API is
synchronized internally in a conventional way.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-01-03 12:29:02 -05:00
Flavio Ceolin
118715c62d misra: Fixes for MISRA-C rule 8.3
MISRA-C says all declarations of an object or function must use the
same name and qualifiers.

MISRA-C rule 8.3

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-12-07 09:06:34 -05:00
Flavio Ceolin
26be3355ac kernel: sched: Fix undefined behavior
The order of evaluation of function calls in the arguments of a
function. This is undefined (32)/ unspecified(15-18) in C99.

MISRA-C rule 13.2 does not allow that a value of an expression and its
side effects happens in not deterministic order to avoid these
undefined behaviors.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-12-07 09:06:34 -05:00
Flavio Ceolin
80418602ed kernel: sched: Make boolean functions return bool
MISRA-C rule 14.4

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-11-30 08:05:11 -08:00
Pawel Dunaj
baea22407d kernel: Always set clock expiry with sync with timeout module
System must not set the clock expiry via backdoor as it may
effect in unbound time drift of all scheduled timeouts.

Fixes: #11502

Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>
2018-11-26 12:24:59 +01:00
Andy Ross
02165d76a0 kernel/timeout: Fix race with clock timeout setting
The call to z_clock_set_timeout() was being made outside the timeout
lock, which can race against other contexts setting sooner-expiring
timeouts.

Also add a long comment to one spot (timeslicing) where this call is
made outside the timeout spinlock (inside the scheduler lock) and why
this is OK.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-11-21 12:52:49 +01:00
Andy Ross
1c3051459b kernel/sched: Fix race in k_sched_time_slice_set()
If this function is itself interrupted by a timeslice event, the
slicing state can be corrupted.  Just re-use the scheduler lock
instead of using a new spinlock; this is a low-latency function that
won't deadlock.  Found by inspection.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-11-13 17:10:07 -05:00
Flavio Ceolin
a406b88fca kernel: Remove duplicated identifier
There was an struct and a variable called _kernel. This is error prone
and a MISRA-C violation. It is changing the struct to have a unique
identifier.

MISRA-C rule 5.8

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-11-04 11:37:24 -05:00
Piotr Zięcik
7700eb2a15 kernel: sched: Make k_sleep() similar to POSIX equivalent
This commit introduces k_sleep() return value, which provides
information about actual sleep time. If the returned value is
not-zero, the thread slept shorter than requested, which is
only possible if the thread has been woken up by k_wakeup() call.

Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
2018-10-30 18:27:31 +01:00
Marek Pieta
e87193896a subsys: debug: tracing: Fix thread tracing
Change fixes issue with thread execution tracing.

Signed-off-by: Marek Pieta <Marek.Pieta@nordicsemi.no>
2018-10-29 22:09:12 -04:00
Spoorthi K
b6cd192fa5 kernel: sched: Fix compiler warning
Ignore return value of _Swap() as it is not
used anywhere.

Signed-off-by: Spoorthi K <spoorthi.k@intel.com>
2018-10-24 09:48:17 +01:00
Adithya Baglody
6176692f4b kernel: ksched.h: Incorrect argument type in _pend_current_thread
In _pend_current_thread the argument key is always a unsigned
interger type and this function forces it to become a signed
interger. This is a dangerous behavior and cant be trusted to
work as expected.

Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
2018-10-17 12:17:58 -04:00
Adithya Baglody
1424561252 kernel: sched: Fixed incorrect argument type of _reschedule()
This API shouldn't take a int type but instead it should take
u32_t. This argument has to be similar to irq_lock() and
irq_unlock().

Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
2018-10-17 07:59:51 -04:00
Andy Ross
7a035c0dc7 kernel/sched: Fix timeslice accounting for already-elapsed ticks
In tickless mode, not all elapsed ticks may have been announced yet,
so future z_time_slice() calls will include "extra" ticks that we have
to account for when setting up the slice count.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
1129ea9394 kernel/sched: Fix timeslicing predicate
It's possible to interrupt a thread that has already scheduled a
timeout.  Really this is a race against the usage of
_add_thread_timeout() and needs some design work to provide proper
locking (which is a distinct requirement from the scheduler lock and
timeout lock!), as the users of that API are spread around the kernel.
But existing usage always schedules the timeouts first, so this is
safe.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
2dd9e2cad4 kernel/sched: Remove spurious locking
The timeout APIs are properly synchronized now.  This irq_lock() (and
the comment explaining it) isn't needed anymore.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
987c0e5fc1 kernel: New timeout implementation
Now that the API has been fixed up, replace the existing timeout queue
with a much smaller version.  The basic algorithm is unchanged:
timeouts are stored in a sorted dlist with each node nolding a delta
time from the previous node in the list; the announce call just walks
this list pulling off the heads as needed.  Advantages:

* Properly spinlocked and SMP-aware.  The earlier timer implementation
  relied on only CPU 0 doing timeout work, and on an irq_lock() being
  taken before entry (something that was violated in a few spots).
  Now any CPU can wake up for an event (or all of them) and everything
  works correctly.

* The *_thread_timeout() API is now expressible as a clean wrapping
  (just one liners) around the lower-level interface based on function
  pointer callbacks.  As a result the timeout objects no longer need
  to store backpointers to the thread and wait_q and have shrunk by
  33%.

* MUCH smaller, to the tune of hundreds of lines of code removed.

* Future proof, in that all operations on the queue are now fronted by
  just two entry points (_add_timeout() and z_clock_announce()) which
  can easily be augmented with fancier data structures.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
5d203523b6 kernel/timeout: Eliminate wait_q parameters from API
Now that this is known to be an unused value, remove it from the API.
Note that this caught a few spots where we were passing values (a
non-NULL wait_q with a NULL thread handle) that were always being
ignored before.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
15d520819d kernel/timeout: Prepare unification of timeout/thread wait_q fields
The existing timeout API wants to store a wait_q on which the thread
is waiting, but it only uses that value in one spot (and there only as
a boolean flag indicating "this thread is waiting on a wait_q).

As it happens threads can already store their own backpointers to a
wait_q (needed for the SCALABLE scheduler backend), so we should use
that instead.

This patch doesn't actually perform that unification yet.  It
reorgnizes things such that the pended_on field is always set at the
point of timeout interaction, and adds a bunch of asserts to make 100%
sure the logic is correct.  The next patch will modify the API.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
9098a45c84 kernel: New timeslicing implementation
Instead of checking every time we hit the low-level context switch
path to see if the new thread has a "partner" with which it needs to
share time, just run the slice timer always and reset it from the
scheduler at the points where it has already decided a switch needs to
happen.  In TICKLESS_KERNEL situations, we pay the cost of extra timer
interrupts at ~10Hz or whatever, which is low (note also that this
kind of regular wakeup architecture is required on SMP anyway so the
scheduler can "notice" threads scheduled by other CPUs).  Advantages:

1. Much simpler logic.  Significantly smaller code.  No variance or
   dependence on tickless modes or timer driver (beyond setting a
   simple timeout).

2. No arch-specific assembly integration with _Swap() needed

3. Better performance on many workloads, as the accounting now happens
   at most once per timer interrupt (~5 Hz) and true rescheduling and
   not on every unrelated context switch and interrupt return.

4. It's SMP-safe.  The previous scheme kept the slice ticks as a
   global variable, which was an unnoticed bug.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
1c08aefe56 kernel/timeoutq: Uninline the timeout methods
There was no good reason to have these rather large functions in a
header.  Put them into sys_clock.c for now, pending rework to the
system.

Now the API is clearly visible in a small header.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
722a888ef7 timer: Clean up hairy tickless APIs
The tickless driver had a bunch of "hairy" APIs which forced the timer
drivers to do needless low-level accounting for the benefit of the
kernel, all of which then proceeded to implement them via cut and
paste.  Specifically the "program_time" calls forced the driver to
expose to the kernel exactly when the next interrupt was due and how
much time had elapsed, in a parallel API to the existing "what time is
it" and "announce a tick" interrupts that carry the same information.

Remove these from the kernel, replacing them with synthesized logic
written in terms of the simpler APIs.

In some cases there will be a performance impact due to the use of the
64 bit uptime call, but that will go away soon.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Andy Ross
ab488277bc drivers/timer: Unify timeout setting APIs
The existing API had two almost identical functions: _set_time() and
_timer_idle_enter().  Both simply instruct the timer driver to set the
next timer interrupt expiration appropriately so that the call to
z_clock_announce() will be made at the requested number of ticks.  On
most/all hardware, these should be implementable identically.

Unfortunately because they are specified differently, existing drivers
have implemented them in parallel.

Specify a new, unified, z_clock_set_timeout().  Document it clearly
for implementors.  And provide a shim layer for legacy drivers that
will continue to use the old functions.

Note that this patch fixes an existing bug found by inspection: the
old call to _set_time() out of z_clock_announce() failed to test for
the "wait forever" case in the situation where clock_always_on is
true, meaning that a system that reached this point and then never set
another timeout would freeze its uptime clock incorrectly.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-10-16 15:03:10 -04:00
Flavio Ceolin
02ed85bd82 kernel: sched: Change boolean APIs to return bool
Change APIs that essentially return a boolean expression  - 0 for
false and 1 for true - to return a bool.

MISRA-C rule 14.4

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-09-28 06:28:41 +05:30
Flavio Ceolin
4218d5f8f0 kernel: Make If statement have essentially Boolean type
Make if statement using pointers explicitly check whether the value is
NULL or not.

The C standard does not say that the null pointer is the same as the
pointer to memory address 0 and because of this is a good practice
always compare with the macro NULL.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-09-18 13:57:15 -04:00
Flavio Ceolin
8f72f245bd kernel: Explicitly check _abort_thread_timemout
A lot of times this API is called during some cleanup even if the
timeout was not set to make the code simpler. In these cases it's not
necessary checking the return. Adding a cast to acknowledge it.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-09-14 16:55:37 -04:00
Flavio Ceolin
98c64b6d92 kernel: Change _reschedule signature
_reschedule return's value is not used anywhere, except erroneously by
pthread_barrier_wait.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-09-14 16:55:37 -04:00
Flavio Ceolin
5884c7f54b kernel: Explicitly ignoring _Swap return
Ignoring _Swap return where there is no treatment or nothing to do.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-09-14 16:55:37 -04:00
Anas Nashif
a9f32d66cf tracing: remove stray event_logger code
Remove obsolete kernel event logger code.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2018-09-05 16:05:08 -04:00
Andy Ross
9ecc4ead68 sched: Properly account for timeslicing in tickless mode
When adding a new runnable thread in tickless mode, we need to detect
whether it will timeslice with the running thread and reset the timer,
otherwise it won't get any CPU time until the next interrupt fires at
some indeterminate time in the future.

This fixes the specific bug discussed in #7193, but the broader
problem of tickless and timeslicing interacting badly remains.  The
code as it exists needs some rework to avoid all the #ifdef mess.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-08-29 10:01:41 -04:00
Anas Nashif
0e07f8e97a Revert "sched: Properly account for timeslicing in tickless mode"
This reverts commit bc6fb65c81.

Causes MPU faults on multiple platforms.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2018-08-27 18:39:51 -04:00
Andy Ross
bc6fb65c81 sched: Properly account for timeslicing in tickless mode
When adding a new runnable thread in tickless mode, we need to detect
whether it will timeslice with the runnable thread and reset the
timer, otherwise it won't get any CPU time until the next interrupt
fires at some indeterminate time in the future.

This fixes the specific bug discussed in #7193, but the broader
problem of tickless and timeslicing interacting badly remains.  The
code as it exists needs some rework to avoid all the #ifdef mess.

Note that the patch also moves _ready_thread() from a ksched.h inline
to sched.c.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-08-27 13:19:29 -04:00
Anas Nashif
b6304e66f6 tracing: support generic tracing hooks
Define generic interface and hooks for tracing to replace
kernel_event_logger and existing tracing facilities with something more
common.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2018-08-21 05:45:47 -07:00
Flavio Ceolin
0866d18d03 irq: Fix irq_lock api usage
irq_lock returns an unsigned int, though, several places was using
signed int. This commit fix this behaviour.

In order to avoid this error happens again, a coccinelle script was
added and can be used to check violations.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2018-08-16 19:47:41 -07:00
Piotr Zięcik
2a26576b03 kernel: sched: Use ticks as time unit in time slicing.
The time slicing settings was kept in milliseconds while all related
operations was based on ticks. Continuous back and forth conversion
between ticks and milliseconds introduced an accumulating error due
to rounding in _ms_to_ticks() and __ticks_to_ms(). As result
configured time slice duration was not achieved.

This commit removes excessive ticks <-> ms conversion by using ticks
as time unit for all operations related to time slicing.

Also, it fixes #8896 as well as #8897.

Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
2018-08-14 07:18:44 -07:00
Piotr Zięcik
e670135fdc kernel: sched: Fix comparsion in _update_time_slice_before_swap()
The _update_time_slice_before_swap() function directly compared
_time_slice_duration (expressed in ms) with value returned by
_get_remaining_program_time() which used ticks as a time unit.

Moreover, the _time_slice_duration was also used as an argument
for _set_time(), which expects time expressed in ticks.

This commit ensures that the same unit (ticks) is used in
comparsion and timer adjustments.

Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
2018-08-14 07:18:44 -07:00
Piotr Zięcik
4a39b9ea64 kernel: sched: Use ticks as time unit in time slicing.
The time slicing settings was kept in milliseconds while all related
operations was based on ticks. Continuous back and forth conversion
between ticks and milliseconds introduced an accumulating error due
to rounding in _ms_to_ticks() and __ticks_to_ms(). As result
configured time slice duration was not achieved.

This commit removes excessive ticks <-> ms conversion by using ticks
as time unit for all operations related to time slicing.

Also, it fixes #8896 as well as #8897.

Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
2018-08-13 07:13:22 -07:00
Piotr Zięcik
ee9a0615a4 kernel: sched: Fix comparsion in _update_time_slice_before_swap()
The _update_time_slice_before_swap() function directly compared
_time_slice_duration (expressed in ms) with value returned by
_get_remaining_program_time() which used ticks as a time unit.

Moreover, the _time_slice_duration was also used as an argument
for _set_time(), which expects time expressed in ticks.

This commit ensures that the same unit (ticks) is used in
comparsion and timer adjustments.

Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
2018-08-13 07:13:22 -07:00
Piotr Zięcik
fe2ac39bf2 kernel: Cleanup _ms_to_ticks().
This commit moves all implementations of the _ms_to_ticks() into
single file. Also, the function is now inline even if
_NEED_PRECISE_TICK_MS_CONVERSION is defined.

Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
2018-07-03 22:46:39 -04:00
Andy Ross
9f06a35450 kernel: Add the old "multi queue" scheduler algorithm as an option
Zephyr 1.12 removed the old scheduler and replaced it with the choice
of a "dumb" list or a balanced tree.  But the old multi-queue
algorithm is still useful in the space between these two (applications
with large-ish numbers of runnable threads, but that don't need fancy
features like EDF or SMP affinity).  So add it as a
CONFIG_SCHED_MULTIQ option.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-07-03 17:09:15 -04:00
Andy Ross
225c74bbdf kernel/Kconfig: Reorgnize wait_q and sched algorithm choices
Make these "choice" items instead of a single boolean that implies the
element unset.

Also renames WAITQ_FAST to WAITQ_SCALABLE, as the rbtree is really
only "fast" for large queue sizes (it's constant factor overhead is
bigger than a list's!)

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-07-03 17:09:15 -04:00
Anas Nashif
80e6a978a6 kernel/drivers: fix compile warnings
Uncovered by clang we have some functions being only used conditionally,
so gaurd them to make them only available when those conditions are met.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2018-07-01 22:58:23 +02:00
Michael Scott
6c95dafd82 kernel: sched: use _is_thread_ready() in should_preempt()
We are using _is_thread_prevented_from_running() to see if the
_current thread can be preempted in should_preempt().  The idea
being that even if the _current thread is a high priority coop
thread, we can still preempt it when it's pending, suspended,
etc.

This does not take into account if the thread is sleeping.

k_sleep() merely removes the thread from the ready_q and calls
Swap().  The scheduler will swap away from the thread temporarily
and then on the next cycle get stuck to the sleeping thread for
however long the sleep timeout is, doing exactly nothing because
other functions like _ready_thread() use _is_thread_ready() as a
check before proceeding.

We should use !_is_thread_ready() to take into account when threads
are waiting on a timer, and let other threads run in the meantime.

Signed-off-by: Michael Scott <michael@opensourcefoundries.com>
2018-06-04 08:21:47 -04:00
Andy Ross
43553da9b2 kernel/sched: Fix preemption logic
The should_preempt() code was catching some of the "unrunnable" cases
but not all of them, opening the possibility of failing to preempt a
just-pended thread and thus waking it up synchronously.  There are
reports of this causing spin loops over k_poll() in the network stack
work queues (see #8049).

Note that the previous _is_dummy() call is folded into (the somewhat
verbosely named) _is_thread_prevented_from_running(), and that the
order of tests has been changed/optimized to hopefully catch common
cases earlier.

Suggested-by: Michael Scott <michael@opensourcefoundries.com>
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2018-05-31 16:46:14 -04:00