kernel/sched: Fix thread selection misordering with aborted threads

When a running thread gets aborted asynchronously (this only happens
in SMP contexts, obviously) it gets flagged "aborting", but the actual
abort needs to happen in the thread's own context.  For convenience,
this was done in the next_up() routine that selects the next thread to
run at interrupt exit time.

But this check was being done AFTER the next candidate thread was
selected from the run queue.  Thread abort can wake up threads blocked
in k_thread_join(), and therefore these weren't seen as runable
threads, even if they should have been.

Executive summary: if you killed a thread running on another CPU, and
there was another thread joined to the killed thread that should have
run on that CPU, it wouldn't (until it received an interrupt or
otherwise reached a schedule point).

Move the abort check above the run queue inspection and into the
end-of-interrupt processing in z_get_next_switch_handle() (so it's
actually a mild performance boost as it's no longer part of the
cooperative context switch path).  Simple fix, subtle bug.

Fixes #58040

Signed-off-by: Andy Ross <andyross@google.com>
This commit is contained in:
Andy Ross 2023-05-19 08:47:48 -07:00 committed by Fabio Baltieri
commit d537267fc3

View file

@ -353,10 +353,6 @@ static ALWAYS_INLINE struct k_thread *next_up(void)
* "ready", it means "is _current already added back to the
* queue such that we don't want to re-add it".
*/
if (is_aborting(_current)) {
end_thread(_current);
}
bool queued = z_is_thread_queued(_current);
bool active = !z_is_thread_prevented_from_running(_current);
@ -1086,6 +1082,10 @@ void *z_get_next_switch_handle(void *interrupted)
LOCKED(&sched_spinlock) {
struct k_thread *old_thread = _current, *new_thread;
if (is_aborting(_current)) {
end_thread(_current);
}
if (IS_ENABLED(CONFIG_SMP)) {
old_thread->switch_handle = NULL;
}