kernel/sched: Fix thread selection misordering with aborted threads

When a running thread gets aborted asynchronously (this only happens in SMP contexts, obviously) it gets flagged "aborting", but the actual abort needs to happen in the thread's own context. For convenience, this was done in the next_up() routine that selects the next thread to run at interrupt exit time. But this check was being done AFTER the next candidate thread was selected from the run queue. Thread abort can wake up threads blocked in k_thread_join(), and therefore these weren't seen as runable threads, even if they should have been. Executive summary: if you killed a thread running on another CPU, and there was another thread joined to the killed thread that should have run on that CPU, it wouldn't (until it received an interrupt or otherwise reached a schedule point). Move the abort check above the run queue inspection and into the end-of-interrupt processing in z_get_next_switch_handle() (so it's actually a mild performance boost as it's no longer part of the cooperative context switch path). Simple fix, subtle bug. Fixes #58040 Signed-off-by: Andy Ross <andyross@google.com>
2023-05-19 08:47:48 -07:00 · 2023-05-19 08:47:48 -07:00 · d537267fc3
commit d537267fc3
parent 51108a9ce4
1 changed files with 4 additions and 4 deletions
--- a/kernel/sched.c
+++ b/kernel/sched.c
@ -353,10 +353,6 @@ static ALWAYS_INLINE struct k_thread *next_up(void)
 	 * "ready", it means "is _current already added back to the
 	 * queue such that we don't want to re-add it".
 	 */
-	if (is_aborting(_current)) {
-		end_thread(_current);
-	}
-
 	bool queued = z_is_thread_queued(_current);
 	bool active = !z_is_thread_prevented_from_running(_current);

@ -1086,6 +1082,10 @@ void *z_get_next_switch_handle(void *interrupted)
 	LOCKED(&sched_spinlock) {
 		struct k_thread *old_thread = _current, *new_thread;

+		if (is_aborting(_current)) {
+			end_thread(_current);
+		}
+
 		if (IS_ENABLED(CONFIG_SMP)) {
 			old_thread->switch_handle = NULL;
 		}