kernel: don't unlock and then lock immediately in idle loop

Inside the idle loop, in some configuration, IRQ is unlocked and
then immediately locked again. There is a side effect:

1. IRQ is unlocked in middle of the loop.
2. Another thread (A) can now run so idle thread is un-scheduled.
3. Thread A runs to its end and going through the thread
   self-abort path.
4. Idle thread is rescheduled again, and continues to run
   the remaining loop when it eventuall calls k_cpu_idle().
   The "pending abort" path is not being executed on thread A
   at this point.
5. Now, thread A is suspended, and the CPU is in idle waiting
   for interrupts (e.g. timeouts).
6. Thread B is waiting to join on thread A. Since thread A has
   not been terminated yet so thread B is waiting until
   the idle thread runs again and starts executing from
   the beginning of while loop.
7. Depending on how many threads are running and how active
   the platform is, idle thread may not run again for a while,
   resulting in thread B appearing to be stuck.

To avoid this situation, the unlock/lock pair in middle of
the loop is removed so no rescheduling can be done mid-loop.
When there is no thread abort pending, it simply locks IRQ
and calls k_cpu_idle(). This is almost identical to the idle
loop before the thread abort code was introduced (except
the check for cpu->pending_abort).

Fixes #30573

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This commit is contained in:
Daniel Leung 2020-12-14 14:31:11 -08:00 committed by Anas Nashif
commit 8ad5ad2214

View file

@ -171,12 +171,14 @@ void idle(void *p1, void *unused2, void *unused3)
z_reschedule_unlocked();
continue;
}
arch_irq_unlock(key);
#if SMP_FALLBACK
arch_irq_unlock(key);
k_busy_wait(100);
k_yield();
#else
(void)arch_irq_lock();
#ifdef CONFIG_SYS_CLOCK_EXISTS
int32_t ticks = z_get_next_timeout_expiry();