kernel/swap: Add assertion to catch lock-breaking context switches

Our z_swap() API takes a key returned from arch_irq_lock() and
releases it atomically with the context switch.  Make sure that the
action of the unlocking is to unmask interrupts globally.  If
interrupts would still be masked then that means there is an OUTER
interrupt lock still held, and the code that locked it surely doesn't
expect the thread to be suspended and interrupts unmasked while it's
held!

Unfortunately, this kind of mistake is very easy to make.  We should
catch that with a simple assertion.  This is essentially a crude
Zephyr equivalent of the extremely common "BUG: scheduling while
atomic" error in Linux drivers (just google it).

The one exception made is the circumstance where a thread has already
aborted itself.  At that stage, whatever upthread lock state might
have existed will have already been messed up, so there's no value in
our asserting here.  We can't catch all bugs, and this can actually
happen in error handling and/or test frameworks.

Fixes #33319

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This commit is contained in:
Andy Ross 2021-05-13 16:22:03 -07:00 committed by Anas Nashif
commit bd077561d0

View file

@ -65,6 +65,27 @@ static ALWAYS_INLINE unsigned int do_swap(unsigned int key,
ARG_UNUSED(lock); ARG_UNUSED(lock);
struct k_thread *new_thread, *old_thread; struct k_thread *new_thread, *old_thread;
#ifdef CONFIG_SPIN_VALIDATE
/* Make sure the key acts to unmask interrupts, if it doesn't,
* then we are context switching out of a nested lock
* (i.e. breaking the lock of someone up the stack) which is
* forbidden! The sole exception are dummy threads used
* during initialization (where we start with interrupts
* masked and switch away to begin scheduling) and the case of
* a dead current thread that was just aborted (where the
* damage was already done by the abort anyway).
*
* (Note that this is disabled on ARM64, where system calls
* can sometimes run with interrupts masked in ways that don't
* represent lock state. See #35307)
*/
# ifndef CONFIG_ARM64
__ASSERT(arch_irq_unlocked(key) ||
_current->base.thread_state & (_THREAD_DUMMY | _THREAD_DEAD),
"Context switching while holding lock!");
# endif
#endif
old_thread = _current; old_thread = _current;
z_check_stack_sentinel(); z_check_stack_sentinel();