arch: tests: Document interrupt delivery behavior after arch_irq_disable()

Upcoming changes from Andrew that add a global timeout to the kernel
broke because of some voodoo behavior in the kernel/context test.  It
will use arch_irq_disable() on the timer interrupt directly to prevent
interrupts and measure timekeeping in their absence.  But some
architectures[1] don't reliably deliver interrupts that arrive, which
means that a running timeout that exists across this period will
result in a corrupt timeout queue.

Document that rule for architectures, move the offending test to the
end of the test suite (to minimize the chance of interacting with
other test code) and put a giant warning about the situation on it.
Long term, we may want to rework this test to do its job in other
ways.

[1] On x86, the interrupt disable happens at the IO-APIC level, while
interrupt latching and delivery is downstream in each CPU's Local
APIC.  An IO-APIC masked interrupt is completely invisible to the APIC
and can never be delivered once the line goes low.

Fixes #31333

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This commit is contained in:
Andy Ross 2021-01-21 12:12:59 -08:00 committed by Anas Nashif
commit e932a1537c
2 changed files with 20 additions and 3 deletions

View file

@ -250,6 +250,15 @@ static inline bool arch_irq_unlocked(unsigned int key);
/**
* Disable the specified interrupt line
*
* @note: The behavior of interrupts that arrive after this call
* returns and before the corresponding call to arch_irq_enable() is
* undefined. The hardware is not required to latch and deliver such
* an interrupt, though on some architectures that may work. Other
* architectures will simply lose such an interrupt and never deliver
* it. Many drivers and subsystems are not tolerant of such dropped
* interrupts and it is the job of the application layer to ensure
* that behavior remains correct.
*
* @see irq_disable()
*/
void arch_irq_disable(unsigned int irq);

View file

@ -609,7 +609,14 @@ static void test_kernel_interrupts(void)
* failure.
*
* Assumptions and Constraints:
* - N/A
* - Note that this test works by disabling the timer interrupt
* directly, without any interaction with the timer driver or
* timeout subsystem. NOT ALL ARCHITECTURES will latch and deliver
* a timer interrupt that arrives while the interrupt is disabled,
* which means that the timeout list will become corrupted (because
* it contains items that should have expired in the past). Any use
* of kernel timeouts after completion of this test is disallowed.
* RUN THIS TEST LAST IN THE SUITE.
*
* @see irq_disable(), irq_enable()
*/
@ -1161,16 +1168,17 @@ void test_main(void)
kernel_init_objects();
/* The timer_interrupts test MUST BE LAST, see note above */
ztest_test_suite(context,
ztest_unit_test(test_kernel_interrupts),
ztest_1cpu_unit_test(test_kernel_timer_interrupts),
ztest_unit_test(test_kernel_ctx_thread),
ztest_1cpu_unit_test(test_busy_wait),
ztest_1cpu_unit_test(test_k_sleep),
ztest_unit_test(test_kernel_cpu_idle_atomic),
ztest_unit_test(test_kernel_cpu_idle),
ztest_1cpu_unit_test(test_k_yield),
ztest_1cpu_unit_test(test_kernel_thread)
ztest_1cpu_unit_test(test_kernel_thread),
ztest_1cpu_unit_test(test_kernel_timer_interrupts)
);
ztest_run_test_suite(context);
}