kernel/work: Fix race under with delayed work item cancellation

The call to unschedule_locked() would return true ("successfully
unscheduled") even in the case where the underlying z_abort_timeout()
failed (because the callback was already unpended and
in-progress/complete/about-to-be-run, remember that timeout callbacks
are unsynchronized), leading to state bugs and races against the
callback behavior.

Correctly detect that case and propagate the error to the caller.

Fixes #51872

Signed-off-by: Andy Ross <andyross@google.com>
This commit is contained in:
Andy Ross 2023-02-09 06:46:36 -08:00 committed by Stephanos Ioannidis
commit d00f9b594b

View file

@ -917,10 +917,13 @@ static inline bool unschedule_locked(struct k_work_delayable *dwork)
bool ret = false;
struct k_work *work = &dwork->work;
/* If scheduled, try to cancel. */
/* If scheduled, try to cancel. If it fails, that means the
* callback has been dequeued and will inevitably run (or has
* already run), so treat that as "undelayed" and return
* false.
*/
if (flag_test_and_clear(&work->flags, K_WORK_DELAYED_BIT)) {
z_abort_timeout(&dwork->timeout);
ret = true;
ret = z_abort_timeout(&dwork->timeout) == 0;
}
return ret;