The kernel has a lot of intances of cond_resched() where it is used as an alternative to spinning in a tight-loop while waiting to retry an operation, or while waiting for a device state to change. Unfortunately, because the scheduler is unlikely to have an interminable supply of runnable tasks on the runqueue, this just amounts to spinning in a tight-loop with a cond_resched(). (When running in a fully preemptible kernel, cond_resched() calls are stubbed out so it amounts to even less.) In sum, cond_resched() in error handling/retry contexts might be useful in avoiding softlockup splats, but not very good at error handling. Ideally, these should be replaced with some kind of timed or event wait. For now add cond_resched_stall(), which tries to schedule if possible, and failing that executes a cpu_relax(). Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> --- include/linux/sched.h | 6 ++++++ kernel/sched/core.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6ba4371761c4..199f8f7211f2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2100,6 +2100,7 @@ static inline int _cond_resched(void) { return 0; } extern int __cond_resched_lock(spinlock_t *lock); extern int __cond_resched_rwlock_read(rwlock_t *lock); extern int __cond_resched_rwlock_write(rwlock_t *lock); +extern int __cond_resched_stall(void); #define MIGHT_RESCHED_RCU_SHIFT 8 #define MIGHT_RESCHED_PREEMPT_MASK ((1U << MIGHT_RESCHED_RCU_SHIFT) - 1) @@ -2135,6 +2136,11 @@ extern int __cond_resched_rwlock_write(rwlock_t *lock); __cond_resched_rwlock_write(lock); \ }) +#define cond_resched_stall() ({ \ + __might_resched(__FILE__, __LINE__, 0); \ + __cond_resched_stall(); \ +}) + static inline void cond_resched_rcu(void) { #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e1b0759ed3ab..ea00e8489ebb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8652,6 +8652,18 @@ int __cond_resched_rwlock_write(rwlock_t *lock) } EXPORT_SYMBOL(__cond_resched_rwlock_write); +int __cond_resched_stall(void) +{ + if (tif_need_resched(RESCHED_eager)) { + __preempt_schedule(); + return 1; + } else { + cpu_relax(); + return 0; + } +} +EXPORT_SYMBOL(__cond_resched_stall); + /** * yield - yield the current processor to other threads. * -- 2.31.1