Re: [RFC PATCH 66/86] treewide: kernel: remove cond_resched()

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Fri, 17 Nov 2023 10:14:33 -0800

On Tue, Nov 07, 2023 at 03:08:02PM -0800, Ankur Arora wrote:
> There are broadly three sets of uses of cond_resched():
> 
> 1.  Calls to cond_resched() out of the goodness of our heart,
>     otherwise known as avoiding lockup splats.
> 
> 2.  Open coded variants of cond_resched_lock() which call
>     cond_resched().
> 
> 3.  Retry or error handling loops, where cond_resched() is used as a
>     quick alternative to spinning in a tight-loop.
> 
> When running under a full preemption model, the cond_resched() reduces
> to a NOP (not even a barrier) so removing it obviously cannot matter.
> 
> But considering only voluntary preemption models (for say code that
> has been mostly tested under those), for set-1 and set-2 the
> scheduler can now preempt kernel tasks running beyond their time
> quanta anywhere they are preemptible() [1]. Which removes any need
> for these explicitly placed scheduling points.
> 
> The cond_resched() calls in set-3 are a little more difficult.
> To start with, given it's NOP character under full preemption, it
> never actually saved us from a tight loop.
> With voluntary preemption, it's not a NOP, but it might as well be --
> for most workloads the scheduler does not have an interminable supply
> of runnable tasks on the runqueue.
> 
> So, cond_resched() is useful to not get softlockup splats, but not
> terribly good for error handling. Ideally, these should be replaced
> with some kind of timed or event wait.
> For now we use cond_resched_stall(), which tries to schedule if
> possible, and executes a cpu_relax() if not.
> 
> All of these are from set-1 except for the retry loops in
> task_function_call() or the mutex testing logic.
> 
> Replace these with cond_resched_stall(). The others can be removed.
> 
> [1] https://lore.kernel.org/lkml/20231107215742.363031-1-ankur.a.arora@xxxxxxxxxx/
> 
> Cc: Tejun Heo <tj@xxxxxxxxxx> 
> Cc: Zefan Li <lizefan.x@xxxxxxxxxxxxx> 
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> 
> Cc: Peter Oberparleiter <oberpar@xxxxxxxxxxxxx> 
> Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx> 
> Cc: Will Deacon <will@xxxxxxxxxx> 
> Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx> 
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx> 
> Cc: Juri Lelli <juri.lelli@xxxxxxxxxx> 
> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx> 
> Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx>

Sounds like the sort of test which should be put into linux-next to get
test coverage right away. To see what really blows up.

 Luis