On Wed, Nov 8, 2023 at 12:09 AM Ankur Arora <ankur.a.arora@xxxxxxxxxx> wrote: > > There are broadly three sets of uses of cond_resched(): > > 1. Calls to cond_resched() out of the goodness of our heart, > otherwise known as avoiding lockup splats. > > 2. Open coded variants of cond_resched_lock() which call > cond_resched(). > > 3. Retry or error handling loops, where cond_resched() is used as a > quick alternative to spinning in a tight-loop. > > When running under a full preemption model, the cond_resched() reduces > to a NOP (not even a barrier) so removing it obviously cannot matter. > > But considering only voluntary preemption models (for say code that > has been mostly tested under those), for set-1 and set-2 the > scheduler can now preempt kernel tasks running beyond their time > quanta anywhere they are preemptible() [1]. Which removes any need > for these explicitly placed scheduling points. What about RCU callbacks ? cond_resched() was helping a bit. > > The cond_resched() calls in set-3 are a little more difficult. > To start with, given it's NOP character under full preemption, it > never actually saved us from a tight loop. > With voluntary preemption, it's not a NOP, but it might as well be -- > for most workloads the scheduler does not have an interminable supply > of runnable tasks on the runqueue. > > So, cond_resched() is useful to not get softlockup splats, but not > terribly good for error handling. Ideally, these should be replaced > with some kind of timed or event wait. > For now we use cond_resched_stall(), which tries to schedule if > possible, and executes a cpu_relax() if not. > > Most of the uses here are in set-1 (some right after we give up a > lock or enable bottom-halves, causing an explicit preemption check.) > > We can remove all of them. A patch series of 86 is not reasonable. 596 files changed, 881 insertions(+), 2813 deletions(-) If really cond_resched() becomes a nop (Nice !) , make this at the definition of cond_resched(), and add there nice debugging. Whoever needs to call a "real" cond_resched(), could call a cond_resched_for_real() (Please change the name, this is only to make a point) Then let the removal happen whenever each maintainer decides, 6 months later, without polluting lkml. Imagine we have to revert this series in 1 month, how painful this would be had we removed ~1400 cond_resched() calls all over the place, with many conflicts. Thanks