Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes: > On Sun, Sep 10, 2023 at 11:32:32AM -0700, Linus Torvalds wrote: > >> I was hoping that we'd have some generic way to deal with this where >> we could just say "this thing is reschedulable", and get rid of - or >> at least not increasingly add to - the cond_resched() mess. > > Isn't that called PREEMPT=y ? That tracks precisely all the constraints > required to know when/if we can preempt. > > The whole voluntary preempt model is basically the traditional > co-operative preemption model and that fully relies on manual yields. Yeah, but as Linus says, this means a lot of code is just full of cond_resched(). For instance a loop the process_huge_page() uses this pattern: for (...) { cond_resched(); clear_page(i); cond_resched(); clear_page(j); } > The problem with the REP prefix (and Xen hypercalls) is that > they're long running instructions and it becomes fundamentally > impossible to put a cond_resched() in. > >> Yes. I'm starting to think that that the only sane solution is to >> limit cases that can do this a lot, and the "instruciton pointer >> region" approach would certainly work. > > From a code locality / I-cache POV, I think a sorted list of > (non overlapping) ranges might be best. Yeah, agreed. There are a few problems with doing that though. I was thinking of using a check of this kind to schedule out when it is executing in this "reschedulable" section: !preempt_count() && in_resched_function(regs->rip); For preemption=full, this should mostly work. For preemption=voluntary, though this'll only work with out-of-line locks, not if the lock is inlined. (Both, should have problems with __this_cpu_* and the like, but maybe we can handwave that away with sparse/objtool etc.) How expensive would be always having PREEMPT_COUNT=y? -- ankur