On Sat, 9 Sep 2023 21:35:54 -0700 Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Sat, 9 Sept 2023 at 20:49, Ankur Arora <ankur.a.arora@xxxxxxxxxx> wrote: > > > > I think we can keep these checks, but with this fixed definition of > > resched_allowed(). This might be better: > > > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -2260,7 +2260,8 @@ static inline void disallow_resched(void) > > > > static __always_inline bool resched_allowed(void) > > { > > - return unlikely(test_tsk_thread_flag(current, TIF_RESCHED_ALLOW)); > > + return unlikely(!preempt_count() && > > + test_tsk_thread_flag(current, TIF_RESCHED_ALLOW)); > > } > > I'm not convinced (at all) that the preempt count is the right thing. > > It works for interrupts, yes, because interrupts will increment the > preempt count even on non-preempt kernels (since the preempt count is > also the interrupt context level). > > But what about any synchronous trap handling? > > In other words, just something like a page fault? A page fault doesn't > increment the preemption count (and in fact many page faults _will_ > obviously re-schedule as part of waiting for IO). I wonder if we should make it a rule to not allow page faults when RESCHED_ALLOW is set? Yeah, we can preempt in page faults, but that's not what the allow_resched() is about. Since the main purpose of that function, according to the change log, is for kernel threads. Do kernel threads page fault? (perhaps for vmalloc? but do we take locks in those cases?). That is, treat allow_resched() like preempt_disable(). If we page fault with "preempt_disable()" we usually complain about that (unless we do some magic with *_nofault() functions). Then we could just add checks in the page fault handlers to see if allow_resched() is set, and if so, complain about it like we do with preempt_disable in the might_fault() function. -- Steve