Sean Christopherson <seanjc@xxxxxxxxxx> writes: > Use preempt_model_preemptible() to detect a preemptible kernel when > deciding whether or not to reschedule in order to drop a contended > spinlock or rwlock. Because PREEMPT_DYNAMIC selects PREEMPTION, kernels > built with PREEMPT_DYNAMIC=y will yield contended locks even if the live > preemption model is "none" or "voluntary". In short, make kernels with > dynamically selected models behave the same as kernels with statically > selected models. Agreed. This behaviour makes sense. Should also be useful for PREEMPT_AUTO. The only thing that gives me pause is that now there is an extra call+ret even when we don't yield the lock. But maybe that could be addressed separately by converting preempt_model_* to use a static key or similar. > Somewhat counter-intuitively, NOT yielding a lock can provide better > latency for the relevant tasks/processes. E.g. KVM x86's mmu_lock, a > rwlock, is often contended between an invalidation event (takes mmu_lock > for write) and a vCPU servicing a guest page fault (takes mmu_lock for > read). For _some_ setups, letting the invalidation task complete even > if there is mmu_lock contention provides lower latency for *all* tasks, > i.e. the invalidation completes sooner *and* the vCPU services the guest > page fault sooner. > > But even KVM's mmu_lock behavior isn't uniform, e.g. the "best" behavior > can vary depending on the host VMM, the guest workload, the number of > vCPUs, the number of pCPUs in the host, why there is lock contention, etc. > > In other words, simply deleting the CONFIG_PREEMPTION guard (or doing the > opposite and removing contention yielding entirely) needs to come with a > big pile of data proving that changing the status quo is a net positive. > > Opportunistically document this side effect of preempt=full, as yielding > contended spinlocks can have significant, user-visible impact. > > Fixes: c597bfddc9e9 ("sched: Provide Kconfig support for default dynamic preempt mode") > Link: https://lore.kernel.org/kvm/ef81ff36-64bb-4cfe-ae9b-e3acf47bff24@xxxxxxxxxxx > Cc: Valentin Schneider <valentin.schneider@xxxxxxx> > Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > Cc: Marco Elver <elver@xxxxxxxxxx> > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> > Cc: David Matlack <dmatlack@xxxxxxxxxx> > Cc: Friedrich Weber <f.weber@xxxxxxxxxxx> > Cc: Ankur Arora <ankur.a.arora@xxxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> > --- > Documentation/admin-guide/kernel-parameters.txt | 4 +++- > include/linux/spinlock.h | 14 ++++++-------- > 2 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 825398d66c69..fdeddb066439 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -4689,7 +4689,9 @@ > none - Limited to cond_resched() calls > voluntary - Limited to cond_resched() and might_sleep() calls > full - Any section that isn't explicitly preempt disabled > - can be preempted anytime. > + can be preempted anytime. Tasks will also yield > + contended spinlocks (if the critical section isn't > + explicitly preempt disabled beyond the lock itself). This seems to read a bit better: + can be preempted anytime. Tasks will also yield + contended spinlocks (unless the critical section is + explicitly preempt disabled beyond the lock itself). Ankur > print-fatal-signals= > [KNL] debug: print fatal signals > diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h > index 3fcd20de6ca8..63dd8cf3c3c2 100644 > --- a/include/linux/spinlock.h > +++ b/include/linux/spinlock.h > @@ -462,11 +462,10 @@ static __always_inline int spin_is_contended(spinlock_t *lock) > */ > static inline int spin_needbreak(spinlock_t *lock) > { > -#ifdef CONFIG_PREEMPTION > + if (!preempt_model_preemptible()) > + return 0; > + > return spin_is_contended(lock); > -#else > - return 0; > -#endif > } > > /* > @@ -479,11 +478,10 @@ static inline int spin_needbreak(spinlock_t *lock) > */ > static inline int rwlock_needbreak(rwlock_t *lock) > { > -#ifdef CONFIG_PREEMPTION > + if (!preempt_model_preemptible()) > + return 0; > + > return rwlock_is_contended(lock); > -#else > - return 0; > -#endif > } > > /*