On 2024-05-23 at 09:30:59 -0700, Dave Hansen wrote: > On 5/16/24 06:02, Chen Yu wrote: > > Performance drop is reported when running encode/decode workload and > > BenchSEE cache sub-workload. > > Bisect points to commit ce0a1b608bfc ("x86/paravirt: Silence unused > > native_pv_lock_init() function warning"). When CONFIG_PARAVIRT_SPINLOCKS > > is disabled the virt_spin_lock_key is set to true on bare-metal. > > The qspinlock degenerates to test-and-set spinlock, which decrease the > > performance on bare-metal. > > > > Fix this by disabling virt_spin_lock_key if CONFIG_PARAVIRT_SPINLOCKS > > is not set, or it is on bare-metal. > > This is missing some background: > > The kernel can change spinlock behavior when running as a guest. But > this guest-friendly behavior causes performance problems on bare metal. > So there's a 'virt_spin_lock_key' static key to switch between the two > modes. > > The static key is always enabled by default (run in guest mode) and > should be disabled for bare metal (and in some guests that want native > behavior). > > ... then describe the regression and the fix > Thanks Juergen for your review. And thanks Dave for the write up, I'll refine the log according to your suggestion. > > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c > > index 5358d43886ad..ee51c0949ed8 100644 > > --- a/arch/x86/kernel/paravirt.c > > +++ b/arch/x86/kernel/paravirt.c > > @@ -55,7 +55,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key); > > > > void __init native_pv_lock_init(void) > > { > > - if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && > > + if (!IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) || > > !boot_cpu_has(X86_FEATURE_HYPERVISOR)) > > static_branch_disable(&virt_spin_lock_key); > > } > This gets used at a single site: > > if (pv_enabled()) > goto pv_queue; > > if (virt_spin_lock(lock)) > return; > > which is logically: > > if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS)) > goto ...; // don't look at virt_spin_lock_key > > if (virt_spin_lock_key) > return; // On virt, but non-paravirt. Did Test-and-Set > // spinlock. > Thanks for the description in detail, my original change might break the "X86_FEATURE_HYPERVISOR + NO_CONFIG_PARAVIRT_SPINLOCKS " case that, the guest can not fall into test-and-set. > So I _think_ Arnd was trying to optimize native_pv_lock_init() away when > it's going to get skipped over anyway by the 'goto'. > > But this took me at least 30 minutes of scratching my head and trying to > untangle the whole thing. It's all far too subtle for my taste, and all > of that to save a few bytes of init text in a configuration that's > probably not even used very often (PARAVIRT=y, but PARAVIRT_SPINLOCKS=n). > > Let's just keep it simple. How about the attached patch? Yes, this one works, I'll refine it. thanks, Chenyu