So why is the rcu_virt_note_context_switch(smp_processor_id()); in guest_enter_irqoff not good enough? This was actually supposed to tell rcu that being in the guest is an extended quiescing period (like userspace). What has changed? On 07/11/2018 07:03 PM, David Woodhouse wrote: > On Wed, 2018-07-11 at 09:49 -0700, Paul E. McKenney wrote: >> And here is an updated v4.15 patch with Marius's Reported-by and David's >> fix to my lost exclamation point. > > Thanks. Are you sending the original version of that to Linus? It'd be > useful to have the commit ID so that we can watch for it landing, and > chase this one up to Greg. > > As discussed on IRC, this patch reduces synchronize_sched() latency for > us from ~4600s to ~160ms, which is nice. > > However, it isn't going to be sufficient in the NO_HZ_FULL case. For > that you want a patch like the one below, which happily reduces the > latency in our (!NO_HZ_FULL) case still further to ~40ms. > > Adding kvm list for better review... > > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > Subject: [PATCH] kvm/x86: Inform RCU of quiescent state when entering guest mode > > RCU can spend long periods of time waiting for a CPU which is actually in > KVM guest mode, entirely pointlessly. Treat it like the idle and userspace > modes, and don't wait for it. > > Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx> > --- > arch/x86/kvm/x86.c | 2 ++ > include/linux/rcutree.h | 2 ++ > kernel/rcu/tree.c | 16 ++++++++++++++++ > 3 files changed, 20 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 0046aa70205a..b0c82f70afa7 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -7458,7 +7458,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD; > } > > + rcu_kvm_enter(); > kvm_x86_ops->run(vcpu); > + rcu_kvm_exit(); > > /* > * Do this here before restoring debug registers on the host. And > diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h > index 914655848ef6..6d07af5a50fc 100644 > --- a/include/linux/rcutree.h > +++ b/include/linux/rcutree.h > @@ -82,6 +82,8 @@ void cond_synchronize_sched(unsigned long oldstate); > > void rcu_idle_enter(void); > void rcu_idle_exit(void); > +void rcu_kvm_enter(void); > +void rcu_kvm_exit(void); > void rcu_irq_enter(void); > void rcu_irq_exit(void); > void rcu_irq_enter_irqson(void); > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index aa7cade1b9f3..df7893273939 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1019,6 +1019,22 @@ void rcu_irq_enter_irqson(void) > local_irq_restore(flags); > } > > +/* > + * These are currently identical to the _idle_ versions but let's > + * explicitly have separate copies to keep Paul honest in future. > + */ > +void rcu_kvm_enter(void) > +{ > + rcu_idle_enter(); > +} > +EXPORT_SYMBOL_GPL(rcu_kvm_enter); > + > +void rcu_kvm_exit(void) > +{ > + rcu_idle_exit(); > +} > +EXPORT_SYMBOL_GPL(rcu_kvm_exit); > + > /** > * rcu_is_watching - see if RCU thinks that the current CPU is idle > * >