Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Tue, 17 Jul 2018 05:56:53 -0700

On Tue, Jul 17, 2018 at 10:19:08AM +0200, David Woodhouse wrote:
> On Mon, 2018-07-16 at 08:40 -0700, Paul E. McKenney wrote:
> > Most of the weekend was devoted to testing today's upcoming pull request,
> > but I did get a bit more testing done on this.
> > 
> > I was able to make this happen more often by tweaking rcutorture a
> > bit, but I still do not yet have statistically significant results.
> > Nevertheless, I have thus far only seen failures with David's patch or
> > with both David's and my patch.  And I actually got a full-up rcutorture
> > failure (a too-short grace period) in addition to the aforementioned
> > close calls.
> > 
> > Over this coming week I expect to devote significant testing time to
> > the commit just prior to David's in my stack.  If I don't see failures
> > on that commit, we will need to spent some quality time with the KVM
> > folks on whether or not kvm_x86_ops->run() and friends have the option of
> > failing to return, but instead causing control to pop up somewhere else.
> > Or someone could tell me how I am being blind to some obvious bug in
> > the two commits that allow RCU to treat KVM guest-OS execution as an
> > extended quiescent state.  ;-)
> 
> One thing we can try, if my patch is implicated, is moving the calls to
> rcu_kvm_en{ter,xit} closer to the actual VM entry. Let's try putting
> them around the large asm block in arch/x86/kvm/vmx.c::vmx_vcpu_run()
> for example. If that fixes it, then we know we've missed something else
> interesting that's happening in the middle.

I don't have enough data to say anything with too much certainty, but
my patch has rcu_kvm_en{ter,xit}() quite a bit farther apart than yours
does, and I am not seeing massive increases in error rate in my patch
compared to yours.  Which again might or might not mean anything.

Plus I haven't proven that your patch isn't an innocent bystander yet.
If it isn't just an innocent bystander, that will take most of this
week do demonstrate given current failure rates.

I am also working on improving rcutorture diagnostics which should help
me work out how to change rcutorture so as to find this more quickly.

> Testing on Skylake shows a guest CPUID goes from ~3000 cycles to ~3500
> with this patch, so in the next iteration it definitely needs to be
> ifdef CONFIG_NO_HZ_FULL anyway, because it's actually required there
> (AFAICT) and it's too expensive otherwise as Christian pointed out.

Makes sense!

							Thanx, Paul