Re: [RFC PATCH] KVM: x86: Skip request checking branches in vcpu_enter_guest() more effectively

Takuya Yoshikawa <takuya.yoshikawa@xxxxxxxxx> · Wed, 26 Sep 2012 11:06:29 +0900

On Mon, 24 Sep 2012 16:50:13 +0200
Avi Kivity <avi@xxxxxxxxxx> wrote:

> Afterwards, most exits are APIC and interrupt related, HLT, and MMIO.
> Of these, some are special (HLT, interrupt injection) and some are not
> (read/write most APIC registers).  I don't think one group dominates the
> other.  So already vcpu->requests processing is not such a slow path, it
> is relatively common.  We still see a lot of page faults during boot and
> during live migration though.
> 
> With AVIC/APIC-V (still in the future) the mix will change again, with
> both special and non-special exits eliminated.  We'll be left mostly
> with APIC timer and HLT (and ICR writes for APIC-V).
> 
> So maybe the direction of your patch makes sense.  Things like
> KVM_REQ_EVENT (or anything above 2-3% of exits) shouldn't be in
> vcpu->requests or maybe they deserve special treatment.

I see the point.

Since KVM_REQ_EVENT must be checked after handling some other requests,
it needs special treatment anyway -- if we think defining it as the
last flag for for_each_set_bit() is kind of special treatment.

As Gleb and you pointed out, KVM_REQ_STEAL_UPDATE needs to be fixed
first not to be set unnecessarily.

Then by special casing KVM_REQ_EVENT, one line change or moving it out
from vcpu->requests, we can see if further improvement is needed.

If a few requests exceed the threshold, 2-3% as you wrote?, we can also
define a mask to indicate which requests should be treated as "not unlikely".

> > BTW, schedule() is really rare?  We do either cond_resched() or
> > heavy weight exit, no?
> 
> If 25% of exits are HLT (like a ping workload), then 25% of your exits
> end up in schedule().
> 
> On modern hardware, a relatively larger percentage of exits are
> heavyweight (same analysis as above).  On AVIC hardware most exits will
> be mmio, HLT, and host interrupts.  Of these only host interrupts that
> don't lead to a context switch will be lightweight.
> 
> > 
> > I always see vcpu threads actively move around the cores.
> > (When I do not pin them.)
> 
> Sure, but the frequency is quite low.  If not that's a bug.

That's what I was originally testing for: if vcpu threads were being
scheduled as expected.

I forgot why I reached here.

> >> Modern processors will eliminate KVM_REQ_EVENT in many cases, so the
> >> optmimization is wasted on them.
> > 
> > Then, my Nehalem server was not so modern.
> 
> Well I was referring to APIC-v/AVIC hardware which nobody has.  On
> current hardware they're very common.  So stuffing it in the
> vcpu->requests slow path is not warranted.
> 
> My patch is cleaner than yours as it handles the problem generically,
> but yours matches reality better.

I guess so.

I remember someone once tried to inline functions used inside
for_each_set_bit() complaining that it was slow.  A generic approach
needs some scale to win.

> > I did something like this:
> > 
> > 	if requests == KVM_REQ_EVENT
> > 		++counter1;
> > 	if requests == KVM_REQ_STEAL_UPDATE
> > 		++counter2;
> > 	...
> > 
> > in vcpu_enter_guest() and saw KVM_REQ_EVENT many times.
> 
> 
> (in theory perf probe can do this.  But figuring out how is often more
> time consuming than patching the kernel).

Yes, actually I was playing with perf before directly counting each pattern.
But since I could not see the details easily, because of inlining or ...,
I ended up going my way.

Thanks,
	Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html