Re: [PATCH 3/3] KVM: MMU: Separate trivial NULL check out from rmap_get_next()

Avi Kivity <avi@xxxxxxxxxx> · Thu, 15 Mar 2012 14:01:11 +0200

On 03/15/2012 12:15 PM, Takuya Yoshikawa wrote:
> Avi Kivity <avi@xxxxxxxxxx> wrote:
>
> > > Although using "inline" like this does not look clean, we could see
> > > measurable performance improvements: get_dirty_log for 1GB dirty memory
> > > became faster by more than 10% on my test box.
> > >
> > 
> > WOW.  I'd have assumed the processor deals better with this; it should
> > be 100% predicted branches.
> > 
> > But I won't argue with cold data.
>
> What I checked was:
>
> original   with-patch2   with-patch3
> 8.7ms      8.5ms         7.5ms

What's the per-call numbers?

> I assumed that without "inline" only __rmap_get_next() would be inlined
> into rmap_get_next() so did like this.
>
> I thought the improvement was just from removing one function call for
> each rmap_write_protect.  Not sure if anything was changed with branch
> predictions.

What I mean is, modern cpus effectively inline simple function calls by
predicting the call, and branchs within the function, and the return, so
they don't have to stop their pipelines at any of these points.  But
again, the numbers talk louder than speculation about cpu architecture.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html