Re: MMU notifiers review and some proposals

Nick Piggin <npiggin@xxxxxxx> · Sun, 27 Jul 2008 14:08:14 +0200

On Sat, Jul 26, 2008 at 03:16:51PM +0200, Andrea Arcangeli wrote:
> On Sat, Jul 26, 2008 at 03:04:06PM +0200, Nick Piggin wrote:
> > On Sat, Jul 26, 2008 at 01:38:13PM +0200, Andrea Arcangeli wrote:
> > > 
> > > 1) absolute minimal intrusion into the kernel common code, and
> > >    absolute minimum number of branches added to the kernel fast
> > >    paths. Kernel is faster than your "minimal" type of notifiers when
> > >    they're disarmed.
> > 
> > BTW. is this really significant? Having one branch per pte
> > I don't think is necessarily slower than 2 branches per unmap.
> > 
> > The 2 branches will use more icache and more branch history. One
> > branch even once per pte in the unmapping loop is going to remain
> > hot in icache and branch history isn't it?
> 
> Even if branch-predicted and icached, it's still more executable to
> compute in a tight loop. Even if quick it'll accumulate cycles. Said

True but having 2 branches and more icache is more likely to be a
branch mispredict or icache miss which costs a *lot* of cached,
predicted branches.

It's all speculation, but my point is that it is not accurate to
say my version woiuld be slower because in some cases it would be
the oposite.

> that perhaps you're right that my point 1 wasn't that important or not
> a tangible positive, but surely doing a secondary mmu invalidate for
> each pte zapped isn't ideal... that's the whole point of the
> tlb-gather logic, nobody wants to do that not even for the primary
> tlb, and surely not for the secondary-mmu that may not even be as fast
> as the primary-tlb at invalidating. Hence the very simple patch is
> clearly inferior when they're armed (if only equivalent when they're
> disarmed)...

See the thing about that is I don't actually dispute that in some
cases the range start/end case will definitely be faster. However,
firstly KVM as you say doesn't really care, and secondly we don't
have numbers for GRU (I'm talking about start/end vs gather)

> I think we can argue once you've reduced the frequency of the
> secondary mmu invalidates of a factor of 500 by mangling over the tlb
> gather logic per-arch.

OK, we'll see...

--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html