Re: [RFD] Re: [PATCH 00 of 41] Transparent Hugepage Support #17

Andrea Arcangeli <aarcange@xxxxxxxxxx> · Tue, 6 Apr 2010 03:26:47 +0200

On Mon, Apr 05, 2010 at 06:08:51PM -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 5 Apr 2010, Linus Torvalds wrote:
> > 
> > In particular, when you quote 6% improvement for a kernel compile, your 
> > own numbers make [me] seriously wonder how many percentage points you'd get 
> > from just faulting in 8 pages at a time when you have lots of memory free, 
> > and use a single 3-order allocation to get those eight pages?
> 
> THIS PATCH IS TOTALLY UNTESTED!
> 
> It's very very unlikely to work, but it compiles for me at least in one 
> particular configuration. So it must be perfect. Ship it.
> 
> It basically tries to just fill in anonymous memory PTE entries roughly 
> one cacheline at a time, avoiding extra page-faults and extra memory 
> allocations.
> 
> It's probably buggy as hell, I don't dare try to actually boot the crap I 
> write. It literally started out as a pseudo-code patch that I then ended 
> up expanding until it compiled and then fixed up some corner cases in. 
> 
> IOW, it's not really a serious patch, although when I look at it, it 
> doesn't really look all that horrible.
> 
> Now, I'm pretty sure that allocating the page with a single order-3 
> allocation, and then treating it as 8 individual order-0 pages is broken 
> and probably makes various things unhappy. That "make_single_page()" 
> monstrosity may or may not be sufficient.
> 
> In other words, what I'm trying to say is: treat this patch as a request 
> for discussion, rather than something that necessarily _works_. 

This will provide 0% speedup to a kernel compile in guest where
transparent hugepage support (or hugetlbfs too) would provide a 6%
speedup.

I evaluated the prefault approach before I finalized my design and
then generated an huge pmd when the whole hugepage was mapped. It's
all worthless complexity in my view.

In fact except at boot time we'll likely won't be interested to take
advantage of this, as it is not a free optimization and it magnifies
the time it takes to clear-page copy-page (which is why I tried to try
to only prefault an hugepages, and then after benchmarking I figured
out it wasn't worth it and it'd be hugely more complicated too). The
only case it is worth mapping more than one 4k page, is when we can
take advantage of the tlb miss speedup and of the 2M tlb, otherwise
it's better to stick to 4k page faults and do a 4k clear-page
copy-page and not risk to take more than 4k of memory. And let
khugepaged do the rest.

I think I already mentioned it in the previous email but seeing your
patch I feel obliged to re-post:

---------------
hugepages in the virtualization hypervisor (and also in the guest!)
are much more important than in a regular host not using
virtualization, becasue with NPT/EPT they decrease the tlb-miss
cacheline accesses from 24 to 19 in case only the hypervisor uses
transparent hugepages, and they decrease the tlb-miss cacheline
accesses from 19 to 15 in case both the linux hypervisor and the linux
guest both uses this patch (though the guest will limit the addition
speedup to anonymous regions only for now...).  Even more important is
that the tlb miss handler is much slower on a NPT/EPT guest than for a
regular shadow paging or no-virtualization scenario. So maximizing the
amount of virtual memory cached by the TLB pays off significantly more
with NPT/EPT than without (even if there would be no significant
speedup in the tlb-miss runtime).
----------------

This is in the changelog of the "transparent hugepage core" patch too
and here as well:

http://linux-mm.org/TransparentHugepage?action=AttachFile&do=get&target=transparent-hugepage.pdf

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>