Re: Interleave policy on 2M pages (was Re: [RFC][BUGFIX][PATCH 1/2] memcg: fix charge bypass route of migration)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 16, 2010 at 11:13:10AM -0500, Christoph Lameter wrote:
> On Thu, 15 Apr 2010, Andrea Arcangeli wrote:
> 
> > 2) add alloc_pages_vma for numa awareness in the huge page faults
> 
> How do interleave policies work with alloc_pages_vma? So far the semantics
> is to spread 4k pages over different nodes. With 2M pages this can no
> longer work the way is was.

static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
       	      	   				     unsigned nid)

See the order parameter, so I hope it's already solved. I assume the
idea would be to interleave 2M pages to avoid the CPU the memory
overhead of the pte layer and to decrease the tlb misses, but still
maxing out the bandwidth of the system when multiple threads accesses
memory that is stored in different nodes with random access. It should
be ideal for hugetlbfs too for the large shared memory pools of the
DB. Surely it'll be better than having all hugepages from the same
node despite MPOL_INTERLEAVE is set.

Said that, it'd also be possible to disable hugepages if the vma has
MPOL_INTERLEAVE set, but I doubt we want to do that by default. Maybe
we can add a sysfs control later for that which can be further tweaked
at boot time by per-arch quirks, dunno... It's really up to you, you
know numa better, but I've no doubt that MPOL_INTERLEAVE also can make
sense with hugepages (both hugetlbfs and transparent hugepage
support).

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]