Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 22 Jul 2014, Nishanth Aravamudan wrote:

> > I think there's two use cases of interest:
> > 
> >  - allocating from a memoryless node where numa_node_id() is memoryless, 
> >    and
> > 
> >  - using node_to_mem_node() for a possibly-memoryless node for kmalloc().
> > 
> > I believe the first should have its own node_zonelist[0], whether it's 
> > memoryless or not, that points to a list of zones that start with those 
> > with the smallest distance.
> 
> Ok, and that would be used for falling back in the appropriate priority?
> 

There's no real fallback since there's never a case when you can allocate 
on a memoryless node.  The zonelist defines the appropriate order in which 
to try to allocate from zones, so it depends on things like the 
numa_node_id() in alloc_pages_current() and whether the zonelist for a 
memoryless node is properly initialized or whether this needs to be 
numa_mem_id().  It depends on the intended behavior of calling 
alloc_pages_{node,vma}() with a memoryless node, the complexity of 
(re-)building the zonelists at bootstrap and for memory hotplug isn't a 
hotpath.

This choice would also impact MPOL_PREFERRED mempolicies when MPOL_F_LOCAL 
is set.

> > I think its own node_zonelist[1], for __GFP_THISNODE allocations,
> > should point to the node with present memory that has the smallest
> > distance.
> 
> And so would this, but with the caveat that we can fail here and don't
> go further? Semantically, __GFP_THISNODE then means "as close as
> physically possible ignoring run-time memory constraints". I say that
> because obviously we might get off-node memory without memoryless nodes,
> but that shouldn't be used to satisfy __GPF_THISNODE allocations.
> 

alloc_pages_current() substitutes any existing mempolicy for the default 
local policy when __GFP_THISNODE is set, and that would require local 
allocation.  That, currently, is numa_node_id() and not numa_mem_id().

The slab allocator already only uses __GFP_THISNODE for numa_mem_id() so 
it will allocate remotely anyway.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]