Re: [PATCH] mem-hotplug: Don't clear the only node in new_node_page()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 20 Sep 2016, Vlastimil Babka wrote:

> On 09/12/2016 11:18 AM, Michal Hocko wrote:
> > On Mon 05-09-16 16:18:29, Vlastimil Babka wrote:
> > 
> > > Also OOM is skipped for __GFP_THISNODE
> > > allocations, so we might also consider the same for nodemask-constrained
> > > allocations?
> > > 
> > > > The patch checks whether it is the last node on the system, and if it
> > > is, then
> > > > don't clear the nid in the nodemask.
> > > 
> > > I'd rather see the allocation not OOM, and rely on the fallback in
> > > new_node_page() that doesn't have nodemask. But I suspect it might also
> > > make
> > > sense to treat empty nodemask as something unexpected and put some WARN_ON
> > > (instead of OOM) in the allocator.
> > 
> > To be honest I am really not all that happy about 394e31d2ceb4
> > ("mem-hotplug: alloc new page from a nearest neighbor node when
> > mem-offline") and find it a bit fishy. I would rather re-iterate that
> > patch rather than build new hacks on top.
> 
> OK, IIRC I suggested the main idea of clearing the current node from nodemask
> and relying on nodelist to get us the other nodes sorted by their distance.
> Which I thought was an easy way to get to the theoretically optimal result.
> How would you rewrite it then? (but note that the fix is already mainline).
> 

This is a mess.  Commit 9bb627be47a5 ("mem-hotplug: don't clear the only 
node in new_node_page()") is wrong because it's clearing nid when the next 
node in node_online_map doesn't match.  node_online_map is wrong because 
it includes memoryless nodes.  (Nodes with closest NUMA distance also do 
not need to have adjacent node ids.)

This is all protected by mem_hotplug_begin() and the zonelists will be 
stable.  The solution is to rewrite new_node_page() to work correctly.  
Use node_states[N_MEMORY] as mask, clear page_to_nid(page).  If mask is 
not empty, do

__alloc_pages_nodemask(gfp_mask, 0,
node_zonelist(page_to_nid(page), gfp_mask), &mask) 

and fallback to alloc_page(gfp_mask), which should also be used if the 
mask is empty -- do not try to allocate memory from the empty set of 
nodes.

mm-page_alloc-warn-about-empty-nodemask.patch is a rather ridiculous 
warning to need.  The largest user of a page allocator nodemask is 
mempolicies which makes sure it doesn't pass an empty set.  If it's really 
required, it should at least be unlikely() since the vast majority of 
callers will have ac->nodemask == NULL.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]