Re: [PATCH] mem-hotplug: Don't clear the only node in new_node_page()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/06/2016 10:13 AM, Li Zhong wrote:

On Sep 5, 2016, at 22:18, Vlastimil Babka <vbabka@xxxxxxx> wrote:

On 09/05/2016 04:59 AM, Li Zhong wrote:
Commit 394e31d2c introduced new_node_page() for memory hotplug.

In new_node_page(), the nid is cleared before calling __alloc_pages_nodemask().
But if it is the only node of the system,

So the use case is that we are partially offlining the only online node?

Yes.

and the first round allocation fails,
it will not be able to get memory from an empty nodemask, and trigger oom.

Hmm triggering OOM due to empty nodemask sounds like a wrong thing to do. CCing some OOM experts for insight. Also OOM is skipped for __GFP_THISNODE allocations, so we might also consider the same for nodemask-constrained allocations?

The patch checks whether it is the last node on the system, and if it is, then
don't clear the nid in the nodemask.

I'd rather see the allocation not OOM, and rely on the fallback in new_node_page() that doesn't have nodemask. But I suspect it might also make sense to treat empty nodemask as something unexpected and put some WARN_ON (instead of OOM) in the allocator.

I think it would be much easier to understand these kind of empty nodemask allocation failure with this WARN_ON(), how about something like this?

===
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a2214c6..57edf18 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3629,6 +3629,11 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
                .migratetype = gfpflags_to_migratetype(gfp_mask),
        };

+       if (nodemask && nodes_empty(*nodemask)) {
+               WARN_ON(1);
+               return NULL;
+       }
+
        if (cpusets_enabled()) {
                alloc_mask |= __GFP_HARDWALL;
                alloc_flags |= ALLOC_CPUSET;
===

If that’s ok, maybe I can send a separate patch for this?

Something like that, but please not in the hotpath. I think the earliest suitable place is in __alloc_pages_slowpath() after the get_page_from_freelist() fails. And probably the best way would be to do something like pr_warn("nodemask is empty") and then clear __GFP_NOWARN from gfp_mask and goto nopage.

Thanks, Vlastimil

Thanks, Zhong


Reported-by: John Allen <jallen@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Li Zhong <zhong@xxxxxxxxxxxxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
Fixes: 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline")

---
mm/memory_hotplug.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 41266dc..b58906b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1567,7 +1567,9 @@ static struct page *new_node_page(struct page *page, unsigned long private,
		return alloc_huge_page_node(page_hstate(compound_head(page)),
					next_node_in(nid, nmask));

-	node_clear(nid, nmask);
+	if (nid != next_node_in(nid, nmask))
+		node_clear(nid, nmask);
+
	if (PageHighMem(page)
	    || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
		gfp_mask |= __GFP_HIGHMEM;






--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]