Re: [PATCH 0/6] mm: make movable onlining suck less

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 04-04-17 16:43:39, Reza Arbab wrote:
> On Tue, Apr 04, 2017 at 09:41:22PM +0200, Michal Hocko wrote:
> >On Tue 04-04-17 13:30:13, Reza Arbab wrote:
> >>I think I found another edge case.  You
> >>get an oops when removing all of a node's memory:
> >>
> >>__nr_to_section
> >>__pfn_to_section
> >>find_biggest_section_pfn
> >>shrink_pgdat_span
> >>__remove_zone
> >>__remove_section
> >>__remove_pages
> >>arch_remove_memory
> >>remove_memory
> >
> >Is this something new or an old issue? I believe the state after the
> >online should be the same as before. So if you onlined the full node
> >then there shouldn't be any difference. Let me have a look...
> 
> It's new. Without this patchset, I can repeatedly
> add_memory()->online_movable->offline->remove_memory() all of a node's
> memory.

OK, I know what is going on here.
shrink_pgdat_span: start_pfn=0x1ff00, end_pfn=0x20000, pgdat_start_pfn=0x0, pgdat_end_pfn=0x20000
[...]
find_biggest_section_pfn loop: pfn=0xff, sec_nr = 0x0

so the node starts at pfn 0 while we are trying to remove range starting
from pfn=255 (1MB). Rather than going with find_smallest_section_pfn we
go with the other branch and that underflows as already mentioned. I
seriously doubt that the node really starts at pfn 0. I am not sure
which arch you are testing on but I believe we reserve the lowest
address pfn range on all aches. The previous code presumably handled
that properly because the original node/zone has started at the lowest
possible address and the zone shifting then preserves that.

My code doesn't do that though. So I guess I have to sanitize. Does this
help? Please drop the "mm, memory_hotplug: get rid of zone/node
shrinking" patch.
---
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index acf2b5eb5ecb..2c5613d19eb6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -750,6 +750,15 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	int ret;
 	struct memory_notify arg;
 
+	do {
+		if (pfn_valid(pfn))
+			break;
+		pfn++;
+	} while (--nr_pages > 0);
+
+	if (!nr_pages)
+		return -EINVAL;
+
 	nid = pfn_to_nid(pfn);
 	if (!allow_online_pfn_range(nid, pfn, nr_pages, online_type))
 		return -EINVAL;
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux