On 26.09.19 11:12, Aneesh Kumar K.V wrote: > David Hildenbrand <david@xxxxxxxxxx> writes: > >> Let's limit shrinking to !ZONE_DEVICE so we can fix the current code. We >> should never try to touch the memmap of offline sections where we could >> have uninitialized memmaps and could trigger BUGs when calling >> page_to_nid() on poisoned pages. >> >> Stopping to shrink the ZONE_DEVICE is fine as set_zone_contiguous() cannot >> deal with ZONE_DEVICE either way. The zones will always be !contiguous >> already and zone shrinking is therefore of limited use. > > Can you add more details w.r.t ZONE_DEVICE? I can convert that to "There is no reliable way to distinguish an uninitialized memmap from an initialized memmap that belongs to ZONE_DEVICE, as we don't have anything like SECTION_IS_ONLINE we can use similar to pfn_to_online_section() for !ZONE_DEVICE memory. E.g., set_zone_contiguous() similarly relies on pfn_to_online_section() and will therefore never set a ZONE_DEVICE zone consecutive. Stopping to shrink the ZONE_DEVICE therefore results in no observable changes, besides /proc/zoneinfo indicating different boundaries - something we can totally live with." > >> >> Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory >> hotplug"), the memmap was initialized with 0 and the node with the >> right value. So the zone might be wrong but not garbage. After that >> commit, both the zone and the node will be garbage when touching >> uninitialized memmaps. >> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >> Cc: Oscar Salvador <osalvador@xxxxxxx> >> Cc: David Hildenbrand <david@xxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxx> >> Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> >> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> >> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug") >> Reported-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> >> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >> --- >> mm/memory_hotplug.c | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index ddba8d786e4a..e0d1f6a9dfeb 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -331,7 +331,7 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone, >> unsigned long end_pfn) >> { >> for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUBSECTION) { >> - if (unlikely(!pfn_valid(start_pfn))) >> + if (unlikely(!pfn_to_online_page(start_pfn))) >> continue; >> >> if (unlikely(pfn_to_nid(start_pfn) != nid)) >> @@ -356,7 +356,7 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone, >> /* pfn is the end pfn of a memory section. */ >> pfn = end_pfn - 1; >> for (; pfn >= start_pfn; pfn -= PAGES_PER_SUBSECTION) { >> - if (unlikely(!pfn_valid(pfn))) >> + if (unlikely(!pfn_to_online_page(pfn))) >> continue; >> >> if (unlikely(pfn_to_nid(pfn) != nid)) >> @@ -415,7 +415,7 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, >> */ >> pfn = zone_start_pfn; >> for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUBSECTION) { >> - if (unlikely(!pfn_valid(pfn))) >> + if (unlikely(!pfn_to_online_page(pfn))) >> continue; >> >> if (page_zone(pfn_to_page(pfn)) != zone) >> @@ -463,6 +463,14 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn, >> struct pglist_data *pgdat = zone->zone_pgdat; >> unsigned long flags; >> >> + /* >> + * Zone shrinking code cannot properly deal with ZONE_DEVICE. So >> + * we will not try to shrink the zones - which is okay as >> + * set_zone_contiguous() cannot deal with ZONE_DEVICE either way. >> + */ >> + if (zone_idx(zone) == ZONE_DEVICE) >> + return; > > Can you describe here what is special about ZONE_DEVICE? I think adding more details to the description is sufficient, especially as this also applies to set_zone_contiguous(). Thanks! -- Thanks, David / dhildenb