On Wed, May 23, 2018 at 10:16:09AM +0200, Michal Hocko wrote: > On Wed 23-05-18 09:52:39, Michal Hocko wrote: > [...] > > Yeah, the current code is far from optimal. We > > used to have a retry count but that one was removed exactly because of > > premature failures. There are three things here > > 1) zone_movable should contain any bootmem or otherwise non-migrateable > > pages > > 2) start_isolate_page_range should fail when seeing such pages - maybe > > has_unmovable_pages is overly optimistic and it should check all > > pages even in movable zones. > > 3) migrate_pages should really tell us whether the failure is temporal > > or permanent. I am not sure we can do that easily though. > > 2) should be the most simple one for now. Could you give it a try? Btw. > the exact configuration that led to boothmem pages in zone_movable would > be really appreciated: I will try it out and I will paste the config. > --- > From 6aa144a9b1c01255c89a4592221d706ccc4b4eea Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@xxxxxxxx> > Date: Wed, 23 May 2018 10:04:20 +0200 > Subject: [PATCH] mm, memory_hotplug: make has_unmovable_pages more robust > > Oscar has reported: > : Due to an unfortunate setting with movablecore, memblocks containing bootmem > : memory (pages marked by get_page_bootmem()) ended up marked in zone_movable. > : So while trying to remove that memory, the system failed in do_migrate_range > : and __offline_pages never returned. > > This is because we rely on start_isolate_page_range resp. has_unmovable_pages > to do their jobb. The first one isolates the whole range to be offlined > so that we do not allocate from it anymore and the later makes sure we > are not stumbling over non-migrateable pages. > > has_unmovable_pages is overly optimistic, however. It doesn't check all > the pages if we are withing zone_movable because we rely that those > pages will be always migrateable. As it turns out we are still not > perfect there. While bootmem pages in zonemovable sound like a clear bug > which should be fixed let's remove the optimization for now and warn if > we encounter unmovable pages in zone_movable in the meantime. That > should help for now at least. > > Btw. this wasn't a real problem until 72b39cfc4d75 ("mm, memory_hotplug: > do not fail offlining too early") because we used to have a small number > of retries and then failed. This turned out to be too fragile though. > > Reported-by: Oscar Salvador <osalvador@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > --- > mm/page_alloc.c | 16 ++++++++++------ > 1 file changed, 10 insertions(+), 6 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3c6f4008ea55..b9a45753244d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -7629,11 +7629,12 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > unsigned long pfn, iter, found; > > /* > - * For avoiding noise data, lru_add_drain_all() should be called > - * If ZONE_MOVABLE, the zone never contains unmovable pages > + * TODO we could make this much more efficient by not checking every > + * page in the range if we know all of them are in MOVABLE_ZONE and > + * that the movable zone guarantees that pages are migratable but > + * the later is not the case right now unfortunatelly. E.g. movablecore > + * can still lead to having bootmem allocations in zone_movable. > */ > - if (zone_idx(zone) == ZONE_MOVABLE) > - return false; > > /* > * CMA allocations (alloc_contig_range) really need to mark isolate > @@ -7654,7 +7655,7 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > page = pfn_to_page(check); > > if (PageReserved(page)) > - return true; > + goto unmovable; > > /* > * Hugepages are not in LRU lists, but they're movable. > @@ -7704,9 +7705,12 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > * page at boot. > */ > if (found > count) > - return true; > + goto unmovable; > } > return false; > +unmovable: > + WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE); > + return true; > } > > #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || defined(CONFIG_CMA) > -- > 2.17.0 Thanks Oscar Salvador