My testing for the latest kernel supporting thp migration found out an infinite loop in offlining the memory block that is filled with shmem thps. We can get out of the loop with a signal, but kernel should return with failure in this case. What happens in the loop is that scan_movable_pages() repeats returning the same pfn without any progress. That's because page migration always fails for shmem thps. In memory offline code, memory blocks containing unmovable pages should be prevented from being offline targets by has_unmovable_pages() inside start_isolate_page_range(). So this patch simply does it for non-anonymous thps. Fixes: commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too early") Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx # v4.15+ --- Actually I'm not sure which commit we should set to "Fixes" tag. Commit 8135d8926c08 ("mm: memory_hotplug: memory hotremove supports thp migration") failed to introduce the code that this patch is suggesting. But the infinite loop became visible by commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too early") where retry code was removed. --- mm/page_alloc.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git v4.16-rc7-mmotm-2018-03-28-16-05/mm/page_alloc.c v4.16-rc7-mmotm-2018-03-28-16-05_patched/mm/page_alloc.c index 905db9d..dbbe8fa 100644 --- v4.16-rc7-mmotm-2018-03-28-16-05/mm/page_alloc.c +++ v4.16-rc7-mmotm-2018-03-28-16-05_patched/mm/page_alloc.c @@ -7682,6 +7682,18 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, if (!PageLRU(page)) found++; + + /* + * Thp migration is available only for anonymous thps for now. + * So let's consider non-anonymous thps as unmovable pages. + */ + if (PageTransCompound(page)) { + if (PageAnon(page)) + iter += (1 << page_order(page)) - 1; + else + found++; + } + /* * If there are RECLAIMABLE pages, we need to check * it. But now, memory offline itself doesn't call -- 2.7.0