On Wed 11-04-18 12:27:39, Andrew Morton wrote: > On Wed, 11 Apr 2018 11:26:11 +0200 Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > On Fri 06-04-18 03:07:11, Naoya Horiguchi wrote: > > > >From e31ec037701d1cc76b26226e4b66d8c783d40889 Mon Sep 17 00:00:00 2001 > > > From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > > > Date: Fri, 6 Apr 2018 10:58:35 +0900 > > > Subject: [PATCH] mm: enable thp migration for shmem thp > > > > > > My testing for the latest kernel supporting thp migration showed an > > > infinite loop in offlining the memory block that is filled with shmem > > > thps. We can get out of the loop with a signal, but kernel should > > > return with failure in this case. > > > > > > What happens in the loop is that scan_movable_pages() repeats returning > > > the same pfn without any progress. That's because page migration always > > > fails for shmem thps. > > > > > > In memory offline code, memory blocks containing unmovable pages should > > > be prevented from being offline targets by has_unmovable_pages() inside > > > start_isolate_page_range(). > > > > > > So it's possible to change migratability > > > for non-anonymous thps to avoid the issue, but it introduces more complex > > > and thp-specific handling in migration code, so it might not good. > > > > > > So this patch is suggesting to fix the issue by enabling thp migration > > > for shmem thp. Both of anon/shmem thp are migratable so we don't need > > > precheck about the type of thps. > > > > > > Fixes: commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too early") > > > Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx # v4.15+ > > > > I do not really feel qualified to give my ack but this is the right > > approach for the fix. We simply do expect that LRU pages are migrateable > > as well as zone_movable pages. > > > > Andrew, do you plan to take it (with Kirill's ack). > > > > Sure. What happened with "Michal's fix in another email" > (https://lkml.kernel.org/r/20180406051452.GB23467@xxxxxxxxxxxxxxxxxxxxxxxxxxxx)? I guess you meant http://lkml.kernel.org/r/20180405190405.GS6312@xxxxxxxxxxxxxx Well, that would be a workaround in case we didn't have a proper fix. It is much simpler but it wouldn't make backporting to older kernels any easier because it depends on other non-trivial changes you already have in your tree. So having a full THP pagecache migration support is preferred of course. -- Michal Hocko SUSE Labs