Re: FAILED: patch "[PATCH] mm: migration: fix migration of huge PMD shared pages" failed to apply to 4.4-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 11-10-18 15:42:59, Mike Kravetz wrote:
> From: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> 
> mm: migration: fix migration of huge PMD shared pages
> 
> commit 017b1660df89f5fb4bfe66c34e35f7d2031100c7 upstream
> 
> The page migration code employs try_to_unmap() to try and unmap the
> source page.  This is accomplished by using rmap_walk to find all
> vmas where the page is mapped.  This search stops when page mapcount
> is zero.  For shared PMD huge pages, the page map count is always 1
> no matter the number of mappings.  Shared mappings are tracked via
> the reference count of the PMD page.  Therefore, try_to_unmap stops
> prematurely and does not completely unmap all mappings of the source
> page.
> 
> This problem can result is data corruption as writes to the original
> source page can happen after contents of the page are copied to the
> target page.  Hence, data is lost.
> 
> This problem was originally seen as DB corruption of shared global
> areas after a huge page was soft offlined due to ECC memory errors.
> DB developers noticed they could reproduce the issue by (hotplug)
> offlining memory used to back huge pages.  A simple testcase can
> reproduce the problem by creating a shared PMD mapping (note that
> this must be at least PUD_SIZE in size and PUD_SIZE aligned (1GB on
> x86)), and using migrate_pages() to migrate process pages between
> nodes while continually writing to the huge pages being migrated.
> 
> To fix, have the try_to_unmap_one routine check for huge PMD sharing
> by calling huge_pmd_unshare for hugetlbfs huge pages.  If it is a
> shared mapping it will be 'unshared' which removes the page table
> entry and drops the reference on the PMD page.  After this, flush
> caches and TLB.
> 
> mmu notifiers are called before locking page tables, but we can not
> be sure of PMD sharing until page tables are locked.  Therefore,
> check for the possibility of PMD sharing before locking so that
> notifiers can prepare for the worst possible case.
> 
> Fixes: 39dde65c9940 ("shared page table for hugetlb page")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>

Acked-by: Michal Hocko <mhocko@xxxxxxxx>

with one minor nit. Please document why we are using the range
invalidation (_start/_stop) only for hugetlb pages. Because this is a
divergence from the upstream and we should better be explicit about
that.

I've been testing without that in place and the memory corruption hasn't
reproduced so far. I cannot give 100% confidence, though, due to some
unrelated road blocks when testing. I might be lucky to not really have
any notifier consumer for my workload as well.

Seeing an Acked-by from Jerome on this would be reassuring of course.
-- 
Michal Hocko
SUSE Labs



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux