The patch titled Subject: mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* has been added to the -mm tree. Its filename is mm-check-__pg_hwpoison-separately-from-page_flags_check_at_.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-check-__pg_hwpoison-separately-from-page_flags_check_at_.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-check-__pg_hwpoison-separately-from-page_flags_check_at_.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Subject: mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't free target page in successful page migration") was not closed completely, because that can happen not only for soft-offline, but also for hard-offline. Consider that a slab page is about to be freed into buddy pool, and then an uncorrected memory error hits the page just after entering __free_one_page(), then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered, despite the fact that it's not necessary because the data on the affected page is not consumed. To solve it, this patch drops __PG_HWPOISON from page flag checks at allocation/free time. I think it's justified because __PG_HWPOISON flags is defined to prevent the page from being reused, and setting it outside the page's alloc-free cycle is a designed behavior (not a bug.) For recent months, I was annoyed about BUG_ON when soft-offlined page remains on lru cache list for a while, which is avoided by calling put_page() instead of putback_lru_page() in page migration's success path. This means that this patch reverts a major change from commit add05cecef80 about the new refcounting rule of soft-offlined pages, so "reuse window" revives. This will be closed by a subsequent patch. Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Cc: Andi Kleen <andi@xxxxxxxxxxxxxx> Cc: Dean Nelson <dnelson@xxxxxxxxxx> Cc: Tony Luck <tony.luck@xxxxxxxxx> Cc: "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/page-flags.h | 10 +++++++--- mm/huge_memory.c | 7 +------ mm/migrate.c | 5 ++++- mm/page_alloc.c | 4 ++++ 4 files changed, 16 insertions(+), 10 deletions(-) diff -puN include/linux/page-flags.h~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ include/linux/page-flags.h --- a/include/linux/page-flags.h~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ +++ a/include/linux/page-flags.h @@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemall 1 << PG_private | 1 << PG_private_2 | \ 1 << PG_writeback | 1 << PG_reserved | \ 1 << PG_slab | 1 << PG_swapcache | 1 << PG_active | \ - 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \ + 1 << PG_unevictable | __PG_MLOCKED | \ __PG_COMPOUND_LOCK) /* * Flags checked when a page is prepped for return by the page allocator. - * Pages being prepped should not have any flags set. It they are set, + * Pages being prepped should not have these flags set. It they are set, * there has been a kernel bug or struct page corruption. + * + * __PG_HWPOISON is exceptional because it needs to be kept beyond page's + * alloc-free cycle to prevent from reusing the page. */ -#define PAGE_FLAGS_CHECK_AT_PREP ((1 << NR_PAGEFLAGS) - 1) +#define PAGE_FLAGS_CHECK_AT_PREP \ + (((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON) #define PAGE_FLAGS_PRIVATE \ (1 << PG_private | 1 << PG_private_2) diff -puN mm/huge_memory.c~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ mm/huge_memory.c --- a/mm/huge_memory.c~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ +++ a/mm/huge_memory.c @@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(s /* after clearing PageTail the gup refcount can be released */ smp_mb__after_atomic(); - /* - * retain hwpoison flag of the poisoned tail page: - * fix for the unsuitable process killed on Guest Machine(KVM) - * by the memory-failure. - */ - page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON; + page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; page_tail->flags |= (page->flags & ((1L << PG_referenced) | (1L << PG_swapbacked) | diff -puN mm/migrate.c~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ mm/migrate.c --- a/mm/migrate.c~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ +++ a/mm/migrate.c @@ -950,7 +950,10 @@ out: list_del(&page->lru); dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); - if (reason != MR_MEMORY_FAILURE) + /* Soft-offlined page shouldn't go through lru cache list */ + if (reason == MR_MEMORY_FAILURE) + put_page(page); + else putback_lru_page(page); } diff -puN mm/page_alloc.c~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ mm/page_alloc.c --- a/mm/page_alloc.c~mm-check-__pg_hwpoison-separately-from-page_flags_check_at_ +++ a/mm/page_alloc.c @@ -1296,6 +1296,10 @@ static inline int check_new_page(struct bad_reason = "non-NULL mapping"; if (unlikely(atomic_read(&page->_count) != 0)) bad_reason = "nonzero _count"; + if (unlikely(page->flags & __PG_HWPOISON)) { + bad_reason = "HWPoisoned (hardware-corrupted)"; + bad_flags = __PG_HWPOISON; + } if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) { bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set"; bad_flags = PAGE_FLAGS_CHECK_AT_PREP; _ Patches currently in -mm which might be from n-horiguchi@xxxxxxxxxxxxx are mm-memory-failure-unlock_page-before-put_page.patch mm-memory-failure-fix-race-in-counting-num_poisoned_pages.patch mm-memory-failure-give-up-error-handling-for-non-tail-refcounted-thp.patch mm-check-__pg_hwpoison-separately-from-page_flags_check_at_.patch mm-memory-failure-set-pagehwpoison-before-migrate_pages.patch hugetlb-make-the-function-vma_shareable-bool.patch pagemap-check-permissions-and-capabilities-at-open-time.patch pagemap-switch-to-the-new-format-and-do-some-cleanup.patch pagemap-rework-hugetlb-and-thp-report.patch pagemap-hide-physical-addresses-from-non-privileged-users.patch pagemap-add-mmap-exclusive-bit-for-marking-pages-mapped-only-here.patch pagemap-update-documentation.patch pagemap-update-documentation-fix.patch mm-page_isolation-remove-bogus-tests-for-isolated-pages.patch mm-rename-and-move-get-set_freepage_migratetype.patch mm-hugetlb-add-cache-of-descriptors-to-resv_map-for-region_add.patch mm-hugetlb-add-region_del-to-delete-a-specific-range-of-entries.patch mm-hugetlb-expose-hugetlb-fault-mutex-for-use-by-fallocate.patch hugetlbfs-hugetlb_vmtruncate_list-needs-to-take-a-range-to-delete.patch hugetlbfs-truncate_hugepages-takes-a-range-of-pages.patch mm-hugetlb-vma_has_reserves-needs-to-handle-fallocate-hole-punch.patch mm-hugetlb-alloc_huge_page-handle-areas-hole-punched-by-fallocate.patch hugetlbfs-new-huge_add_to_page_cache-helper-routine.patch hugetlbfs-add-hugetlbfs_fallocate.patch hugetlbfs-add-hugetlbfs_fallocate-fix.patch mm-madvise-allow-remove-operation-for-hugetlbfs.patch mempolicy-get-rid-of-duplicated-check-for-vmavm_pfnmap-in-queue_pages_range.patch mm-page_isolation-make-set-unset_migratetype_isolate-file-local.patch mm-compaction-more-robust-check-for-scanners-meeting.patch mm-compaction-simplify-handling-restart-position-in-free-pages-scanner.patch mm-compaction-encapsulate-resetting-cached-scanner-positions.patch mm-compaction-skip-compound-pages-by-order-in-free-scanner.patch page-flags-trivial-cleanup-for-pagetrans-helpers.patch page-flags-introduce-page-flags-policies-wrt-compound-pages.patch page-flags-define-pg_locked-behavior-on-compound-pages.patch page-flags-define-behavior-of-fs-io-related-flags-on-compound-pages.patch page-flags-define-behavior-of-lru-related-flags-on-compound-pages.patch page-flags-define-behavior-slb-related-flags-on-compound-pages.patch page-flags-define-behavior-of-xen-related-flags-on-compound-pages.patch page-flags-define-pg_reserved-behavior-on-compound-pages.patch page-flags-define-pg_swapbacked-behavior-on-compound-pages.patch page-flags-define-pg_swapcache-behavior-on-compound-pages.patch page-flags-define-pg_mlocked-behavior-on-compound-pages.patch page-flags-define-pg_uncached-behavior-on-compound-pages.patch page-flags-define-pg_uptodate-behavior-on-compound-pages.patch page-flags-look-on-head-page-if-the-flag-is-encoded-in-page-mapping.patch mm-sanitize-page-mapping-for-tail-pages.patch do_shared_fault-check-that-mmap_sem-is-held.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html