Subject: + mm-hwpoison-fix-race-against-poison-thp.patch added to -mm tree To: liwanp@xxxxxxxxxxxxxxxxxx,andi@xxxxxxxxxxxxxx,n-horiguchi@xxxxxxxxxxxxx,tony.luck@xxxxxxxxx From: akpm@xxxxxxxxxxxxxxxxxxxx Date: Mon, 26 Aug 2013 14:42:39 -0700 The patch titled Subject: mm/hwpoison: fix race against poison thp has been added to the -mm tree. Its filename is mm-hwpoison-fix-race-against-poison-thp.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-hwpoison-fix-race-against-poison-thp.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-hwpoison-fix-race-against-poison-thp.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Wanpeng Li <liwanp@xxxxxxxxxxxxxxxxxx> Subject: mm/hwpoison: fix race against poison thp There is a race between hwpoison page and unpoison page, memory_failure set the page hwpoison and increase num_poisoned_pages without hold page lock, and one page count will be accounted against thp for num_poisoned_pages. However, unpoison can occur before memory_failure hold page lock and split transparent hugepage, unpoison will decrease num_poisoned_pages by 1 << compound_order since memory_failure has not yet split transparent hugepage with page lock held. That means we account one page for hwpoison and 1 << compound_order for unpoison. This patch fix it by inserting a PageTransHuge check before doing TestClearPageHWPoison, unpoison failed without clearing PageHWPoison and decreasing num_poisoned_pages. A B memory_failue TestSetPageHWPoison(p); if (PageHuge(p)) nr_pages = 1 << compound_order(hpage); else nr_pages = 1; atomic_long_add(nr_pages, &num_poisoned_pages); unpoison_memory nr_pages = 1<< compound_trans_order(page); if(TestClearPageHWPoison(p)) atomic_long_sub(nr_pages, &num_poisoned_pages); lock page if (!PageHWPoison(p)) unlock page and return hwpoison_user_mappings if (PageTransHuge(hpage)) split_huge_page(hpage); Signed-off-by: Wanpeng Li <liwanp@xxxxxxxxxxxxxxxxxx> Suggested-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Cc: Andi Kleen <andi@xxxxxxxxxxxxxx> Cc: Tony Luck <tony.luck@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memory-failure.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff -puN mm/memory-failure.c~mm-hwpoison-fix-race-against-poison-thp mm/memory-failure.c --- a/mm/memory-failure.c~mm-hwpoison-fix-race-against-poison-thp +++ a/mm/memory-failure.c @@ -1339,6 +1339,16 @@ int unpoison_memory(unsigned long pfn) return 0; } + /* + * unpoison_memory() can encounter thp only when the thp is being + * worked by memory_failure() and the page lock is not held yet. + * In such case, we yield to memory_failure() and make unpoison fail. + */ + if (PageTransHuge(page)) { + pr_info("MCE: Memory failure is now running on %#lx\n", pfn); + return 0; + } + nr_pages = 1 << compound_order(page); if (!get_page_unless_zero(page)) { _ Patches currently in -mm which might be from liwanp@xxxxxxxxxxxxxxxxxx are mm-zswapc-get-swapper-address_space-by-using-macro.patch mm-fix-potential-null-pointer-dereference.patch mm-hugetlb-move-up-the-code-which-check-availability-of-free-huge-page.patch mm-hugetlb-trivial-commenting-fix.patch mm-hugetlb-clean-up-alloc_huge_page.patch mm-hugetlb-fix-and-clean-up-node-iteration-code-to-alloc-or-free.patch mm-hugetlb-remove-redundant-list_empty-check-in-gather_surplus_pages.patch mm-hugetlb-do-not-use-a-page-in-page-cache-for-cow-optimization.patch mm-hugetlb-add-vm_noreserve-check-in-vma_has_reserves.patch mm-hugetlb-remove-decrement_hugepage_resv_vma.patch mm-hugetlb-decrement-reserve-count-if-vm_noreserve-alloc-page-cache.patch mm-vmalloc-remove-useless-variable-in-vmap_block.patch mm-vmalloc-use-well-defined-find_last_bit-func.patch mm-hotplug-verify-hotplug-memory-range.patch mm-hotplug-verify-hotplug-memory-range-fix.patch mm-hotplug-remove-stop_machine-from-try_offline_node.patch mm-hugetlb-protect-reserved-pages-when-soft-offlining-a-hugepage.patch mm-hugetlb-change-variable-name-reservations-to-resv.patch mm-hugetlb-fix-subpool-accounting-handling.patch mm-hugetlb-remove-useless-check-about-mapping-type.patch mm-hugetlb-grab-a-page_table_lock-after-page_cache_release.patch mm-hugetlb-return-a-reserved-page-to-a-reserved-pool-if-failed.patch mm-migrate-make-core-migration-code-aware-of-hugepage.patch mm-soft-offline-use-migrate_pages-instead-of-migrate_huge_page.patch migrate-add-hugepage-migration-code-to-migrate_pages.patch mm-migrate-add-hugepage-migration-code-to-move_pages.patch mm-mbind-add-hugepage-migration-code-to-mbind.patch mm-migrate-remove-vm_hugetlb-from-vma-flag-check-in-vma_migratable.patch mm-memory-hotplug-enable-memory-hotplug-to-handle-hugepage.patch mm-migrate-check-movability-of-hugepage-in-unmap_and_move_huge_page.patch mm-prepare-to-remove-proc-sys-vm-hugepages_treat_as_movable.patch mm-prepare-to-remove-proc-sys-vm-hugepages_treat_as_movable-v2.patch mm-mempolicy-rename-check_range-to-queue_pages_range.patch mm-sparse-introduce-alloc_usemap_and_memmap.patch mm-writeback-make-writeback_inodes_wb-static.patch mm-vmalloc-use-wrapper-function-get_vm_area_size-to-caculate-size-of-vm-area.patch hwpoison-always-unset-migrate_isolate-before-returning-from-soft_offline_page.patch mm-hwpoison-fix-loss-of-pg_dirty-for-errors-on-mlocked-pages.patch mm-hwpoison-dont-need-to-hold-compound-lock-for-hugetlbfs-page.patch mm-hwpoison-fix-race-against-poison-thp.patch mm-hwpoison-replace-atomic_long_sub-with-atomic_long_dec.patch mm-hwpoison-dont-set-migration-type-twice-to-avoid-holding-heavily-contend-zone-lock.patch mm-hwpoison-drop-forward-reference-declarations-__soft_offline_page.patch mm-hwpoison-add-to-madvise_hwpoison.patch mm-hwpoison-fix-memory-failure-still-holding-reference-count-after-unpoisoning-empty-zero-page.patch linux-next.patch mm-drop-actor-argument-of-do_generic_file_read.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html