The patch titled Subject: mm/hugetlb: fix suboptimal migration/hwpoisoned entry check has been added to the -mm tree. Its filename is mm-hugetlb-fix-suboptimal-migration-hwpoisoned-entry-check.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-fix-suboptimal-migration-hwpoisoned-entry-check.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-fix-suboptimal-migration-hwpoisoned-entry-check.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Subject: mm/hugetlb: fix suboptimal migration/hwpoisoned entry check Currently hugetlb_fault() checks at first whether pte of the faulted address is a migration or hwpoisoned entry, which means that we call huge_ptep_get() twice in single hugetlb_fault(). This is not optimized. The reason of this approach is that without checking at first, huge_pte_alloc() can trigger BUG_ON() because pmd_huge() returned false for non-present hugetlb entry. With a previous patch in this series, pmd_huge() becomes to return true for non-present entry, so we no longer need this dirty workaround. Let's move the checking code to the proper place. Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: James Hogan <james.hogan@xxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Luiz Capitulino <lcapitulino@xxxxxxxxxx> Cc: Nishanth Aravamudan <nacc@xxxxxxxxxxxxxxxxxx> Cc: Lee Schermerhorn <lee.schermerhorn@xxxxxx> Cc: Steve Capper <steve.capper@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff -puN mm/hugetlb.c~mm-hugetlb-fix-suboptimal-migration-hwpoisoned-entry-check mm/hugetlb.c --- a/mm/hugetlb.c~mm-hugetlb-fix-suboptimal-migration-hwpoisoned-entry-check +++ a/mm/hugetlb.c @@ -3136,20 +3136,10 @@ int hugetlb_fault(struct mm_struct *mm, struct hstate *h = hstate_vma(vma); struct address_space *mapping; int need_wait_lock = 0; + int need_wait_migration = 0; address &= huge_page_mask(h); - ptep = huge_pte_offset(mm, address); - if (ptep) { - entry = huge_ptep_get(ptep); - if (unlikely(is_hugetlb_entry_migration(entry))) { - migration_entry_wait_huge(vma, mm, ptep); - return 0; - } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) - return VM_FAULT_HWPOISON_LARGE | - VM_FAULT_SET_HINDEX(hstate_index(h)); - } - ptep = huge_pte_alloc(mm, address, huge_page_size(h)); if (!ptep) return VM_FAULT_OOM; @@ -3176,12 +3166,16 @@ int hugetlb_fault(struct mm_struct *mm, /* * entry could be a migration/hwpoison entry at this point, so this * check prevents the kernel from going below assuming that we have - * a active hugepage in pagecache. This goto expects the 2nd page fault, - * and is_hugetlb_entry_(migration|hwpoisoned) check will properly - * handle it. + * a active hugepage in pagecache. */ - if (!pte_present(entry)) + if (!pte_present(entry)) { + if (is_hugetlb_entry_migration(entry)) + need_wait_migration = 1; + else if (is_hugetlb_entry_hwpoisoned(entry)) + ret = VM_FAULT_HWPOISON_LARGE | + VM_FAULT_SET_HINDEX(hstate_index(h)); goto out_mutex; + } /* * If we are going to COW the mapping later, we examine the pending @@ -3247,6 +3241,8 @@ out_ptl: } out_mutex: mutex_unlock(&htlb_fault_mutex_table[hash]); + if (need_wait_migration) + migration_entry_wait_huge(vma, mm, ptep); /* * Generally it's safe to hold refcount during waiting page lock. But * here we just wait to defer the next page fault to avoid busy loop and _ Patches currently in -mm which might be from n-horiguchi@xxxxxxxxxxxxx are mm-add-kpf_zero_page-flag-for-proc-kpageflags.patch mm-hugetlb-reduce-arch-dependent-code-around-follow_huge_.patch mm-hugetlb-pmd_huge-returns-true-for-non-present-hugepage.patch mm-hugetlb-take-page-table-lock-in-follow_huge_pmd.patch mm-hugetlb-fix-getting-refcount-0-page-in-hugetlb_fault.patch mm-hugetlb-add-migration-hwpoisoned-entry-check-in-hugetlb_change_protection.patch mm-hugetlb-add-migration-entry-check-in-__unmap_hugepage_range.patch mm-hugetlb-fix-suboptimal-migration-hwpoisoned-entry-check.patch mm-hugetlb-cleanup-and-rename-is_hugetlb_entry_migrationhwpoisoned.patch do_shared_fault-check-that-mmap_sem-is-held.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html