Subject: + migrate-add-migrate_entry_wait_huge.patch added to -mm tree To: n-horiguchi@xxxxxxxxxxxxx,andi@xxxxxxxxxxxxxx,kosaki.motohiro@xxxxxxxxxxxxxx,liwanp@xxxxxxxxxxxxxxxxxx,mgorman@xxxxxxx,mhocko@xxxxxxx,riel@xxxxxxxxxx,stable@xxxxxxxxxxxxxxx From: akpm@xxxxxxxxxxxxxxxxxxxx Date: Fri, 31 May 2013 12:30:20 -0700 The patch titled Subject: migrate: add migrate_entry_wait_huge() has been added to the -mm tree. Its filename is migrate-add-migrate_entry_wait_huge.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Subject: migrate: add migrate_entry_wait_huge() When we have a page fault for the address which is backed by a hugepage under migration, the kernel can't wait correctly and do busy looping on hugepage fault until the migration finishes. This is because pte_offset_map_lock() can't get a correct migration entry or a correct page table lock for hugepage. This patch introduces migration_entry_wait_huge() to solve this. Note that the caller, hugetlb_fault(), gets the pointer to the "leaf" entry with huge_pte_offset() inside which all the arch-dependency of the page table structure are. So migration_entry_wait_huge() and __migration_entry_wait() are free from arch-dependency. Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Reviewed-by: Rik van Riel <riel@xxxxxxxxxx> Reviewed-by: Wanpeng Li <liwanp@xxxxxxxxxxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Andi Kleen <andi@xxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxx> Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> [2.6.35+] Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/swapops.h | 3 +++ mm/hugetlb.c | 2 +- mm/migrate.c | 23 ++++++++++++++++++----- 3 files changed, 22 insertions(+), 6 deletions(-) diff -puN include/linux/swapops.h~migrate-add-migrate_entry_wait_huge include/linux/swapops.h --- a/include/linux/swapops.h~migrate-add-migrate_entry_wait_huge +++ a/include/linux/swapops.h @@ -137,6 +137,7 @@ static inline void make_migration_entry_ extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address); +extern void migration_entry_wait_huge(struct mm_struct *mm, pte_t *pte); #else #define make_migration_entry(page, write) swp_entry(0, 0) @@ -148,6 +149,8 @@ static inline int is_migration_entry(swp static inline void make_migration_entry_read(swp_entry_t *entryp) { } static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { } +static inline void migration_entry_wait_huge(struct mm_struct *mm, + pte_t *pte) { } static inline int is_write_migration_entry(swp_entry_t entry) { return 0; diff -puN mm/hugetlb.c~migrate-add-migrate_entry_wait_huge mm/hugetlb.c --- a/mm/hugetlb.c~migrate-add-migrate_entry_wait_huge +++ a/mm/hugetlb.c @@ -2851,7 +2851,7 @@ int hugetlb_fault(struct mm_struct *mm, if (ptep) { entry = huge_ptep_get(ptep); if (unlikely(is_hugetlb_entry_migration(entry))) { - migration_entry_wait(mm, (pmd_t *)ptep, address); + migration_entry_wait_huge(mm, ptep); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) return VM_FAULT_HWPOISON_LARGE | diff -puN mm/migrate.c~migrate-add-migrate_entry_wait_huge mm/migrate.c --- a/mm/migrate.c~migrate-add-migrate_entry_wait_huge +++ a/mm/migrate.c @@ -200,15 +200,14 @@ static void remove_migration_ptes(struct * get to the page and wait until migration is finished. * When we return from this function the fault will be retried. */ -void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, - unsigned long address) +static void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, + spinlock_t *ptl) { - pte_t *ptep, pte; - spinlock_t *ptl; + pte_t pte; swp_entry_t entry; struct page *page; - ptep = pte_offset_map_lock(mm, pmd, address, &ptl); + spin_lock(ptl); pte = *ptep; if (!is_swap_pte(pte)) goto out; @@ -236,6 +235,20 @@ out: pte_unmap_unlock(ptep, ptl); } +void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, + unsigned long address) +{ + spinlock_t *ptl = pte_lockptr(mm, pmd); + pte_t *ptep = pte_offset_map(pmd, address); + __migration_entry_wait(mm, ptep, ptl); +} + +void migration_entry_wait_huge(struct mm_struct *mm, pte_t *pte) +{ + spinlock_t *ptl = huge_pte_lockptr(mm, pte); + __migration_entry_wait(mm, pte, ptl); +} + #ifdef CONFIG_BLOCK /* Returns true if all buffers are successfully locked */ static bool buffer_migrate_lock_buffers(struct buffer_head *head, _ Patches currently in -mm which might be from n-horiguchi@xxxxxxxxxxxxx are mm-memory-failurec-fix-memory-leak-in-successful-soft-offlining.patch hugetlbfs-support-split-page-table-lock.patch migrate-add-migrate_entry_wait_huge.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html