The patch titled mm: optimize page_lock_anon_vma() fast-path has been added to the -mm tree. Its filename is mm-optimize-page_lock_anon_vma-fast-path.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find out what to do about this The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ ------------------------------------------------------ Subject: mm: optimize page_lock_anon_vma() fast-path From: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> Optimize the page_lock_anon_vma() fast path to be one atomic op, instead of two. Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> Cc: David Miller <davem@xxxxxxxxxxxxx> Cc: Martin Schwidefsky <schwidefsky@xxxxxxxxxx> Cc: Russell King <rmk@xxxxxxxxxxxxxxxx> Cc: Paul Mundt <lethal@xxxxxxxxxxxx> Cc: Jeff Dike <jdike@xxxxxxxxxxx> Cc: Richard Weinberger <richard@xxxxxx> Cc: Tony Luck <tony.luck@xxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> Cc: Nick Piggin <npiggin@xxxxxxxxx> Cc: Namhyung Kim <namhyung@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/rmap.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 82 insertions(+), 4 deletions(-) diff -puN mm/rmap.c~mm-optimize-page_lock_anon_vma-fast-path mm/rmap.c --- a/mm/rmap.c~mm-optimize-page_lock_anon_vma-fast-path +++ a/mm/rmap.c @@ -86,6 +86,29 @@ static inline struct anon_vma *anon_vma_ static inline void anon_vma_free(struct anon_vma *anon_vma) { VM_BUG_ON(atomic_read(&anon_vma->refcount)); + + /* + * Synchronize against page_lock_anon_vma() such that + * we can safely hold the lock without the anon_vma getting + * freed. + * + * Relies on the full mb implied by the atomic_dec_and_test() from + * put_anon_vma() against the acquire barrier implied by + * mutex_trylock() from page_lock_anon_vma(). This orders: + * + * page_lock_anon_vma() VS put_anon_vma() + * mutex_trylock() atomic_dec_and_test() + * LOCK MB + * atomic_read() mutex_is_locked() + * + * LOCK should suffice since the actual taking of the lock must + * happen _before_ what follows. + */ + if (mutex_is_locked(&anon_vma->root->mutex)) { + anon_vma_lock(anon_vma); + anon_vma_unlock(anon_vma); + } + kmem_cache_free(anon_vma_cachep, anon_vma); } @@ -372,20 +395,75 @@ out: return anon_vma; } +/* + * Similar to page_get_anon_vma() except it locks the anon_vma. + * + * Its a little more complex as it tries to keep the fast path to a single + * atomic op -- the trylock. If we fail the trylock, we fall back to getting a + * reference like with page_get_anon_vma() and then block on the mutex. + */ struct anon_vma *page_lock_anon_vma(struct page *page) { - struct anon_vma *anon_vma = page_get_anon_vma(page); + struct anon_vma *anon_vma = NULL; + unsigned long anon_mapping; - if (anon_vma) - anon_vma_lock(anon_vma); + rcu_read_lock(); + anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping); + if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON) + goto out; + if (!page_mapped(page)) + goto out; + + anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON); + if (mutex_trylock(&anon_vma->root->mutex)) { + /* + * If we observe a !0 refcount, then holding the lock ensures + * the anon_vma will not go away, see __put_anon_vma(). + */ + if (!atomic_read(&anon_vma->refcount)) { + anon_vma_unlock(anon_vma); + anon_vma = NULL; + } + goto out; + } + + /* trylock failed, we got to sleep */ + if (!atomic_inc_not_zero(&anon_vma->refcount)) { + anon_vma = NULL; + goto out; + } + if (!page_mapped(page)) { + put_anon_vma(anon_vma); + anon_vma = NULL; + goto out; + } + + /* we pinned the anon_vma, its safe to sleep */ + rcu_read_unlock(); + anon_vma_lock(anon_vma); + + if (atomic_dec_and_test(&anon_vma->refcount)) { + /* + * Oops, we held the last refcount, release the lock + * and bail -- can't simply use put_anon_vma() because + * we'll deadlock on the anon_vma_lock() recursion. + */ + anon_vma_unlock(anon_vma); + __put_anon_vma(anon_vma); + anon_vma = NULL; + } + + return anon_vma; + +out: + rcu_read_unlock(); return anon_vma; } void page_unlock_anon_vma(struct anon_vma *anon_vma) { anon_vma_unlock(anon_vma); - put_anon_vma(anon_vma); } /* _ Patches currently in -mm which might be from a.p.zijlstra@xxxxxxxxx are linux-next.patch net-convert-%p-usage-to-%pk.patch mm-mmu_gather-rework.patch mm-mmu_gather-rework-fix.patch powerpc-mmu_gather-rework.patch sparc-mmu_gather-rework.patch s390-mmu_gather-rework.patch arm-mmu_gather-rework.patch sh-mmu_gather-rework.patch ia64-mmu_gather-rework.patch um-mmu_gather-rework.patch mm-now-that-all-old-mmu_gather-code-is-gone-remove-the-storage.patch mm-powerpc-move-the-rcu-page-table-freeing-into-generic-code.patch mm-extended-batches-for-generic-mmu_gather.patch lockdep-mutex-provide-mutex_lock_nest_lock.patch mm-remove-i_mmap_lock-lockbreak.patch mm-convert-i_mmap_lock-to-a-mutex.patch mm-revert-page_lock_anon_vma-lock-annotation.patch mm-improve-page_lock_anon_vma-comment.patch mm-use-refcounts-for-page_lock_anon_vma.patch mm-convert-anon_vma-lock-to-a-mutex.patch mm-optimize-page_lock_anon_vma-fast-path.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html