On Fri, 2011-06-17 at 20:18 +0200, Peter Zijlstra wrote: > On Fri, 2011-06-17 at 11:01 -0700, Linus Torvalds wrote: > > > So I do think that "page_referenced_anon()" should do a trylock, and > > return "referenced" if the trylock fails. Comments? > > The only problem I can immediately see with that is when a single > process' anon memory is dominant, then such an allocation will never > succeed in freeing these pages because the one lock will pin pretty much > all anon. Then again, there's always a few file pages to drop. > > That said, its rather unlikely, and iirc people were working on removing > direct reclaim, or at least rely less on it. > > something like so I guess, completely untested etc.. Also, there's a page_lock_anon_vma() user in split_huge_page(), which is used in mm/swap_state.c:add_to_swap(), which is also in the reclaim path, not quite sure what to do there. Aside from the THP thing there's a user in memory-failure.c, which looks to be broken as it is because its calling things under tasklist_lock which isn't preemptible, but it looks like we can simply swap the tasklist_lock vs page_lock_anon_vma. --- kernel/Makefile | 1 + mm/rmap.c | 39 +++++++++++++++++++++++++++++++++++++-- 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/kernel/Makefile b/kernel/Makefile index 2d64cfc..f6d05de 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -101,6 +101,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/ obj-$(CONFIG_TRACEPOINTS) += trace/ obj-$(CONFIG_SMP) += sched_cpupri.o obj-$(CONFIG_IRQ_WORK) += irq_work.o +obj-m += test.o obj-$(CONFIG_PERF_EVENTS) += events/ diff --git a/mm/rmap.c b/mm/rmap.c index 0eb463e..40cd399 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -400,6 +400,41 @@ out: return anon_vma; } +struct anon_vma *page_trylock_anon_vma(struct page *page) +{ + struct anon_vma *anon_vma = NULL; + struct anon_vma *root_anon_vma; + unsigned long anon_mapping; + + rcu_read_lock(); + anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping); + if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON) + goto out; + if (!page_mapped(page)) + goto out; + + anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON); + root_anon_vma = ACCESS_ONCE(anon_vma->root); + if (!mutex_trylock(&root_anon_vma->mutex)) { + anon_vma = NULL; + goto out; + } + + /* + * If the page is still mapped, then this anon_vma is still + * its anon_vma, and holding the mutex ensures that it will + * not go away, see anon_vma_free(). + */ + if (!page_mapped(page)) { + mutex_unlock(&root_anon_vma->mutex); + anon_vma = NULL; + } + +out: + rcu_read_unlock(); + return anon_vma; +} + /* * Similar to page_get_anon_vma() except it locks the anon_vma. * @@ -694,7 +729,7 @@ static int page_referenced_anon(struct page *page, struct anon_vma_chain *avc; int referenced = 0; - anon_vma = page_lock_anon_vma(page); + anon_vma = page_trylock_anon_vma(page); if (!anon_vma) return referenced; @@ -1396,7 +1431,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags) struct anon_vma_chain *avc; int ret = SWAP_AGAIN; - anon_vma = page_lock_anon_vma(page); + anon_vma = page_trylock_anon_vma(page); if (!anon_vma) return ret; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>