The patch titled page invalidation cleanup has been added to the -mm tree. Its filename is page-invalidation-cleanup.patch See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find out what to do about this ------------------------------------------------------ Subject: page invalidation cleanup From: Nick Piggin <nickpiggin@xxxxxxxxxxxx> Clean up the invalidate code, and use a common function to safely remove the page from pagecache. Signed-off-by: Nick Piggin <npiggin@xxxxxxx> Cc: Hugh Dickins <hugh@xxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> --- mm/truncate.c | 25 ++++++++----------------- mm/vmscan.c | 27 +++++++++++++++++++++++---- 2 files changed, 31 insertions(+), 21 deletions(-) diff -puN mm/truncate.c~page-invalidation-cleanup mm/truncate.c --- a/mm/truncate.c~page-invalidation-cleanup +++ a/mm/truncate.c @@ -9,6 +9,7 @@ #include <linux/kernel.h> #include <linux/mm.h> +#include <linux/swap.h> #include <linux/module.h> #include <linux/pagemap.h> #include <linux/pagevec.h> @@ -78,36 +79,26 @@ truncate_complete_page(struct address_sp /* * This is for invalidate_inode_pages(). That function can be called at * any time, and is not supposed to throw away dirty pages. But pages can - * be marked dirty at any time too. So we re-check the dirtiness inside - * ->tree_lock. That provides exclusion against the __set_page_dirty - * functions. + * be marked dirty at any time too, so use remove_mapping which safely + * discards clean, unused pages. * * Returns non-zero if the page was successfully invalidated. */ static int invalidate_complete_page(struct address_space *mapping, struct page *page) { + int ret; + if (page->mapping != mapping) return 0; if (PagePrivate(page) && !try_to_release_page(page, 0)) return 0; - write_lock_irq(&mapping->tree_lock); - if (PageDirty(page)) - goto failed; - if (page_count(page) != 2) /* caller's ref + pagecache ref */ - goto failed; - - BUG_ON(PagePrivate(page)); - __remove_from_page_cache(page); - write_unlock_irq(&mapping->tree_lock); + ret = remove_mapping(mapping, page); ClearPageUptodate(page); - page_cache_release(page); /* pagecache ref */ - return 1; -failed: - write_unlock_irq(&mapping->tree_lock); - return 0; + + return ret; } /** diff -puN mm/vmscan.c~page-invalidation-cleanup mm/vmscan.c --- a/mm/vmscan.c~page-invalidation-cleanup +++ a/mm/vmscan.c @@ -384,11 +384,30 @@ int remove_mapping(struct address_space BUG_ON(mapping != page_mapping(page)); write_lock_irq(&mapping->tree_lock); - /* - * The non-racy check for busy page. It is critical to check - * PageDirty _after_ making sure that the page is freeable and - * not in use by anybody. (pagecache + us == 2) + * The non racy check for a busy page. + * + * Must be careful with the order of the tests. When someone has + * a ref to the page, it may be possible that they dirty it then + * drop the reference. So if PageDirty is tested before page_count + * here, then the following race may occur: + * + * get_user_pages(&page); + * [user mapping goes away] + * write_to(page); + * !PageDirty(page) [good] + * SetPageDirty(page); + * put_page(page); + * !page_count(page) [good, discard it] + * + * [oops, our write_to data is lost] + * + * Reversing the order of the tests ensures such a situation cannot + * escape unnoticed. The smp_rmb is needed to ensure the page->flags + * load is not satisfied before that of page->_count. + * + * Note that if SetPageDirty is always performed via set_page_dirty, + * and thus under tree_lock, then this ordering is not required. */ if (unlikely(page_count(page) != 2)) goto cannot_free; _ Patches currently in -mm which might be from nickpiggin@xxxxxxxxxxxx are fix-longstanding-load-balancing-bug-in-the-scheduler.patch git-gfs2.patch radix-tree-rcu-lockless-readside-semicolon.patch adix-tree-rcu-lockless-readside-update-tidy.patch adix-tree-rcu-lockless-readside-fix-3.patch radix-tree-cleanup-radix_tree_deref_slot-and.patch cleanup-radix_tree_derefreplace_slot-calling-conventions.patch page-migration-replace-radix_tree_lookup_slot-with-radix_tree_lockup.patch mm-remove_mapping-safeness-fix.patch page-invalidation-cleanup.patch do_sched_setscheduler-dont-take-tasklist_lock.patch introduce-is_rt_policy-helper.patch sched_setscheduler-fix-policy-checks.patch reparent_to_init-use-has_rt_policy.patch select_bad_process-kill-a-bogus-pf_dead-task_dead-check.patch oom_kill_task-cleanup-mm-checks.patch sched-remove-unnecessary-sched-group-allocations.patch sched-remove-unnecessary-sched-group-allocations-fix.patch lower-migration-thread-stop-machine-prio.patch sched-generic-sched_group-cpu-power-setup.patch sched2-sched-domain-sysctl.patch - To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html