Re: [GIT PULL] Hole puch vs page cache filling races fixes for 5.14-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 01-07-21 11:04:28, Linus Torvalds wrote:
> On Thu, Jul 1, 2021 at 9:19 AM Jan Kara <jack@xxxxxxx> wrote:
> >
> > That being said I don't expect the optimization to matter too much
> > because in do_read_fault() we first call do_fault_around() which will
> > exactly map pages that are already in cache and uptodate
> 
> Yeah, I think that ends up saving the situation.
> 
> > So do you think the optimization is still worth it despite
> > do_fault_around()?
> 
> I suspect it doesn't matter that much for performance as you say due
> to any filesystem that cares about performance having the "map_pages"
> function pointing to filemap_map_pages, but I reacted to it just from
> looking at the patch, and it just seems conceptually wrong. Taking the
> lock in a situation where it's not actually needed will just cause
> problems later when somebody decides that the lock protects something
> else entirely.

OK, I did some experiments and indeed in my totally unscientific boot and
compile tests I've seen ~90% of page faults to not get to filemap_fault()
(they were either anon memory or satisfied with filemap_map_pages()). From
the remaining 10% which got to filemap_fault() about 95% already had a page
in the page cache (so the optimization would help) - usually because we
were doing a write fault. So the optimization seems worth it.  I've added
attached patch on top of the series which implements the optimization you
suggested. I've also pushed out the complete series to

git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git hole_punch_fixes

so that people can see the whole picture. Review is welcome!

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
>From 347157fb9dfaf5a7d10e723c773affe147cdff34 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@xxxxxxx>
Date: Tue, 6 Jul 2021 16:45:42 +0200
Subject: [PATCH] mm: Acquire invalidate_lock in filemap_fault() only when
 creating pages

We don't need to acquire invalidate_lock in filemap_fault() when the
page is already in the page cache and uptodate since the existence of
the page itself (and its page lock) protect from races with
invalidation. This is rather common situation especially for write
faults (for read faults filemap_map_pages() generally handles this
situation). So let's avoid the unnecessary invalidate_lock acquisition
for this fast path.

Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Jan Kara <jack@xxxxxxx>
---
 mm/filemap.c | 37 +++++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index b8e9bccecd9f..82269aaa715e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3030,6 +3030,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 	pgoff_t max_off;
 	struct page *page;
 	vm_fault_t ret = 0;
+	bool mapping_locked = false;
 
 	max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
 	if (unlikely(offset >= max_off))
@@ -3053,12 +3054,16 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		fpin = do_sync_mmap_readahead(vmf);
 	}
 
-	/*
-	 * See comment in filemap_create_page() why we need invalidate_lock
-	 */
-	filemap_invalidate_lock_shared(mapping);
 	if (!page) {
 retry_find:
+		/*
+		 * See comment in filemap_create_page() why we need
+		 * invalidate_lock
+		 */
+		if (!mapping_locked) {
+			filemap_invalidate_lock_shared(mapping);
+			mapping_locked = true;
+		}
 		page = pagecache_get_page(mapping, offset,
 					  FGP_CREAT|FGP_FOR_MMAP,
 					  vmf->gfp_mask);
@@ -3068,6 +3073,9 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 			filemap_invalidate_unlock_shared(mapping);
 			return VM_FAULT_OOM;
 		}
+	} else if (unlikely(!PageUptodate(page))) {
+		filemap_invalidate_lock_shared(mapping);
+		mapping_locked = true;
 	}
 
 	if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
@@ -3085,8 +3093,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 	 * We have a locked page in the page cache, now we need to check
 	 * that it's up-to-date. If not, it is going to be due to an error.
 	 */
-	if (unlikely(!PageUptodate(page)))
+	if (unlikely(!PageUptodate(page))) {
+		/*
+		 * The page was in cache and uptodate and now it is not.
+		 * Strange but possible since we didn't hold the page lock all
+		 * the time. Let's drop everything get the invalidate lock and
+		 * try again.
+		 */
+		if (!mapping_locked) {
+			unlock_page(page);
+			put_page(page);
+			goto retry_find;
+		}
 		goto page_not_uptodate;
+	}
 
 	/*
 	 * We've made it this far and we had to drop our mmap_lock, now is the
@@ -3097,6 +3117,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 		unlock_page(page);
 		goto out_retry;
 	}
+	if (mapping_locked)
+		filemap_invalidate_unlock_shared(mapping);
 
 	/*
 	 * Found the page and have a reference on it.
@@ -3106,11 +3128,9 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 	if (unlikely(offset >= max_off)) {
 		unlock_page(page);
 		put_page(page);
-		filemap_invalidate_unlock_shared(mapping);
 		return VM_FAULT_SIGBUS;
 	}
 
-	filemap_invalidate_unlock_shared(mapping);
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 
@@ -3141,7 +3161,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 	 */
 	if (page)
 		put_page(page);
-	filemap_invalidate_unlock_shared(mapping);
+	if (mapping_locked)
+		filemap_invalidate_unlock_shared(mapping);
 	if (fpin)
 		fput(fpin);
 	return ret | VM_FAULT_RETRY;
-- 
2.26.2


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux