Re: [PATCH RFC] mm: migrate: Fix races of __find_get_block() and page migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 12, 2019 at 02:21:11PM -0700, Andrew Morton wrote:
> On Fri, 12 Jul 2019 13:39:35 +0100 Mel Gorman <mgorman@xxxxxxx> wrote:
> 
> > > So although I still think that just failing the migration if we cannot
> > > invalidate buffer heads is a safer choice, just extending the private_lock
> > > protected section does not seem as bad as I was afraid.
> > > 
> > 
> > That does not seem too bad and your revised patch looks functionally
> > fine. I'd leave out the tracepoints though because a perf probe would have
> > got roughly the same data and the tracepoint may be too specific to track
> > another class of problem. Whether the tracepoint survives or not and
> > with a changelog added;
> > 
> > Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > 
> > Andrew, which version do you want to go with, the original version or
> > this one that holds private_lock for slightly longer during migration?
> 
> The revised version looks much more appealing for a -stable backport. 
> I expect any mild performance issues can be address in the usual
> fashion.  My main concern is not to put a large performance regression
> into mainline and stable kernels.  How confident are we that this is
> (will be) sufficiently tested from that point of view?
> 

Fairly confident. If we agree on this patch in principle (build tested
only), I'll make sure it gets tested from a functional point of view
and queue up a few migration-intensive tests while a metadata workload
is running in the background to see what falls out. Furthermore, not all
filesystems even take this path even for migration-intensive situations
so some setups will never notice a difference.

---8<---
>From a4f07d789ba5742cd2fe6bcfb635502f2a1de004 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@xxxxxxx>
Date: Wed, 10 Jul 2019 11:31:01 +0200
Subject: [PATCH] mm: migrate: Fix race with __find_get_block()

buffer_migrate_page_norefs() can race with bh users in a following way:

CPU1                                    CPU2
buffer_migrate_page_norefs()
  buffer_migrate_lock_buffers()
  checks bh refs
  spin_unlock(&mapping->private_lock)
                                        __find_get_block()
                                          spin_lock(&mapping->private_lock)
                                          grab bh ref
                                          spin_unlock(&mapping->private_lock)
  move page                               do bh work

This can result in various issues like lost updates to buffers (i.e.
metadata corruption) or use after free issues for the old page.

This patch closes the race by holding mapping->private_lock while the
mapping is being moved to a new page. Ordinarily, a reference can be taken
outside of the private_lock using the per-cpu BH LRU but the references
are checked and the LRU invalidated if necessary. The private_lock is held
once the references are known so the buffer lookup slow path will spin
on the private_lock. Between the page lock and private_lock, it should
be impossible for other references to be acquired and updates to happen
during the migration.

[mgorman@xxxxxxxxxxxxxxxxxxx: Changelog, removed tracing]
Fixes: 89cb0888ca14 "mm: migrate: provide buffer_migrate_page_norefs()"
CC: stable@xxxxxxxxxxxxxxx
Signed-off-by: Jan Kara <jack@xxxxxxx>
Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
---
 mm/migrate.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index e9594bc0d406..a59e4aed6d2e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -771,12 +771,12 @@ static int __buffer_migrate_page(struct address_space *mapping,
 			}
 			bh = bh->b_this_page;
 		} while (bh != head);
-		spin_unlock(&mapping->private_lock);
 		if (busy) {
 			if (invalidated) {
 				rc = -EAGAIN;
 				goto unlock_buffers;
 			}
+			spin_unlock(&mapping->private_lock);
 			invalidate_bh_lrus();
 			invalidated = true;
 			goto recheck_buffers;
@@ -809,6 +809,8 @@ static int __buffer_migrate_page(struct address_space *mapping,
 
 	rc = MIGRATEPAGE_SUCCESS;
 unlock_buffers:
+	if (check_refs)
+		spin_unlock(&mapping->private_lock);
 	bh = head;
 	do {
 		unlock_buffer(bh);

-- 
Mel Gorman
SUSE Labs



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux