On Sat, Aug 06 2016, Shaohua Li wrote: > On Thu, Aug 04, 2016 at 01:16:49PM +1000, Neil Brown wrote: >> On Wed, Aug 03 2016, NeilBrown wrote: >> >> > [ Unknown signature status ] >> > On Sun, Jul 31 2016, shli@xxxxxxxxxx wrote: >> > >> >> From: Shaohua Li <shli@xxxxxx> >> >> >> >> .quiesce is called with mddev lock hold at most places. There are few >> >> exceptions. Calling .quesce without the lock hold could create races. For >> >> example, the .quesce of raid1 can't be recursively. The purpose of the patches >> >> is to fix a race in raid5-cache. The raid5-cache .quesce will write md >> >> superblock and should be called with mddev lock hold. >> >> >> >> Cc: NeilBrown <neilb@xxxxxxxx> >> >> Signed-off-by: Shaohua Li <shli@xxxxxx> >> > >> > Acked-by: NeilBrown <neilb@xxxxxxxx> >> > >> > This should be safe but I'm not sure I really like it. >> > The raid1 quiesce could be changed so that it can be called recursively. >> > The raid5-cache situation would be harder to get right and maybe this is >> > the best solution... It's just that 'quiesce' should be a fairly >> > light-weight operation, just waiting for pending requests to flush. It >> > shouldn't really *need* a lock. >> >> Actually, the more I think about this, the less I like it. >> >> I would much rather make .quiesce lighter weight so that no locking was >> needed. >> >> For r5l_quiesce, that probable means removed the "r5l_do_reclaim()". >> Stopping and restarting the reclaim thread seems reasonable, but calling >> r5l_do_reclaim() should not be needed. It should be done periodically >> by the thread, and at 'stop' time, but otherwise isn't needed. >> You would need to hold some mutex while calling md_register_thread, but >> that could be probably be log->io_mutex, or maybe even some other new >> mutex > > We will have the same deadlock issue with just stopping/restarting the reclaim > thread. As stopping the thread will wait for the thread, which probably is > doing r5l_do_reclaim and writting the superblock. Since we are writting the > superblock, we must hold the reconfig_mutex. When you say "writing the superblock" you presumably mean "blocked in r5l_write_super_and_discard_space(), waiting for MD_CHANGE_PENDING to be cleared" ?? With a bit of care you could wait for MD_CHANGE_PENDING to clear, or for ->quiesce to be set, and then exit gracefully. > > Letting raid5_quiesce call r5l_do_reclaim gives us a clean log. Just > stop/restart the reclaim thread can't guarantee this, as it's possible some > space aren't reclaimed yet. A clean log will simplify a lot of things, for > example we change the layout of the array. The log doesn't need to remember > which part is for the old layout and which part is the new layout. I really think you are putting too much functionality into quiesce. When we change the shape of the array, we do much more than just quiesce it. We also call check_reshape and start_reshape etc. They are called with reconfig_mutex held and it would be perfectly appropriate to finish of the r5l_do_reclaim() work in there. > > I think we can add a new parameter for .quiesce to indicate if reconfig_mutex > is hold. raid5_quiesce can check the parameter and hold reconfig_mutex if > necessary. Adding a new parameter because it happens to be convenient in one case is not necessarily a good idea. It is often a sign that the interface isn't well designed, or isn't well understood, or is being used poorly. I really really don't think ->quiesce() should care about whether reconfig_mutex is held. All it should do is drain all IO and stop new IO so that other threads can do unusually things in race-free ways. NeilBrown
Attachment:
signature.asc
Description: PGP signature