Re: [PATCH 1/3] MD: hold mddev lock for .quiesce in md_do_sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Aug 06 2016, Shaohua Li wrote:

> On Thu, Aug 04, 2016 at 01:16:49PM +1000, Neil Brown wrote:
>> On Wed, Aug 03 2016, NeilBrown wrote:
>> 
>> > [ Unknown signature status ]
>> > On Sun, Jul 31 2016, shli@xxxxxxxxxx wrote:
>> >
>> >> From: Shaohua Li <shli@xxxxxx>
>> >>
>> >> .quiesce is called with mddev lock hold at most places. There are few
>> >> exceptions. Calling .quesce without the lock hold could create races. For
>> >> example, the .quesce of raid1 can't be recursively. The purpose of the patches
>> >> is to fix a race in raid5-cache. The raid5-cache .quesce will write md
>> >> superblock and should be called with mddev lock hold.
>> >>
>> >> Cc: NeilBrown <neilb@xxxxxxxx>
>> >> Signed-off-by: Shaohua Li <shli@xxxxxx>
>> >
>> > Acked-by: NeilBrown <neilb@xxxxxxxx>
>> >
>> > This should be safe but I'm not sure I really like it.
>> > The raid1 quiesce could be changed so that it can be called recursively.
>> > The raid5-cache situation would be harder to get right and maybe this is
>> > the best solution... It's just that 'quiesce' should be a fairly
>> > light-weight operation, just waiting for pending requests to flush.  It
>> > shouldn't really *need* a lock.
>> 
>> Actually, the more I think about this, the less I like it.
>> 
>> I would much rather make .quiesce lighter weight so that no locking was
>> needed.
>> 
>> For r5l_quiesce, that probable means removed the "r5l_do_reclaim()".
>> Stopping and restarting the reclaim thread seems reasonable, but calling
>> r5l_do_reclaim() should not be needed.  It should be done periodically
>> by the thread, and at 'stop' time, but otherwise isn't needed.
>> You would need to hold some mutex while calling md_register_thread, but
>> that could be probably be log->io_mutex, or maybe even some other new
>> mutex
>
> We will have the same deadlock issue with just stopping/restarting the reclaim
> thread. As stopping the thread will wait for the thread, which probably is
> doing r5l_do_reclaim and writting the superblock. Since we are writting the
> superblock, we must hold the reconfig_mutex.

When you say "writing the superblock" you presumably mean "blocked in
r5l_write_super_and_discard_space(), waiting for  MD_CHANGE_PENDING to
be cleared" ??
With a bit of care you could wait for MD_CHANGE_PENDING to clear, or for
->quiesce to be set, and then exit gracefully.

>
> Letting raid5_quiesce call r5l_do_reclaim gives us a clean log. Just
> stop/restart the reclaim thread can't guarantee this, as it's possible some
> space aren't reclaimed yet. A clean log will simplify a lot of things, for
> example we change the layout of the array. The log doesn't need to remember
> which part is for the old layout and which part is the new layout.

I really think you are putting too much functionality into quiesce.
When we change the shape of the array, we do much more than just
quiesce it.  We also call check_reshape and start_reshape etc.
They are called with reconfig_mutex held and it would be perfectly
appropriate to finish of the r5l_do_reclaim() work in there.

>
> I think we can add a new parameter for .quiesce to indicate if reconfig_mutex
> is hold. raid5_quiesce can check the parameter and hold reconfig_mutex if
> necessary.

Adding a new parameter because it happens to be convenient in one case
is not necessarily a good idea.  It is often a sign that the interface
isn't well designed, or isn't well understood, or is being used poorly.

I really really don't think ->quiesce() should care about whether
reconfig_mutex is held.  All it should do is drain all IO and stop new
IO so that other threads can do unusually things in race-free ways.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux