Re: [PATCH RFC] md/raid1: fix deadlock between freeze_array() and wait_barrier().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 27 2016, Alexander Lyakas wrote:

> When we call wait_barrier, we might have some bios waiting
> in current->bio_list, which prevents the array_freeze call to
> complete. Those can only be internal READs, which have already
> passed the wait_barrier call (thus incrementing nr_pending), but
> still were not submitted to the lower level, due to generic_make_request
> logic to avoid recursive calls. In such case, we have a deadlock:
> - array_frozen is already set to 1, so wait_barrier unconditionally waits, so
> - internal READ bios will not be submitted, thus freeze_array will
> never completes
>
> This problem was originally fixed in commit:
> d6b42dc md/raid1,raid10: avoid deadlock during resync/recovery.
>
> But then it was broken in commit:
> b364e3d raid1: Add a field array_frozen to indicate whether raid in
> freeze state.

Thanks for the great analysis.
I think this primarily a problem in generic_make_request().  It queues
requests in the *wrong* order.

Please try the patch from
  https://lkml.org/lkml/2016/7/7/428

and see if it helps.  If two requests for a raid1 are in the
generic_make_request queue, this patch causes the sub-requests created
by the first to be handled before the second is attempted.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux