Re: [PATCH V3 1/2] RAID1: a new I/O barrier implementation to remove resync window

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 17 2017, Coly Li wrote:

> On 2017/2/16 下午3:04, NeilBrown wrote:
>> I know you are going to change this as Shaohua wantsthe spitting to
>> happen in a separate function, which I agree with, but there is 
>> something else wrong here. Calling bio_split/bio_chain repeatedly
>> in a loop is dangerous. It is OK for simple devices, but when one
>> request can wait for another request to the same device it can
>> deadlock. This can happen with raid1.  If a resync request calls
>> raise_barrier() between one request and the next, then the next has
>> to wait for the resync request, which has to wait for the first
>> request. As the first request will be stuck in the queue in 
>> generic_make_request(), you get a deadlock.
>
> For md raid1, queue in generic_make_request(), can I understand it as
> bio_list_on_stack in this function? And queue in underlying device,
> can I understand it as the data structures like plug->pending and
> conf->pending_bio_list ?

Yes, the queue in generic_make_request() is the bio_list_on_stack.  That
is the only queue I am talking about.  I'm not referring to
plug->pending or conf->pending_bio_list at all.

>
> I still don't get the point of deadlock, let me try to explain why I
> don't see the possible deadlock. If a bio is split, and the first part
> is processed by make_request_fn(), and then a resync comes and it will
> raise a barrier, there are 3 possible conditions,
> - the resync I/O tries to raise barrier on same bucket of the first
> regular bio. Then the resync task has to wait to the first bio drops
> its conf->nr_pending[idx]

Not quite.
First, the resync task (in raise_barrier()) will wait for ->nr_waiting[idx]
to be zero.  We can assume this happens immediately.
Then the resync_task will increment ->barrier[idx].
Only then will it wait for the first bio to drop ->nr_pending[idx].
The processing of that first bio will have submitted bios to the
underlying device, and they will be in the bio_list_on_stack queue, and
will not be processed until raid1_make_request() completes.

The loop in raid1_make_request() will then call make_request_fn() which
will call wait_barrier(), which will wait for ->barrier[idx] to be zero.

So raid1_make_request is waiting for the resync to progress, and resync
is waiting for a bio which is on bio_list_on_stack which won't be
processed until raid1_make_request() completes.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux