On Fri, Feb 17 2017, Coly Li wrote: > On 2017/2/16 下午3:04, NeilBrown wrote: >> I know you are going to change this as Shaohua wantsthe spitting to >> happen in a separate function, which I agree with, but there is >> something else wrong here. Calling bio_split/bio_chain repeatedly >> in a loop is dangerous. It is OK for simple devices, but when one >> request can wait for another request to the same device it can >> deadlock. This can happen with raid1. If a resync request calls >> raise_barrier() between one request and the next, then the next has >> to wait for the resync request, which has to wait for the first >> request. As the first request will be stuck in the queue in >> generic_make_request(), you get a deadlock. > > For md raid1, queue in generic_make_request(), can I understand it as > bio_list_on_stack in this function? And queue in underlying device, > can I understand it as the data structures like plug->pending and > conf->pending_bio_list ? Yes, the queue in generic_make_request() is the bio_list_on_stack. That is the only queue I am talking about. I'm not referring to plug->pending or conf->pending_bio_list at all. > > I still don't get the point of deadlock, let me try to explain why I > don't see the possible deadlock. If a bio is split, and the first part > is processed by make_request_fn(), and then a resync comes and it will > raise a barrier, there are 3 possible conditions, > - the resync I/O tries to raise barrier on same bucket of the first > regular bio. Then the resync task has to wait to the first bio drops > its conf->nr_pending[idx] Not quite. First, the resync task (in raise_barrier()) will wait for ->nr_waiting[idx] to be zero. We can assume this happens immediately. Then the resync_task will increment ->barrier[idx]. Only then will it wait for the first bio to drop ->nr_pending[idx]. The processing of that first bio will have submitted bios to the underlying device, and they will be in the bio_list_on_stack queue, and will not be processed until raid1_make_request() completes. The loop in raid1_make_request() will then call make_request_fn() which will call wait_barrier(), which will wait for ->barrier[idx] to be zero. So raid1_make_request is waiting for the resync to progress, and resync is waiting for a bio which is on bio_list_on_stack which won't be processed until raid1_make_request() completes. NeilBrown
Attachment:
signature.asc
Description: PGP signature