On Wed, Dec 14 2016, Jinpu Wang wrote: > > As you suggested, I re-run same test with 4.4.36 with no our own patch on MD. > I can still reproduce the same bug, nr_pending on heathy leg(loop1) is till 1. > Thanks. I have an hypothesis. md_make_request() calls blk_queue_split(). If that does split the request it will call generic_make_request() on the first half. That will call back into md_make_request() and raid1_make_request() which will submit requests to the underlying devices. These will get caught on the bio_list_on_stack queue in generic_make_request(). This is a queue which is not accounted in nr_queued. When blk_queue_split() completes, 'bio' will be the second half of the bio. This enters raid1_make_request() and by this time the array have been frozen. So wait_barrier() has to wait for pending requests to complete, and that includes the one that it stuck in bio_list_on_stack, which will never complete now. To see if this might be happening, please change the blk_queue_split(q, &bio, q->bio_split); call in md_make_request() to struct bio *tmp = bio; blk_queue_split(q, &bio, q->bio_split); WARN_ON_ONCE(bio != tmp); If that ever triggers, then the above is a real possibility. Fixing the problem isn't very easy... You could try: 1/ write a function in raid1.c which calls punt_bios_to_rescuer() (which you will need to export from block/bio.c), passing mddev->queue->bio_split as the bio_set. 1/ change the wait_event_lock_irq() call in wait_barrier() to wait_event_lock_irq_cmd(), and pass the new function as the command. That way, if wait_barrier() ever blocks, all the requests in bio_list_on_stack will be handled by a separate thread. NeilBrown
Attachment:
signature.asc
Description: PGP signature