Re: [BUG] MD/RAID1 hung forever on freeze_array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 14 2016, Jinpu Wang wrote:

>
> As you suggested, I re-run same test with 4.4.36 with no our own patch on MD.
> I can still reproduce the same bug, nr_pending on heathy leg(loop1) is till 1.
>

Thanks.

I have an hypothesis.

md_make_request() calls blk_queue_split().
If that does split the request it will call generic_make_request()
on the first half. That will call back into md_make_request() and
raid1_make_request() which will submit requests to the underlying
devices.  These will get caught on the bio_list_on_stack queue in
generic_make_request().
This is a queue which is not accounted in nr_queued.

When blk_queue_split() completes, 'bio' will be the second half of the
bio.
This enters raid1_make_request() and by this time the array have been
frozen.
So wait_barrier() has to wait for pending requests to complete, and that
includes the one that it stuck in bio_list_on_stack, which will never
complete now.

To see if this might be happening, please change the

	blk_queue_split(q, &bio, q->bio_split);

call in md_make_request() to

	struct bio *tmp = bio;
	blk_queue_split(q, &bio, q->bio_split);
	WARN_ON_ONCE(bio != tmp);

If that ever triggers, then the above is a real possibility.

Fixing the problem isn't very easy...

You could try:
1/ write a function in raid1.c which calls punt_bios_to_rescuer()
  (which you will need to export from block/bio.c),
  passing mddev->queue->bio_split as the bio_set.

1/ change the wait_event_lock_irq() call in wait_barrier() to
   wait_event_lock_irq_cmd(), and pass the new function as the command.

That way, if wait_barrier() ever blocks, all the requests in
bio_list_on_stack will be handled by a separate thread.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux