Re: blk-mq request allocation stalls [was: Re: [PATCH v3 0/8] dm: add request-based blk-mq support]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/09/2015 06:48 PM, Mike Snitzer wrote:
On Fri, Jan 09 2015 at  7:27pm -0500,
Jens Axboe <axboe@xxxxxxxxx> wrote:

I sent out the half-done v3, unfortunately. Can you try this? Both the
cases with substantial nr_free are at the end of an index.

I initially thought it was fixed since I didn't see any failures on boot
(which I normally do see 3-4).  I then ran the kernel "make install" to
this virtio-blk root device and also didn't see any failures on the the
first run.  But the 2nd run triggered these:

[   83.711724] __bt_get: values before for loop: last_tag=55, index=1
[   83.713395] __bt_get: values after  for loop: last_tag=32, index=1
[   83.714464] bt_get: __bt_get() returned -1
[   83.715183] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[   83.716297] nr_free=128, nr_reserved=0
[   83.716940] active_queues=0

[   88.716241] __bt_get: values before for loop: last_tag=15, index=0
[   88.717890] __bt_get: values after  for loop: last_tag=0, index=0
[   88.718956] bt_get: __bt_get() returned -1
[   88.719682] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[   88.720866] nr_free=128, nr_reserved=0
[   88.721536] active_queues=0

A third "make install" resulted in:

[  543.711782] __bt_get: values before for loop: last_tag=114, index=3
[  543.713411] __bt_get: values after  for loop: last_tag=96, index=3
[  543.714495] bt_get: __bt_get() returned -1
[  543.715222] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[  543.716351] nr_free=128, nr_reserved=0
[  543.717016] active_queues=0

(things definitely do seem better, e.g. less frequent failure and no
longer see the last_tag=127 case)

So if we end up freeing in batches, it's not totally unlikely that the case could hit where all were busy, and they got freed in between. Does seem a bit peculiar, though. The dump above, is that for the first failure case of invoking __bt_get()? I don't see the:

_still_ returned -1

which would seem to back up the theory, though. So I think this might actually be good, even if you hit that case.

Bart, could you try the patch (the -v4) and your DM hang and see if it solves it for you?


If this one doesn't solve it, I'll reproduce it myself to save the
ping-pong effort :-)

I don't mind testing it since it is really quick.  But OK.

OK, then we can stick to that. Let me know if you hit the case of it both the initial -1 and the following -1, since that would indicate it's not fixed.


--
Jens Axboe

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux