On 01/09/2015 06:48 PM, Mike Snitzer wrote:
On Fri, Jan 09 2015 at 7:27pm -0500,
Jens Axboe <axboe@xxxxxxxxx> wrote:
I sent out the half-done v3, unfortunately. Can you try this? Both the
cases with substantial nr_free are at the end of an index.
I initially thought it was fixed since I didn't see any failures on boot
(which I normally do see 3-4). I then ran the kernel "make install" to
this virtio-blk root device and also didn't see any failures on the the
first run. But the 2nd run triggered these:
[ 83.711724] __bt_get: values before for loop: last_tag=55, index=1
[ 83.713395] __bt_get: values after for loop: last_tag=32, index=1
[ 83.714464] bt_get: __bt_get() returned -1
[ 83.715183] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[ 83.716297] nr_free=128, nr_reserved=0
[ 83.716940] active_queues=0
[ 88.716241] __bt_get: values before for loop: last_tag=15, index=0
[ 88.717890] __bt_get: values after for loop: last_tag=0, index=0
[ 88.718956] bt_get: __bt_get() returned -1
[ 88.719682] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[ 88.720866] nr_free=128, nr_reserved=0
[ 88.721536] active_queues=0
A third "make install" resulted in:
[ 543.711782] __bt_get: values before for loop: last_tag=114, index=3
[ 543.713411] __bt_get: values after for loop: last_tag=96, index=3
[ 543.714495] bt_get: __bt_get() returned -1
[ 543.715222] queue_num=0, nr_tags=128, reserved_tags=0, bits_per_word=5
[ 543.716351] nr_free=128, nr_reserved=0
[ 543.717016] active_queues=0
(things definitely do seem better, e.g. less frequent failure and no
longer see the last_tag=127 case)
So if we end up freeing in batches, it's not totally unlikely that the
case could hit where all were busy, and they got freed in between. Does
seem a bit peculiar, though. The dump above, is that for the first
failure case of invoking __bt_get()? I don't see the:
_still_ returned -1
which would seem to back up the theory, though. So I think this might
actually be good, even if you hit that case.
Bart, could you try the patch (the -v4) and your DM hang and see if it
solves it for you?
If this one doesn't solve it, I'll reproduce it myself to save the
ping-pong effort :-)
I don't mind testing it since it is really quick. But OK.
OK, then we can stick to that. Let me know if you hit the case of it
both the initial -1 and the following -1, since that would indicate it's
not fixed.
--
Jens Axboe
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel