On Wed, Nov 08, 2017 at 03:48:51PM -0700, Jens Axboe wrote: > This patch attempts to make the case of hctx re-running on driver tag > failure more robust. Without this patch, it's pretty easy to trigger a > stall condition with shared tags. An example is using null_blk like > this: > > modprobe null_blk queue_mode=2 nr_devices=4 shared_tags=1 submit_queues=1 hw_queue_depth=1 > > which sets up 4 devices, sharing the same tag set with a depth of 1. > Running a fio job ala: > > [global] > bs=4k > rw=randread > norandommap > direct=1 > ioengine=libaio > iodepth=4 > > [nullb0] > filename=/dev/nullb0 > [nullb1] > filename=/dev/nullb1 > [nullb2] > filename=/dev/nullb2 > [nullb3] > filename=/dev/nullb3 > > will inevitably end with one or more threads being stuck waiting for a > scheduler tag. That IO is then stuck forever, until someone else > triggers a run of the queue. > > Ensure that we always re-run the hardware queue, if the driver tag we > were waiting for got freed before we added our leftover request entries > back on the dispatch list. > > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Reviewed-by: Omar Sandoval <osandov@xxxxxx>