Re: [PATCHv2 1/3] block: introduce rq_list_for_each_safe macro

Keith Busch <kbusch@xxxxxxxxxx> · Wed, 5 Jan 2022 09:26:25 -0800

On Tue, Jan 04, 2022 at 02:15:58PM +0200, Max Gurtovoy wrote:
> 
> This patch worked for me with 2 namespaces for NVMe PCI.
> 
> I'll check it later on with my RDMA queue_rqs patches as well. There we have
> also a tagset sharing with the connect_q (and not only with multiple
> namespaces).
> 
> But the connect_q is using a reserved tags only (for the connect commands).
> 
> I saw some strange things that I couldn't understand:
> 
> 1. running randread fio with libaio ioengine didn't call nvme_queue_rqs -
> expected
> 
> *2. running randwrite fio with libaio ioengine did call nvme_queue_rqs - Not
> expected !!*
> 
> *3. running randread fio with io_uring ioengine (and --iodepth_batch=32)
> didn't call nvme_queue_rqs - Not expected !!*
> 
> 4. running randwrite fio with io_uring ioengine (and --iodepth_batch=32) did
> call nvme_queue_rqs - expected
> 
> 5. *running randread fio with io_uring ioengine (and --iodepth_batch=32
> --runtime=30) didn't finish after 30 seconds and stuck for 300 seconds (fio
> jobs required "kill -9 fio" to remove refcounts from nvme_core)   - Not
> expected !!*
> 
> *debug pring: fio: job 'task_nvme0n1' (state=5) hasn't exited in 300
> seconds, it appears to be stuck. Doing forceful exit of this job.
> *
> 
> *6. ***running randwrite fio with io_uring ioengine (and  --iodepth_batch=32
> --runtime=30) didn't finish after 30 seconds and stuck for 300 seconds (fio
> jobs required "kill -9 fio" to remove refcounts from nvme_core)   - Not
> expected !!**
> 
> ***debug pring: fio: job 'task_nvme0n1' (state=5) hasn't exited in 300
> seconds, it appears to be stuck. Doing forceful exit of this job.***
> 
> 
> any idea what could cause these unexpected scenarios ? at least unexpected
> for me :)

Not sure about all the scenarios. I believe it should call queue_rqs
anytime we finish a plugged list of requests as long as the requests
come from the same request_queue, and it's not being flushed from
io_schedule().

The stuck fio job might be a lost request, which is what this series
should address. It would be unusual to see such an error happen in
normal operation, though. I had to synthesize errors to verify the bug
and fix.

In any case, I'll run more multi-namespace tests to see if I can find
any other issues with shared tags.