On 12/17/24 12:37 PM, Damien Le Moal wrote: > On 2024/12/17 11:33, Jens Axboe wrote: >> On 12/17/24 12:28 PM, Damien Le Moal wrote: >>> On 2024/12/17 11:25, Bart Van Assche wrote: >>>> On 12/17/24 11:20 AM, Damien Le Moal wrote: >>>>> For a simple fio "--zonemode=zbd --rw=randwrite --numjobs=X" for X > 1 >>>> >>>> Please note that this e-mail thread started by discussing a testcase >>>> with --numjobs=1. >>> >>> I missed that. Then io_uring should be fine and behave the same way as libaio. >>> Since it seems to not be working, we may have a bug beyond the recently fixed >>> REQ_NOWAIT handling I think. That needs to be looked at. >> >> Inflight collision, yes that's what I was getting at - there seems to be >> another bug here, and misunderstandings on how io_uring works is causing >> it to be ignored and/or not understood. > > OK. Will dig into this because I definitely do not fully understand where the > issue is. As per earlier replies, it's either -EAGAIN being mishandled, OR it's driving more IOs than the device supports. For the latter case, io_uring will NOT block, but libaio will. This means that libaio will sit there waiting on previous IO to complete, and then issue the next one. io_uring will punt that IO to io-wq, and then all bets are off in terms of ordering if you have multiple of these threads blocking on tags and doing issues. The test case looks like it's freezing the queue, which means you don't even need more than QD number of IOs inflight. When that happens, guess what libaio does? That's right, it blocks waiting on the queue, and io_uring will not block but rather punt those IOs to io-wq. If you have QD=2, then you now have 2 threads doing IO submission, and either of them could wake and submit before the other. Like Christoph alluded to in his first reply, driving more than 1 request inflight is going to be trouble, potentially. -- Jens Axboe