On 2024/12/17 11:41, Jens Axboe wrote: > On 12/17/24 12:37 PM, Damien Le Moal wrote: >> On 2024/12/17 11:33, Jens Axboe wrote: >>> On 12/17/24 12:28 PM, Damien Le Moal wrote: >>>> On 2024/12/17 11:25, Bart Van Assche wrote: >>>>> On 12/17/24 11:20 AM, Damien Le Moal wrote: >>>>>> For a simple fio "--zonemode=zbd --rw=randwrite --numjobs=X" for X > 1 >>>>> >>>>> Please note that this e-mail thread started by discussing a testcase >>>>> with --numjobs=1. >>>> >>>> I missed that. Then io_uring should be fine and behave the same way as libaio. >>>> Since it seems to not be working, we may have a bug beyond the recently fixed >>>> REQ_NOWAIT handling I think. That needs to be looked at. >>> >>> Inflight collision, yes that's what I was getting at - there seems to be >>> another bug here, and misunderstandings on how io_uring works is causing >>> it to be ignored and/or not understood. >> >> OK. Will dig into this because I definitely do not fully understand where the >> issue is. > > As per earlier replies, it's either -EAGAIN being mishandled, OR it's > driving more IOs than the device supports. For the latter case, io_uring > will NOT block, but libaio will. This means that libaio will sit there > waiting on previous IO to complete, and then issue the next one. > io_uring will punt that IO to io-wq, and then all bets are off in terms > of ordering if you have multiple of these threads blocking on tags and > doing issues. The test case looks like it's freezing the queue, which > means you don't even need more than QD number of IOs inflight. When that > happens, guess what libaio does? That's right, it blocks waiting on the > queue, and io_uring will not block but rather punt those IOs to io-wq. > If you have QD=2, then you now have 2 threads doing IO submission, and > either of them could wake and submit before the other. That sounds like a very good analysis :) > Like Christoph alluded to in his first reply, driving more than 1 > request inflight is going to be trouble, potentially. Yes. I think the confusion is with which "inflight" we are talking about. Between the block layer and the device, zone write plugging prevents more than 1 write per zone, so things are OK (modulo bugs...). But between the application and the block layer, that is not well managed and as your analysis above shows, bad things can happen. I will look into it to see if we can do something sensible. If not, we should at least warn the user, or just outright fail using io_uring with zoned block devices to avoid bad surprises. -- Damien Le Moal Western Digital Research