On 12/17/24 1:59 PM, Damien Le Moal wrote: > On 2024/12/17 11:58, Jens Axboe wrote: >> On 12/17/24 12:54 PM, Jens Axboe wrote: >>> io_uring does support ordering writes - not because of zoning, but to >>> avoid buffered writes being spread over a bunch of threads and hence >>> just hammering the inode mutex rather than doing actual useful work. You >>> could potentially use that. Then all pending writes for that inode would >>> be ordered, even if punted to io-wq. >> >> See io_uring/io_uring.c:io_prep_async_work(), which is called when an IO >> is added for io-wq execution, io_wq_hash_work() makes sure it'll be >> ordered. However, this will still not work if you're driving beyond the >> limit of the device queue depth, or if you're doing IOs that may trigger >> -EAGAIN spuriously for -EAGAIN as you can still have two issuers - the >> task itself submitting IO, and the one io-wq worker tasked with doing >> blocking writes on this zoned device. > > Thanks for the pointer. Will have a look. It may be as simple as > always using the io-wq worker for zone writes and have these ordered > (__WQ_ORDERED). Maybe. Right, that should work if you force everything to be served by io-wq and ensure it's hashed. -- Jens Axboe