On 12/17/24 7:04 AM, Damien Le Moal wrote:
On 2024/12/16 20:15, Christoph Hellwig wrote:
On Mon, Dec 16, 2024 at 11:24:24AM -0800, Bart Van Assche wrote:
Hi Damien,
If 'qd=1' is changed into 'qd=2' in tests/zbd/012 then this test fails
against all kernel versions I tried, including kernel version 6.9. Do
you agree that this test should pass?
That test case is not very well documented and you're not explaining
how it fails.
As far as I can tell the test uses fio to write to a SCSI debug device
using the zbd randwrite mode and the io_uring I/O engine of fio.
Of note about io_uring: if writes are submitted from multiple jobs to multiple
queues, then you will see unaligned write errors, but the same test with libaio
will work just fine. The reason is that io_uring fio engine IO submission only
adds write requests to the io rings, which will then be submitted by the kernel
ring handling later. But at that time, the ordering information is lost and if
the rings are processed in the wrong order, you'll get unaligned errors.
io_uring is thus unsafe for writes to zoned block devices. Trying to do
something about it has been on my to-do list for a while. Been too busy to do
anything yet. The best solution is of course zone append. If the user wants to
use regular writes, then it better tightly control its write IO issuing to be
QD=1 per zone itself as relying on zone write plugging will not be enough.
We've ever guaranteed ordering of multiple outstanding asynchronous user
writes on zoned block devices, so from that point of view a "failure" due
to write pointer violations when changing the test to use QD=2 is
entirely expected.
Not for libaio since the io_submit() call goes down to submit_bio(). So if the
issuer user application does the right synchronization (which fio does), libaio
is safe as we are guaranteed that the writes are placed in order in the zone
write plugs. As explained above, that is not the case with io_uring though.
Thanks Damien for having shared this information. After having switched
to libaio, the higher queue depth test cases pass with Jens'
block-for-next branch. See also
https://github.com/osandov/blktests/pull/156.
Bart.