On 7/23/19 2:07 AM, Stefan Hajnoczi wrote: > Hi, > io_uring O_DIRECT writes can fail with EIO on ext4. Please see the > function graph trace from Linux 5.3.0-rc1 below for details. It was > produced with the following qemu-io command (using Aarushi's QEMU > patches from https://github.com/rooshm/qemu/commits/io_uring): > > $ qemu-io --cache=none --aio=io_uring --format=qcow2 -c 'writev -P 185 131072 65536' tests/qemu-iotests/scratch/test.qcow2 > > This issue is specific to ext4. XFS and the underlying LVM logical > volume both work. > > The storage configuration is an LVM logical volume (device-mapper linear > target), on top of LUKS, on top of a SATA disk. The logical volume's > request_queue does not have mq_ops and this causes > generic_make_request_checks() to fail: > > if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q)) > goto not_supported; > > I guess this could be worked around by deferring the request to the > io_uring work queue to avoid REQ_NOWAIT. But XFS handles this fine so > how can io_uring.c detect this case cleanly or is there a bug in ext4? I actually think it's XFS that's broken here, it's not passing down the IOCB_NOWAIT -> IOMAP_NOWAIT -> REQ_NOWAIT. This means we lose that important request bit, and we just block instead of triggering the not_supported case. Outside of that, that case needs similar treatment to what I did for the EAGAIN case here: http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=893a1c97205a3ece0cbb3f571a3b972080f3b4c7 It was a big mistake to pass back these values in an async fashion, and it also means that some accounting in other drivers are broken as we can get completions without the bio actually being submitted. This was fixed for just the EAGAIN case here: http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=c9b3007feca018d3f7061f5d5a14cb00766ffe9b but that's still broken for EOPNOTSUPP. tldr is that we should pass these errors back sync, and it was a mistake to EVER try and do that through ->bi_end_io(). That behavior was introduced by: http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=03a07c92a9ed9938d828ca7f1d11b8bc63a7bb89 I'll add a patch on top of my for-linus branch that makes us handle the EOPNOTSUPP case similarly. You are right that in those cases we should just punt to the async worker internally. -- Jens Axboe