[RFC] block: dio: immediately fail when count < logical block size

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Tue, 2 May 2023 02:00:18 -0700

When using direct IO of say 4k on a 32k physical block size device
we crash. The amount of data requested must match the minimum IO supported
by the device but instead we take it for a ride right now and try to fail
later after checking alignments.

Use the logical block size to ensure the data passed on matches our minimum
supported.

Without this we end up in a crash below:

kernel BUG at lib/iov_iter.c:999!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 4 PID: 949 Comm: fio Not tainted 6.3.0-large-block-20230426-dirty #28
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5        04/01/2014                                                                              [   43.513057] RIP: 0010:iov_iter_revert.part.0+0x16e/0x170
Code: f9 40 a2 63 af 74 07 03 56 08 89 d8 29 d0 89 45 08 44 89 6d 20 <etc>
RSP: 0018:ffffaa52006cfc60 EFLAGS: 00010246
RAX: 0000000000000016 RBX: 0000000000000016 RCX: 0000000000000000
RDX: 0000000000000004 RSI: 0000000000000006 RDI: ffffaa52006cfd08
RBP: ffffaa52006cfd08 R08: 0000000000000000 R09: ffffaa52006cfb40
R10: 0000000000000003 R11: ffffffffafcc21e8 R12: 0000000000004000
R13: 0000000000003fea R14: ffff9de3d7565e00 R15: ffff9de3c1f68600
FS:  00007f8bfe726c40(0000) GS:ffff9de43bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8bf5eadd68 CR3: 0000000102c76001 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 blkdev_direct_write+0xf0/0x160
 blkdev_write_iter+0x11b/0x230
 io_write+0x10c/0x420
 ? kmem_cache_alloc_bulk+0x2a1/0x410
 ? fget+0x79/0xb0
 io_issue_sqe+0x60/0x3b0
 ? io_prep_rw+0x5a/0x190
 io_submit_sqes+0x1e6/0x640
 __do_sys_io_uring_enter+0x54c/0xb90
 ? handle_mm_fault+0x9a/0x340
 ? preempt_count_add+0x47/0xa0
 ? up_read+0x37/0x70
 ? do_user_addr_fault+0x27c/0x780
 do_syscall_64+0x37/0x90
 entry_SYSCALL_64_after_hw

The issue is we end up calling iov_iter_revert() at the end of
blkdev_direct_write() due to the writes not being valid and
being unaligned. We can fail twice, for on __blkdev_direct_IO_simple()
and later on __blkdev_direct_IO_async().

Instead of allowing such writes through and letting them in for
an attempt to ride, kill them early and fail early.

Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
---

But this can't be right, this should mean other failed unaligned
writes were possibly doing the wrong accounting also with
iov_iter_revert(). So this patch can't be right, right?

 block/fops.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/block/fops.c b/block/fops.c
index 524b8a828aad..c03c1bafcb1b 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -583,9 +583,14 @@ static int blkdev_close(struct inode *inode, struct file *filp)
 static ssize_t
 blkdev_direct_write(struct kiocb *iocb, struct iov_iter *from)
 {
+	struct file *file = iocb->ki_filp;
+	struct block_device *bdev = file->private_data;
 	size_t count = iov_iter_count(from);
 	ssize_t written;
 
+	if (count < bdev_logical_block_size(bdev))
+		return -EINVAL;
+
 	written = kiocb_invalidate_pages(iocb, count);
 	if (written) {
 		if (written == -EBUSY)
-- 
2.39.2