On 8/1/19 3:21 AM, Damien Le Moal wrote:
The recent fix to properly handle IOCB_NOWAIT for async O_DIRECT IO
(patch 6a43074e2f46) introduced two problems with BIO fragment handling
for direct IOs:
1) The dio size processed is claculated by incrementing the ret variable
by the size of the bio fragment issued for the dio. However, this size
is obtained directly from bio->bi_iter.bi_size AFTER the bio submission
which may result in referencing the bi_size value after the bio
completed, resulting in an incorrect value use.
2) The ret variable is not incremented by the size of the last bio
fragment issued for the bio, leading to an invalid IO size being
returned to the user.
Fix both problem by using dio->size (which is incremented before the bio
submission) to update the value of ret after bio submissions, including
for the last bio fragment issued.
Fixes: 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
Reported-by: Masato Suzuki <masato.suzuki@xxxxxxx>
Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxx>
---
fs/block_dev.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index c2a85b587922..75cc7f424b3a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -439,6 +439,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
ret = -EAGAIN;
goto error;
}
+ ret = dio->size;
if (polled)
WRITE_ONCE(iocb->ki_cookie, qc);
@@ -465,7 +466,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
ret = -EAGAIN;
goto error;
}
- ret += bio->bi_iter.bi_size;
+ ret = dio->size;
bio = bio_alloc(gfp, nr_pages);
if (!bio) {
Hi Damien,
Had you verified this patch with blktests and KASAN enabled? I think the
above patch introduced the following KASAN complaint:
==================================================================
BUG: KASAN: use-after-free in blkdev_direct_IO+0x9d5/0xa20
Read of size 8 at addr ffff888114d872c8 by task fio/12456
CPU: 3 PID: 12456 Comm: fio Not tainted 5.3.0-rc3-dbg+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0x86/0xca
print_address_description+0x74/0x32d
__kasan_report.cold.6+0x1b/0x36
kasan_report+0x12/0x17
__asan_load8+0x54/0x90
blkdev_direct_IO+0x9d5/0xa20
generic_file_direct_write+0x10e/0x200
__generic_file_write_iter+0x120/0x2a0
blkdev_write_iter+0x13c/0x220
aio_write+0x1bb/0x2d0
io_submit_one+0xe30/0x18e0
__x64_sys_io_submit+0x117/0x350
do_syscall_64+0x71/0x270
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Allocated by task 12456:
save_stack+0x21/0x90
__kasan_kmalloc.constprop.9+0xc7/0xd0
kasan_slab_alloc+0x11/0x20
kmem_cache_alloc+0xf9/0x3a0
mempool_alloc_slab+0x15/0x20
mempool_alloc+0xef/0x260
bio_alloc_bioset+0x148/0x310
blkdev_direct_IO+0x223/0xa20
generic_file_direct_write+0x10e/0x200
__generic_file_write_iter+0x120/0x2a0
blkdev_write_iter+0x13c/0x220
aio_write+0x1bb/0x2d0
io_submit_one+0xe30/0x18e0
__x64_sys_io_submit+0x117/0x350
do_syscall_64+0x71/0x270
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 26:
save_stack+0x21/0x90
__kasan_slab_free+0x139/0x190
kasan_slab_free+0xe/0x10
slab_free_freelist_hook+0x67/0x1e0
kmem_cache_free+0x102/0x320
mempool_free_slab+0x17/0x20
mempool_free+0x63/0x160
bio_free+0x144/0x210
bio_put+0x59/0x60
blkdev_bio_end_io+0x172/0x1f0
bio_endio+0x2b8/0x4c0
blk_update_request+0x180/0x4f0
end_clone_bio+0xbb/0xd0
bio_endio+0x2b8/0x4c0
blk_update_request+0x180/0x4f0
blk_mq_end_request+0x33/0x200
nvme_complete_rq+0xff/0x420 [nvme_core]
nvme_rdma_complete_rq+0xba/0x100 [nvme_rdma]
blk_mq_complete_request+0x174/0x250
nvme_rdma_recv_done+0x330/0x5a0 [nvme_rdma]
__ib_process_cq+0x8c/0x100 [ib_core]
ib_poll_handler+0x47/0xb0 [ib_core]
irq_poll_softirq+0xf5/0x270
__do_softirq+0x128/0x60f
(gdb) list *(blkdev_direct_IO+0x9d5)
0x49c5 is in blkdev_direct_IO (fs/block_dev.c:442).
437 if (qc == BLK_QC_T_EAGAIN) {
438 if (!ret)
439 ret = -EAGAIN;
440 goto error;
441 }
442 ret = dio->size;
443
444 if (polled)
445 WRITE_ONCE(iocb->ki_cookie, qc);
446 break;