On Mon, Nov 04, 2024 at 04:38:08PM +0000, Pavel Begunkov wrote: > On 11/4/24 13:35, Ming Lei wrote: > > On Mon, Nov 04, 2024 at 01:24:09PM +0000, Pavel Begunkov wrote: > ... > > > > > > > any private data, then the buffer should've already been initialised by > > > > > > > the time it was lease. Initialised is in the sense that it contains no > > > > > > > > > > > > For block IO the practice is to zero the remainder after short read, please > > > > > > see example of loop, lo_complete_rq() & lo_read_simple(). > > > > > > > > > > It's more important for me to understand what it tries to fix, whether > > > > > we can leak kernel data without the patch, and whether it can be exploited > > > > > even with the change. We can then decide if it's nicer to zero or not. > > > > > > > > > > I can also ask it in a different way, can you tell is there some security > > > > > concern if there is no zeroing? And if so, can you describe what's the exact > > > > > way it can be triggered? > > > > > > > > Firstly the zeroing follows loop's handling for short read > > > > > > > Secondly, if the remainder part of one page cache buffer isn't zeroed, it might > > > > be leaked to userspace via another read() or mmap() on same page. > > > > > > What kind of data this leaked buffer can contain? Is it uninitialised > > > kernel memory like a freshly kmalloc'ed chunk would have? Or is it private > > > data of some user process? > > > > Yes, the page may be uninitialized, and might contain random kernel data. > > I see now, the user is obviously untrusted, but you're saying the ublk > server user space is trusted enough to see that kind of kernel data. ublk server isn't allowed to read from uninitialized page too, that is why `dir` field is added to `io_uring_kernel_buf`. For READ IO, the buffer is write-only, and I will extend io_mapped_ubuf to cover it as suggested by Jens. > Sounds like a security concern, is there a precedent allowing such? User emulated storages are in same situation, such as virtio-blk, in virtio_queue_rq(), virtblk_add_req() is called to add the READ request's sglist to virt-queue, then wakeup qemu for handling read IO, qemu will retrieve guest sg list from virt-queue and make sure DATA is filled to the sg buffer. Another example is null_blk without memback, which does nop for any READ IO. > Is it what ublk normally does even without this zero copy proposal? Without zero copy, userspace provides one IO buffer for each io command, and ublk server read data to this IO buffer first, then notify ublk driver via uring_cmd for completing the read IO, and ublk driver will copy data from the ublk server IO buffer to the original READ request buffer. Thanks, Ming