Re: Question about ublk and NEED_GET_DATA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 19, 2023 at 12:03:40AM +0000, Harris, James R wrote:
> Hi,
> 
> I’m working on adding NEED_GET_DATA support to the SPDK ublk server, to avoid allocating I/O buffers until they are actually needed.
> 
> It is very clear how this works for write commands with NEED_GET_DATA.  We wait to allocate the buffer until we get UBLK_IO_RES_NEED_GET_DATA completion and submit again using UBLK_IO_NEED_GET_DATA.  After we get the UBLK_IO_RES_OK completion from ublk, we submit the block request to the SPDK bdev layer.  After it completes, we submit using UBLK_IO_COMMIT_AND_FETCH_REQ and can free the I/O buffer because the data has been committed.
> 
> But how does this work for the read path?  On a read, I can wait to allocate the buffer until I get the UBLK_IO_RES_OK completion.  But after the read operation is completed and SPDK submits the UBLK_IO_COMMIT_AND_FETCH_REQ, how do I know when ublk has finished copying data out of the buffer so that I can reuse that buffer?
> 
> I’m sure I’m missing something obvious, if anyone can provide a pointer I would appreciate it.

Yeah, it is one known issue.

But the buffer is guaranteed to be ready for reuse after io_uring_enter()
returns.

Also the patch[1] exposes read/write interface to /dev/ublkcN,
then NEED_GET_DATA becomes not necessary any more. The initial
motivation is for zero-copy, but it can be used for non-zc too, such as
don't do any copy in driver side, and simply let userspace cover
everything:

1) ublk server gets one io command, allocate buffer and handle this
command

2) if it is WRITE command, read buffer data via read() from
/dev/ublkcN, then handle the write by this buffer 

3) if it is READ command, handle the read using the allocated buffer,
and after the data is ready in this buffer, write data in buffer to
/dev/ublkcN.

This approach could be cleaner for both sides, but add one extra cost
of read/write /dev/ublkcN in userspace, and it can be done via io_uring
too.

What do you think of this approach?

[1] https://lore.kernel.org/linux-block/20230330113630.1388860-16-ming.lei@xxxxxxxxxx/


Thanks,
Ming




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux