Re: [PATCH liburing] man/io_uring_enter.2: document IORING_OP_SEND_ZC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/5/22 4:09 PM, Pavel Begunkov wrote:
> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
> ---
> 
> Doc writing is not my strongest side, comments are welcome.
> 
>  man/io_uring_enter.2 | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 44 insertions(+)
> 
> diff --git a/man/io_uring_enter.2 b/man/io_uring_enter.2
> index 1a9311e..7fd275c 100644
> --- a/man/io_uring_enter.2
> +++ b/man/io_uring_enter.2
> @@ -1059,6 +1059,50 @@ value being passed in. This request type can be used to either just wake or
>  interrupt anyone waiting for completions on the target ring, or it can be used
>  to pass messages via the two fields. Available since 5.18.
>  
> +.TP
> +.B IORING_OP_SEND_ZC
> +Issue the zerocopy equivalent of a
> +.BR send(2)
> +system call. It's similar to IORING_OP_SEND, but when the
> +.I flags
> +field of the
> +.I "struct io_uring_cqe"
> +contains IORING_CQE_F_MORE, the userspace should expect a second cqe, a.k.a.
> +notification, and until then it should not modify data in the buffer. The
> +notification will have the same
> +.I user_data
> +as the first one and its
> +.I flags
> +field will contain the
> +.I IORING_CQE_F_NOTIF
> +flag. It's guaranteed that IORING_CQE_F_MORE is set IFF the result is
> +non-negative.
> +.I fd
> +must be set to the socket file descriptor,
> +.I addr
> +must contain a pointer to the buffer,
> +.I len
> +denotes the length of the buffer to send, and
> +.I msg_flags
> +holds the flags associated with the system call. When
> +.I addr2
> +is non-zero it points to the address of the target with
> +.I addr_len
> +specifying its size, turning the request into a
> +.BR sendto(2)
> +system call equivalent.
> +
> +.B IORING_OP_SEND_ZC
> +tries to avoid making intermediate data copies but still may fall back to
> +copying. Furthermore, zerocopy is not always faster, especially when the
> +per-request payload size is small. The two completion model is needed because
> +the kernel might hold on to buffers for a long time, e.g. waiting for a TCP ACK,
> +and having a separate cqe for request completions allows the userspace to push
> +more data without extra delays. Note, notifications don't guarantee that the
> +data has been or will ever be received by the other endpoint.

I'd probably reorder this a bit to introduce it with the fact that's
it's like SEND, but zero-copy. Then explain the mechanics of how MORE is
set for the 2 stage completion notification if zc is done. I can shuffle
it around a bit if you want me to - just let me know!

> +Available since 5.20.

Should be 6.0 here.

-- 
Jens Axboe



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux