On 2022/10/19 03:15, Jonathan Lemon wrote: > This series is a RFC for io_uring/zctap. This is an evolution of > the earlier zctap work, re-targeted to use io_uring as the userspace > API. The current code is intended to provide a zero-copy RX path for > upper-level networking protocols (aka TCP and UDP). The current draft > focuses on host-provided memory (not GPU memory). > > This RFC contains the upper-level core code required for operation, > with the intent of soliciting feedback on the general API. This does > not contain the network driver side changes required for complete > operation. Also please note that as an RFC, there are some things > which are incomplete or in need of rework. > > The intent is to use a network driver which provides header/data > splitting, so the frame header (which is processed by the networking > stack) does not reside in user memory. > > The code is successfully receiving a zero-copy TCP stream from a > remote sender. An RFC, the intent is to solicit feedback on the > API and overall design. The current code will also work with > system pages, copying the data out to the application - this is > intended as a fallback/testing path. > > Performance numbers are coming soon! > > High level description: > > The application allocates a frame backing store, and provides this > to the kernel for use. An interface queue is requested from the > networking device, and incoming frames are deposited into the provided > memory region. > > Responsibility for correctly steering incoming frames to the queue > is outside the scope of this work - it is assumed that the user > has set steering rules up separately. > > Incoming frames are sent up the stack as skb's and eventually > land in the application's socket receive queue. This differs > from AF_XDP, which receives raw frames directly to userspace, > without protocol processing. > > The RECV_ZC opcode then returns an iov[] style vector which points > to the data in userspace memory. When the application has completed > processing of the data, the buffer is returned back to the kernel > through a fill ring for reuse. > > Changelog: > v1: initial version > v2: Remove separate PROVIDE_REGION opcode, fold this functionality > into REGISTER_IFQ. Remove page_pool hooks, as it appears the > page pool is currently incompatible with user-mapped memory. > Add io_zctap_buffers and network driver API. > . > > Jonathan Lemon (13): > io_uring: add zctap ifq definition > netdevice: add SETUP_ZCTAP to the netdev_bpf structure > io_uring: add register ifq opcode > io_uring: create a zctap region for a mapped buffer > io_uring: create page freelist for the ifq region > io_uring: Provide driver API for zctap packet buffers. > io_uring: Allocate the zctap buffers for the device > io_uring: Add zctap buffer get/put functions and refcounting. > skbuff: Introduce SKBFL_FIXED_FRAG and skb_fixed() > io_uring: Allocate a uarg for use by the ifq RX > io_uring: Define the zctap iov[] returned to the user. > io_uring: add OP_RECV_ZC command. > io_uring: Make remove_ifq_region a delayed work call > > include/linux/io_uring.h | 35 ++ > include/linux/io_uring_types.h | 11 + > include/linux/netdevice.h | 6 + > include/linux/skbuff.h | 10 +- > include/uapi/linux/io_uring.h | 24 + > io_uring/Makefile | 3 +- > io_uring/io_uring.c | 8 + > io_uring/kbuf.c | 13 + > io_uring/kbuf.h | 2 + > io_uring/net.c | 123 +++++ > io_uring/opdef.c | 15 + > io_uring/zctap.c | 842 +++++++++++++++++++++++++++++++++ > io_uring/zctap.h | 16 + > 13 files changed, 1106 insertions(+), 2 deletions(-) > create mode 100644 io_uring/zctap.c > create mode 100644 io_uring/zctap.h > Hi, Jonathan We are interested in your work, too. I think the API is better than V1. I have a question: Is this patchset still incomplete? We'd like to know how to split msg header and body by XDP with io_uring ZC_RECV. Or could you please share one runnable demo which could run with your previous liburing patch. Regards, Zhang.