[PATCH v1 00/15] zero-copy RX for io_uring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This series introduces network RX zerocopy for io_uring.

This is an evolution of the earlier zctap work, re-targeted to use
io_uring as the userspace API.  The code is intends to provide a 
ZC RX path for upper-level networking protocols (aka TCP and UDP),
with a focus on focuses on host-provided memory (not GPU memory).

This patch contains the upper-level core code required for operation,
but does not not contain the network driver side changes required for
true zero-copy operation.  The io_uring RECV_ZC opcode will work 
without hardware support, albeit in copy mode.

The intent is to use a network driver which provides header/data
splitting, so the frame header which is processed by the networking
stack is not placed in user memory.

The code is successfully receiving a zero-copy TCP stream from a
remote sender.

There is a liburing fork providing the needed wrappers:

    https://github.com/jlemon/liburing/tree/zctap

Which contains an examples/io_uring-net test application exercising
these features.  A sample run:

  # ./io_uring-net -i eth1 -q 20 -p 9999 -r 3000
   copy bytes: 1938872
     ZC bytes: 996683008
  Total bytes: 998621880, nsec:1025219375
         Rate: 7.79 Gb/s

If no queue is specified, then non-zc mode is used:

  # ./io_uring-net -p 9999
   copy bytes: 998621880
     ZC bytes: 0
  Total bytes: 998621880, nsec:1051515726
         Rate: 7.60 Gb/s

There is also an iperf3 fork as well:

   https://github.com/jlemon/iperf/tree/io_uring

This allows running single tests with either:
   * select (normal iperf3)
   * io_uring READ
   * io_uring RECV_ZC copy mode
   * io_uring RECV_ZC hardware mode

Current testing shows similar BW between RECV_ZC and READ modes
(running at 22Gbit/sec), but a reduction of ~50% of MemBW.

High level description:

The application allocates a frame backing store, and provides this
to the kernel for use.  An interface queue is requested from the
networking device, and incoming frames are deposited into the provided
memory region.  The NIC should provide a header splitting feature, so
only the frame payload is placed in the user space area.

Responsibility for correctly steering incoming frames to the queue
is outside the scope of this work - it is assumed that the user 
has set steering rules up separately.

Incoming frames are sent up the stack as skb's and eventually
land in the application's socket receive queue.  This differs
from AF_XDP, which receives raw frames directly to userspace,
without protocol processing.

The RECV_ZC opcode then returns an iov[] style vector which points
to the data in userspace memory.  When the application has completed
processing of the data, the buffers are returned back to the kernel
through a fill ring for reuse.

Jonathan Lemon (15):
  io_uring: add zctap ifq definition
  netdevice: add SETUP_ZCTAP to the netdev_bpf structure
  io_uring: add register ifq opcode
  io_uring: create a zctap region for a mapped buffer
  io_uring: mark pages in ifq region with zctap information.
  io_uring: Provide driver API for zctap packet buffers.
  io_uring: Allocate zctap device buffers and dma map them.
  io_uring: Add zctap buffer get/put functions and refcounting.
  skbuff: Introduce SKBFL_FIXED_FRAG and skb_fixed()
  io_uring: Allocate a uarg for use by the ifq RX
  io_uring: Define the zctap iov[] returned to the user.
  io_uring: add OP_RECV_ZC command.
  io_uring: Make remove_ifq_region a delayed work call
  io_uring: Add a buffer caching mechanism for zctap.
  io_uring: Notify the application as the fillq is drained.

 include/linux/io_uring.h       |   47 ++
 include/linux/io_uring_types.h |   12 +
 include/linux/netdevice.h      |    6 +
 include/linux/skbuff.h         |   10 +-
 include/uapi/linux/io_uring.h  |   24 +
 io_uring/Makefile              |    3 +-
 io_uring/io_uring.c            |    8 +
 io_uring/kbuf.c                |   13 +
 io_uring/kbuf.h                |    2 +
 io_uring/net.c                 |  123 ++++
 io_uring/opdef.c               |   15 +
 io_uring/zctap.c               | 1001 ++++++++++++++++++++++++++++++++
 io_uring/zctap.h               |   31 +
 13 files changed, 1293 insertions(+), 2 deletions(-)
 create mode 100644 io_uring/zctap.c
 create mode 100644 io_uring/zctap.h

-- 
2.30.2




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux