Re: [RFC PATCH v3 10/12] tcp: RX path for devmem TCP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/9/23 16:07, Edward Cree wrote:
On 09/11/2023 02:39, Mina Almasry wrote:
On Wed, Nov 8, 2023 at 7:36 AM Edward Cree <ecree.xilinx@xxxxxxxxx> wrote:
  If not then surely the way to return a memory area
  in an io_uring idiom is just to post a new read sqe ('RX descriptor')
  pointing into it, rather than explicitly returning it with setsockopt.

We're interested in using this with regular TCP sockets, not
necessarily io_uring.
Fair.  I just wanted to push against the suggestion upthread that "oh,
  since io_uring supports setsockopt() we can just ignore it and it'll
  all magically work later" (paraphrased).

IMHO, that'd be horrible, but that why there are io_uring zc rx
patches, and we'll be sending an update soon

https://lore.kernel.org/all/20231107214045.2172393-1-dw@xxxxxxxxxxx/


If you can keep the "allocate buffers out of a devmem region" and "post
  RX descriptors built on those buffers" APIs separate (inside the
  kernel; obviously both triggered by a single call to the setsockopt()
  uAPI) that'll likely make things simpler for the io_uring interface I
  describe, which will only want the latter.
PS: Here's a crazy idea that I haven't thought through at all: what if
  you allow device memory to be mmap()ed into process address space
  (obviously with none of r/w/x because it's unreachable), so that your
  various uAPIs can just operate on pointers (e.g. the setsockopt
  becomes the madvise it's named after; recvmsg just uses or populates
  the iovec rather than needing a cmsg).  Then if future devices have
  their memory CXL accessible that can potentially be enabled with no
  change to the uAPI (userland just starts being able to access the
  region without faulting).
And you can maybe add a semantic flag to recvmsg saying "if you don't
  use all the buffers in my iovec, keep hold of the rest of them for
  future incoming traffic, and if I post new buffers with my next
  recvmsg, add those to the tail of the RXQ rather than replacing the
  ones you've got".  That way you can still have the "userland
  directly fills the RX ring" behaviour even with TCP sockets.


--
Pavel Begunkov




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux