On Tue, Nov 7, 2023 at 4:01 PM David Ahern <dsahern@xxxxxxxxxx> wrote: > > On 11/7/23 4:55 PM, Mina Almasry wrote: > > On Mon, Nov 6, 2023 at 4:03 PM Willem de Bruijn > > <willemdebruijn.kernel@xxxxxxxxx> wrote: > >> > >> On Mon, Nov 6, 2023 at 3:55 PM David Ahern <dsahern@xxxxxxxxxx> wrote: > >>> > >>> On 11/6/23 4:32 PM, Stanislav Fomichev wrote: > >>>>> The concise notification API returns tokens as a range for > >>>>> compression, encoding as two 32-bit unsigned integers start + length. > >>>>> It allows for even further batching by returning multiple such ranges > >>>>> in a single call. > >>>> > >>>> Tangential: should tokens be u64? Otherwise we can't have more than > >>>> 4gb unacknowledged. Or that's a reasonable constraint? > >>>> > >>> > >>> Was thinking the same and with bits reserved for a dmabuf id to allow > >>> multiple dmabufs in a single rx queue (future extension, but build the > >>> capability in now). e.g., something like a 37b offset (128GB dmabuf > >>> size), 19b length (large GRO), 8b dmabuf id (lots of dmabufs to a queue). > >> > >> Agreed. Converting to 64b now sounds like a good forward looking revision. > > > > The concept of IDing a dma-buf came up in a couple of different > > contexts. First, in the context of us giving the dma-buf ID to the > > user on recvmsg() to tell the user the data is in this specific > > dma-buf. The second context is here, to bind dma-bufs with multiple > > user-visible IDs to an rx queue. > > > > My issue here is that I don't see anything in the struct dma_buf that > > can practically serve as an ID: > > > > https://elixir.bootlin.com/linux/v6.6-rc7/source/include/linux/dma-buf.h#L302 > > > > Actually, from the userspace, only the name of the dma-buf seems > > queryable. That's only unique if the user sets it as such. The dmabuf > > FD can't serve as an ID. For our use case we need to support 1 process > > doing the dma-buf bind via netlink, sharing the dma-buf FD to another > > process, and that process receives the data. In this case the FDs > > shown by the 2 processes may be different. Converting to 64b is a > > trivial change I can make now, but I'm not sure how to ID these > > dma-bufs. Suggestions welcome. I'm not sure the dma-buf guys will > > allow adding a new ID + APIs to query said dma-buf ID. > > > > The API can be unique to this usage: e.g., add a dmabuf id to the > netlink API. Userspace manages the ids (tells the kernel what value to > use with an instance), the kernel validates no 2 dmabufs have the same > id and then returns the value here. > > Seems reasonable, will do. On Wed, Nov 8, 2023 at 7:36 AM Edward Cree <ecree.xilinx@xxxxxxxxx> wrote: > > On 06/11/2023 21:17, Stanislav Fomichev wrote: > > I guess I'm just wondering whether other people have any suggestions > > here. Not sure Jonathan's way was better, but we fundamentally > > have two queues between the kernel and the userspace: > > - userspace receiving tokens (recvmsg + magical flag) > > - userspace refilling tokens (setsockopt + magical flag) > > > > So having some kind of shared memory producer-consumer queue feels natural. > > And using 'classic' socket api here feels like a stretch, idk. > > Do 'refilled tokens' (returned memory areas) get used for anything other > than subsequent RX? Hi Ed! Not really, it's only the subsequent RX. > If not then surely the way to return a memory area > in an io_uring idiom is just to post a new read sqe ('RX descriptor') > pointing into it, rather than explicitly returning it with setsockopt. We're interested in using this with regular TCP sockets, not necessarily io_uring. The io_uring interface to devmem TCP may very well use what you suggest and can drop the setsockopt. > (Being async means you can post lots of these, unlike recvmsg(), so you > don't need any kernel management to keep the RX queue filled; it can > just be all handled by the userland thus simplifying APIs overall.) > Or I'm misunderstanding something? > > -e -- Thanks, Mina