Thanks a lot Magnus If I understand correctly, you mean creating some storage of free umem addresses? Something as: https://github.com/xdp-project/xdp-tutorial/blob/master/advanced03-AF_XDP/af_xdp_user.c#L147 If so I did it and it works well, I just have to keep them separated per thread. Regarding the new peek you mentioned, do you think such patch would be meaningful even for merging in libxdp? That would be a motivation for me to dive deep into it. And would it be thread safe as I suppose? One more question: can I use the poll() also for the Completion Queue? It might be useful for me. Best Regards Julius On Thu, Jun 29, 2023 at 8:32 AM Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote: > > On Wed, 28 Jun 2023 at 20:26, Július Milan <julius.milan.22@xxxxxxxxx> wrote: > > > > Hi all > > > > I am writing an AF_XDP based user space application. > > However in my use case, packets payload get fragmented while > > processing, basically new packets are constructed inside and sent > > further. > > I probably cannot avoid mempcy anyway. > > > > So I plan to solve it simply - one umem per port, no locking, no > > keeping track of umem frames presence (kernel / user space) . Just > > usage of the rings, one half of the frames to circulate between the RX > > <-> fill queue, the other half TX <-> completion queue. > > > > Is it actually possible to initialize the rings in such a way that at > > the very beginning I would fill the completion queue by some frames? > > This is to avoid multithreaded access to the free frames without > > locking (initial TX would take a look for free frames inside the > > completion queue). > > You could fill the completion ring with entries from user space at > initialization time. As long as you always pick the first entry in the > completion ring before putting it on the Tx ring, the kernel would not > overwrite your entries. One complication is that you would have to > construct a new peek() routine for the completion queue, as the normal > one would indicate no entries found even though you have written > entries in it. > > Maybe an easier idea is to just have some code like this: > > /* Total N entries in Tx and completion ring. > * allocated initialized to 0 somewhere */ > if (allocated < N) { > allocated++; > return N; > } > return next entry on completion queue; > > Returns the buffer number in the umem and assumes your Tx buffers are > first in the umem. Though this scheme would introduce an if statement > in the path, it would be easier to start with. You need an address not > a number, and you likely also need some indication of "no buffer > available". Tried to keep it simple. > > > > If it's a bad idea, what else would you suggest? > > > > Thank you > > Julius