Re: Switching packets between queues in XDP program

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 3 Jan 2024 at 10:12, Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote:
>
> On Tue, 2 Jan 2024 at 17:16, Yuval El-Hanany <yuvale@xxxxxxxxxxx> wrote:
> >
> > Wow,
> > I did not expect to have the option of somebody else doing the work for me ;-). I wanted to take a stab at it, but I see that coincidentally the same root cause applies to another thread (Redirect to AF_XDP socket not working with bond interface in native mode). The same fix would apply to both I think, and I’m on vacation so I can't I work it until mid January. Unless it takes longer, I guess I’ll take that stab at my next target, which will be getting TSO to work. However, I will be able to test the patch on our platform once it’s out.
>
> Yes, you are correct that they are related, though there needs to be
> one more patch for the bonding case to work. The current thinking is
> that something like this would solve your case (and be the base for
> the bonding problem fix):
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 9f13aa3353e3..f626bf1284de 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -309,10 +309,13 @@ static bool xsk_is_bound(struct xdp_sock *xs)
>
>  static int xsk_rcv_check(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
>  {
> +       struct net_device *dev = xdp->rxq->dev;
> +       u32 qid = xdp->rxq->queue_index;
> +
>         if (!xsk_is_bound(xs))
>                 return -ENXIO;
>
> -       if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
> +       if (xs->umem != dev->_rx[qid].pool->umem)
>                 return -EINVAL;
>
>         if (len > xsk_pool_get_rx_frame_size(xs->pool) && !xs->sg) {
>
> The only thing that matters is if the socket that is found in the
> xskmap is bound to the same umem as the packet was received in. I
> think this should do the trick and solve your problem, but need to try
> it out first.
>
> Unfortunately, this is not enough for the bonding case as xs->dev
> points to the bonding device and xdp->rxq->dev points to the real NIC
> device but only _rx of the bonding device is populated with the pool
> pointer (xdp->rxq->dev->_rx is empty above) as the AF_XDP core code
> has no idea that the bonding device really consists of multiple
> devices. So some code is needed in the bonding driver to transfer this
> state from the bonding device to the active real device. But would be
> nice if I could come up with a better solution for that case, so need
> to think some more.
>
> Enjoy you vacation. I can submit this in a couple of days. Just need
> to verify that it is a good solution.

Just to let you know that the solution above works except for the
bonding driver. It will crash the system as dev->_rx[qid].pool is NULL
for the bonded device. I have been trying to find a good solution for
the bonding driver, but the one I have is rather complicated and I
would prefer something simpler. I probably should just post something
on the mailing list and ask for help as I might need some new ideas.
Anyway, it would be nice to solve the bonding problem first so that we
end up with this simple solution for this piece of functionality. That
is why this is taking its time, sorry.

> Thanks: Magnus
>
> > Happy New Year, though this year, I’ll settle for a year better than the last to all people and nationalities,
> > Yuval.
> >
> > On 2 Jan 2024, at 11:15, Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote:
> >
> > CAUTION:External Email, Do not click on links or open attachments unless you recognize the sender and know the content is safe.
> >
> > On Sat, 23 Dec 2023 at 23:25, Yuval El-Hanany <yuvale@xxxxxxxxxxx> wrote:
> >
> >
> > Hi,
> >        I’m designing a POC to port a large application from DPDK to XDP. Switching ingressing packets between different processes is part of the core concept of the POC. I saw a question in the Q&A of AF_XDP but the question and answer seem a bit mismatched. The question is about switching umems in copy mode and the answer is generic about switching queues.
> >
> > Q: Can I use the XSKMAP to implement a switch between different umems in copy mode?
> > A: The short answer is no, that is not supported at the moment. The XSKMAP can only be used to switch traffic coming in on queue id X to sockets bound to the same queue id X. The XSKMAP can contain sockets bound to different queue ids, for example X and Y, but only traffic goming in from queue id Y can be directed to sockets bound to the same queue id Y. In zero-copy mode, you should use the switch, or other distribution mechanism, in your NIC to direct traffic to the correct queue id and socket.
> >
> >        My follow up question is whether this applies if I use a shared umem to all queues and devices. Obviously, it does not apply in user mode, as it’s possible to send the packets to any device and queue sharing the umem. Is it impossible to send packets to different queues even if they share umem in the XDP program using the XSKMAP? Is this a hard limit or a safety measure, that may be lifted using some kernel patch? For the POC, the limitation may fail the whole port. I've tried to switch packets between queues in a simple single process application in skb mode with a shared umem and indeed it seems the packets did not reach their destination.
> >
> >
> > Sorry for the delay, but now I am finally back from the holidays.
> >
> > In zero-copy mode, it should be possible to lift this restriction when
> > the umem is shared between different netdevs and queue ids. My guess
> > is that this restriction in the code is there from before the time the
> > shared umem concept was introduced and it was not lifted when I
> > introduced that. The only complication I can see is that in
> > user-space, you do not know from what fill ring the buffer came from.
> > So a simple scheme such as: "after processing a buffer from the Rx
> > ring, return the buffer to the corresponding fill ring" does not work
> > anymore. But that is easy to solve, so should not be a problem.
> >
> > Do you want to take a stab at a patch or do you prefer me to do it?
> >
> > There is also the case of copy-mode that we should support, but we
> > might get that for free as it amounts to the same thing when the umem
> > is shared. However, it is also possible to support your use case (in
> > copy-mode) when the umem is not shared as we are copying the packet
> > anyway in copy-mode. Just have to pick the correct umem as the
> > destination.
> >
> >
> >        Thanks,
> >                Yuval.
> >
> >





[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux