Re: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Guys,

Thanks four your insight.

@William, I'll definitely take a look at
https://github.com/openvswitch/ovs/blob/05629ed271a67c737227a92f35bcff199648d604/lib/netdev-afxdp.c

I'll give a try to latest implementation using libbpf.
Shared umem is not preferred case, but it was only option where I was
able to go past one successful bind so far.

@VPP vs DPDK
I do not knw details of original project, but there were multiple
layest of VPP running on single node and the performance wasn't as
expected. Therefore we are exploring XDP as lightweigt alternative.

marek

Regards,
marek

On Fri, Aug 23, 2019 at 2:23 PM Marek Zavodsky <marek.zavodsky@xxxxxxxxx> wrote:
>
> Hi Jesper,
>
> Thanks for your reply.
> I apologize, I'm new to kernel dev, so I may be missing some background.
>
> Let's bring some more light into this. We are using kernel 5.0.0 and
> used samples/bpf/xdpsock as start point.
> I checked master, and example evolved (e.g. by adding cleanup
> mechanisms), but in terms what I need of it, it looks equal (and even
> more complicated, because now XDP attaching to interface is
> interleaved with XSK allocation).
> I built latest kernel, but it refused to boot, so I haven't had chance
> yet to tray the latest.
>
> I took the _user part and split it into two:
> "loader" -  Executed once to setup environment and once to cleanup,
> loads _kern.o, attaches it to interface and pin maps under
> /sys/fs/bpf.
> and
> "worker" - Executed as many as required. Every instance loads maps
> from /sys/fs/bpf, create one AF_XDP sock, update xsks record and start
> listen/process packets from AF_XDP (in test scenario we are using
> l2fwd because of write-back). I had to add missing cleanups there(
> close(fd), munmap()). This should be vpp in final solution.
> So far so good.
> I'm unable to start more than one worker due to previously mentioned
> error. First instance works properly, every other fails on bind
> (lineno may not match due to local changes):
> xdpsock_user.c:xsk_configure:595: Assertion failed: bind(sfd, (struct
> sockaddr *)&sxdp, sizeof(sxdp)) == 0: errno: 16/"Device or resource
> busy"
>
> I modified it to allocate multiple sockets within one process, and I
> was successful with shared umem:
> num_socks = 0;
> xsks[num_socks++] = xsk_configure(NULL);
> for (; num_socks < opt_alloc; num_socks++)
>         xsks[num_socks] = xsk_configure(xsks[0]->umem);
>
>
> but got same behavior (first ok, second failed on bind) when tried non-shared:
> num_socks = 0;
> for (; num_socks < opt_alloc; num_socks++)
>       xsks[num_socks] = xsk_configure(NULL);
>
>
>
> And the TX processing... as a workaround we moved VLAN pop/push to
> "worker" and XDP does only xsk-map redirects based on vlan-id, but it
> violates the purpose. It there any estimate when we could expect
> something on XDP TX front?
>
> I guess changing opt_ifindex to xsk->fd in
> bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags); won't help,
> as they are 2 different things, right? :)
>
> One side question. I noticed that bpf_trace_printk creates sparse
> entries in /sys/kernel/debug/tracing/trace.
> When I run sample of 100 packets I may get 0 to many entries there.
> It's a bit annoying to run "load test" just to verify I hit the
> correct code path. Is it doing sampling? Can I tweak it somehow?
> Any trick how to use tail -f on /sys/kernel/debug/tracing/trace?
>
> Thanks,
> marek
>
>
> On Fri, Aug 23, 2019 at 12:43 PM Jesper Dangaard Brouer
> <brouer@xxxxxxxxxx> wrote:
> >
> >
> > Bringing these questions to the xdp-newbies list, where they belong.
> > Answers inlined below.
> >
> > On Tue, 20 Aug 2019 21:17:57 +0200 Július Milan <Julius.Milan@xxxxxxxxxxxxx>
> > >
> > > I am writing AF_XDP driver for FDio VPP. I have 2 questions.
> > >
> >
> > That sounds excellent.  I was hoping someone would do this for FDio VPP.
> > Do notice that DPDK now also got AF_XDP support.  IHMO it makes a lot
> > of sense to implement AF_XDP for FDio, and avoid the DPDK dependency.
> > (AFAIK FDio already got other back-ends than DPDK).
> >
> >
> > > 1 - I created a simple driver according to sample in kernel. I load my XDP
> > > program and pin the maps.
> > >
> > >   Then in user application I create a socket, mmap the memory and
> > > push it to xskmap in program. All fine yet.
> > >
> > >   Then I start another instance of user application and do the
> > > same, create socket, mmap the memory and trying to
> > >
> > >   push it somewhere else into the map. But I got  errno: 16
> > > "Device or resource busy" when trying to bind.
> > >
> > >   I guess the memory can’t be mmaped 2 times, but should be
> > > shared, is that correct?
> >
> > I'm cc'ing the AF_XDP experts, as I'm not sure myself.  I mostly deal
> > with the in-kernel XDP path.  (AF_XDP is essentially kernel bypass :-O)
> >
> >
> > >   If so, I am wondering how to solve this nicely.
> > >
> > >   Can I store the value of first socket (that created the mmaped
> > > memory) in some special map in my XDP program to avoid complicated
> > > inter-process communication?
> > >
> > >   And what happens if this first socket is closed while any other
> > > sockets are still alive (using its shared mmaped memory)?
> > >
> > >   What would you recommend? Maybe you have some sample.
> >
> > We just added a sample (by Eelco Cc'ed) into XDP-tutorial:
> >  https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP
> >
> > At-least read the README.org file... to get over the common gotchas.
> >
> > AFAIK the sample doesn't cover your use-case.  I guess, we/someone
> > should extend the sample, to illustrate how how multiple interfaces can
> > share the same UMEM.
> >
> > The official documentation is:
> >  https://www.kernel.org/doc/html/latest/networking/af_xdp.html
> >
> >
> > >   Can I do also atomic operations? (I want it just for such rare
> > > cases as initialization of next socket, to check if there already is
> > > one, that mmaped the memory)
> > >
> > >
> > >
> > > 2 – We want to do also some decap/encap on XDP layer, before
> > > redirecting it to the socket.
> > >
> >
> > Decap on XDP layer is an excellent use-case, that demonstrate
> > cooperation between XDP and AF_XDP kernel-bypass facility.
> >
> >
> > >   On RX way it is easy, I do what I want and redirect it to the
> > > socket, but can I achieve the same also on TX?
> > >
> >
> > (Yes, RX case is easy)
> >
> > We don't have an XDP TX hook yet... but so many people have requested
> > this, that we should add this.
> >
> > >   Can I catch the packet while TX in XDP and do something with it
> > > (encapsulate it) before sending it out?
> >
> > Usually, we recommend people use the TC egress BPF hook to do the encap
> > in TX.  For the AF_XDP use-case, the TC hook isn't there... so that is
> > not an option.  Again an argument for an XDP-TX hook.  You, could
> > of-cause add the encap header in your AF_XDP userspace program, but I
> > do understand it would make architectural sense that in-kernel XDP
> > would act as a decap/encap layer.
> >
> >
> > >   If so what about performance?
> > >
> >
> > For AF_XDP RX-side is really really fast, even in copy-mode.
> >
> > For AF_XDP TX-side in copy-mode, it is rather slow, as it allocates
> > SKBs etc.  We could optimize this further but we have not.  When
> > enabling AF_XDP zero-copy mode, the TX-side is also super fast.
> >
> > Another hint, for AF_XDP TX-side, remember to "produce" several packets
> > before doing the sendmsg system call.  Thus, effectively doing bulking
> > on the TX-ring.
> >
> >
> > >
> > > By the way, great job with XDP ;)
> >
> > Thanks!
> >
> > --
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   LinkedIn: http://www.linkedin.com/in/brouer




[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux