Re: Running XSK on a tun device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 3, 2020 at 3:07 AM Gilberto Bertin <me@xxxxxxx> wrote:
>
> I am trying to bind an XSK socket to a tun device, so that I can run some
> automated tests on an XSK based server I'm working on. A tun device would in
> fact allow me to have fine control over what packets I'm sending to and
> receiving from the server (as opposed for example to an approach where the
> server listens on a regular interface and tests interact with it over sockets).
>
> The XSK logic of the server is largely based on the one presented in the
> xdpsock_user sample in samples/bpf in the Linux kernel (the server is using the
> XDP_USE_NEED_WAKEUP bind flag).
>
> When I manually interact with it using a pair of veth devices and netcat,
> everything works as expected: the server receives and then sends back packets
> properly.
>
> The troubles start when I try to bind it to a tun device as I am not able to move
> any packet between the device and the server.
>
> I tried then to reproduce the issue with a simplified setup based on the
> xdpsock_user sample, and I got the same results (I tested different combination
> of options such as driver mode vs skb mode, poll vs non poll mode, need-wakeup
> vs no-need-wakeup, all with the same outcome).
>
> By inspecting more closely the behavior of the sample program I found that:
>
> - packets are actually being received in the rx ring, as poll returns 1 each time
>   something is written on the fd of the tun device
> - the program gets stuck in rx_drop() [1], more precisely in:
>
>         ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
>         while (ret != rcvd) {
>                 if (ret < 0)
>                         exit_with_error(-ret);
>                 if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
>                         ret = poll(fds, num_socks, opt_timeout);
>                 ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
>         }
>
>   where xsk_ring_prod__reserve keeps returning 0.

Which kernel version are you running? If my memory serves me right, in
versions prior to 5.6, the update of the global state that signifies
that there is space available in the fill ring was updated in a lazy
manner. If you are not using the latest kernel, could you please try
it? Maybe it could give us some hints on what is going on.

Also have to say that the sample program is quite simplistic. If you
cannot reserve some entries in the fill ring at some point, you should
just go ahead and do something else (receive for example) and come
back later and try to do the same thing. It is not critical to always
be able to fill it again, even though it is good practice in a high
performance situation to keep it as full as possible to minimize the
risk of packet loss.

Note that there is not zero-copy support for TUN, but there is XDP
support so copy mode and XDP_DRV should work. Also note that I have
never tried TUN with AF_XDP, so you can have stumbled upon something
new.

/Magnus

> I'm not sure why this is happening as most of the descriptors in the fill ring
> should be available (especially since this exact same code works fine for other
> devices like veth).
>
> As I'm still getting acquainted with the codebase it's not obvious to me where I
> should start looking for to understand what's the underling cause of this issue
> so I'd really appreciate some help/pointers on this.
>
> Cheers,
> Gilberto
>
> [1] https://github.com/torvalds/linux/blob/8ed47e140867a6e7d56170f325c8d4fdee6d6b66/samples/bpf/xdpsock_user.c#L873-L880



[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux