Re: ixgbe: xdpsock txonly hangs and does not complete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 11 Oct 2023 at 08:06, Srivats P <pstavirs@xxxxxxxxx> wrote:
>
> Hi,
>
> While debugging a TX problem with my AF_XDP app it seems that there
> might be a bug in the ixgbe driver (see thread:
> https://www.spinics.net/lists/xdp-newbies/msg02406.html)
>
> So I decided to try xdpsock txonly and I see a similar behaviour as my
> app with xdpsock as well.
>
> Essentially, after sending 'n' packets on Tx ring, the app (or xdpsock
> for that matter) expects to free 'n' packets from the completion ring,
> but it either gets less or sometimes more packets/descriptors from the
> completion ring.
>
> Please see the log below.
>
> <log>
> root@tditwtga002:~# ./xdpsock -t -i eno49
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 7,388,073      7,390,400
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,689,548      17,082,880
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,660,591      26,745,152
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,690,513      36,437,184
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,684,898      46,123,840
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,688,898      55,815,168
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,654,194      65,471,488
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,681,793      75,154,880
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,664,164      84,821,376
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,681,688      94,504,768
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,637,019      104,143,552
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 9,684,780      113,830,720
> ^Coutstanding 2466 (-64)
> outstanding 2402 (-64)
> outstanding 2338 (-64)
> outstanding 2274 (-64)
> outstanding 2210 (-64)
> outstanding 2146 (-64)
> outstanding 2082 (-64)
> outstanding 2020 (-62)
> outstanding 1956 (-64)
> outstanding 1892 (-64)
> outstanding 1828 (-64)
> outstanding 1764 (-64)
> outstanding 1700 (-64)
> outstanding 1636 (-64)
> outstanding 1572 (-64)
> outstanding 1510 (-62)
> outstanding 1446 (-64)
> outstanding 1382 (-64)
> outstanding 1318 (-64)
> outstanding 1254 (-64)
> outstanding 1190 (-64)
> outstanding 1126 (-64)
> outstanding 1062 (-64)
> outstanding 1000 (-62)
> outstanding 936 (-64)
> outstanding 872 (-64)
> outstanding 808 (-64)
> outstanding 744 (-64)
> outstanding 680 (-64)
> outstanding 616 (-64)
> outstanding 552 (-64)
> outstanding 490 (-62)
> outstanding 426 (-64)
> outstanding 362 (-64)
> outstanding 298 (-64)
> outstanding 234 (-64)
> outstanding 170 (-64)
> outstanding 106 (-64)
> outstanding 42 (-64)
> outstanding 1 (-41)
>
>  sock0@eno49:0 txonly xdp-drv
>                    pps            pkts           1.00
> rx                 0              0
> tx                 5,625,834      119,457,536
>
> ^C
>
> ^C
>
> ^C
> ^[[A^H^H^H^H^C
> ^C
> ^C
> ^C^Z
> [1]+  Stopped                 ./xdpsock -t -i eno49
> root@tditwtga002:~#
> </log>
>
> As you can see, the code gets into a infinite loop waiting for 1
> descriptor which never appears on the completion ring.
>
> Note that I added the code to print the outstanding count and also
> call complete_tx_only_all() in case of Ctrl-C also and not just when
> number of packets were specified. Here's the exact diff (the sample
> code was taken from 5.15 Linux kernel tree) -
>
> <diff>
> --- xdpsock_user.c.orig 2023-10-11 11:20:47.553580604 +0530
> +++ xdpsock_user.c      2023-10-07 12:18:33.849399960 +0530
> @@ -1174,7 +1174,7 @@ static inline void complete_tx_l2fwd(str
>  }
>
>  static inline void complete_tx_only(struct xsk_socket_info *xsk,
> -                                   int batch_size)
> +                                   int batch_size, bool print)
>  {
>         unsigned int rcvd;
>         u32 idx;
> @@ -1191,6 +1191,9 @@ static inline void complete_tx_only(stru
>         if (rcvd > 0) {
>                 xsk_ring_cons__release(&xsk->umem->cq, rcvd);
>                 xsk->outstanding_tx -= rcvd;
> +                if (print)
> +                    fprintf(stderr, "outstanding %u (-%02u) \n",
> +                            xsk->outstanding_tx, rcvd);
>         }
>  }
>
> @@ -1271,7 +1274,7 @@ static void tx_only(struct xsk_socket_in
>
>         while (xsk_ring_prod__reserve(&xsk->tx, batch_size, &idx) <
>                                       batch_size) {
> -               complete_tx_only(xsk, batch_size);
> +               complete_tx_only(xsk, batch_size, false);
>                 if (benchmark_done)
>                         return;
>         }
> @@ -1288,7 +1291,7 @@ static void tx_only(struct xsk_socket_in
>         xsk->outstanding_tx += batch_size;
>         *frame_nb += batch_size;
>         *frame_nb %= NUM_FRAMES;
> -       complete_tx_only(xsk, batch_size);
> +       complete_tx_only(xsk, batch_size, false);
>  }
>
>  static inline int get_batch_size(int pkt_cnt)
> @@ -1311,7 +1314,7 @@ static void complete_tx_only_all(void)
>                 pending = false;
>                 for (i = 0; i < num_socks; i++) {
>                         if (xsks[i]->outstanding_tx) {
> -                               complete_tx_only(xsks[i], opt_batch_size);
> +                               complete_tx_only(xsks[i], opt_batch_size, true);
>                                 pending = !!xsks[i]->outstanding_tx;
>                         }
>                 }
> @@ -1353,7 +1356,9 @@ static void tx_only_all(void)
>                         break;
>         }
>
> +#if 0
>         if (opt_pkt_count)
> +#endif
>                 complete_tx_only_all();
>  }
> </diff>
>
> Distro/Kernel:
> Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-86-generic x86_64)

Thanks for reporting. Could you please try with a bleeding edge kernel
please and check if the problem is still there? 6.5 for example.

> Driver:
> # ethtool -i eno49
> driver: ixgbe
> version: 5.19.6
> firmware-version: 0x80000887, 1.2688.0
> expansion-rom-version:
> bus-info: 0000:04:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
>
> I would be grateful if someone can try the same and let me know if
> they see a similar behaviour.
>
> Thanks in advance,
> Srivats



[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux