Re: Excessive network latency when using Realtek R8168/R8111 et al NIC

Rod Webster <rod@xxxxxxxxxx> · Fri, 19 May 2023 21:41:30 +1000

There is something weird going on, I have an i5 PC with R8111 NIC
which was problematic until I installed the R8168-dkms driver and the
6.3 kernel I compiled. It regularly threw packet errors on a RT thread
that fires every 1 ms.
I wanted to revert it so I could do these tests.
So I purged  the R8168-dkms driver to revert to the R8169 and rolled
back to a Debian 6.1 RT kernel. It's still running isolcpus=2,3
The darn thing refused to generate a packet error until I dropped the
thread timing to fire every 0.15 ms.
So it seems the process of installing and uninstalling has changed the
driver environment.
I do recall having something similar accidentally happen to another PC
once before.  I think it was related to the install and uninstall of
the 8168-dkms driver.
Any ideas what could be going on?

I will do some testing tomorrow now I know what's going on but I
wonder how relevant it will be given this turnaround?

Rod Webster

Rod Webster

VMN®

www.vmn.com.au

Ph: 1300 896 832

Mob: +61 435 765 611

On Fri, 19 May 2023 at 18:37, Sebastian Andrzej Siewior
<bigeasy@xxxxxxxxxxxxx> wrote:
>
> On 2023-05-18 20:15:20 [+1000], Rod Webster wrote:
> > Sebastian,
> Hi Rod,
>
> > I have cc'd Marcelo here as he has offered to help.
> >
> > I am attaching the last 1000 lines or so of our trace (I hope
> > attachments are allowed)
> >
> > Our bash script to terminate the trace was called PacketErrorTest. It
> > loops until the hardware reports a packet error.
> > The script contains:
> > #!/usr/bin/bash
> > stat=0
> > while (($stat < 1))
> > do
> > stat=`halcmd getp hm2_7i96s.0.packet-error-total`
> > done
> > trace-cmd stop
> >
> > I must say we have no idea what we are looking at but can't see
> > anything obvious to a layperson.
> > Please advise next steps
>
> I don't see much wrong here. Maybe the trace isn't long enough.
> However what I saw is that your networking interrupt isn't threaded and
> then all networking is outsourced to ksoftirqrd which might get less
> processing time if there is much going on.
> It would help if you enable NAPI threads: As root, please do
>
>         echo 1 > /sys/class/net/enp2s0/threaded
>
> This will spawn a few threads named "napi/enp2s0-$napi_id. The
> actual number of threads depends on the number of NAPI instances
> (usually queues) used by the driver.
> Once you identified them, please pin them the right CPU and adjust the
> priority:
>         taskset -pc 2 $pid
>         chrt -p -f 50 $pid
>
> The taskset command will pin the NAPI thread to CPU2. This is what I see
> in your trace where the interrupt is active. It is good to have them
> (the interrupt and the interrupt thread or the NAPI thread in your
> case) running on the same CPU. The chrt will adjust the priority of the
> thread from SCHED_OTHER to a SCHED_FIFO priority 50.
>
> By default all interrupt threads run at priority 50 (middle priority).
> They get preferred over all tasks with lower priority including
> SCHED_OTHER tasks. SCHED_OTHER (the default scheduling class) tasks get
> preempted once they used their time slice so that they don't lock up the
> system. This isn't the case with SCHED_FIFO.
>
> This realtek/r8169 driver is not using interrupts threads in your
> configuration. While I don't see any API breakage I am rethinking the
> whole softirq concept and moving to NAPI threads might not be that bad.
> Be aware that _now_ you don't have any "overloading" concept. I.e. what
> should be done if the system is receiving a lot of packages that it
> didn't ask for (say ping flood or so). Before this change, the NAPI
> would hand over to ksoftirqd running at SCHED_OTHER so it would be
> balanced with other tasks within the system.
>
> An explanation of the trace you sent:
> |  PacketErrorTest-2552  [002]   191.050337: irq_handler_entry:    irq=125 name=enp2s0
>    Running Task          <CPU>               Interrupt occurred    network card
>
> |  PacketErrorTest-2552  [002]   191.050339: softirq_raise:        vec=3 [action=NET_RX]
> |  PacketErrorTest-2552  [002]   191.050339: irq_handler_exit:     irq=125 ret=handled
> |  PacketErrorTest-2552  [002]   191.050340: sched_waking:         comm=ksoftirqd/2 pid=37 prio=120 target_cpu=002
> |  PacketErrorTest-2552  [002]   191.050343: sched_wakeup:         ksoftirqd/2:37 [120] CPU:002
>
> It scheduled NAPI (NET_RX) which is processed in softirq. Since this
> happened in hardirq context ("trace-cmd report -l" would reveal that) it
> wakes ksoftirqd for the processing.
>
> |  PacketErrorTest-2552  [002]   191.050344: sched_stat_runtime:   comm=PacketErrorTest pid=2552 runtime=587973 [ns] vruntime=15925574432 [ns]
> |  PacketErrorTest-2552  [002]   191.050346: sched_switch:         PacketErrorTest:2552 [120] R ==> ksoftirqd/2:37 [120]
> |      ksoftirqd/2-37    [002]   191.050347: softirq_entry:        vec=3 [action=NET_RX]
> |      ksoftirqd/2-37    [002]   191.050351: softirq_exit:         vec=3 [action=NET_RX]
>
> Here receiving of network packets is completed. Timing wise,
> 191.050351 - 191.050337 = 14us after the interrupt triggered. Looking
> through the whole file, it ranges from 6us to 14us (17 samples).
> So it doesn't look bad. If processing would get preempted and would need
> a few millisecond then it would be bad.
>
> > Rod Webster
> >
> > VMN®
> >
> > www.vmn.com.au
>
> Sebastian