Re: Excessive network latency when using Realtek R8168/R8111 et al NIC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 22, 2023 at 08:06:39PM +1000, Rod Webster wrote:
> Yes, I won't be using a DKMS driver and will test with the in-tree driver.
> I rolled a PC back to the Debian 6.1 RT kernel and the default in-tree
> R8169 driver yesterday with isolcpus=2,3.
> 
> I had a shot at it last night with the napi thread but was confused by
> the trace results so will try again.
> I could not see our PacketErrorTest script appearing in the trace but
> it stopped the trace when it should.

Rod,

Can you describe the PacketErrorTest script and how do you know, from
it, that latency has crossed the threshold? What was confusing about
the traces?

Earlier you wrote:
I've had some feedback from a hardware supplier.
The Linuxcnc RT ethernet driver has a pin that reports ethernet packet
error count.
This increments when the ethernet response exceeds our real time servo
thread period (typically 1 ms)
We should be able to monitor this pin and stop the trace if it is >  0
without too much effort.

So you are seeing that pin increase its value?

When you do, you should write to the "tracing_on" file (a zero):

        /*
         * if we're not tracing and the tracing_on fd is not open,
         * open the tracing_on file so that we can stop the trace
         * if we hit a breaktrace threshold
         */
        if (trace_fd < 0) {
                sprintf(path, "%s/%s", fileprefix, "tracing_on");
                if ((trace_fd = open(path, O_WRONLY)) < 0)
                        warn("unable to open tracing_on file: %s\n", path);
        }

        /* now stop any trace */
        write(trace_fd, "0\n", 2);

The trace file (/sys/kernel/debug/tracing/trace) should now contain
the trace up to the write to tracing_on file.

Sebastian: didnt see anything weird in the traces as well:

"Follows the delta between a list of timestamped events,
where each event is: "TIMESTAMP: rtapi is scheduled IN CPU-3"
(there is at one time only a single process scheduled in cpu-3).

[191.050245, 191.050278]: 0.000033
[191.050278, 191.050293]: 0.000015
[191.050293, 191.050307]: 0.000014
[191.050307, 191.050322]: 0.000015
[191.050322, 191.050336]: 0.000014

rtapi_app goes to sleep for approximately 15us, and is awakened
by a timer.

[191.050336, 191.050350]: 0.000014
[191.050350, 191.050365]: 0.000015
[191.050365, 191.050379]: 0.000014
[191.050379, 191.050393]: 0.000014
[191.050393, 191.050408]: 0.000015
[191.050408, 191.051243]: 0.000835
[191.051243, 191.051272]: 0.000029

It eventually sleeps for 800us (and this pattern repeats)".

Perhaps tracing the system call which sends out the packet would be
useful as well.

> Also our RT process did not appear to be on an isolated thread which
> seemed odd. I did not get as far as setting the affinity for napi.
> 
> I've also installed  Bookworm RC3 on another PC with the default RT
> kernel on a Gigabyte Brix N3160 with an R8111/R8168 NIC and the
> in-tree R8169 driver.
> This PC historically had max latency of about 130000 ns  back with
> kernel 4.x and now its reporting 23000 ns with the 6.1 kernel over 24
> hours testing with the linuxcnc tools. Quite amazing what you guys
> have achieved! It will be a few days before I can lug my hardware to
> work to test on it as well for another data point.
> 
> I really appreciate your willingness to help. This remains a
> frustrating issue for many Linuxcnc users who are using my compiled
> 6.3 kernel to get out of jail.
> Bear with me as I bumble my way through!
> 
> Rod Webster
> 
> 
> 
> 
> 
> Rod Webster
> 
> VMN®
> 
> www.vmn.com.au
> 
> Ph: 1300 896 832
> 
> Mob: +61 435 765 611
> 
> 
> 
> 
> On Mon, 22 May 2023 at 19:32, Sebastian Andrzej Siewior
> <bigeasy@xxxxxxxxxxxxx> wrote:
> >
> > On 2023-05-19 21:41:30 [+1000], Rod Webster wrote:
> > > I will do some testing tomorrow now I know what's going on but I
> > > wonder how relevant it will be given this turnaround?
> >
> > I'm not going to look at the dkms/OOT driver. If there is an actual
> > problem or something odd with mainline and RT and I will look into it.
> > I managed to find an older box which is using the r8169 and I'm going to
> > boot and see if there the driver is doing something odd what I didn't
> > see earlier…
> >
> > Please make sure that you are using the in-tree driver. If you
> > experience high latencies then I'm willing to help investigate.
> >
> > > Rod Webster
> >
> > Sebastian
> 
> 




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux