On Tue, 23 May 2023, Rod Webster wrote:
Date: Tue, 23 May 2023 06:02:13 +1000 From: Rod Webster <rod@xxxxxxxxxx> To: Marcelo Tosatti <mtosatti@xxxxxxxxxx> Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>, linux-rt-users@xxxxxxxxxxxxxxx Subject: Re: Excessive network latency when using Realtek R8168/R8111 et al NIC This stuff is hard! I just realised that rtapi_app is a red herring! rtapi_app is Linuxcnc and there is nothing wrong with it. Its thread is on a 1000us cycle so it seems it gets all its jobs done in 200us and then sleeps for 800us which makes perfect sense! The issue we have is deeper than that. I think we should be looking at the NIC interrupt (but don't trust the novice!). The network communication is consuming more than the 800us slack from time to time. When that happens, our hardware sees the timing overrun and increments an internal packet error count. If too many of these happen in succession, the hardware decides the RT environment can't be relied on, disables further communication and returns an "error finishing read" to Linuxcnc to say it's given up. Marcelo, we didn't resort to C. We were able to use a bash script and use a linuxcnc tool called halcmd to query the hardware as shown here. #!/usr/bin/bash stat=0 while (($stat < 1)) do stat=`halcmd getp hm2_7i96s.0.packet-error-total` done trace-cmd stop I think we need to increase the stat threshold so we get more samples in our trace before stopping it. The current trace will only have one instance. Thanks for letting me see the issue more clearly. Rod Webster
I should note that at least for Intel MACs, the 6.3.1-rt13 and 6.4.0-rc2-rt1 kernels seem to solve the issue. Not sure what changed but maximum read time is now in the 200.. 250 usec peak region (about 100 usec more than average) This is the peak read latency after about 3 days of videos, compiling and local network activity.
Sadly 6.4.0-rc3-rt2 has regressed slightly in network latency on my test systems
My test systems were all Intel CPUs with 4 cores, isolcpus=3 and the Ethernet IRQ pinned to CPU3 Peter Wallace