Hi Ramon, On 01/05/24 11:59 pm, Ramón Nordin Rodriguez wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > >>> n | name | min | avg | max | rx dropped | samples >>> 1 | no mod | 827K | 846K | 891K | 945 | 5 >>> 2 | no log | 711K | 726K | 744K | 562 | 5 >>> 3 | less irq | 815K | 833K | 846K | N/A | 5 >>> 4 | no irq | 914K | 924K | 931K | N/A | 5 >>> 5 | simple | 857K | 868K | 879K | 615 | 5 >> >> That is odd. >> >> Side question: What CONFIG_HZ= do you have? 100, 250, 1000? Try >> 1000. I've seen problems where the driver wants to sleep for a short >> time, but the CONFIG_HZ value limits how short a time it can actually >> sleep. It ends up sleeping much longer than it wants. >> > > I have been doing my best to abuse the link some more. In brief tweaking > CONFIG_HZ has some but limited effect. > Saturating the link with the rx buffer interrupt enabled breaks the driver. > Saturating the link with the rx buffer interrupt disabled has poor > performance. > > The following scenario has been tested. Both ends of the link run: > * server.py > * client.py > > One end is an arm64 quad core running at 1.2GHz with the lan8650 macphy. > The other end is an amd 3950x running the lan8670 usb eval board. > Both systems should be fast enough that running python should not be a > limiting factor. > > -- The test code -- > server.py > #!/bin/env python3 > import socket > > def serve(sock: socket.socket): > while True: > client, addr = sock.accept() > print(f'connection from: {addr}') > while len(client.recv(2048)) > 0: > pass > print('client disconnected') > client.close() > > if __name__ == '__main__': > sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) > sock.bind(('0.0.0.0', 4040)) > sock.listen(1) > serve(sock) > print("something went wrong") > > client.py > #!/bin/env python3 > import socket > import sys > > if __name__ == '__main__': > sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > sock.connect((sys.argv[1], 4040)) > > while True: > sock.sendall(b'0'*2048) > > -- test runs -- > run 1 - all interrupts enabled > Time to failure: > 1 min or less > > Kernel output: > [ 94.361312] sched: RT throttling activated > > top output: > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 145 root -51 0 0 0 0 R 95.5 0.0 1:11.22 oa-tc6-spi-thread > > link stats: > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 > link/ether 32:c2:7e:22:93:99 brd ff:ff:ff:ff:ff:ff > RX: bytes packets errors dropped missed mcast > 3371902 7186 0 48 0 0 > RX errors: length crc frame fifo overrun > 0 0 0 0 0 > TX: bytes packets errors dropped carrier collsns > 10341438 8071 0 0 0 0 > TX errors: aborted fifo window heartbt transns > 0 0 0 0 1 > state: > Completly borked, can't ping in or out, bringing the interface down then up > has no effect. > There is no SPI clock and no interrupts generated by the mac-phy. > The worker thread seems to have live locked. > > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > run 2 - RX_BUFFER_OVERLOW interrupt disabled > > state: > Runs just fine but the oa-tc6-spi-thread is consuming 10-20% cpu > Ping times have increased from 1-2ms to 8-35ms > > > -- additional notes -- > When tweaking CONFIG_HZ I do get some changes in behaviour, the cpu > consumption stays stable at 20%+-2 with CONFIG_HZ=250, when increased to > CONFIG_HZ=1000 it jumps up and down between 10-20%. > > I don't have access to a logic analyzer but my old oscilloscope is > almost reliable. I could confirm that the spi clock is indeed running at > the expected 25MHz, but I could observe some gaps of up to 320µs so > that's 8k spi cycles spent doing something else. > These gaps were observed on the SPI clock and the macphy interrupt was > active for the same ammount of time(though this was measured independently > and not on the same trigger). > I've been drinking way to much coffe, so soldering is not gonna happen > today (shaky hands), but if it helps I can solder wires to attach both > probes to confirm that the gap in the SPI clock happens at the same time > or not as the interrupt is active. > > I'd be keen on hearing what Microchips plans to address. If tracking > down performance issues is a priority I'll probably not spend any time > on it, if not then I'll definetly dig into it more. I tried to reproduce the issue in my setup with your above applications. But surprisingly I couldn't reproduce the issue you are facing. One end is Raspberry Pi 4 with lan8651 MAC-PHY and the other end is Raspberry Pi 4 with EVB-LAN8670-USB Stick. lan8651 MAC-PHY side: --------------------- pi@raspberrypi:~/lan865x/v4_mainline $ python3 server.py connection from: ('192.168.5.101', 46608) Top output: ----------- top - 14:28:14 up 1:14, 4 users, load average: 0.68, 0.67, 0.63 Tasks: 201 total, 1 running, 200 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 1.1 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st MiB Mem : 7810.0 total, 7110.5 free, 392.3 used, 432.4 buff/cache MiB Swap: 100.0 total, 100.0 free, 0.0 used. 7417.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7219 root -51 0 0 0 0 S 2.7 0.0 1:13.11 oa-tc6-spi-thread 35307 pi 20 0 16280 9216 5248 S 0.7 0.1 0:16.53 python3 88831 pi 20 0 11728 4864 2816 R 0.3 0.1 0:00.22 top 89819 vnc 20 0 2320 384 384 S 0.3 0.0 0:00.04 sh 1 root 20 0 168652 11580 8436 S 0.0 0.1 0:03.85 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pool_workqueue_release 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-rcu_g 5 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-rcu_p 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-slub_ 7 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-netns 11 root 20 0 0 0 0 I 0.0 0.0 0:00.03 kworker/u8:0-ext4-rsv-conversion 12 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-mm_pe ifconfig output: ---------------- eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.5.100 netmask 255.255.255.0 broadcast 0.0.0.0 ether 04:05:06:01:02:03 txqueuelen 1000 (Ethernet) RX packets 1589879 bytes 2391045582 (2.2 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1022419 bytes 71954905 (68.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 55 dmesg output: ------------- pi@raspberrypi:~/lan865x/v4_mainline $ dmesg [ 234.019968] LAN865X Rev.B0 Internal Phy spi0.0:00: attached PHY driver (mii_bus:phy_addr=spi0.0:00, irq=POLL) [ 234.069387] lan865x spi0.0 eth1: Link is Up - 10Mbps/Half - flow control off EVB-LAN8670-USB stick side: --------------------------- pi@pv-rpi-tp2:~/microchip/t1s-usb $ python3 client.py 192.168.5.100 ifconfig output: ---------------- eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.5.101 netmask 255.255.255.0 broadcast 192.168.5.255 ether 00:1e:c0:d1:ca:39 txqueuelen 1000 (Ethernet) RX packets 1034606 bytes 58330335 (55.6 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1722400 bytes 2628199197 (2.4 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Let me know if you want to try anything more. Best regards, Parthiban V > > Let me know if anything is unclear or if I can help out with anything > specific. > > R