Hi Jason and Leon, We recently switch to use upstream OFED from MLNX-OFED, and we notice IPoIB stop working with upstream kernel 5.4.102 with mellanox CX-5 HCA, it's working fine on CX-2/CX-3. I tested also on 5.11 kernel it behaves the same. The symptoms are ipoib child interfaces are UP and ready, but ping doens't work at all, simple ifdown/ifup the child interface doens't change anything. Workaround is bring up the parent interface "ip link set ib0 up" basic config from "ip a" jwang@xxxxxxxxxxxxxx:~$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 0c:c4:7a:ff:07:d0 brd ff:ff:ff:ff:ff:ff inet 10.41.3.146/22 brd 10.41.3.255 scope global eth0 valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 0c:c4:7a:ff:07:d1 brd ff:ff:ff:ff:ff:ff 4: ib0: <BROADCAST,MULTICAST> mtu 4092 qdisc noop state DOWN group default qlen 1024 link/infiniband 00:00:11:07:fe:80:00:00:00:00:00:00:98:03:9b:03:00:66:de:52 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 5: ib1: <BROADCAST,MULTICAST> mtu 4092 qdisc noop state DOWN group default qlen 1024 link/infiniband 00:00:19:07:fe:80:00:00:00:00:00:00:98:03:9b:03:00:66:de:53 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 6: ib0.beef@ib0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 4092 qdisc mq state UP group default qlen 1024 link/infiniband 00:00:11:4b:fe:80:00:00:00:00:00:00:98:03:9b:03:00:66:de:52 brd 00:ff:ff:ff:ff:12:40:1b:be:ef:00:00:00:00:00:00:ff:ff:ff:ff inet 10.42.3.146/20 brd 10.42.15.255 scope global ib0.beef valid_lft forever preferred_lft forever inet6 fe80::9a03:9b03:66:de52/64 scope link valid_lft forever preferred_lft forever 7: ib0.dddd@ib0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 4092 qdisc mq state UP group default qlen 1024 link/infiniband 00:00:12:87:fe:80:00:00:00:00:00:00:98:03:9b:03:00:66:de:52 brd 00:ff:ff:ff:ff:12:40:1b:dd:dd:00:00:00:00:00:00:ff:ff:ff:ff inet6 2a02:247f:401:1:2:0:a:392/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9a03:9b03:66:de52/64 scope link valid_lft forever preferred_lft forever 8: ib1.beef@ib1: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 4092 qdisc mq state UP group default qlen 1024 link/infiniband 00:00:19:4b:fe:80:00:00:00:00:00:00:98:03:9b:03:00:66:de:53 brd 00:ff:ff:ff:ff:12:40:1b:be:ef:00:00:00:00:00:00:ff:ff:ff:ff inet 10.43.3.146/20 brd 10.43.15.255 scope global ib1.beef valid_lft forever preferred_lft forever inet6 fe80::9a03:9b03:66:de53/64 scope link valid_lft forever preferred_lft forever 9: ib1.dddd@ib1: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 4092 qdisc mq state UP group default qlen 1024 link/infiniband 00:00:1a:87:fe:80:00:00:00:00:00:00:98:03:9b:03:00:66:de:53 brd 00:ff:ff:ff:ff:12:40:1b:dd:dd:00:00:00:00:00:00:ff:ff:ff:ff inet6 2a02:247f:402:1:2:0:a:392/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9a03:9b03:66:de53/64 scope link valid_lft forever preferred_lft forever jwang@xxxxxxxxxxxxxx:~$ dmesg | egrep 'mlx|ib' [ 0.000000] Command line: BOOT_IMAGE=(http)/live-images/liveboot-2021.76/vmlinuz BOOTIF=0c:c4:7a:ff:07:d0 boot=live fetch=http://mgmt/live-images/liveboot-2021.76/root.squashfs consoleblank=0 PHASE=Testing crashkernel=512M quiet salt-master=salt-master.stg.profitbricks.net saltenv=base pillarenv=base ib_ipoib.debug_level=1 liveboot.sdn2 [ 0.889525] Kernel command line: BOOT_IMAGE=(http)/live-images/liveboot-2021.76/vmlinuz BOOTIF=0c:c4:7a:ff:07:d0 boot=live fetch=http://mgmt/live-images/liveboot-2021.76/root.squashfs consoleblank=0 PHASE=Testing crashkernel=512M quiet salt-master=salt-master.stg.profitbricks.net saltenv=base pillarenv=base ib_ipoib.debug_level=1 liveboot.sdn2 [ 1.997444] Calibrating delay loop (skipped), value calculated using timer frequency.. 4200.00 BogoMIPS (lpj=21000000) [ 2.422119] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details. [ 2.422119] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details. [ 2.992059] pci_bus 0000:03: extended config space not accessible [ 3.024991] pci 0000:03:00.0: vgaarb: bridge control possible [ 5.287548] tsc: Refined TSC clocksource calibration: 2099.999 MHz [ 16.839146] systemd[1]: File /lib/systemd/system/systemd-journald.service:12 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling. [ 16.874155] systemd[1]: /lib/systemd/system/tap-offloads-trk.service:10: PIDFile= references path below legacy directory /var/run/, updating /var/run/tap-offloads-trk.pid → /run/tap-offloads-trk.pid; please update the unit file accordingly. [ 16.893383] systemd[1]: Listening on initctl Compatibility Named Pipe. [ 23.244067] mlx5_core 0000:af:00.0: firmware version: 16.27.2008 [ 23.244103] mlx5_core 0000:af:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link) [ 23.274277] libata version 3.00 loaded. [ 23.555901] mlx5_core 0000:af:00.0: Port module event: module 0, Cable plugged [ 23.556314] mlx5_core 0000:af:00.0: mlx5_pcie_event:296:(pid 7): PCIe slot advertised sufficient power (75W). [ 23.573895] mlx5_core 0000:af:00.1: firmware version: 16.27.2008 [ 23.573950] mlx5_core 0000:af:00.1: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link) [ 23.885989] mlx5_core 0000:af:00.1: Port module event: module 1, Cable plugged [ 23.886133] mlx5_core 0000:af:00.1: mlx5_pcie_event:296:(pid 3256): PCIe slot advertised sufficient power (75W). [ 27.924069] mlx5_core 0000:af:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 27.924076] mlx5_core 0000:af:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 27.999211] ib0: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 28.000387] mlx5_core 0000:af:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 28.000393] mlx5_core 0000:af:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 28.086111] ib1: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.415045] ib0: Event 12 on device mlx5_0 port 1 [ 29.415147] ib0: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.415661] ib0: Event 12 on device mlx5_0 port 1 [ 29.415742] ib0: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.416497] ib0: Event 12 on device mlx5_0 port 1 [ 29.416591] ib0: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.419656] ib0: Event 17 on device mlx5_0 port 1 [ 29.419669] ib0: Not flushing - IPOIB_FLAG_INITIALIZED not set. [ 29.420226] ib0: Event 11 on device mlx5_0 port 1 [ 29.420240] ib0: Not flushing - IPOIB_FLAG_INITIALIZED not set. [ 29.420257] ib1: Event 12 on device mlx5_1 port 1 [ 29.420317] ib1: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.420840] ib1: Event 12 on device mlx5_1 port 1 [ 29.420898] ib1: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.421190] ib1: Event 12 on device mlx5_1 port 1 [ 29.421247] ib1: Not flushing - IPOIB_FLAG_ADMIN_UP not set. [ 29.421632] ib1: Event 11 on device mlx5_1 port 1 [ 29.421640] ib1: Not flushing - IPOIB_FLAG_INITIALIZED not set. [ 29.422261] ib1: Event 17 on device mlx5_1 port 1 [ 29.422276] ib1: Not flushing - IPOIB_FLAG_INITIALIZED not set. [ 29.749430] ib0: Event 9 on device mlx5_0 port 1 [ 29.749441] ib0: Not flushing - IPOIB_FLAG_INITIALIZED not set. [ 29.751349] ib1: Event 9 on device mlx5_1 port 1 [ 29.751365] ib1: Not flushing - IPOIB_FLAG_INITIALIZED not set. [ 46.707421] mlx5_core 0000:af:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 46.707434] mlx5_core 0000:af:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 46.725944] ib0.beef: bringing up interface [ 46.968005] ib0.beef: Created ah 00000000cb29051b [ 47.000529] IPv6: ADDRCONF(NETDEV_CHANGE): ib0.beef: link becomes ready [ 47.004101] ib0.beef: Created ah 000000001338d4ae [ 47.007399] ib0.beef: Created ah 000000002947be1d [ 47.010668] ib0.beef: Created ah 00000000a8586948 [ 47.013871] ib0.beef: Created ah 00000000e584ea42 [ 47.033747] ib0.beef: Created ah 0000000086cb1ff9 [ 47.189454] mlx5_core 0000:af:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 47.189465] mlx5_core 0000:af:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 47.215051] ib0.dddd: bringing up interface [ 47.457634] ib0.dddd: Created ah 000000009bb41171 [ 47.490564] IPv6: ADDRCONF(NETDEV_CHANGE): ib0.dddd: link becomes ready [ 47.494065] ib0.dddd: Created ah 00000000531ff3b3 [ 47.497206] ib0.dddd: Created ah 0000000006238049 [ 47.500281] ib0.dddd: Created ah 00000000a2776703 [ 47.503453] ib0.dddd: Created ah 000000006f839ea0 [ 47.506697] ib0.dddd: Created ah 00000000d3218392 [ 47.523579] ib0.dddd: Created ah 000000004e8a14c7 [ 48.894389] ib0.dddd: Created ah 00000000c664dbd4 [ 48.897657] ib0.beef: Created ah 00000000c446a0e6 [ 49.593055] mlx5_core 0000:af:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 49.593064] mlx5_core 0000:af:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 49.610051] ib1.beef: bringing up interface [ 49.857979] ib1.beef: Created ah 000000003571492a [ 49.890521] IPv6: ADDRCONF(NETDEV_CHANGE): ib1.beef: link becomes ready [ 49.893951] ib1.beef: Created ah 00000000aea98452 [ 49.897011] ib1.beef: Created ah 000000004e23c357 [ 49.899995] ib1.beef: Created ah 00000000ed62df50 [ 49.903036] ib1.beef: Created ah 0000000041605d6d [ 49.915754] mlx5_core 0000:af:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 49.915765] mlx5_core 0000:af:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 49.923955] ib1.beef: Created ah 00000000f5d6b457 [ 49.943153] ib1.dddd: bringing up interface [ 50.187608] ib1.dddd: Created ah 00000000cebeba47 [ 50.220523] IPv6: ADDRCONF(NETDEV_CHANGE): ib1.dddd: link becomes ready [ 50.224347] ib1.dddd: Created ah 00000000c6f96f11 [ 50.227539] ib1.dddd: Created ah 000000004fe70418 [ 50.230691] ib1.dddd: Created ah 00000000ae96df99 [ 50.233810] ib1.dddd: Created ah 000000004af47f93 [ 50.236892] ib1.dddd: Created ah 0000000064aca082 [ 50.264221] ib1.dddd: Created ah 00000000f330012e [ 51.774399] ib1.beef: Created ah 000000007f1ef527 [ 52.094689] ib1.dddd: Created ah 00000000210b80b4 [ 57.215935] ib0.dddd: Created ah 00000000f07b9547 [ 57.216368] ib1.beef: Created ah 00000000f3a87dc7 [ 57.219420] ib1.beef: Created ah 00000000b7d4d592 [ 57.225647] ib0.beef: Created ah 00000000e65557a4 [ 57.228334] ib1.dddd: Created ah 000000001914b301 [ 57.228819] ib0.beef: Created ah 0000000070b21f1c [ 57.264003] ib1.beef: Created ah 0000000070b3a6e8 [ 57.264079] ib0.beef: Created ah 00000000be1feac1, [ 137.514460] ib0.beef: neigh free for ffffff ff12:601b:beef:0000:0000:0001:ff66:de52 [ 137.514461] ib0.dddd: neigh free for ffffff ff12:601b:dddd:0000:0000:0001:ff0a:0392 [ 137.514471] ib0.dddd: neigh free for ffffff ff12:601b:dddd:0000:0000:0001:ff66:de52 [ 137.514473] ib0.beef: neigh free for ffffff ff12:401b:beef:0000:0000:0000:0000:0016 [ 137.514477] ib0.dddd: neigh free for ffffff ff12:601b:dddd:0000:0000:0000:0000:0016 [ 137.514478] ib0.beef: neigh free for ffffff ff12:601b:beef:0000:0000:0000:0000:0016 [ 140.074531] ib1.beef: neigh free for ffffff ff12:401b:beef:0000:0000:0000:0000:0016 [ 140.074541] ib1.beef: neigh free for ffffff ff12:601b:beef:0000:0000:0000:0000:0016 [ 140.074545] ib1.beef: neigh free for ffffff ff12:601b:beef:0000:0000:0001:ff66:de53 [ 140.714539] ib1.dddd: neigh free for ffffff ff12:601b:dddd:0000:0000:0001:ff0a:0392 [ 140.714549] ib1.dddd: neigh free for ffffff ff12:601b:dddd:0000:0000:0000:0000:0016 [ 140.714553] ib1.dddd: neigh free for ffffff ff12:601b:dddd:0000:0000:0001:ff66:de53 [ 144.470916] ib0.dddd: Created ah 000000009d40e279 [ 177.320655] ib0.dddd: Created ah 0000000023a374d0 [ 177.321583] ib1.beef: Created ah 00000000b54aadfc [ 177.324385] ib0.beef: Created ah 00000000f4507818 [ 177.325263] ib1.beef: Created ah 00000000132b48ff [ 177.328056] ib0.beef: Created ah 000000004e093b7c [ 177.328715] ib1.dddd: Created ah 00000000b274652f [ 177.358792] ib0.beef: Created ah 0000000076e40813 [ 177.358863] ib1.dddd: Created ah 00000000146f0ae3 [ 177.361796] ib1.beef: Created ah 00000000d7c8cff5 [ 177.362033] ib0.beef: Created ah 0000000086031b72 [ 177.365082] ib0.dddd: Created ah 0000000083e723db [ 177.365086] ib1.beef: Created ah 0000000029b2b4cb [ 200.215825] ib1.beef: neigh free for ffffff ff12:401b:beef:0000:0000:0000:ffff:ffff I suspect it might be related to change in this patchset: https://lore.kernel.org/linux-rdma/20180729083500.5352-1-leon@xxxxxxxxxx/ Is this expected behavor? how can we fix it? Thanks! -- Jinpu Wang