I found a ConnectX-3 (non-pro) and wired it up. So in bridge mode, it seems like I can get ib_read_bw to work (still with a warm-up error message), but as router, I'm still having trouble. 192.168.21.17 ----- Linux bridge ------ 192.168.21.18 # ib_read_bw -d mlx5_0 -F -a 192.168.21.17 --------------------------------------------------------------------------------------- Device not recognized to implement inline feature. Disabling it --------------------------------------------------------------------------------------- RDMA_Read BW Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 1024[B] Link type : Ethernet Gid index : 0 Outstand reads : 16 rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0000 QPN 0x0135 PSN 0x12f108 OUT 0x10 RKey 0x009f79 VAddr 0x007f1c82d1f000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:18 remote address: LID 0000 QPN 0x0175 PSN 0x37982e OUT 0x10 RKey 0x00eac9 VAddr 0x007f54c1405000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:17 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] Conflicting CPU frequency values detected: 3698.669000 != 3102.661000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.86 differs from nominal 3698.67 MHz 2 1000 0.65 0.65 0.341088 Conflicting CPU frequency values detected: 3699.310000 != 1199.920000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500.01 differs from nominal 3699.31 MHz 4 1000 0.10 0.10 0.025750 Conflicting CPU frequency values detected: 3681.579000 != 1199.920000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.99 differs from nominal 3681.58 MHz 8 1000 2.77 2.77 0.363689 Conflicting CPU frequency values detected: 3602.325000 != 3265.655000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.99 differs from nominal 3602.32 MHz 16 1000 5.37 5.36 0.351569 Conflicting CPU frequency values detected: 3600.830000 != 3265.655000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.97 differs from nominal 3600.83 MHz 32 1000 11.30 11.29 0.370062 Conflicting CPU frequency values detected: 3599.761000 != 3265.655000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500.01 differs from nominal 3599.76 MHz 64 1000 22.39 22.28 0.365108 Conflicting CPU frequency values detected: 3599.975000 != 3265.655000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500.01 differs from nominal 3599.97 MHz 128 1000 45.09 45.08 0.369316 Conflicting CPU frequency values detected: 3599.761000 != 3265.655000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.99 differs from nominal 3599.76 MHz 256 1000 89.55 89.54 0.366765 Conflicting CPU frequency values detected: 3599.761000 != 2280.212000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500 differs from nominal 3599.76 MHz 512 1000 179.65 179.64 0.367907 Conflicting CPU frequency values detected: 3599.761000 != 1200.347000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.99 differs from nominal 3599.76 MHz 1024 1000 361.00 360.98 0.369639 Conflicting CPU frequency values detected: 3601.043000 != 1751.495000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500.01 differs from nominal 3601.04 MHz 2048 1000 492.15 491.42 0.251606 Conflicting CPU frequency values detected: 3698.028000 != 3601.470000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500.01 differs from nominal 3698.03 MHz 4096 1000 617.10 615.00 0.157440 Conflicting CPU frequency values detected: 3684.356000 != 3600.189000 Test integrity may be harmed ! Warning: measured timestamp frequency 3500 differs from nominal 3684.36 MHz 8192 1000 679.31 679.30 0.086951 Conflicting CPU frequency values detected: 3646.759000 != 1877.532000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.98 differs from nominal 3646.76 MHz 16384 1000 722.86 722.85 0.046262 Conflicting CPU frequency values detected: 3599.975000 != 2271.881000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.99 differs from nominal 3599.97 MHz 32768 1000 742.08 742.08 0.023746 Conflicting CPU frequency values detected: 3602.966000 != 1933.929000 Test integrity may be harmed ! Warning: measured timestamp frequency 3499.97 differs from nominal 3602.97 MHz 65536 1000 763.25 762.52 0.012200 mlx5: prv-0-18-roberttest.betterservers.com: got completion with error: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00008813 10000135 4680fcd2 Problems with warm up === Router config === 192.168.21.17 ------ 192.168.21.11 (Linux router) 192.168.22.11 ------ 192.168.21.18 #192.168.22.18 # ping 192.168.21.17 PING 192.168.21.17 (192.168.21.17) 56(84) bytes of data. 64 bytes from 192.168.21.17: icmp_seq=1 ttl=63 time=0.191 ms ^C --- 192.168.21.17 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.191/0.191/0.191/0.000 ms #192.168.21.17 # route -n | grep 168 192.168.21.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 192.168.22.0 192.168.21.11 255.255.255.0 UG 0 0 0 eth2 #192.168.22.18 # route -n | grep 168 192.168.21.0 192.168.22.11 255.255.255.0 UG 0 0 0 eth2 192.168.22.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 #192.168.22.18 # ib_read_bw -d mlx5_0 -F -a 192.168.21.17 --------------------------------------------------------------------------------------- Device not recognized to implement inline feature. Disabling it --------------------------------------------------------------------------------------- RDMA_Read BW Test Dual-port : OFF Device : mlx5_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 1024[B] Link type : Ethernet Gid index : 0 Outstand reads : 16 rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0000 QPN 0x013a PSN 0x676912 OUT 0x10 RKey 0x00dfd3 VAddr 0x007fe67aee8000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:22:18 remote address: LID 0000 QPN 0x017a PSN 0x4256ce OUT 0x10 RKey 0x012985 VAddr 0x007f59de5bf000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:17 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] Problems with warm up #192.168.21.17 # cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode RoCE v2 #192.168.22.18 # cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode RoCE v2 With routing, I'm not seeing any RoCE traffic with tcpdump on the interfaces. With bridge mode, I do see the RoCE traffic, but it looks like RoCE v1 traffic. [snip] 14:55:06.010682 0c:c4:7a:89:f7:06 > 0c:c4:7a:89:f6:f6, ethertype Unknown (0x8915), length 78: 0x0000: 6010 0000 0018 1b40 0000 0000 0000 0000 `......@........ 0x0010: 0000 ffff c0a8 1511 0000 0000 0000 0000 ................ 0x0020: 0000 ffff c0a8 1512 1060 ffff 0000 013e .........`.....> 0x0030: 00e5 7b6c 0000 0411 0000 0000 60bb 6a87 ..{l........`.j. [snip] I can get iSER to kind of work... In bridge mode and running fio on the iSER target, I'm getting messages in dmesg: [Thu Nov 10 15:14:17 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe [Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:14:17 2016] 00000000 08007806 2500014f a7a758d2 [Thu Nov 10 15:14:17 2016] iser: iser_err_comp: memreg failure: memory management operation error (6) vend_err 78 [Thu Nov 10 15:14:17 2016] connection82:0: detected conn error (1011) [Thu Nov 10 15:14:24 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe [Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:14:24 2016] 00000000 08007806 25000150 3471eed2 ... In routed mode I also get the same messages, but the device goes offline and crashes fio [Thu Nov 10 15:09:13 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe [Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000 [Thu Nov 10 15:09:13 2016] 00000000 08007806 25000149 5a524ad2 [Thu Nov 10 15:09:13 2016] iser: iser_err_comp: memreg failure: memory management operation error (6) vend_err 78 [Thu Nov 10 15:09:13 2016] connection80:0: detected conn error (1011) [Thu Nov 10 15:09:18 2016] session80: session recovery timed out after 5 secs [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device [Thu Nov 10 15:09:18 2016] scsi_io_completion: 23 callbacks suppressed [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] CDB: Read(10) 28 00 09 9f 97 18 00 01 48 00 [Thu Nov 10 15:09:18 2016] blk_update_request: 23 callbacks suppressed [Thu Nov 10 15:09:18 2016] blk_update_request: I/O error, dev sdab, sector 161453848 [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] CDB: Read(10) 28 00 07 bf 98 60 00 00 a8 00 [Thu Nov 10 15:09:18 2016] blk_update_request: I/O error, dev sdab, sector 129996896 [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request ... This is all using ConnectX-4 LX cards on the target and initiator and the 3.8.5 kernel. Any ideas of what may be causing these issues? Thank you, Robert LeBlanc ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Nov 3, 2016 at 11:38 AM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > That box has a build-in ConnectX-3 card that we aren't using in this > test so the mlx4 modules are loaded. I unloaded mlx4_ib, no luck. I > also tried to unload the mlx5_ib driver and it also unloaded mlx5_core > and my interfaces were gone. It seems like I can't only unload > mlx5_ib. > > With mlx4_ib unloaded I still can't rping or ib_read_bw (connects, but > get messages like: > ethernet_read_keys: Couldn't read remote address > Unable to read to socket/rdam_cm > Failed to exchange data between server and clients > Problems with warm up) same as before. > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Thu, Nov 3, 2016 at 11:16 AM, Parav Pandit <pandit.parav@xxxxxxxxx> wrote: >> Hi Robert, >> >> Can you please unload the mlx4_ib module in the bridge/router box and >> give it a quick try? >> >> Parav >> >> On Thu, Nov 3, 2016 at 10:32 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: >>> I'm trying to do some testing of RoCE v2 and so I put a LInux box >>> between two RoCE machines. I think the ConnectX-4 Lx card in the >>> bridge/router is intercepting the RoCE traffic and so it is not being >>> bridged/routed. I don't see any traffic using tcpdump which seems to >>> confirm this. I thought I could change the UDP port that the card is >>> looking for RoCE traffic to something not in use [0], but rr_proto is >>> not a valid parameter for the inbox mlx5_core module on 4.8.5. I can >>> ping across the bridge/router so I know that it is setup correctly, >>> just RDMA is not working. >>> >>> Any ideas on how to pass RoCE traffic like normal traffic? The reason >>> we are using a Linux box is that we can use netem to understand how >>> RoCE behaves in different situations. >>> >>> [0] https://community.mellanox.com/docs/DOC-1444 >>> >>> Thank you >>> ---------------- >>> Robert LeBlanc >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html