Re: Problems trying to bridge/route RoCE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I found a ConnectX-3 (non-pro) and wired it up. So in bridge mode, it
seems like I can get ib_read_bw to work (still with a warm-up error
message), but as router, I'm still having trouble.

192.168.21.17 ----- Linux bridge ------ 192.168.21.18

# ib_read_bw -d mlx5_0 -F -a 192.168.21.17
---------------------------------------------------------------------------------------
Device not recognized to implement inline feature. Disabling it
---------------------------------------------------------------------------------------
                    RDMA_Read BW Test
 Dual-port       : OFF          Device         : mlx5_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 TX depth        : 128
 CQ Moderation   : 100
 Mtu             : 1024[B]
 Link type       : Ethernet
 Gid index       : 0
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x0135 PSN 0x12f108 OUT 0x10 RKey
0x009f79 VAddr 0x007f1c82d1f000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:18
 remote address: LID 0000 QPN 0x0175 PSN 0x37982e OUT 0x10 RKey
0x00eac9 VAddr 0x007f54c1405000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:17
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
Conflicting CPU frequency values detected: 3698.669000 != 3102.661000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.86 differs from nominal 3698.67 MHz
 2          1000             0.65               0.65               0.341088
Conflicting CPU frequency values detected: 3699.310000 != 1199.920000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500.01 differs from nominal 3699.31 MHz
 4          1000             0.10               0.10               0.025750
Conflicting CPU frequency values detected: 3681.579000 != 1199.920000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.99 differs from nominal 3681.58 MHz
 8          1000             2.77               2.77               0.363689
Conflicting CPU frequency values detected: 3602.325000 != 3265.655000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.99 differs from nominal 3602.32 MHz
 16         1000             5.37               5.36               0.351569
Conflicting CPU frequency values detected: 3600.830000 != 3265.655000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.97 differs from nominal 3600.83 MHz
 32         1000             11.30              11.29              0.370062
Conflicting CPU frequency values detected: 3599.761000 != 3265.655000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500.01 differs from nominal 3599.76 MHz
 64         1000             22.39              22.28              0.365108
Conflicting CPU frequency values detected: 3599.975000 != 3265.655000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500.01 differs from nominal 3599.97 MHz
 128        1000             45.09              45.08              0.369316
Conflicting CPU frequency values detected: 3599.761000 != 3265.655000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.99 differs from nominal 3599.76 MHz
 256        1000             89.55              89.54              0.366765
Conflicting CPU frequency values detected: 3599.761000 != 2280.212000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500 differs from nominal 3599.76 MHz
 512        1000             179.65             179.64             0.367907
 Conflicting CPU frequency values detected: 3599.761000 != 1200.347000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.99 differs from nominal 3599.76 MHz
 1024       1000             361.00             360.98             0.369639
Conflicting CPU frequency values detected: 3601.043000 != 1751.495000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500.01 differs from nominal 3601.04 MHz
 2048       1000             492.15             491.42             0.251606
Conflicting CPU frequency values detected: 3698.028000 != 3601.470000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500.01 differs from nominal 3698.03 MHz
 4096       1000             617.10             615.00             0.157440
Conflicting CPU frequency values detected: 3684.356000 != 3600.189000
Test integrity may be harmed !
Warning: measured timestamp frequency 3500 differs from nominal 3684.36 MHz
 8192       1000             679.31             679.30             0.086951
Conflicting CPU frequency values detected: 3646.759000 != 1877.532000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.98 differs from nominal 3646.76 MHz
 16384      1000             722.86             722.85             0.046262
Conflicting CPU frequency values detected: 3599.975000 != 2271.881000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.99 differs from nominal 3599.97 MHz
 32768      1000             742.08             742.08             0.023746
Conflicting CPU frequency values detected: 3602.966000 != 1933.929000
Test integrity may be harmed !
Warning: measured timestamp frequency 3499.97 differs from nominal 3602.97 MHz
 65536      1000             763.25             762.52             0.012200
mlx5: prv-0-18-roberttest.betterservers.com: got completion with error:
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00008813 10000135 4680fcd2
Problems with warm up


=== Router config ===
192.168.21.17 ------ 192.168.21.11 (Linux router) 192.168.22.11 ------
192.168.21.18

#192.168.22.18
# ping 192.168.21.17
PING 192.168.21.17 (192.168.21.17) 56(84) bytes of data.
64 bytes from 192.168.21.17: icmp_seq=1 ttl=63 time=0.191 ms
^C
--- 192.168.21.17 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.191/0.191/0.191/0.000 ms

#192.168.21.17
# route -n | grep 168
192.168.21.0    0.0.0.0         255.255.255.0   U     0      0        0 eth2
192.168.22.0    192.168.21.11   255.255.255.0   UG    0      0        0 eth2

#192.168.22.18
# route -n | grep 168
192.168.21.0    192.168.22.11   255.255.255.0   UG    0      0        0 eth2
192.168.22.0    0.0.0.0         255.255.255.0   U     0      0        0 eth2

#192.168.22.18
# ib_read_bw -d mlx5_0 -F -a 192.168.21.17
---------------------------------------------------------------------------------------
Device not recognized to implement inline feature. Disabling it
---------------------------------------------------------------------------------------
                    RDMA_Read BW Test
 Dual-port       : OFF          Device         : mlx5_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 TX depth        : 128
 CQ Moderation   : 100
 Mtu             : 1024[B]
 Link type       : Ethernet
 Gid index       : 0
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x013a PSN 0x676912 OUT 0x10 RKey
0x00dfd3 VAddr 0x007fe67aee8000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:22:18
 remote address: LID 0000 QPN 0x017a PSN 0x4256ce OUT 0x10 RKey
0x012985 VAddr 0x007f59de5bf000
 GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:17
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
Problems with warm up


#192.168.21.17
# cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
RoCE v2

#192.168.22.18
# cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
RoCE v2

With routing, I'm not seeing any RoCE traffic with tcpdump on the
interfaces. With bridge mode, I do see the RoCE traffic, but it looks
like RoCE v1 traffic.

[snip]
14:55:06.010682 0c:c4:7a:89:f7:06 > 0c:c4:7a:89:f6:f6, ethertype
Unknown (0x8915), length 78:
        0x0000:  6010 0000 0018 1b40 0000 0000 0000 0000  `......@........
        0x0010:  0000 ffff c0a8 1511 0000 0000 0000 0000  ................
        0x0020:  0000 ffff c0a8 1512 1060 ffff 0000 013e  .........`.....>
        0x0030:  00e5 7b6c 0000 0411 0000 0000 60bb 6a87  ..{l........`.j.
[snip]

I can get iSER to kind of work...

In bridge mode and running fio on the iSER target, I'm getting
messages in dmesg:
[Thu Nov 10 15:14:17 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe
[Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:14:17 2016] 00000000 08007806 2500014f a7a758d2
[Thu Nov 10 15:14:17 2016] iser: iser_err_comp: memreg failure: memory
management operation error (6) vend_err 78
[Thu Nov 10 15:14:17 2016]  connection82:0: detected conn error (1011)
[Thu Nov 10 15:14:24 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe
[Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:14:24 2016] 00000000 08007806 25000150 3471eed2
...

In routed mode I also get the same messages, but the device goes
offline and crashes fio

[Thu Nov 10 15:09:13 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe
[Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000
[Thu Nov 10 15:09:13 2016] 00000000 08007806 25000149 5a524ad2
[Thu Nov 10 15:09:13 2016] iser: iser_err_comp: memreg failure: memory
management operation error (6) vend_err 78
[Thu Nov 10 15:09:13 2016]  connection80:0: detected conn error (1011)
[Thu Nov 10 15:09:18 2016]  session80: session recovery timed out after 5 secs
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
[Thu Nov 10 15:09:18 2016] scsi_io_completion: 23 callbacks suppressed
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] FAILED Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] CDB: Read(10) 28 00 09
9f 97 18 00 01 48 00
[Thu Nov 10 15:09:18 2016] blk_update_request: 23 callbacks suppressed
[Thu Nov 10 15:09:18 2016] blk_update_request: I/O error, dev sdab,
sector 161453848
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] FAILED Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] CDB: Read(10) 28 00 07
bf 98 60 00 00 a8 00
[Thu Nov 10 15:09:18 2016] blk_update_request: I/O error, dev sdab,
sector 129996896
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
[Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
...

This is all using ConnectX-4 LX cards on the target and initiator and
the 3.8.5 kernel.

Any ideas of what may be causing these issues?

Thank you,
Robert LeBlanc

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Nov 3, 2016 at 11:38 AM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
> That box has a build-in ConnectX-3 card that we aren't using in this
> test so the mlx4 modules are loaded. I unloaded mlx4_ib, no luck. I
> also tried to unload the mlx5_ib driver and it also unloaded mlx5_core
> and my interfaces were gone. It seems like I can't only unload
> mlx5_ib.
>
> With mlx4_ib unloaded I still can't rping or ib_read_bw (connects, but
> get messages like:
> ethernet_read_keys: Couldn't read remote address
> Unable to read to socket/rdam_cm
> Failed to exchange data between server and clients
> Problems with warm up) same as before.
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
> On Thu, Nov 3, 2016 at 11:16 AM, Parav Pandit <pandit.parav@xxxxxxxxx> wrote:
>> Hi Robert,
>>
>> Can you please unload the mlx4_ib module in the bridge/router box and
>> give it a quick try?
>>
>> Parav
>>
>> On Thu, Nov 3, 2016 at 10:32 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
>>> I'm trying to do some testing of RoCE v2 and so I put a LInux box
>>> between two RoCE  machines. I think the ConnectX-4 Lx card in the
>>> bridge/router is intercepting the RoCE traffic and so it is not being
>>> bridged/routed. I don't see any traffic using tcpdump which seems to
>>> confirm this. I thought I could change the UDP port that the card is
>>> looking for RoCE traffic to something not in use [0], but rr_proto is
>>> not a valid parameter for the inbox mlx5_core module on 4.8.5. I can
>>> ping across the bridge/router so I know that it is setup correctly,
>>> just RDMA is not working.
>>>
>>> Any ideas on how to pass RoCE traffic like normal traffic? The reason
>>> we are using a Linux box is that we can use netem to understand how
>>> RoCE behaves in different situations.
>>>
>>> [0] https://community.mellanox.com/docs/DOC-1444
>>>
>>> Thank you
>>> ----------------
>>> Robert LeBlanc
>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux