There were several loopback-related RXE bugs fixed in the 4.9/4.10/4.11 timeframe. -Andrew On 2/18/18, 12:38 PM, "linux-rdma-owner@xxxxxxxxxxxxxxx on behalf of Jason Gunthorpe" <linux-rdma-owner@xxxxxxxxxxxxxxx on behalf of jgg@xxxxxxxx> wrote: Well, that really isn't supposed to be.. I wonder if you just hit a general rxe kernel bug, or if something incompatible actually slipped by.. Jason On Sun, Feb 18, 2018 at 06:47:07PM +0800, Bairen Yi wrote: > Hi folks, > > Just a quick update, upgrading to kernel v4.14 solves my issue. > > It looks like rdma-core 16 in Debian stretch-backports does not work with kernel v4.9 in Debian stretch. > > And yep, matching is important :P > > Best, > Bairen > > > On 18 Feb 2018, at 05:23, Majd Dibbiny <majd@xxxxxxxxxxxx> wrote: > > > >> > >> On Feb 17, 2018, at 3:03 PM, Yi Bairen <byron@xxxxxxxxxx> wrote: > >> > >> Hi, > >> > >>> On 15 Feb 2018, at 03:25, Yuval Shaia <yuval.shaia@xxxxxxxxxx> wrote: > >>> > >>> On Wed, Feb 14, 2018 at 05:13:07PM +0000, Yi Bairen wrote: > >>>> Hi folks, > >>>> > >>>> Seems I can’t get a Linux 4.9 (Debian 9.3) VM working with single box SoftRoCE loopback. I installed rdma-core 16. The rdma_client & rdma_server in the same box seems working. However, when I use ib_write_bw in the same box, the QPs can’t be connected through the same rxe device. > >>> > >>> If the QPs connection is problematic then how come rdma_* is working? > >> > >> The rdma_client and rdma_server do work with `end 0`. > >> > >> The ib_write_bw seems to hang. If I kill the server side, the client does not respond. If I kill the client side, it shows the following log: > >> > >> $ ib_write_bw > > Did you try to specify a gid index with -x param? For example ib_write_bw -x 0. > > > > Without gid I would expect init2rtr to fail.. which doesn’t seem the case though .. > >> > >> ************************************ > >> * Waiting for client to connect... * > >> ************************************ > >> RDMA_Write BW Test > >> Dual-port : OFF Device : rxe0 > >> Number of qps : 1 Transport type : IB > >> Connection type : RC Using SRQ : OFF > >> CQ Moderation : 100 > >> Mtu : 1024[B] > >> Link type : Ethernet > >> GID index : 1 > >> Max inline data : 0[B] > >> rdma_cm QPs : OFF > >> Data ex. method : Ethernet > >> local address: LID 0000 QPN 0x0014 PSN 0x3787ca RKey 0x00052c VAddr 0x007f3dd4c42000 > >> GID: 00:00:00:00:00:00:00:00:00:00:255:255:159:65:13:34 > >> remote address: LID 0000 QPN 0x0015 PSN 0x5c8ba9 RKey 0x000658 VAddr 0x007f30e74b3000 > >> GID: 00:00:00:00:00:00:00:00:00:00:255:255:159:65:13:34 > >> #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] > >> ethernet_read_keys: Couldn't read remote address > >> Unable to read to socket/rdam_cm > >> Failed to exchange data between server and clients > >> > >> #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] > >> ethernet_read_keys: Couldn't read remote address > >> Unable to read to socket/rdam_cm > >> Failed to exchange data between server and clients > >> > >> My RXE config: > >> > >> $ sudo rxe_cfg start > >> Name Link Driver Speed NMTU IPv4_addr RDEV RMTU > >> docker0 no bridge > >> eth0 yes virtio_net rxe0 1024 (3) > >> eth1 yes virtio_net rxe1 1024 (3) > >> > >> dmesg: > >> > >> [14441.336707] rdma_rxe: loaded > >> [14441.375108] Loading iSCSI transport class v2.0-870. > >> [14441.380930] rdma_rxe: set rxe0 active > >> [14441.381824] rdma_rxe: added rxe0 to eth0 > >> [14441.389567] iscsi: registered transport (iser) > >> [14441.403595] RPC: Registered named UNIX socket transport module. > >> [14441.404618] RPC: Registered udp transport module. > >> [14441.405390] RPC: Registered tcp transport module. > >> [14441.406057] RPC: Registered tcp NFSv4.1 backchannel transport module. > >> [14441.415816] RPC: Registered rdma transport module. > >> [14441.416689] RPC: Registered rdma backchannel transport module. > >> [14441.419979] rdma_rxe: set rxe1 active > >> [14441.420657] rdma_rxe: added rxe1 to eth1 > >> [14470.422836] detected loopback device > >> [14484.278022] rdma_rxe: qp#18 moved to error state > >> > >>> > >>>> > >>>> I’d like to know if rxe loopback is known to be broken, and how to fix it. > >>>> > >>>> Best, > >>>> Bairen Yi > >> > >> Bairen Yi > >> > >> N�����r��y���b�X��ǧv�^�){.n�+����{��ٚ�{ay�?ʇڙ�,j?��f���h���z�?�w���?���j:+v���w�j�m����?����zZ+�����ݢj"��!�i > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f