Re: regression: nvme rdma with bnxt_re0 broken

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 12, 2019 at 8:19 AM Parav Pandit <parav@xxxxxxxxxxxx> wrote:
>
> Hi Yi Zhang,
>
> > -----Original Message-----
> > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma-
> > owner@xxxxxxxxxxxxxxx> On Behalf Of Yi Zhang
> > Sent: Friday, July 12, 2019 7:23 AM
> > To: Parav Pandit <parav@xxxxxxxxxxxx>; linux-nvme@xxxxxxxxxxxxxxxxxxx
> > Cc: Daniel Jurgens <danielj@xxxxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx;
> > Devesh Sharma <devesh.sharma@xxxxxxxxxxxx>;
> > selvin.xavier@xxxxxxxxxxxx
> > Subject: Re: regression: nvme rdma with bnxt_re0 broken
> >
> > Hi Parav
> >
> > Here is the info, let me know if it's enough, thanks.
> >
> > [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" >
> > /sys/kernel/debug/dynamic_debug/control
> > [root@rdma-perf-07 ~]$ ifdown bnxt_roce
> > Device 'bnxt_roce' successfully disconnected.
> > [root@rdma-perf-07 ~]$ ifup bnxt_roce
> > Connection successfully activated (D-Bus active path:
> > /org/freedesktop/NetworkManager/ActiveConnection/16)
> > [root@rdma-perf-07 ~]$ sh a.sh
> > DEV    PORT    INDEX    GID                    IPv4         VER DEV
> > ---    ----    -----    ---                    ------------ ---    ---
> > bnxt_re0    1    0    fe80:0000:0000:0000:020a:f7ff:fee3:6e32
> > v1    bnxt_roce
> > bnxt_re0    1    1    fe80:0000:0000:0000:020a:f7ff:fee3:6e32
> > v2    bnxt_roce
> > bnxt_re0    1    10    0000:0000:0000:0000:0000:ffff:ac1f:2bbb
> > 172.31.43.187     v1    bnxt_roce.43
> > bnxt_re0    1    11    0000:0000:0000:0000:0000:ffff:ac1f:2bbb
> > 172.31.43.187     v2    bnxt_roce.43
> > bnxt_re0    1    2    fe80:0000:0000:0000:020a:f7ff:fee3:6e32
> > v1    bnxt_roce.45
> > bnxt_re0    1    3    fe80:0000:0000:0000:020a:f7ff:fee3:6e32
> > v2    bnxt_roce.45
> > bnxt_re0    1    4    fe80:0000:0000:0000:020a:f7ff:fee3:6e32
> > v1    bnxt_roce.43
> > bnxt_re0    1    5    fe80:0000:0000:0000:020a:f7ff:fee3:6e32
> > v2    bnxt_roce.43
> > bnxt_re0    1    6    0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > 172.31.40.187     v1    bnxt_roce
> > bnxt_re0    1    7    0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > 172.31.40.187     v2    bnxt_roce
> > bnxt_re0    1    8    0000:0000:0000:0000:0000:ffff:ac1f:2dbb
> > 172.31.45.187     v1    bnxt_roce.45
> > bnxt_re0    1    9    0000:0000:0000:0000:0000:ffff:ac1f:2dbb
> > 172.31.45.187     v2    bnxt_roce.45
> > bnxt_re1    1    0    fe80:0000:0000:0000:020a:f7ff:fee3:6e33
> > v1    lom_2
> > bnxt_re1    1    1    fe80:0000:0000:0000:020a:f7ff:fee3:6e33
> > v2    lom_2
> > cxgb4_0    1    0    0007:433b:f5b0:0000:0000:0000:0000:0000         v1
> > cxgb4_0    2    0    0007:433b:f5b8:0000:0000:0000:0000:0000         v1
> > hfi1_0    1    0    fe80:0000:0000:0000:0011:7501:0109:6c60     v1
> > hfi1_0    1    1    fe80:0000:0000:0000:0006:6a00:0000:0005     v1
> > mlx5_0    1    0    fe80:0000:0000:0000:506b:4b03:00f3:8a38     v1
> > n_gids_found=19
> >
> > [root@rdma-perf-07 ~]$ dmesg | tail -15
> > [   19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link
> > becomes ready [   19.758371] IPv6: ADDRCONF(NETDEV_CHANGE):
> > mlx5_ib0.8004: link becomes ready [   20.010469] hfi1 0000:d8:00.0: hfi1_0:
> > Switching to NO_DMA_RTAIL [   20.440580] IPv6:
> > ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready
> > [   21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes
> > requested: 8. Max supported is 2.
> > [   21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes
> > requested: 8. Max supported is 2.
> > [   22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes
> > ready [  211.407329] _ib_cache_gid_del: can't delete gid
> > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [  211.407334]
> > _ib_cache_gid_del: can't delete gid
> > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [  211.425275] infiniband
> > bnxt_re0: del_gid port=1 index=6 gid
> > 0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > [  211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 index=6 gid
> > 0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > [  211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid
> > 0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > [  211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 index=7 gid
> > 0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > [  225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 gid
> > 0000:0000:0000:0000:0000:ffff:ac1f:28bb
> > [  225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 gid
> > 0000:0000:0000:0000:0000:ffff:ac1f:28bb
> >
> >
> GID table looks fine.
>
The GID table has  fe80:0000:0000:0000:020a:f7ff:fee3:6e32 entry
repeated 6 times. 2 for each interface
bnxt_roce, bnxt_roce.43 and bnxt_roce.45. Is this expected to have
same gid entries for vlan and base interfaces? As you mentioned
earlier, driver's assumption
that only 2 GID entries identical (one for RoCE v1 and one for RoCE
v2)   is breaking here.

> > On 7/12/19 12:18 AM, Parav Pandit wrote:
> > > Sagi,
> > >
> > > This is better one to cc to linux-rdma.
> > >
> > > + Devesh, Selvin.
> > >
> > >> -----Original Message-----
> > >> From: Parav Pandit
> > >> Sent: Thursday, July 11, 2019 6:25 PM
> > >> To: Yi Zhang <yi.zhang@xxxxxxxxxx>; linux-nvme@xxxxxxxxxxxxxxxxxxx
> > >> Cc: Daniel Jurgens <danielj@xxxxxxxxxxxx>
> > >> Subject: RE: regression: nvme rdma with bnxt_re0 broken
> > >>
> > >> Hi Yi Zhang,
> > >>
> > >>> -----Original Message-----
> > >>> From: Yi Zhang <yi.zhang@xxxxxxxxxx>
> > >>> Sent: Thursday, July 11, 2019 3:17 PM
> > >>> To: linux-nvme@xxxxxxxxxxxxxxxxxxx
> > >>> Cc: Daniel Jurgens <danielj@xxxxxxxxxxxx>; Parav Pandit
> > >>> <parav@xxxxxxxxxxxx>
> > >>> Subject: regression: nvme rdma with bnxt_re0 broken
> > >>>
> > >>> Hello
> > >>>
> > >>> 'nvme connect' failed when use bnxt_re0 on latest upstream build[1],
> > >>> by bisecting I found it was introduced from v5.2.0-rc1 with [2], it
> > >>> works after I revert it.
> > >>> Let me know if you need more info, thanks.
> > >>>
> > >>> [1]
> > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420
> > >>> -n testnqn Failed to write to /dev/nvme-fabrics: Bad address
> > >>>
> > >>> [root@rdma-perf-07 ~]$ dmesg
> > >>> [  476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status 0x5
>
> Devesh, Selvin,
>
> What does this error mean?
> bnxt_qplib_create_ah() is failing.
>
We are passing a wrong index for the GID to FW because of the
assumption mentioned earlier.
FW is failing command due to this.

> > >>> [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [
> > >>> 476.332525] nvme nvme2: rdma_connect failed (-14).
> > >>> [  476.343552] nvme nvme2: rdma connection establishment failed
> > >>> (-14)
> > >>>
> > >>> [root@rdma-perf-07 ~]$ lspci  | grep -i Broadcom
> > >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries
> > >>> NetXtreme
> > >>> BCM5720 2-port Gigabit Ethernet PCIe
> > >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries
> > >>> NetXtreme
> > >>> BCM5720 2-port Gigabit Ethernet PCIe
> > >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008
> > >>> [Fury] (rev
> > >>> 02)
> > >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412
> > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
> > >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412
> > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
> > >>>
> > >>>
> > >>> [2]
> > >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6
> > >>> Author: Parav Pandit <parav@xxxxxxxxxxxx>
> > >>> Date:   Wed Apr 10 11:23:03 2019 +0300
> > >>>
> > >>>      IB/core: Allow vlan link local address based RoCE GIDs
> > >>>
> > >>>      IPv6 link local address for a VLAN netdevice has nothing to do with its
> > >>>      resemblance with the default GID, because VLAN link local GID is in
> > >>>      different layer 2 domain.
> > >>>
> > >>>      Now that RoCE MAD packet processing and route resolution consider
> > the
> > >>>      right GID index, there is no need for an unnecessary check which
> > prevents
> > >>>      the addition of vlan based IPv6 link local GIDs.
> > >>>
> > >>>      Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> > >>>      Reviewed-by: Daniel Jurgens <danielj@xxxxxxxxxxxx>
> > >>>      Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > >>>      Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
> > >>>
> > >>>
> > >>>
> > >>> Best Regards,
> > >>>    Yi Zhang
> > >>>
> > >> I need some more information from you to debug this issue as I don’t
> > >> have the hw.
> > >> The highlighted patch added support for IPv6 link local address for
> > >> vlan. I am unsure how this can affect IPv4 AH creation for which there is
> > failure.
> > >>
> > >> 1. Before you assign the IP address to the netdevice, Please do, echo
> > >> -n "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control
> > >>
> > >> Please share below output before doing nvme connect.
> > >> 2. Output of script [1]
> > >> $ show_gids script
> > >> If getting this script is problematic, share the output of,
> > >>
> > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/*
> > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/*
> > >> $ ip link show
> > >> $ip addr show
> > >> $ dmesg
> > >>
> > >> [1] https://community.mellanox.com/s/article/understanding-show-gids-
> > >> script#jive_content_id_The_Script
> > >>
> > >> I suspect that driver's assumption about GID indices might have gone
> > >> wrong here in drivers/infiniband/hw/bnxt_re/ib_verbs.c.
> > >> Lets see about results to confirm that.
> > > _______________________________________________
> > > Linux-nvme mailing list
> > > Linux-nvme@xxxxxxxxxxxxxxxxxxx
> > > http://lists.infradead.org/mailman/listinfo/linux-nvme




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux