Hi Yi Zhang, > -----Original Message----- > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > owner@xxxxxxxxxxxxxxx> On Behalf Of Yi Zhang > Sent: Friday, July 12, 2019 7:23 AM > To: Parav Pandit <parav@xxxxxxxxxxxx>; linux-nvme@xxxxxxxxxxxxxxxxxxx > Cc: Daniel Jurgens <danielj@xxxxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; > Devesh Sharma <devesh.sharma@xxxxxxxxxxxx>; > selvin.xavier@xxxxxxxxxxxx > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > Hi Parav > > Here is the info, let me know if it's enough, thanks. > > [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" > > /sys/kernel/debug/dynamic_debug/control > [root@rdma-perf-07 ~]$ ifdown bnxt_roce > Device 'bnxt_roce' successfully disconnected. > [root@rdma-perf-07 ~]$ ifup bnxt_roce > Connection successfully activated (D-Bus active path: > /org/freedesktop/NetworkManager/ActiveConnection/16) > [root@rdma-perf-07 ~]$ sh a.sh > DEV PORT INDEX GID IPv4 VER DEV > --- ---- ----- --- ------------ --- --- > bnxt_re0 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v1 bnxt_roce > bnxt_re0 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v2 bnxt_roce > bnxt_re0 1 10 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > 172.31.43.187 v1 bnxt_roce.43 > bnxt_re0 1 11 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > 172.31.43.187 v2 bnxt_roce.43 > bnxt_re0 1 2 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v1 bnxt_roce.45 > bnxt_re0 1 3 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v2 bnxt_roce.45 > bnxt_re0 1 4 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v1 bnxt_roce.43 > bnxt_re0 1 5 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v2 bnxt_roce.43 > bnxt_re0 1 6 0000:0000:0000:0000:0000:ffff:ac1f:28bb > 172.31.40.187 v1 bnxt_roce > bnxt_re0 1 7 0000:0000:0000:0000:0000:ffff:ac1f:28bb > 172.31.40.187 v2 bnxt_roce > bnxt_re0 1 8 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > 172.31.45.187 v1 bnxt_roce.45 > bnxt_re0 1 9 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > 172.31.45.187 v2 bnxt_roce.45 > bnxt_re1 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > v1 lom_2 > bnxt_re1 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > v2 lom_2 > cxgb4_0 1 0 0007:433b:f5b0:0000:0000:0000:0000:0000 v1 > cxgb4_0 2 0 0007:433b:f5b8:0000:0000:0000:0000:0000 v1 > hfi1_0 1 0 fe80:0000:0000:0000:0011:7501:0109:6c60 v1 > hfi1_0 1 1 fe80:0000:0000:0000:0006:6a00:0000:0005 v1 > mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00f3:8a38 v1 > n_gids_found=19 > > [root@rdma-perf-07 ~]$ dmesg | tail -15 > [ 19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link > becomes ready [ 19.758371] IPv6: ADDRCONF(NETDEV_CHANGE): > mlx5_ib0.8004: link becomes ready [ 20.010469] hfi1 0000:d8:00.0: hfi1_0: > Switching to NO_DMA_RTAIL [ 20.440580] IPv6: > ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready > [ 21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > requested: 8. Max supported is 2. > [ 21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > requested: 8. Max supported is 2. > [ 22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes > ready [ 211.407329] _ib_cache_gid_del: can't delete gid > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.407334] > _ib_cache_gid_del: can't delete gid > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.425275] infiniband > bnxt_re0: del_gid port=1 index=6 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 index=6 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 index=7 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > GID table looks fine. > On 7/12/19 12:18 AM, Parav Pandit wrote: > > Sagi, > > > > This is better one to cc to linux-rdma. > > > > + Devesh, Selvin. > > > >> -----Original Message----- > >> From: Parav Pandit > >> Sent: Thursday, July 11, 2019 6:25 PM > >> To: Yi Zhang <yi.zhang@xxxxxxxxxx>; linux-nvme@xxxxxxxxxxxxxxxxxxx > >> Cc: Daniel Jurgens <danielj@xxxxxxxxxxxx> > >> Subject: RE: regression: nvme rdma with bnxt_re0 broken > >> > >> Hi Yi Zhang, > >> > >>> -----Original Message----- > >>> From: Yi Zhang <yi.zhang@xxxxxxxxxx> > >>> Sent: Thursday, July 11, 2019 3:17 PM > >>> To: linux-nvme@xxxxxxxxxxxxxxxxxxx > >>> Cc: Daniel Jurgens <danielj@xxxxxxxxxxxx>; Parav Pandit > >>> <parav@xxxxxxxxxxxx> > >>> Subject: regression: nvme rdma with bnxt_re0 broken > >>> > >>> Hello > >>> > >>> 'nvme connect' failed when use bnxt_re0 on latest upstream build[1], > >>> by bisecting I found it was introduced from v5.2.0-rc1 with [2], it > >>> works after I revert it. > >>> Let me know if you need more info, thanks. > >>> > >>> [1] > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 > >>> -n testnqn Failed to write to /dev/nvme-fabrics: Bad address > >>> > >>> [root@rdma-perf-07 ~]$ dmesg > >>> [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status 0x5 Devesh, Selvin, What does this error mean? bnxt_qplib_create_ah() is failing. > >>> [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ > >>> 476.332525] nvme nvme2: rdma_connect failed (-14). > >>> [ 476.343552] nvme nvme2: rdma connection establishment failed > >>> (-14) > >>> > >>> [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > >>> NetXtreme > >>> BCM5720 2-port Gigabit Ethernet PCIe > >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > >>> NetXtreme > >>> BCM5720 2-port Gigabit Ethernet PCIe > >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 > >>> [Fury] (rev > >>> 02) > >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > >>> > >>> > >>> [2] > >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6 > >>> Author: Parav Pandit <parav@xxxxxxxxxxxx> > >>> Date: Wed Apr 10 11:23:03 2019 +0300 > >>> > >>> IB/core: Allow vlan link local address based RoCE GIDs > >>> > >>> IPv6 link local address for a VLAN netdevice has nothing to do with its > >>> resemblance with the default GID, because VLAN link local GID is in > >>> different layer 2 domain. > >>> > >>> Now that RoCE MAD packet processing and route resolution consider > the > >>> right GID index, there is no need for an unnecessary check which > prevents > >>> the addition of vlan based IPv6 link local GIDs. > >>> > >>> Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx> > >>> Reviewed-by: Daniel Jurgens <danielj@xxxxxxxxxxxx> > >>> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > >>> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > >>> > >>> > >>> > >>> Best Regards, > >>> Yi Zhang > >>> > >> I need some more information from you to debug this issue as I don’t > >> have the hw. > >> The highlighted patch added support for IPv6 link local address for > >> vlan. I am unsure how this can affect IPv4 AH creation for which there is > failure. > >> > >> 1. Before you assign the IP address to the netdevice, Please do, echo > >> -n "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control > >> > >> Please share below output before doing nvme connect. > >> 2. Output of script [1] > >> $ show_gids script > >> If getting this script is problematic, share the output of, > >> > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* > >> $ ip link show > >> $ip addr show > >> $ dmesg > >> > >> [1] https://community.mellanox.com/s/article/understanding-show-gids- > >> script#jive_content_id_The_Script > >> > >> I suspect that driver's assumption about GID indices might have gone > >> wrong here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. > >> Lets see about results to confirm that. > > _______________________________________________ > > Linux-nvme mailing list > > Linux-nvme@xxxxxxxxxxxxxxxxxxx > > http://lists.infradead.org/mailman/listinfo/linux-nvme