Hi Lijun, > -----Original Message----- > From: Jason Gunthorpe <jgg@xxxxxxxx> > Sent: Friday, November 1, 2019 8:06 AM > To: oulijun <oulijun@xxxxxxxxxx>; Parav Pandit <parav@xxxxxxxxxxxx> > Cc: Doug Ledford <dledford@xxxxxxxxxx>; linux-rdma <linux- > rdma@xxxxxxxxxxxxxxx> > Subject: Re: 【Ask for help】 A question for __ib_cache_gid_add() > > On Fri, Nov 01, 2019 at 05:36:36PM +0800, oulijun wrote: > > Hi > > I am using the ubuntu system(5.0.0 kernel) to test the hip08 NIC > > port,. When I modify the perr mac1 to mac2,then restore to mac1, it will > cause the gid0 and gid 1 of the roce to be unavailable, and check that the > /sys/class/infiniband/hns_0/ports/1/gid_attrs/ndevs/0 is show invalid. > > the protocol stack print will appear. > > > > Oct 16 17:59:36 ubuntu kernel: [200635.496317] __ib_cache_gid_add: > > unable to add gid fe80:0000:0000:0000:4600:4dff:fea7:9599 error=-28 > > Oct 16 17:59:37 ubuntu kernel: [200636.705848] 8021q: adding VLAN 0 to > > HW filter on device enp189s0f0 Oct 16 17:59:37 ubuntu kernel: > > [200636.705854] __ib_cache_gid_add: unable to add gid > > fe80:0000:0000:0000:4600:4dff:fea7:9599 error=-28 Oct 16 17:59:39 > > ubuntu kernel: [200638.755828] hns3 0000:bd:00.0 enp189s0f0: link up > > Oct 16 17:59:39 ubuntu kernel: [200638.755847] IPv6: > > ADDRCONF(NETDEV_CHANGE): enp189s0f0: link becomes ready Oct 16 > > 18:00:56 ubuntu kernel: [200715.699961] hns3 0000:bd:00.0 enp189s0f0: > > link down Oct 16 18:00:56 ubuntu kernel: [200716.016142] > > __ib_cache_gid_add: unable to add gid > > fe80:0000:0000:0000:4600:4dff:fea7:95f4 error=-28 Oct 16 18:00:58 > > ubuntu kernel: [200717.229857] 8021q: adding VLAN 0 to HW filter on > > device enp189s0f0 Oct 16 18:00:58 ubuntu kernel: [200717.229863] > > __ib_cache_gid_add: unable to add gid > > fe80:0000:0000:0000:4600:4dff:fea7:95f4 error=-28 > > > > Has anyone else encounterd a similar problem ? I wonder if the > _ib_cache_add_gid() is defective in 5.0 kernel? > > Maybe Parav knows? I used the kernel from [1], which seems to be fine; it has the required commits [2], [3], [4]. Are you running RDMA traffic/applications which are using GID 0 and 1 when changing MAC? If so, administrative operation such as MAC address change during active RDMA traffic is unsupported, which can lead to this error. Can you please confirm? If you are not running RDMA traffic while changing the mac, I need more debug logs. Can you please enable ftrace and share the output file mac_change_trace.txt using below steps? echo 0 > /sys/kernel/debug/tracing/tracing_on echo function_graph > /sys/kernel/debug/tracing/current_tracer echo > /sys/kernel/debug/tracing/trace echo > /sys/kernel/debug/tracing/set_ftrace_filter echo ':mod:ib*' > /sys/kernel/debug/tracing/set_ftrace_filter echo ':mod:rdma*' >> /sys/kernel/debug/tracing/set_ftrace_filter echo 1 > /sys/kernel/debug/tracing/tracing_on ip link set <netdev> address <new_mac1> ip link set <netdev> address <new_mac2> cat /sys/kernel/debug/tracing/trace > mac_change_trace.txt [1] git://git.launchpad.net/~ubuntu-kernel-test/ubuntu/+source/linux/+git/mainline-crack v5.0 [2] commit 5c5702e259dc ("RDMA/core: Set right entry state before releasing reference") [3] commit be5914c124bc ("RDMA/core: Delete RoCE GID in hw when corresponding IP is deleted") [4] commit d12e2eed2743 ("IB/core: Update GID entries for netdevice whose mac address changes")