On Tue, Jun 16, 2020 at 09:39:38PM +0800, Weihang Li wrote: > From: Yangyang Li <liyangyang20@xxxxxxxxxx> > > If a IMP reset caused by some hardware errors and hns RoCE driver reset > occurred at the same time, there is a possiblity that the IMP will stop > dealing with command and users can't use the hardware. The logs are as > follows: > > [17223.382506] hns3 0000:fd:00.1: cleaned 0, need to clean 1 > [17223.382515] hns3 0000:fd:00.1: firmware version query failed -11 > [17223.382516] hns3 0000:fd:00.1: Cmd queue init failed > [17223.382523] hns3 0000:fd:00.1: Upgrade reset level > [17223.382529] hns3 0000:fd:00.1: global reset interrupt > > The hns NIC driver divides the reset process into 3 status: initialization, > hardware resetting and softwaring restting. RoCE driver gets reset status > by interfaces provided by NIC driver and commands will not be sent to the > IMP if the driver is in any above status. The main reason for this issue is > that there is a time gap between status 1 and 2, if the RoCE driver sends > commands to the IMP during this gap, the IMP will stop working because it > is not ready. > > To eliminate the time gap, the hns NIC driver has added a new interface in > commit a4de02287abb9 ("net: hns3: provide .get_cmdq_stat interface for the > client"), so RoCE driver can ensure that no commands will be sent during > resetting. > > Signed-off-by: Yangyang Li <liyangyang20@xxxxxxxxxx> > Signed-off-by: Weihang Li <liweihang@xxxxxxxxxx> > --- > drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied to for-rc, thanks Jason