[PATCH for-rc] RDMA/hns: Fix an cmd queue issue when resetting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Yangyang Li <liyangyang20@xxxxxxxxxx>

If a IMP reset caused by some hardware errors and hns RoCE driver reset
occurred at the same time, there is a possiblity that the IMP will stop
dealing with command and users can't use the hardware. The logs are as
follows:

[17223.382506] hns3 0000:fd:00.1: cleaned 0, need to clean 1
[17223.382515] hns3 0000:fd:00.1: firmware version query failed -11
[17223.382516] hns3 0000:fd:00.1: Cmd queue init failed
[17223.382523] hns3 0000:fd:00.1: Upgrade reset level
[17223.382529] hns3 0000:fd:00.1: global reset interrupt

The hns NIC driver divides the reset process into 3 status: initialization,
hardware resetting and softwaring restting. RoCE driver gets reset status
by interfaces provided by NIC driver and commands will not be sent to the
IMP if the driver is in any above status. The main reason for this issue is
that there is a time gap between status 1 and 2, if the RoCE driver sends
commands to the IMP during this gap, the IMP will stop working because it
is not ready.

To eliminate the time gap, the hns NIC driver has added a new interface in
commit a4de02287abb9 ("net: hns3: provide .get_cmdq_stat interface for the
client"), so RoCE driver can ensure that no commands will be sent during
resetting.

Signed-off-by: Yangyang Li <liyangyang20@xxxxxxxxxx>
Signed-off-by: Weihang Li <liweihang@xxxxxxxxxx>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index bb86754c..dd01a51 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -910,7 +910,7 @@ static int hns_roce_v2_rst_process_cmd(struct hns_roce_dev *hr_dev)
 	instance_stage = handle->rinfo.instance_state;
 	reset_stage = handle->rinfo.reset_state;
 	reset_cnt = ops->ae_dev_reset_cnt(handle);
-	hw_resetting = ops->get_hw_reset_stat(handle);
+	hw_resetting = ops->get_cmdq_stat(handle);
 	sw_resetting = ops->ae_dev_resetting(handle);
 
 	if (reset_cnt != hr_dev->reset_cnt)
-- 
2.8.1




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux