On Tue, Nov 23, 2021 at 10:24:02PM +0800, Wenpeng Liang wrote: > From: Yangyang Li <liyangyang20@xxxxxxxxxx> > > When hns_roce_v2_destroy_qp() is called, the brief calling process of the > driver is as follows: > > ...... > hns_roce_v2_destroy_qp > hns_roce_v2_qp_modify > hns_roce_cmd_mbox > hns_roce_qp_destroy > > If hns_roce_cmd_mbox() detects that the hardware is being reset during > the execution of the hns_roce_cmd_mbox(), the driver will not be able > to get the return value from the hardware (the firmware cannot respond > to the driver's mailbox during the hardware reset phase). The driver > needs to wait for the hardware reset to complete before continuing to > execute hns_roce_qp_destroy(), otherwise it may happen that the driver > releases the resources but the hardware is still accessing. In order to > fix this problem, HNS RoCE needs to add a piece of code to wait for the > hardware reset to complete. > > The original interface get_hw_reset_stat() is the instantaneous state > of the hardware reset, which cannot accurately reflect whether the > hardware reset is completed, so it needs to be replaced with the > ae_dev_reset_cnt interface. > > The sign that the hardware reset is complete is that the return value > of the ae_dev_reset_cnt interface is greater than the original value > reset_cnt recorded by the driver. > > Fixes: 6a04aed6afae ("RDMA/hns: Fix the chip hanging caused by sending mailbox&CMQ during reset") > Signed-off-by: Yangyang Li <liyangyang20@xxxxxxxxxx> > Signed-off-by: Wenpeng Liang <liangwenpeng@xxxxxxxxxx> > --- > drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) Applied to for-rc, thanks Jason