Re: [PATCH 4/8] qla2xxx: Fix driver unload hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/7/19 9:55 AM, Himanshu Madhani wrote:
> 
> 
>> On Nov 7, 2019, at 10:54 AM, Bart Van Assche <bvanassche@xxxxxxx
>> <mailto:bvanassche@xxxxxxx>> wrote:
>>
>> On 11/5/19 7:06 AM, Himanshu Madhani wrote:
>>> From: Quinn Tran <qutran@xxxxxxxxxxx <mailto:qutran@xxxxxxxxxxx>>
>>> This patch fixes driver unload hang by removing msleep()
>>> Fixes: d74595278f4ab ("scsi: qla2xxx: Add multiple queue pair
>>> functionality.")
>>> Cc: stable@xxxxxxxxxxxxxxx <mailto:stable@xxxxxxxxxxxxxxx>
>>> Signed-off-by: Quinn Tran <qutran@xxxxxxxxxxx
>>> <mailto:qutran@xxxxxxxxxxx>>
>>> Signed-off-by: Himanshu Madhani <hmadhani@xxxxxxxxxxx
>>> <mailto:hmadhani@xxxxxxxxxxx>>
>>> ---
>>>  drivers/scsi/qla2xxx/qla_init.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>> diff --git a/drivers/scsi/qla2xxx/qla_init.c
>>> b/drivers/scsi/qla2xxx/qla_init.c
>>> index bddb26baedd2..ff4528702b4e 100644
>>> --- a/drivers/scsi/qla2xxx/qla_init.c
>>> +++ b/drivers/scsi/qla2xxx/qla_init.c
>>> @@ -9009,8 +9009,6 @@ int qla2xxx_delete_qpair(struct scsi_qla_host
>>> *vha, struct qla_qpair *qpair)
>>>  struct qla_hw_data *ha = qpair->hw;
>>>    qpair->delete_in_progress = 1;
>>> -while (atomic_read(&qpair->ref_count))
>>> -msleep(500);
>>>    ret = qla25xx_delete_req_que(vha, qpair->req);
>>>  if (ret != QLA_SUCCESS)
>>
>> I think that an explanation is needed why that loop had been
>> introduced and also why it is safe not to wait until qpair->ref_count
>> drops to zero in qla2xxx_delete_qpair().
>>
> 
> commit d74595278f4ab had drawback in design for MQ implementation in
> qla2xxx. Now that we have been making this more stable with MQ being
> default on for 5x kernel. What we discovered that after heavy IO
> workload in a cluster environment, driver unload encountered hang and
> shows following stack trace
> 
> # ps -fax | grep rmmod
> 6029 pts/0 D+ 0:00 | \_ rmmod qla2xxx
> 
> [<0>] msleep+0x29/0x30 [<0>] qla2xxx_delete_qpair+0x2c/0x160 [qla2xxx]
> [<0>] qla25xx_delete_queues+0x14b/0x1d0 [qla2xxx] [<0>]
> qla2x00_free_device+0x31/0xe0 [qla2xxx] [<0>]
> qla2x00_remove_one+0x239/0x370 [qla2xxx] [<0>]
> pci_device_remove+0x3b/0xc0 [<0>]
> device_release_driver_internal+0x18c/0x250 [<0>] driver_detach+0x39/0x6d
> [<0>] bus_remove_driver+0x77/0xc9 [<0>] pci_unregister_driver+0x2d/0xb0
> [<0>] qla2x00_module_exit+0x2d/0x90 [qla2xxx] [<0>]
> __x64_sys_delete_module+0x139/0x270 [<0>] do_syscall_64+0x5b/0x1b0 [<0>]
> entry_SYSCALL_64_after_hwframe+0x65/0xca [<0>] 0xffffffffffffffff
> 
> Removing this msleep() help resolve this stack trace. 

Hi Himanshu,

Does your answer mean that this hang has not yet been root-caused fully
and hence that it is possible this patch is only a workaround but not a
fix of the root cause?

Thanks,

Bart.




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux