Re: [RFC] blk-mq/scsi: deadlock found on usb driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/29/20 11:23 PM, Yufen Yu wrote:
>   We reported IO stuck on a scsi usb driver recently and any IO issued
> to the device cannot return. The usb driver just have **one** driver tag
> and  **two** sched tag. After debugging, we found there is a deadlock
> race as following:
> 
> cpu0(scsi_eh)       cpu1                          cpu2
>                     get sched tag(internal_tag=0)
>                     get driver tag(tag=0)
>                                                   get sched
> tag(internal_tag=1)
>                                                   wait for driver tag
> scsi_error_handler try issue io
> wait for sched tag
>                     try to dispatch the request
>                     wait for setting shost state as SHOST_RUNNING
> //scsi_host_set_state(shost, SHOST_RUNNING)
> 
> The scsi_eh thread stack as following:
> PID: 945745  TASK: ffff950a8f2f0000  CPU: 42  COMMAND: "scsi_eh_15"
>   [ffffbbee8d5b3ce0] __schedule at ffffffffa506ebac
>   [ffffbbee8d5b3d00] sbitmap_get at ffffffffa4c4684f
>   [ffffbbee8d5b3d48] schedule at ffffffffa506f208
>   [ffffbbee8d5b3d50] io_schedule at ffffffffa506f5d2
>   [ffffbbee8d5b3d60] blk_mq_get_tag at ffffffffa4bf5277
>   [ffffbbee8d5b3d88] autoremove_wake_function at ffffffffa48ffe40
>   [ffffbbee8d5b3db8] autoremove_wake_function at ffffffffa48ffe40
>   [ffffbbee8d5b3e08] blk_mq_get_request at ffffffffa4bef14c
>   [ffffbbee8d5b3e20] eh_lock_door_done at ffffffffa4da5580
>   [ffffbbee8d5b3e38] blk_mq_alloc_request at ffffffffa4bef494
>   [ffffbbee8d5b3e80] blk_get_request at ffffffffa4be5042
>   [ffffbbee8d5b3e98] scsi_error_handler at ffffffffa4da8670
>   [ffffbbee8d5b3ea0] __schedule at ffffffffa506ebb4
>   [ffffbbee8d5b3f08] scsi_error_handler at ffffffffa4da8430
>   [ffffbbee8d5b3f10] kthread at ffffffffa48d6d7d
>   [ffffbbee8d5b3f20] kthread at ffffffffa48d6c70
>   [ffffbbee8d5b3f50] ret_from_fork at ffffffffa520023f
> 
> Since there are no more available sched tag and driver tag. All of
> threads will wait forever. We found the bug on 4.18 kernel, but the
> latest kernel code also have the problem.
> 
> I don't have good idea about how to fix the bug. So, any suggestions are
> welcome.

Please take a look at
https://lore.kernel.org/linux-scsi/20201130024615.29171-6-bvanassche@xxxxxxx/T/#u.

Thanks,

Bart.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux