Re: [PATCH 1/1] scsi: fix hang when device state is set via sysfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/12/21 10:50 AM, Mike Christie wrote:
> On 10/5/21 11:45 PM, Mike Christie wrote:
>> Cc'ing lee.
>>
>> On 10/5/21 11:31 PM, Mike Christie wrote:
>>> This fixes a regression added with:
>>>
>>> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
>>> offlinining device")
>>>
>>> The problem is that after iSCSI recovery, iscsid will call into the kernel
>>> to set the dev's state to running, and with that patch we now call
>>> scsi_rescan_device with the state_mutex held. If the scsi error handler
>>> thread is just starting to test the device in scsi_send_eh_cmnd then it's
>>> going to try to grab the state_mutex.
>>>
>>> We are then stuck, because when scsi_rescan_device tries to send its IO
>>> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
>>> will return true (the host state is still in recovery) and IO will just be
>>> requeued. scsi_send_eh_cmnd will then never be able to grab the
>>> state_mutex to finish error handling.
>>>
>>> This just moves the scsi_rescan_device call to after we drop the
>>> state_mutex.
>>
>>
>> I want to maybe nak my own patch. There is still a problem where if one
>> of the rescan IOs hits an issue then userspace is stuck waiting for
>> however long it takes to perform recovery. For iscsid, this will cause
>> problems because it sets the device state from its main thread. So
>> while scsi_rescan_device is hung then iscsid can't do anything for
>> any session.
>>
>> I think we either want to:
>>
>> 1. Do the patch below, but Lee will need to change iscsid so it sets
>> the dev state from a worker thread.
>>
>> 2. Have the kernel kick off the rescan from a workqueue. This seems
>> easiest but I'm not sure if it will cause issues for lijinlin's use
>> case.
> 
> I have not heard from huawei, but I don't think we can do 2. The problem
> is that I think userspace will not assume once the write returns that the

Meant userspace will now assume.



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux