Re: Thinkpad E595 system deadlock on resume from S3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/4/23 15:43, Petr Tesařík wrote:
> On Wed, 4 Oct 2023 10:25:47 +0900
> Damien Le Moal <dlemoal@xxxxxxxxxx> wrote:
> 
>> On 10/4/23 05:07, Petr Tesařík wrote:
>>> I just want to confirm that my understanding of the issue is correct
>>> now. After applying the patch below, my laptop has just survived half a
>>> dozen suspend/resume cycles with autosuspend enabled for the SATA SSD
>>> disk. Without the patch, it usually freezes on first attempt.
>>>
>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>>> index a371b497035e..87000f89319e 100644
>>> --- a/drivers/ata/libata-scsi.c
>>> +++ b/drivers/ata/libata-scsi.c
>>> @@ -4768,6 +4768,14 @@ void ata_scsi_dev_rescan(struct work_struct *work)
>>>  		ata_for_each_dev(dev, link, ENABLED) {
>>>  			struct scsi_device *sdev = dev->sdev;
>>>  
>>> +			/*
>>> +			 * If the queue accepts only PM, bail out.
>>> +			 */
>>> +			if (blk_queue_pm_only(sdev->request_queue)) {
>>> +				ret = -EAGAIN;
>>> +				goto unlock;
>>> +			}
>>> +
>>>  			/*
>>>  			 * If the port was suspended before this was scheduled,
>>>  			 * bail out.  
>>
>> This seems sensible to me: scsi_rescan_device() only checks that the device is
>> running, without checking that the device queue is also resumed. So the right
>> place for this check is scsi_rescan_device():
>>
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index 3db4d31a03a1..b05a55f498a2 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -1627,12 +1627,13 @@ int scsi_rescan_device(struct scsi_device *sdev)
>>         device_lock(dev);
>>
>>         /*
>> -        * Bail out if the device is not running. Otherwise, the rescan may
>> -        * block waiting for commands to be executed, with us holding the
>> -        * device lock. This can result in a potential deadlock in the power
>> -        * management core code when system resume is on-going.
>> +        * Bail out if the device or its queue are not running. Otherwise,
>> +        * the rescan may block waiting for commands to be executed, with us
>> +        * holding the device lock. This can result in a potential deadlock
>> +        * in the power management core code when system resume is on-going.
>>          */
>> -       if (sdev->sdev_state != SDEV_RUNNING) {
>> +       if (sdev->sdev_state != SDEV_RUNNING ||
>> +           blk_queue_pm_only(sdev->request_queue)) {
>>                 ret = -EWOULDBLOCK;
>>                 goto unlock;
>>         }
>>
>> Can you try the above to confirm it works for you ? It should, as that is
>> essentially the same as you did.
> 
> After revertung my patch and applying yours, my system can still resume
> from S3. Congrats!
> 
> Tested-by: Petr Tesarik <petr@xxxxxxxxxxx>

Thanks. I will send a proper patch asap.

> 
> Petr T

-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux