Re: Thinkpad E595 system deadlock on resume from S3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 4 Oct 2023 10:25:47 +0900
Damien Le Moal <dlemoal@xxxxxxxxxx> wrote:

> On 10/4/23 05:07, Petr Tesařík wrote:
> > I just want to confirm that my understanding of the issue is correct
> > now. After applying the patch below, my laptop has just survived half a
> > dozen suspend/resume cycles with autosuspend enabled for the SATA SSD
> > disk. Without the patch, it usually freezes on first attempt.
> > 
> > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> > index a371b497035e..87000f89319e 100644
> > --- a/drivers/ata/libata-scsi.c
> > +++ b/drivers/ata/libata-scsi.c
> > @@ -4768,6 +4768,14 @@ void ata_scsi_dev_rescan(struct work_struct *work)
> >  		ata_for_each_dev(dev, link, ENABLED) {
> >  			struct scsi_device *sdev = dev->sdev;
> >  
> > +			/*
> > +			 * If the queue accepts only PM, bail out.
> > +			 */
> > +			if (blk_queue_pm_only(sdev->request_queue)) {
> > +				ret = -EAGAIN;
> > +				goto unlock;
> > +			}
> > +
> >  			/*
> >  			 * If the port was suspended before this was scheduled,
> >  			 * bail out.  
> 
> This seems sensible to me: scsi_rescan_device() only checks that the device is
> running, without checking that the device queue is also resumed. So the right
> place for this check is scsi_rescan_device():
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 3db4d31a03a1..b05a55f498a2 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1627,12 +1627,13 @@ int scsi_rescan_device(struct scsi_device *sdev)
>         device_lock(dev);
> 
>         /*
> -        * Bail out if the device is not running. Otherwise, the rescan may
> -        * block waiting for commands to be executed, with us holding the
> -        * device lock. This can result in a potential deadlock in the power
> -        * management core code when system resume is on-going.
> +        * Bail out if the device or its queue are not running. Otherwise,
> +        * the rescan may block waiting for commands to be executed, with us
> +        * holding the device lock. This can result in a potential deadlock
> +        * in the power management core code when system resume is on-going.
>          */
> -       if (sdev->sdev_state != SDEV_RUNNING) {
> +       if (sdev->sdev_state != SDEV_RUNNING ||
> +           blk_queue_pm_only(sdev->request_queue)) {
>                 ret = -EWOULDBLOCK;
>                 goto unlock;
>         }
> 
> Can you try the above to confirm it works for you ? It should, as that is
> essentially the same as you did.

After revertung my patch and applying yours, my system can still resume
from S3. Congrats!

Tested-by: Petr Tesarik <petr@xxxxxxxxxxx>

Petr T




[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux