Re: [PATCH] scsi: sd: Revert "Rework asynchronous resume support"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hans,

On Sat, Aug 20, 2022 at 5:37 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
> On 8/16/22 19:26, Bart Van Assche wrote:
> > Although patch "Rework asynchronous resume support" eliminates the delay
> > for some ATA disks after resume, it causes resume of ATA disks to fail
> > on other setups. See also:
> > * "Resume process hangs for 5-6 seconds starting sometime in 5.16"
> >   (https://bugzilla.kernel.org/show_bug.cgi?id=215880).
> > * Geert's regression report
> >   (https://lore.kernel.org/linux-scsi/alpine.DEB.2.22.394.2207191125130.1006766@xxxxxxxxxxxxxx/).
> >
> > This is what I understand about this issue:
> > * During resume, ata_port_pm_resume() starts the SCSI error handler.
> >   This changes the SCSI host state into SHOST_RECOVERY and causes
> >   scsi_queue_rq() to return BLK_STS_RESOURCE.
> > * sd_resume() calls sd_start_stop_device() for ATA devices. That
> >   function in turn calls sd_submit_start() which tries to submit a START
> >   STOP UNIT command. That command can only be submitted after the SCSI
> >   error handler has changed the SCSI host state back to SHOST_RUNNING.
> > * The SCSI error handler runs on its own thread and calls
> >   schedule_work(&(ap->scsi_rescan_task)). That causes
> >   ata_scsi_dev_rescan() to be called from the context of a kernel
> >   workqueue. That call hangs in blk_mq_get_tag(). I'm not sure why -
> >   maybe because all available tags have been allocated by
> >   sd_submit_start() calls (this is a guess).
> >
> > Cc: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx>
> > Cc: Hannes Reinecke <hare@xxxxxxx>
> > Cc: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
> > Cc: gzhqyz@xxxxxxxxx
> > Reported-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
> > Reported-by: gzhqyz@xxxxxxxxx
> > Fixes: 88f1669019bd ("scsi: sd: Rework asynchronous resume support"; v6.0-rc1~114^2~68)
> > Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
>
> As reported here I've been seeing tasks block/hang on IO
> to a sata disk on a system with / on a NVME (which keeps
> the system alive except for the SATA disk acccessing tasks):
>
> https://lore.kernel.org/regressions/dd6844e7-f338-a4e9-2dad-0960e25b2ca1@xxxxxxxxxx/
>
> I'm running 6.0-rc1 with this patch added now and so far
> I've not seen the problem re-occur.
>
> I was also seeing 6.0 suspend/resume issues on 2 laptops with
> sata disks (rather then NVME) which I did not yet get around
> to collecting logs from / reporting. I'm happy to report that
> those suspend/resume issues are also fixed by this:

It looks like there is a (different) regression in v6.1-rc1 related
to s2idle and s2ram, which is not fixed by this patch.  In fact it
also happens on boards where SATA is not used, it is just less likely
to happen on the non-SATA boards.
I still have to bisect it, which may take some time, as the issue is
not 100% reproducible.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux