Re: [PATCH v7 00/23] Fix libata suspend/resume handling and code cleanup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Damien,

On Fri, Sep 29, 2023 at 3:37 PM Damien Le Moal <dlemoal@xxxxxxxxxx> wrote:
> On 2023/09/28 14:26, Geert Uytterhoeven wrote:
> > On Tue, Sep 26, 2023 at 10:15 AM Damien Le Moal <dlemoal@xxxxxxxxxx> wrote:
> >> The first 9 patches of this series fix several issues with suspend/resume
> >> power management operations in scsi and libata. The most significant
> >> changes introduced are in patch 4 and 5, where the manage_start_stop
> >> flag of scsi devices is split into the manage_system_start_stop and
> >> manage_runtime_start_stop flags to allow keeping scsi runtime power
> >> operations for spining up/down ATA devices but have libata do its own
> >> system suspend/resume device power state management using EH.
> >>
> >> The remaining patches are code cleanup that do not introduce any
> >> significant functional change.
> >>
> >> This series was tested on qemu and on various PCs and servers. I am
> >> CC-ing people who recently reported issues with suspend/resume.
> >> Additional testing would be much appreciated.
> >
> > JFTR, with current libata/for-next[*], I saw the following with
> > rcar-sata, once (interesting lines marked with "!"):
> >
> >     PM: suspend entry (s2idle)
> >     Filesystems sync: 0.026 seconds
> >     Freezing user space processes
> >  !  ata1.00: qc timeout after 10000 msecs (cmd 0x40)
> >     Freezing user space processes completed (elapsed 0.007 seconds)
> >  !  ata1.00: VERIFY failed (err_mask=0x4)
> >     OOM killer disabled.
> >  !  ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
> >     Freezing remaining freezable tasks
> >  !  ata1.00: revalidation failed (errno=-5)
> >     Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
> >     sd 0:0:0:0: [sda] Synchronizing SCSI cache
> >     ata1: link resume succeeded after 1 retries
> >     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >     ata1.00: configured for UDMA/133
> >     ata1.00: Entering active power mode
> >     ata1.00: Entering standby power mode
> >     ravb e6800000.ethernet eth0: Link is Down
> >     Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached
> > PHY driver (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=136)
> >     OOM killer enabled.
> >     Restarting tasks ... done.
> >     random: crng reseeded on system resumption
> >     PM: suspend exit
> >     ata1: link resume succeeded after 1 retries
> >     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >     ata1.00: Entering active power mode
> >     ata1.00: configured for UDMA/133
> >     ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
> >
> > Regardless, the disk worked fine after resume.
> >
> > Note that I saw this only once.
>
> I think I found the reason for this, but to confirm, were you doing a suspend
> right after resuming the system ? If yes, that I think I exactly understand the
> issue and why you saw it only once (it is a subtle race with scheduling
> libata-EH suspend/resume operations). I will send a fix next week.

Now you ask that, yes there was a system suspend before.

Relevant log with timing info:

    [  130.177616] PM: suspend exit
    [  130.257981] ata1: link resume succeeded after 1 retries
    [  130.376714] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    [  130.388525] ata1.00: Entering active power mode

so the drive should have been ready here.

    [  140.452669] PM: suspend entry (s2idle)
    [  140.488313] Filesystems sync: 0.026 seconds
    [  140.515957] Freezing user space processes
    [  140.518209] ata1.00: qc timeout after 10000 msecs (cmd 0x40)
    [  140.523384] Freezing user space processes completed (elapsed
0.007 seconds)
    [  140.527718] ata1.00: VERIFY failed (err_mask=0x4)
    [  140.532541] OOM killer disabled.
    [  140.537270] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
    [  140.542069] Freezing remaining freezable tasks
    [  140.546784] ata1.00: revalidation failed (errno=-5)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux