[linux-pm] [patch/rft 2.6.17-rc2] swsusp resume must not device_suspend()

rjw at sisk.pl (Rafael J. Wysocki) · Tue Apr 25 14:57:34 2006

On Tuesday 25 April 2006 23:04, David Brownell wrote:
> On Tuesday 25 April 2006 11:56 am, Rafael J. Wysocki wrote:
> > 
> > > I've begun thinking that calls like pm_should_I_spin_down_drives() would be a
> > > better structural approach than continually redefining this "freeze" thing so
> > > it makes less and less sense to all other drivers ... who nonethless need to
> > > clutter themselves up with a growing list of special cases, to accomodate
> > > rotating media that may not even exist in the target system.
> > 
> > I think we should do something different to device_power_down(PMSG_FREEZE)
> > there, but I'm not sure it should be kernel_restart_prepare(NULL).
> > 
> > Actually spinning down disks during resume is a problem for some users (yes,
> > we've had such bug reports recently), so it's better to avoid this.
> 
> Well, if we had a pm_should_I_spin_down_drives() it would make sense to me
> that it return FALSE during kernel_restart_prepare() too ... surely kexec
> users have the same issues!
> 
> If you currently have users who object to spindown-during-resume, then it'd
> seem that my patch couldn't change anything except maybe details.

Fortunately this particular problem has been fixed in the driver. ;-)

> And that switching over to a call like pm_should_I_spin_down_drives() should
> fix it all.

Agreed, but I have to learn quite a bit to implement such a thing.

> > > > OTOH I think at least some device driver writers assume that .resume() will
> > > > always be called after .suspend() which only is true for non-modular drivers
> > > > (or for modular drivers loaded from an initrd before resume). 
> > > 
> > > Say what?  Of _course_ resume() should only be called after suspend().  If
> > > that's not true in any case, the code wrongly issuing the resume() is buggy.
> > 
> > Well, suppose we have a modular driver that's not loaded before resume.
> 
> That's not the problem case though; it works correctly, since the device
> hardware is already being left in an appropriate (RESET) state.
> 
> 
> > Then it goes like that (approximately):
> > (1) We activate swsusp which calls .suspend() for all devices including our
> > driver (this is a real suspend).
> > (2) swsusp snapshots the system and creates the image.
> > (3) swsusp calls .resume() for all devices in order to be able to save the
> > image (.resume() for our driver is also called which is OK).
> > (4) swsusp turns off the system.
> > (5) (some time later) We start a new kernel and tell it to resume.
> > (6) It activates swsusp which reads the image.
> 
> And assuming this is an x86 PC, at this point every device is in one of three states:
> 
>   - initialized by BIOS.  This is a particular PITA for USB, but one that's
>     handled OK (mostly) except when BIOS bugs kick in.  There's some nasty
>     code that kicks in along with PCI quirk handling, which ensures that by
>     the time Linux-USB  driver could see this state (or the input subsystem
>     needs to care about it), the state has morphed to reset.  Video cards
>     have funky issues here too.
> 
>   - (powerup) reset.  This is the ideal state, in terms of "truth" to convey
>     to the image we're about to restore ... no ambiguity, every driver will
>     need to re-init.  As if there were (thank you!!) no BIOS.
> 
>   - initialized by Linux ... which leads to the case my patch addresses.
> 
> Those first two states are legit for any resume() call, and they apply in
> your scenario restriction.

IIRC, there were some ALSA problems with .resume() called on a reseted
device, but they seem to be fixed now.

> The third state is the problem scenario, kicking in when the driver was
> statically linked (or modprobed from initramfs, etc), but not during
> your scenario.

The problem, as I see it, is that too many devices may be initialized at the
kernel startup.  I think we _can_ reset some of them before the image
is restored, but at least some of them need to be treated more carefully.

> > (7) (without your change) swsusp calls .suspend() for all device drivers that
> > are present at that time,
> 
> ... the current troublesome consequence of that third state ...
> 
> > but our driver is not there, so its .suspend() 
> > _won't_ be called.  [Of course with your change .suspend() won't be called
> > for any driver.]
> 
> Right:  the first two "safe" cases kick in.  This is the partial workaround I
> had identified:  dodging the code paths for that third state, where suspend()
> is being used to put the hardware into a broken suspend state.

So perhaps we should just make them enter a state that's not broken?
That may be reset for some devices (eg. USB) and something else for some
others (eg. storage).

> Note that with that third state, there are actually two suspend() calls, but
> only one resume() call.  (Suspend before snapshot, suspend before resume
> snapshot, resume after activating snapshot.)  Such an extra suspend() call is
> a small hint that something's odd, and maybe wrong.

Agreed.

> > (8) swsusp restores the image.
> > (9) swsusp calls .resume() for all devices _including_ our driver, because it
> > was in memory before suspend.  For our driver this .resume() is not
> > called after .suspend(), is it?
> 
> The suspend() was called from the kernel being resumed ... and the device hardware
> is in one of the states (reset) that it's allowed to be in when calling its
> matching resume().  No problem there.
> 
> 
> > You're saying that (9) is wrong, so could you please suggest what to do
> > instead of it?
> 
> The case in which (9) is wrong is the case you excluded:  where the pre-resume
> kernel loaded the driver and used the third state listed above, and then trashed
> the correct device hardware state (reset) and replaced it with a suspend state.
> 
> It may help to think of two distinct types of device hardware suspend states
> (only the first is real, the second is just a software bug):
> 
>  - Correct, with internal state corresponding to what the driver suspend() did;
>    what a normal hardware suspend/resume cycle (not powercycle!) could do.
> 
>  - Broken, with any other internal state (except reset).  This is what swsusp
>    currently forces, by adding **AND HIDING** a reset and reinit cycle, because
>    of the extra suspend() call in (7).

Still there are drivers that have no problems with it, so why we should we
forcibly reset their devices?

> My patch/suggestion just ensures that instead of that broken state, reset is used.
> in all cases ... not just the "driver not initialized before snapshot resume" case.

As I said before I generally agree with this except I think some more fine
grained approach is needed in this case.

Basically it seems we need something like a .prepare_for_resume() routine,
that will be called before restoring the swsusp's image, apart from .suspend()
and .resume() for each driver.  Your patch just assumes that
.prepare_for_resume() should be "reset" for all devices.

Greetings,
Rafael