[linux-pm] [PATCH 2/2] Fix console handling during suspend/resume

torvalds at osdl.org (Linus Torvalds) · Fri, 23 Jun 2006 20:12:38 -0700 (PDT)

On Fri, 23 Jun 2006, Adam Belay wrote:
> 
> Yeah, I think it could certainly be improved.  However, I think it's very
> important that we be careful to prevent a driver from attempting to access
> powered off hardware, even in the STR case.  Now that doesn't warrant a
> fullblown teardown of the driver stack.  In most cases this sort of thing
> can be handled by notifying the right higher-level subystems.  So the actual
> driver suspend() mechanisms can remain very simple. (see below)

Right. I think drivers do way too much, and that's part of the problem - 
not only do we have basically the same code repeated over and over, but 
they have a really hard time really doing the right thing.

For example, on the run-time management, if we shut things down not as a 
"pci_device" but as a "network device" (which just happens to be _bound_ 
to a pci device), we could very easily do the highlevel network device 
crap to make sure that we don't get entered that way _first_. And do it in 
just one place.

> > 
> > It's really only needed with the current setup, because the whole 
> > suspend() phase is so messy, and we try to solve everything in one single 
> > pass, and one single function call.
> 
> As an immediate incremental improvement, we could add a prepare_suspend()
> callback that would be called before userspace is stopped and a
> finish_resume() callback that would be called after userspace has been
> started again.

I basically have this patch finished. I'll post when I've tested the last 
version (I already tested my previous one and it worked, I just want to 
expand on it).

In fact, I did the second stage too, which is to do the "suspend_late" and 
"resume_early" parts too. It actually simplified a number of assumptions 
in the current power management code.

> Yes, most drivers, especially of the PCI variety, can do something pretty
> simple when suspending, but only if we have the right infrastructure in place.

Absolutely.

Which is what I'm trying to put in place, so that drivers don't have to do 
the extra work that really isn't on "their level" anyway.

Now, I'm not claiming that the rewrite will be perfect, but I've _already_ 
got a fairly small patch:

	[torvalds at macmini linux]$ git diff | wc -l
	338

that not only compiles, but actually implements the suspend as a 
five-stage process and _works_ (of course, it works mainly because most 
drivers only _use_ two of the five stages, but that's all part of the 
plan: I don't want to rewrite a million drivers, I want to prepare the 
infrastructure so that drivers can be written more simply and robustly in 
the future - and _fixed_ more simply when they don't work now).

So basically, instead of 

 - suspend
 - resume

in my current tree I have

 - suspend_prepare (I went with Ben's name, maybe that strokes his ego 
   enough that he'll admit it's better now)
 - suspend (same as old)
 - suspend_late
 - resume_early
 - resume (same as old)

(and I really wanted to do a "resume_finish()" too after user-land resume, 
just to have the "reverse" three phases of resume as I have of suspend, 
but I decided I didn't have any driver that I would make use of it 
personally)

> One thing that might help us get there is if we passed a suspend notification
> to the class devices (i.e. the higher level subsystems).

Good point. We probably should. That really really makes sense, and that 
also automagically solves the "network device" issue.

I'll do that too, it actually looks pretty simple (famous last words).

> I'm curious about your thoughts on runtime suspending of devices are, such as
> the resource rebalancing or cpufreq cases I suggested earlier. 

I really don't see that as my primary worry. Runtime suspend is "nice", 
but it's not a _primary_ goal for me.

I think it should be pretty easy to implement, and I think your subsystem 
suspend notification thing would help a lot (to basically guarantee that 
the subsystem doesn't try to use it).

> Do you have any opinions on how this might be handled?  So far, I've 
> been favoring usage of the same sort of freeze() mechanism used for 
> preparing for memory snapshots etc.

Let me reboot my current kernel to test my current five-phase thing, and 
I'll do the subsystem thing too.

My off-the-cuff plan for that is to just add a "suspend(dev, state)" 
callback to the subsystem structure, and have device_suspend() call the 
subsystem suspend function before it even calls the actual device suspend 
function (and in reverse order on resume, of course).

Again - I'm not actually planning on doing very many individual drivers 
(that's the point I _don't_ care about), I want the support infrastructure 
to be sane.

(That, btw, obviously indirectly means that I'm not willing to break 
existing drivers - my infrastructure is strictly a _superset_ of what they 
get now).

			Linus