[linux-pm] [PATCH 2/2] Fix console handling during suspend/resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 15 Jun 2006, Alan Stern wrote:
>
> Here's what you actually did say:
> ---------
>
> > To have DMAs stopped, you need to "freeze" the devices.
> 
> No you don't. 
> 
> You need to stop the high-level _queues_, but that's something totally 
> different from actually stopping the _devices_.

Right. 

What you _do_ need to do, is stop the user-level actions.

Ie by "higher-level queues", we're talking stuff that has nothing at all 
to do with device drivers any more.

Before you suspend, you need to make the machine quiescent, in other 
words. The devices are still working, but you really really don't want to 
do this while things are still _happening_.

Now, with suspend-to-RAM, I suspect we could even avoid that until the 
very last phase (ie the actual suspend code). But quite frankly, from a 
pure debuggability standpoint, I do think we want to basically try to make 
everything as quiet as humanly possible.

And from a suspend-to-disk standpoint, the act of starting to write to 
disk really requires that everything is "done", so you had better have 
_nothing_ else than the actual write-to-disk actually happening. That's 
also the thing where a "save_state()" may actually want to flush its 
queues entirely and replace them with a known-temporary thing.

But the point is, the devices really have to be able to handle things that 
can happen during suspend, even after their state has been "saved". They 
can't just stop. That would be a bug - or it would require totally insane 
special casing, which is effectively what we do now.

So think about what we do now: We special-case X, and we special-case the 
save-to-disk device, and we special-case the console printouts, and we 
special-case a lot of other things, AND WE STILL GOT IT WRONG. Try using 
netconsole, and see it blow up in your face without my changes (it _might_ 
work with some network drivers, but I looked at the sky2 driver, and I 
suspect that apart from the stupid bug where it didn't actually do a 
pci_save_state(), it's probably one of the _better_ ones).

And the thing is, all those special-cases are all really doing the same 
thing: "keep the device alive despite shutting it down". Really. I'm not 
making that up. In the case of X, we did it the other way around, namely 
in that case, the special case was not keeping the device alive, but 
instead just saving the state separately (and early) from all the other 
drivers. Which I'm just saying we should do for _everyting_.

At some point, somebody just _has_ to realize, that the problem was 
shutting the damn thing down in the first place! If you just save the hw 
state that you need to save, and let the device itself continue work, 
suddenly all the special cases just go away.

Poof. They're gone.

And yes, I admit (and I started off talking about this) that I care a lot 
more about suspend-to-ram than I do about suspend-to-disk. I seriously 
claim that STR _should_ be a lot simpler than suspend-to-disk, because it 
avoids all the memory management problems. The reason that we support 
suspend-to-disk but not STR is totally perverse - it's simply that it has 
been easier to debug, because unlike STR, we can do a "real boot" into a 
working system, and thus we don't have the debugging problems that the 
"easy" suspend/resume case has.

Wouldn't you agree?

Which is obviously also why patch 1/2 (and in many way the more 
fundamental one) was about trying to make debugging much simpler. Or at 
least possible.

			Linus


[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux