On Thursday 22 June 2006 12:23 pm, Linus Torvalds wrote: > > On Thu, 22 Jun 2006, David Brownell wrote: > > > On Thursday 22 June 2006 9:10 am, Linus Torvalds wrote: > > > > > > The fact that worries me is that suspend-to-ram DOES NOT WORK FOR PEOPLE. > > > I have never _ever_ met a laptop or machine of mine that "just worked". > > > I've always had to fix something, and people always end up having to do > > > something ridiculous like unlink all modules etc. > > > > And when I've looked at the causes of such problems, they've been > > either (a) driver bugs, or (b) ACPI bugs. As you know, both of > > them are hard to debug, especially when the symptom is on resume > > paths with no console. (Oooh, see $SUBJECT, this isn't offtopic!!) > > EXACTLY. > > We're back to square one. > > The #1 problem _by_far_ with suspend has absolutely ZERO to do with > suspend being "hard", block device queues, or how to save driver state per > se. > > Each individual driver tends to be fairly easy to fix, I'd say. I suspect > that even USB in the end is just a "Small Matter Of Programming", but it's > a total bitch to debug. Actually, testing is more of a problem, given the 2^(about 8) different configurations, with different fault paths in each. That one is never going away, while the "is printk available" issue has at least had some system-specific workarounds. > Our problem is that it's damn hard to debug the mess, AND A LARGE PART OF > THAT IS THAT STUPID INTERFACE! Specifically, that the interface de-facto includes "printk unavailable" during interesting sequence like resume, so there's no way to see what broke and when. > Let's revisit why I want to do as much _independently_ of actually calling > suspend() on a device again: > > - debugging is basically impossible during the _actual_ suspend sequence. > > This is why we want to (nay, NEED) to split that "suspend()" function up, > so that it doesn't do five different things. The more we can do _outside_ > of suspend(), the better. Exactly because suspend() is a total bitch to > debug, and because in order to actually do things like printk() and use > netconsole, we want to minimize the amount of code that gets run in that > state. Seriously, suspend() tends to be less of a problem than resume(). Which is why I'm lukewarm to notions of refactoring suspend(). Going from a first-principles model based approach, the conceptual issue is that providing a console has to date been purely a side effect of the driver model suspend and resume sequences. There are multiple sequences of driver suspend/resume calls which observe the parent/child constraints, but there's no effort to keep a consoles maximally active. - Dave