Re: [PATCH] swsusp: Use platform mode by default

Bill Davidsen <davidsen@xxxxxxx> · Sun, 13 May 2007 20:36:30 -0400

Rafael J. Wysocki wrote:
On Friday, 11 May 2007 18:30, Linus Torvalds wrote:
On Fri, 11 May 2007, Rafael J. Wysocki wrote:
We're working on fixing the breakage, but currently it's difficult, because
none of my testboxes has problems with the 'platform' hibernation and I
cannot reproduce the reported issues.
The rule for anything ACPI-related has been: no regressions.

It doesn't matter if something fixes 10 boxes, if it breaks a single one, 
it's going to get reverted.

[Well, I think I should stop explaining decisions that weren't mine.  Yet, I
feel responsible for patches that I sign-off.]

Just to clarify, the change in question isn't new.  It was introduced by the
commit 9185cfa92507d07ac787bc73d06c42222eec7239 before 2.6.20, at Seife's
request and with Pavel's acceptance.

We had much too much of the "two steps forward, one step back" dance with 
ACPI a few years ago, which is the reason that rule got installed (and 
which is why it's ACPI-only: in some other subsystems we accept the fact 
that sometimes we don't know how to fix some hardware issue, but the new 
situation is at least better than the old one).

I agree that it can be aggravating to know that you can fix a problem for 
some people, but then being limited by the fact that it breaks for others. 
But beign able to *rely* on something that used to work is just too 
important, and with ACPI, you can never make a good judgement of which way 
works better (since it really just depends on some random firmware issues 
that we have zero visibility into).

Also, quite often, it may *seem* like something fixes more boxes than it 
breaks, but it's because people report *breakage* only, and then a few 
months later it turns out that it's exactly the other way around: now it's 
a hundred people who report breakage with the *new* code, and the reason 
people thought it fixed more than it broke was that the people for whom 
the old code worked fine obviously never reported it!

So this is why "a single regression is considered more important than ten 
fixes" - because a single regressionr report tends to actually be just the 
first indication of a lot of people who simply haven't tested the new code 
yet! People for whom the old code is broken are more likely to test new 
things.

So I'd just suggest changing the default back to PM_DISK_SHUTDOWN (but 
leave the "pm_ops->enter" testing in place - ie not reverting the other 
commits in the series).

The series actually preserves the 2.6.20/21 behavior.  By defaulting back to
PM_DISK_SHUTDOWN, we'll cause some users for whom 2.6.20 and 2.6.21 work to
report this change as a regression, so please let me avoid making this decision
(I'm not the maintainer of the hibernation code after all).

The problem is that we don't know about regressions until somebody reports them
and if that happens after two affected kernel releases, what should we do?

I think that one of the reasons people (guilty) don't report problems 
with suspend and hibernate is that it's been a problem on and off and 
when it breaks people don't bother to chase it, they just don't use it 
unless it's critical, or they install suspend2.

I only suggest that if 'platform' is more correct use that, don't change 
it again. Then fix platform.

--
Bill Davidsen <davidsen@xxxxxxx>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html