RE: [PATCH] Workaround for a PCI restoring bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> the problem is that saving device state didn't 
>> work, because the device was somehow turned off at suspend 
>> time before we even saved it, so we saved crap.
>> Or more specifically, we saved 
>> all-0xffffffff values from the config space, because the 
>> devices were effectively not there any more! ]
>> 
>> On Wed, 16 May 2007, Lukas Hejtmanek wrote:
>> > 
>> > ...
>> > The problematic code is in drivers/acpi/hardware/hwsleep.c method
>> > acpi_enter_sleep_state_prep.
>> > 
>> > In particular code:
>> > /* Run the _PTS and _GTS methods */
>> > 

>> 
>> Ok, so to fix this bug, we really should do 
>> acpi_enter_sleep_state_prep() 
>> _after_ we have done our own suspend/save of the device information, 
>> rather than before. Because clearly acpi_enter_sleep_state_prep() can

>> corrupt the state wee *want* to save!

Right.

>> Of course, it has to be between the "suspend()" and the 
>> "suspend_late()" 
>> calls, since it wants to run with interrupts enabled and the 
>> system in a 
>> working state, and the final suspend_late() really does end 
>> up shutting 
>> things down and does *not* want to have things in a "working" state.
>> 
>> But that's fine. It is, after all, how suspend/suspend_late 
>> was designed 
>> to work: the "suspend()" part should allocate memory and 
>> save state, the 
>> "suspend_late()" part should actually suspend the device. So 
>> this does fit in the model.

>> - call "device_suspend()" before the "pm_ops->prepare()" function,
and 
>>    then call the "suspend_late()" thing just before the actual
suspend. 
>> 
>>    This *should* work, and I think it's the right thing to do, but
the 
>>    "change order of ACPI calls" is dangerous. Much more dangerous
than I'd 
>>    like. There's no way I can do it before 2.6.22 - it would have to
go 
>>    into 2.6.23-rc1 (and the ugly patch might be a candidate for
2.6.22, 
>>    however much I hate it)

While I know it breaks your own rule for taking risks after rc1,
I don't see a huge down-side to pushing this patch now.
Suspend on Linux has numerous problems on numerous boxes, and this fix
appears to be obvious and important -- so the reward is potentially
large.
It is a small text change to code that is not otherwise changing --
so if it causes a regression before 2.6.22, simply revert it.
Consider that if Rafael had found this test-case box, I'm sure he would
have
shipped you this patch before the rc1 cutoff.

Acked-by: Len Brown <len.brown@xxxxxxxxx>
If you want somebody to blame:-)

>>    However, the reason I think that's the correct thing to 
>>    do is that on *resume*, we actually already call "pm_finish()"
>>    before we call  "device_resume()", so _logically_ we should do
>>    the suspend in the reverse order too!

Agreed.

>> Len? What do you think? Does the ACPI spec say anything? Do 
>> you know what the other system (the one that BIOS vendors actually
test with) does?

>Well, I've raised exactly the same issue on linux-pm and 
>linux-acpi recently,
>but I haven't had a test case to support my observations.
>
>In fact, we're violating the spec by executing the _PTS and 
>_GTS control methods before suspending devices.

Agreed.  suspend_prepare() has this order wrong.
Linus' patch (attached) looks like it fixes it.

>The spec is clear in these points:
>_PTS should be executed *after* devices are placed in 
>low-power states and
>_GTS should be executed immediately before the sleep state is 
>*entered*.

Yep -- here's the quote from ACPI 3.0b:

OSPM will invoke _GTS, _PTS, _TTS, _WAK, and _BFS in the following
order:
1.OSPM decides (through a policy scheme) to place the system into a
sleeping state.
2._TTS(Sx) is run, where Sx is the desired sleep state to enter.
3. OSPM notifies all native device drivers of the sleep state transition
4._PTS is run
5.OSPM readies system for the sleep state transition
6._GTS is run
7.OSPM writes the sleep vector and the system enters the specified Sx
sleep state.
8.System Wakes up
9._BFS is run
10.OSPM readies system for the return from the sleep state transition
11._WAK is run
12. OSPM notifies all native device drivers of the return from the sleep
state transition
13._TTS(0) is run to indicate the return to the S0 state.

Technically we write the sleep vector too early as well.
And we don't evaluate _TTS at all -- though _TTS was
added only as of ACPI 3.0 and I've not seen it implemented on
any of the systems I've got.

>Thus I think the second patch is a better thing to do long 
>term.  Moreover, I'd
>also move the execution of _GTS to after disable_nonboot_cpus()

I'd agree to that on the grounds of symmetry.
However, I expect it will be a NOP -- since the BIOS
can't tell the difference between an enabled and a non-enabled CPU.
(The possible exception is if any BIOS hooks -- such as magic traps
 into SMM -- assume that they're run in cpu0).

In the resume case, the order is important.  Per our discussion
on this last November, _WAK must follow _INIT of the non boot cpus
so that it can patch up stuff in firmware that got cleared on reset.
This continues to be true in Linux.

>(and, symmetrically, the execution of _BFS to before 
>enable_nonboot_cpus(), which
>also would be more in line with the spec, but that's a separate issue).

True, _BFS is probably later than the spec wants.
But like _GTS, it was added with ACPI 2.0 and it is optional
for a BIOS to implement it.  Poking around, I don't see any systems
that actually implement _BFS or _GTS.  So that tweak probably isn't
a big deal like the problem at hand.

thanks,
-Len

Attachment: diff-better
Description: diff-better


[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux