Re: [PATCH v2 0/5] acpi/x86: s2idle: move Display off/on calls outside suspend (fixes ROG Ally suspend)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Antheas,

On 5-Oct-24 6:24 PM, Hans de Goede wrote:
> Hi Antheas,
> 
> On 5-Oct-24 5:10 PM, Antheas Kapenekakis wrote:
>> Hi Hans,
>>
>>> Thank you for your work on this and thank you for the comprehensive write-up
>>> on how Windows does modern standby.
>>>
>>> First of all may I suggest that you take the above write-up, minus the ROG
>>> Ally specific bits and turn this into a new documentation file under
>>> Documentation/power ?  And also document at which point Linux currently
>>> makes the various transitions.
>>
>> I will try to move some of that documentation there, this is a great idea.
>>
>>> And then in patches where you move the transitions, also update the docs
>>> on what Linux does to match.
>>>
>>> I have read the discussion about tying the display on/off calls to CRTC state
>>> and/or exposing a userspace knob for that. I think that this needs more
>>> discussion / design work.
>>
>> Yes, you are right. To become a knob this would require a much bigger
>> discussion. I would also like to move Sleep calls as part of that. The
>> Legion Go and OneXPlayer devices turn off their controllers as part of
>> that and other modern standby devices limit their power envelope
>> (perhaps the Legion Go too). I think the Sleep call is where most of
>> the userspace usability will come from. Display On/Off is a bit of a
>> NOOP on most devices.
>>
>> As for the LSB0 enter and exit, I do not know where the correct place
>> for those would be, and perhaps someone from Microsoft needs to be
>> consulted on that. The documentation is very vague. However it is
>> clear to me that they should be close to where they are right now, so
>> they very likely do not need to move.
>>
>> There is also the new _DSM intent to turn display on 9 call. Which
>> meshes with the sleep call. That call is made before Sleep Exit, if
>> the kernel knows that the wake-up will cause the display to turn on,
>> to boost the thermal envelope of the device and help it wake up
>> quicker. If the Sleep call is moved then we would also have to
>> introduce that somewhere to avoid wake-up time regressions on devices
>> that support it, which also raises the question of how would the
>> kernel decide if an interrupt will cause the display to turn on before
>> unfreezing userspace (bulk of resume) (or should it be done after
>> unfreezing?).
>>
>>> OTOH IMHO it would be good to take patches 1 - 3 . Certainly 1 + 2 would
>>> be good to have. 3 is a bit unfortunate and not necessary with the current
>>> special ROG Ally handling in the asus-wmi driver. It might be better to
>>> just keep the quirks there.
>>
>> From what I know Luke plans to remove that quirk ASAP due to new
>> firmware. I would keep it around until this patch series merges
>> personally and remove it as part of that.
> 
> Ack I replied to Luke to say something like that.
> 
>> As it will probably cause regressions if both are in place
> 
> I don't see how having both this patch-sets + the asus-wmi
> quirks will cause regressions?  The suspend delay will be done
> twice, but that is harmless. 
> 
>> and require manual intervention if
>> either is not. I will also note that the quirk in asus-wmi calls the
>> Display On/Off calls a second time and during the suspend sequence,
>> which is not in any way proper.
> 
> AFAICT asus-wmi does not call the display on / off callbacks instead
> it directly calls "\\_SB.PCI0.SBRG.EC0.CSEE" to control the power ?
> 
>> So if future devices need this kind of
>> quirk, it really does not seem like a good idea to me to paper over
>> their problems by calling the notifications a second time in random
>> platform drivers. There is the question of where that quirk should be
>> placed, that is true, but I IMO it should be a pm problem.
>>
>> Perhaps not in the location where I put it though and perhaps it
>> should be done with LSB0 callbacks instead. Although, being done this
>> way allows for it to blend with the suspend sequence. Ideally, the
>> Display Off delay would be blended with userspace going down such that
>> if e.g., there is heavy userspace activity that requires ~2s to
>> freeze, the quirk would add no delay. Instead, it would only add delay
>> if userspace freezes quickly (less than .5s). Same can be said with
>> Sleep Entry and beginning prepare_late, which blocks the EC interrupts
>> (that would need a lot of investigation though).
>>
>> On that note, it seems to me that the Ally has 2 bugs related to the
>> _DSM calls 3 and 4. First bug is that Display On is gated on current
>> firmware and only works when the USB subsystem is powered on.
>> Allegedly, this is fixed on the upcoming firmware but it is not
>> something I have verified personally. I will verify it in 10 days or
>> so, if the new firmware does not fail QA I guess.
>>
>> However, there is a second bug with Display Off in _DSM 4. The
>> controller of the Ally needs time to power off, around 500ms.
>> Otherwise it gets its power clipped and/or does not power off
>> correctly. This causes the issues mentioned in the discussion and I
>> have no indication that this is fixed with newer controller firmware.
>> It is also my understanding that most of the testing of the new
>> firmware happened with the asus-wmi quirk in place, which papers over
>> that issue, so removing the quirk might be premature in any case.
> 
> Ok.
> 
>> We have currently released this patch series in Bazzite and I am happy
>> to report that it completely fixes all controller related issues in
>> the Ally devices and makes them behave exactly as they do in Windows,
>> regardless of firmware and bug for bug.
>>
>> So we will be keeping it around and extending it as appropriate to
>> include the Sleep calls. I am reminded multiple times per week that
>> the Ally has TDP suspend bugs, where if the user is playing a heavy
>> game, the EC of the device tends to get stuck at 6W and fail to
>> respond after waking the device. So moving calls 7, 8 is the natural
>> next step in this investigation. I already have a draft patch on
>> standby, that we plan to AB test soon.
>>
>>> IMHO it would be good to submit a v2 of just patches 1 - 3 run through
>>> checkpatch. Also the commit message of patch 3 should point to the existing
>>> quirk code in asus-wmi.c and mention that then is no longer necessary after
>>> patch 3, then we can discuss what is the best place for these quirks.
>>
>> I did run it through before sending the patch. However, some of the
>> warnings were a bit cryptic to me... I will run it again.
>>
>> I will add a note for asus-wmi on future patch series.
>>
>> First 3 patches of the series are designed to NOOP before patch 4. Did
>> you mean patch 3 (which adds the delay) instead of 4?
> 
> Ah I misread the series and failed to notice that patch 4 actually hooks
> things up, I was under the impression that patch 4 hooks things up.
> 
> But I did mean that patch 3 might lead to discussion not patch 4.
> 
> Now that I have taken a better look some more detailed review comments:
> 
> * Patches 1 and 2 should be squashed into a single patch IMHO.
> 
> * Patch 3 adds the quirks to kernel/power/suspend.c but this
> really should be added to drivers/acpi/x86/s2idle.c and then do
> the sleep as part of the display_off callback.
> 
> * Given my comment on patch 3 I would swap the order of patch 3 and 4
>   and only add the quirks after moving the display off ACPI call to
>   the new callback
> 
> * Patch 4 does too much in a single patch, specifically besides
> actually implementing the new callbacks it also does s/SCREEN/DISPLAY
> on all the ACPI_LPS0_* defines. This renaming of the defines must
> be done in a separate patch.

Thinking some more about this I am having second doubts about
moving the LPS0 display power off call to before devices are suspended,
doing so would mean that the display might still be on when that call
is made and that call could disable power-resources which are necessary
for the display causing issues when the display driver's suspend method
runs.

So I think that we need something closer to Mario's POC from:

https://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/log/?h=superm1/dsm-screen-on-off

here where the call is made when the last display is turned off.

IOW have the drm modesetting core call this.

Maybe have something like a enabled_displays counter in the
drm-core which gets increased / decreased by helpers and
have the drm-core call platform_suspend_screen_off() /
platform_suspend_screen_on() when the counter goes from 1 -> 0
resp. 0 -> 1, ignoring the very first 0 -> 1 transition
which will be done when the first GPU with an enabled
output is found ?

The idea being that the first increase() call gets made when
a drm/kms driver probes a display and finds outputs which are
light up during probe() and then further increase / decrease
calls are made either when all displays go off; or maybe
per crtc when the crtc gets enabled / disabled.

Anyways how best to do this at display off time should be
discussed with the drm/kms community on the dri-devel list.

Regards,

Hans













[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux