Re: [PATCH v8 06/17] pwm: lpss: Use pwm_lpss_restore() when restoring state on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 8/31/20 3:15 PM, Thierry Reding wrote:
On Mon, Aug 31, 2020 at 01:46:28PM +0200, Hans de Goede wrote:
Hi,

On 8/31/20 1:10 PM, Thierry Reding wrote:
On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
Before this commit a suspend + resume of the LPSS PWM controller
would result in the controller being reset to its defaults of
output-freq = clock/256, duty-cycle=100%, until someone changes
to the output-freq and/or duty-cycle are made.

This problem has been masked so far because the main consumer
(the i915 driver) was always making duty-cycle changes on resume.
With the conversion of the i915 driver to the atomic PWM API the
driver now only disables/enables the PWM on suspend/resume leaving
the output-freq and duty as is, triggering this problem.

Doesn't this imply that there's another bug at play here? At the PWM API
level you're applying a state and it's up to the driver to ensure that
the hardware state after ->apply() is what the software has requested.

If you only switch the enable state and that doesn't cause period and
duty cycle to be updated it means that your driver isn't writing those
registers when it should be.

Right, the driver was not committing those as it should *on resume*,
that and it skips setting the update bit on the subsequent enable,
which is an optimization which gets removed in 7/17.

Before switching the i915 driver over to atomic, when the LPSS-PWM
was used for the backlight we got the following order on suspend/resume

1. Set duty-cycle to 0%
2. Set enabled to 0
3. Save ctrl reg
4. Power-off PWM controller, it now looses all its state
5. Power-on PWM ctrl
6. Restore ctrl reg (as a single reg write)
7. Set enabled to 1, at this point one would expect the
duty/freq from the restored ctrl-reg to apply, but:
a) The resume code never sets the update bit (which this commit fixes); and
b) On applying the pwm_state with enabled=1 the code applying the
state does this (before setting the enabled bit in the ctrl reg):

	if (orig_ctrl != ctrl) {
		pwm_lpss_write(pwm, ctrl);
		pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
	}
and since the restore of the ctrl reg set the old duty/freq the
writes are skipped, so the update bit never gets set.

8. Set duty-cycle to the pre-suspend value (which is not 0)
this does cause a change in the ctrl-reg, so now the update flag
does get set.

Note that 1-2 and 7-8 are both done by the non atomic i915 code,
when moving the i915 code to atomic I decided that having these
2 separate steps here is non-sense, so the new i915 code just
toggles the enable bit. So in essence the new atomic PWM
i915 code drops step 1 and 8.

Dropping steps 8 means that the update bit never gets set and we
end up with the PWM running at its power-on-reset duty cycle.

You are correct in your remark to patch 7/17 that since that removes
the if (orig_ctrl != ctrl) for the writes that now step 7 will be
sufficient to get the PWM to work again. But that only takes the i915
usage into account.

What if the PWM is used through the sysfs userspace API?
Then only steps 3-6 will happen on suspend-resume and without
fixing step 6 to properly restore the PWM controller in its
pre-resume state (this patch) it will once again be running at
its power-on-reset defaults instead of the values from the
restored control register.

Actually PWM's sysfs code has suspend/resume callbacks that basically
make sysfs just a regular consumer of PWMs. So they do end up doing a
pwm_apply_state() on the PWM as well on suspend and restore the state
from before suspend on resume.

This was done very specifically because the suspend/resume order can be
unexpected under some circumstances, so for PWM we really want for the
consumer to always have ultimate control over when precisely the PWM is
restored on resume.

The reason why we did this was because people observed weird glitches on
suspend/resume with different severity. In some cases a backlight would
be resumed before the display controller had had a chance to start
sending frames, causing on-screen corruption in some cases (such as
smart displays) and in other cases a PWM-controller regulator would be
resumed too late or too early, which I think was causing some issue with
the CPUs not working properly on resume.

So I'd prefer not to have any PWM driver save and restore its own
context on suspend/resume, because that's inevitably going to cause
unexpected behaviour at some point. If it's absolutely necessary we can
of course still do that, but I think in that case we need to at least
add a comment in the code about why context save/restore is needed in
this particular case and make it clear that this is not something that
other drivers should copy because they most likely won't be needing it.

Given the above it also doesn't sound to me like there's a real problem,
or at least that the bug is somewhere else. A consumer should always be
responsible for applying the pre-suspend state upon resume and it sounds
like that would be true after patch 7. Since sysfs is just a regular
consumer, the same should apply for sysfs-controlled PWMs as well.

Ok, I was not aware that for PWM the consumer is supposed to always
be the one to restore the state. If that is the rule then we should probably
just drop the save/restore suspend/resume code from pwm-lpss.

It seems that I'm actually responsible for adding that suspend/resume
code in the first place, see commit 1d375b58c12f ("pwm: lpss: platform:
Save/restore the ctrl register over a suspend/resume") although the
ctrl-register was already being saved/restored before that commit
but then by the acpi/acpi_lpss.c code.

One worry after dropping the suspend/resume save/restore code is what
happens if the controller was enabled at the moment the system suspends
and the consumers first post resume apply() call has pwm_state.enabled
set too.

Currently pwm_lpss_apply() looks like this:

        if (state->enabled) {
                if (!pwm_is_enabled(pwm)) {
                        pm_runtime_get_sync(chip->dev);
                        ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
                        if (ret)
                                pm_runtime_put(chip->dev);
                } else {
                        ret = pwm_lpss_prepare_enable(lpwm, pwm, state, false);
                }
        } else if (pwm_is_enabled(pwm)) {

Where the true / false parameter to pwm_lpss_prepare_enable()
decides if pwm_lpss_prepare_enable() sets the enable bit in the controllers
ctrl register, or if it skips that.

If we then come from a full system suspend (controller loses state,
comes up with enable bit cleared) we will still enter the:

                        ret = pwm_lpss_prepare_enable(lpwm, pwm, state, false);

Path since the !pwm_is_enabled(pwm) check checks the pwm_state struct,
not the actual hw-enabled bit and then we do not (re)set the enabled after
resume as we should when apply() is called with pwm_state.enabled set.

Fixing this is easy though, we still need to check for the disabled ->
enabled transition for runtime pm refcounting, but we can also tell
pwm_lpss_prepare_enable() to set the enable bit in the other path, this
will be a no-op in case it is already set.

So then the new apply() code would become:

        if (state->enabled) {
                if (!pwm_is_enabled(pwm)) {
                        pm_runtime_get_sync(chip->dev);
                        ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
                        if (ret)
                                pm_runtime_put(chip->dev);
                } else {
                        ret = pwm_lpss_prepare_enable(lpwm, pwm, state, true);
                }
        } else if (pwm_is_enabled(pwm)) {

(and we can even optimize out the enable parameter to pwm_lpss_prepare_enable
then and always make it set the enable bit).

Together with patch 07/16 will make apply() always work independent of
the state the controller was in before it got called. Which in light of
all the subtle issues we have seen surrounding this is likely a good thing.

And with the fix to make apply() fully independent of the previous state
of the controller, I'm all for dropping the suspend/resume state
save/restore code.  Doing that makes things more KISS so I like it :)

So at step 6, if the PWM was enabled before, we must set the update
bit, and then wait for it to clear again so the controller is
ready for subsequent updates. The waiting for it to clear again
needs to happen before or after setting the enable bit depending
on the hw generation, which leads to this patch.

But all of that should be happening as part of the call to
pwm_apply_state(), right? That path should be taken for all consumers on
resume, including sysfs.

Ack.

<snip>

See above, apply() was trying to be smart but the restore of ctrl
on resume without setting the update bit was tricking it. That
being too smart for its own good is removed in 7/16 as you
rightfully point out. But this patch is still necessary for the
PWM controller to be in the expected state between resume and the
first apply() after resume (which may be quite a long time in
the future when using e.g. the sysfs API).

Like I said, the sysfs code should be resuming any exported PWMs on
resume just like any other consumer.

Obviously it's always up to the consumer to call pwm_apply_state() at
the right time. If that's "too late" for some reason, then that's a bug
in the consumer driver. But as I explained above there are a number of
cases where restoring context in the PWM driver itself doesn't work
because it can cause sequencing issues.

Ack, I was not aware that PWM consumers are responsible for restoring
their own state on resume. If that is the case then:

TL;DR:

1. We (I) should make apply() work independent of the current
hardware state, instead of having it make various assumptions
about that state as it now does.

2. We (I) should drop the suspend/resume save/restore state
handlers from pwm-lpss completely.

So I believe that I should prepare yet another version of this
patch-set replacing 06/17 and 07/17 with 2 patches doing these
2 things.

Thierry, Andy, does that sound good to you ?

Regards,

Hans




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux