On Mon, Aug 31, 2020 at 01:46:28PM +0200, Hans de Goede wrote: > Hi, > > On 8/31/20 1:10 PM, Thierry Reding wrote: > > On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote: > > > Before this commit a suspend + resume of the LPSS PWM controller > > > would result in the controller being reset to its defaults of > > > output-freq = clock/256, duty-cycle=100%, until someone changes > > > to the output-freq and/or duty-cycle are made. > > > > > > This problem has been masked so far because the main consumer > > > (the i915 driver) was always making duty-cycle changes on resume. > > > With the conversion of the i915 driver to the atomic PWM API the > > > driver now only disables/enables the PWM on suspend/resume leaving > > > the output-freq and duty as is, triggering this problem. > > > > Doesn't this imply that there's another bug at play here? At the PWM API > > level you're applying a state and it's up to the driver to ensure that > > the hardware state after ->apply() is what the software has requested. > > > > If you only switch the enable state and that doesn't cause period and > > duty cycle to be updated it means that your driver isn't writing those > > registers when it should be. > > Right, the driver was not committing those as it should *on resume*, > that and it skips setting the update bit on the subsequent enable, > which is an optimization which gets removed in 7/17. > > Before switching the i915 driver over to atomic, when the LPSS-PWM > was used for the backlight we got the following order on suspend/resume > > 1. Set duty-cycle to 0% > 2. Set enabled to 0 > 3. Save ctrl reg > 4. Power-off PWM controller, it now looses all its state > 5. Power-on PWM ctrl > 6. Restore ctrl reg (as a single reg write) > 7. Set enabled to 1, at this point one would expect the > duty/freq from the restored ctrl-reg to apply, but: > a) The resume code never sets the update bit (which this commit fixes); and > b) On applying the pwm_state with enabled=1 the code applying the > state does this (before setting the enabled bit in the ctrl reg): > > if (orig_ctrl != ctrl) { > pwm_lpss_write(pwm, ctrl); > pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE); > } > and since the restore of the ctrl reg set the old duty/freq the > writes are skipped, so the update bit never gets set. > > 8. Set duty-cycle to the pre-suspend value (which is not 0) > this does cause a change in the ctrl-reg, so now the update flag > does get set. > > Note that 1-2 and 7-8 are both done by the non atomic i915 code, > when moving the i915 code to atomic I decided that having these > 2 separate steps here is non-sense, so the new i915 code just > toggles the enable bit. So in essence the new atomic PWM > i915 code drops step 1 and 8. > > Dropping steps 8 means that the update bit never gets set and we > end up with the PWM running at its power-on-reset duty cycle. > > You are correct in your remark to patch 7/17 that since that removes > the if (orig_ctrl != ctrl) for the writes that now step 7 will be > sufficient to get the PWM to work again. But that only takes the i915 > usage into account. > > What if the PWM is used through the sysfs userspace API? > Then only steps 3-6 will happen on suspend-resume and without > fixing step 6 to properly restore the PWM controller in its > pre-resume state (this patch) it will once again be running at > its power-on-reset defaults instead of the values from the > restored control register. Actually PWM's sysfs code has suspend/resume callbacks that basically make sysfs just a regular consumer of PWMs. So they do end up doing a pwm_apply_state() on the PWM as well on suspend and restore the state from before suspend on resume. This was done very specifically because the suspend/resume order can be unexpected under some circumstances, so for PWM we really want for the consumer to always have ultimate control over when precisely the PWM is restored on resume. The reason why we did this was because people observed weird glitches on suspend/resume with different severity. In some cases a backlight would be resumed before the display controller had had a chance to start sending frames, causing on-screen corruption in some cases (such as smart displays) and in other cases a PWM-controller regulator would be resumed too late or too early, which I think was causing some issue with the CPUs not working properly on resume. So I'd prefer not to have any PWM driver save and restore its own context on suspend/resume, because that's inevitably going to cause unexpected behaviour at some point. If it's absolutely necessary we can of course still do that, but I think in that case we need to at least add a comment in the code about why context save/restore is needed in this particular case and make it clear that this is not something that other drivers should copy because they most likely won't be needing it. Given the above it also doesn't sound to me like there's a real problem, or at least that the bug is somewhere else. A consumer should always be responsible for applying the pre-suspend state upon resume and it sounds like that would be true after patch 7. Since sysfs is just a regular consumer, the same should apply for sysfs-controlled PWMs as well. > So at step 6, if the PWM was enabled before, we must set the update > bit, and then wait for it to clear again so the controller is > ready for subsequent updates. The waiting for it to clear again > needs to happen before or after setting the enable bit depending > on the hw generation, which leads to this patch. But all of that should be happening as part of the call to pwm_apply_state(), right? That path should be taken for all consumers on resume, including sysfs. > I hope that helps explain why this patch is the correct thing > to do. > > > > > The LPSS PWM controller has a mechanism where the ctrl register value > > > and the actual base-unit and on-time-div values used are latched. When > > > software sets the SW_UPDATE bit then at the end of the current PWM cycle, > > > the new values from the ctrl-register will be latched into the actual > > > registers, and the SW_UPDATE bit will be cleared. > > > > > > The problem is that before this commit our suspend/resume handling > > > consisted of simply saving the PWM ctrl register on suspend and > > > restoring it on resume, without setting the PWM_SW_UPDATE bit. > > > When the controller has lost its state over a suspend/resume and thus > > > has been reset to the defaults, just restoring the register is not > > > enough. We must also set the SW_UPDATE bit to tell the controller to > > > latch the restored values into the actual registers. > > > > > > Fixing this problem is not as simple as just or-ing in the value which > > > is being restored with SW_UPDATE. If the PWM was enabled before we must > > > write the new settings + PWM_SW_UPDATE before setting PWM_ENABLE. > > > We must also wait for PWM_SW_UPDATE to become 0 again and depending on the > > > model we must do this either before or after the setting of PWM_ENABLE. > > > > > > All the necessary logic for doing this is already present inside > > > pwm_lpss_apply(), so instead of duplicating this inside the resume > > > handler, this commit adds a new pwm_lpss_restore() helper which mirrors > > > pwm_lpss_apply() minus the runtime-pm reference handling (which we should > > > not change on resume). > > > > If this is all already implemented in pwm_lpss_apply(), why isn't it > > working for the suspend/resume case? It shouldn't matter that the > > consumer only changes the enable/disable state. After ->apply() > > successfully returns your hardware should be programmed with exactly > > the state that the consumer requested. > > See above, apply() was trying to be smart but the restore of ctrl > on resume without setting the update bit was tricking it. That > being too smart for its own good is removed in 7/16 as you > rightfully point out. But this patch is still necessary for the > PWM controller to be in the expected state between resume and the > first apply() after resume (which may be quite a long time in > the future when using e.g. the sysfs API). Like I said, the sysfs code should be resuming any exported PWMs on resume just like any other consumer. Obviously it's always up to the consumer to call pwm_apply_state() at the right time. If that's "too late" for some reason, then that's a bug in the consumer driver. But as I explained above there are a number of cases where restoring context in the PWM driver itself doesn't work because it can cause sequencing issues. Thierry
Attachment:
signature.asc
Description: PGP signature