On Sat, 2025-01-18 at 01:07 +0200, Ville Syrjälä wrote: > On Fri, Jan 17, 2025 at 10:20:17PM +0200, Ville Syrjälä wrote: > > On Thu, Jan 09, 2025 at 09:31:37AM +0200, Jouni Högander wrote: > > > Now as we have correct PSR2_MAN_TRK_CTL handling in place we can > > > allow DSB > > > usage also when PSR is enabled for LunarLake onwards. > > > > We seem to still lack an answer as to when the PSR wakes, when it > > latches the update, and how does all that guarantee that the DSB > > interrupt fires after the update has been latched? > > > > Some thoughts as to how to figure this out: > > 1. make sure we're in PSR > > 2. sample TIMESTAMP_CTR > > 3. start DSB in which > > write PLANE_SURF with a new value > > send push > > wait for vblank > > poll PLANE_SURFLIVE == new value > > fire interrupt > > > > in the interrupt handler: > > sample TIMESTAMP_CTR again > > > > And then compare flip timestmap vs. frame timestamp vs. the > > manually sampled timestamps. And then repeat without the SURFLIVE > > poll to make sure nothing has changed. You'll need to be careful > > to make sure it will actually poll for long enough to make a real > > difference (if the poll actuall is needed), but tweaking the poll > > interval+count suitably. I don't remeber what the max poll > > count was, but IIRC it wasn't too high so the duration will have > > to get bumped for long polls. > > > > I guess one could also try to poll for the actual PSR status, > > but dunno how well that'll work. > > > > And we could also try to come up with different ideas on where > > to sample timestamps. Unfortunately we only have the single > > pipe flip timestamp register so we can only sample one timestamp > > from the DSB itself per frame. If we had more we could much more > > easily figure things out :/ > > > > I pushed my latest DSB selftest stuff to > > https://github.com/vsyrjala/linux.git dsb_selftests_7 > > which has a bunch of stuff for this kind of experimentation. > > It's in a somewhat sorry state at the moment since I last used > > to hunt for various DSB bugs, but at least it still builds :) > > > > The way I use that is that I run igt 'testdisplay -o ...' > > to make sure nothing else is actively poking the hardware > > and then I trigger the DSB tests via debugfs. > > I poked around a bit, though only on a TGL+PSR1 system (what I had > at hand), so some of this might not apply to PSR2 and/or more > modern platforms. > > General notes: > - PSR1 exit is triggered by any pipe/plane register write (even the > non-arming ones) This is same for PSR2 as well. > > We basically have three cases to consider here: > 1. PSR1 is currently inactive > Obviously everything should be with the current code, > vblank evasion works, wait for vblank+interrupt scheme > for flip completion works > > 2. PSR1 is active, but DC states are not > The wakeup latency here is super quick (it's < 1 scanline, how > much below? I've not yet measured), and arming registers do latch > nearly immediately. The scanline counter starts counting > accordingly > from vblank_start-1. However the hardware still considers PSR to be > active for that short duration so DSB_SKIP_WAITS_EN _will_ skip the > waits. > > Unfortunately being this quick I'm not convinced we have enough > time > to write all the registers atomically before the hardware latches > something. So I'm thinking we may need to remove DSB_SKIP_WAITS_EN, > in which case the vblank evasion will push the arming register > writes into the next frame. This will mean the wakeup will take > one full frame. To my understanding DSB_SKIP_WAITS_EN have impact only when in SRD (PSR)/DEEP_SLEEP(PSR2). I.e. In this scenario we still do have all waits as in commit without PSR. > > 3. PSR1 is active and so are DC states > The wakeup latency is ~5ms. During that time scanline counter reads > 0, PSR is active for the purposes of DSB_SKIP_WAITS_EN. Again we > pretty much need DSB_SKIP_WAITS_EN=0 here to make sure the > interrupt > gets signalled after the wait for vblank. vblank evasion will get > skipped on account the scanline being 0. Somewhat ironically this > would give us ~5ms total wakeup latency which is now faster than > the > previous case. Again my understanding is that DSB_SKIP_WAITS_EN=0/1 have impact only when in SRD/DEEP_SLEEP. There is SRD_CTRL/PSR2_CTRL[Idle Frames] to control when entering SRD/DEEP_SLEEP. So wait for vblank is supposed to work normally in this scenario as well because there has to be at least one idle frame before entering sleep. > > So everything here would be fine if we know that the wakeup has > just > started since we have all of that 5ms to write all the registers. > But I guess we can't really know when the wakeup started, so we > might be doing the vblank evasion just before the scanline counter > starts to read its proper value, and then we have that < 1 scanline > to write all the arming registers. Not sure if its enough. If not, > then we could also explicitly evade scanline 0 as well, which would > again force all arming register writes into the next frame giving > us the same kind of single frame wakeup latency as in case 2. hmm, what else could trigger the wakeup than the commit that is on hand ? Frontbuffer flush/legacy cursor update? It begins to look like we still need to take that mutex over DSB commit... > > IIRC you said that you had stuff get stuck with DSB_SKIP_WAITS_EN=0. > Was that with hardware that has TRANS_PUSH for PSR? It has TRANS_PUSH, but not using it for PSR. I.e. On LunarLake and still using register writes as a trigger for Frame Change event towards PSR. It happened with DSB commit where scanline wait was the first thing. I.e. only cursor plane updating. > I suppose that > could happen if the scanline counter already reads something > (eg. again vblank_start-1) before we've done the push, that would > cause the vblank evasion to wait forever. I could see two ways to > perhaps handle that: > - if DSB_SKIP_WAITS_EN becomes inactive immediately after > TRANS_PUSH then we could just keep DSB_SKIP_WAITS_EN=1 all > the time and things should be fine > - if DSB_SKIP_WAITS_EN stays active for some time after TRANS_PUSH > then we'll perhaps need to poke that bit from the DSB itself > dynamically so that we will skip the vblank evasion, but > not the wait for blank prior to generating the interrupt. > But this needs a bit more reverse engineering for sure. TRAN_PUSH is still not enabled in this patch set. BR, Jouni Högander > > > > > > > > > Signed-off-by: Jouni Högander <jouni.hogander@xxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/display/intel_display.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > > > b/drivers/gpu/drm/i915/display/intel_display.c > > > index e448ff64660a..58575800fad2 100644 > > > --- a/drivers/gpu/drm/i915/display/intel_display.c > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c > > > @@ -7631,6 +7631,7 @@ static void intel_atomic_dsb_finish(struct > > > intel_atomic_state *state, > > > intel_atomic_get_old_crtc_state(state, crtc); > > > struct intel_crtc_state *new_crtc_state = > > > intel_atomic_get_new_crtc_state(state, crtc); > > > + struct intel_display *display = to_intel_display(crtc); > > > > > > if (!new_crtc_state->hw.active) > > > return; > > > @@ -7643,7 +7644,7 @@ static void intel_atomic_dsb_finish(struct > > > intel_atomic_state *state, > > > new_crtc_state->update_planes && > > > !new_crtc_state->vrr.enable && > > > !new_crtc_state->do_async_flip && > > > - !new_crtc_state->has_psr && > > > + (DISPLAY_VER(display) >= 20 || !new_crtc_state- > > > >has_psr) && > > > !new_crtc_state->scaler_state.scaler_users && > > > !old_crtc_state->scaler_state.scaler_users && > > > !intel_crtc_needs_modeset(new_crtc_state) && > > > -- > > > 2.43.0 > > > > -- > > Ville Syrjälä > > Intel >