On Mon, 2025-01-20 at 16:39 +0200, Ville Syrjälä wrote: > On Mon, Jan 20, 2025 at 07:28:52AM +0000, Hogander, Jouni wrote: > > On Sat, 2025-01-18 at 01:07 +0200, Ville Syrjälä wrote: > > > On Fri, Jan 17, 2025 at 10:20:17PM +0200, Ville Syrjälä wrote: > > > > On Thu, Jan 09, 2025 at 09:31:37AM +0200, Jouni Högander wrote: > > > > > Now as we have correct PSR2_MAN_TRK_CTL handling in place we > > > > > can > > > > > allow DSB > > > > > usage also when PSR is enabled for LunarLake onwards. > > > > > > > > We seem to still lack an answer as to when the PSR wakes, when > > > > it > > > > latches the update, and how does all that guarantee that the > > > > DSB > > > > interrupt fires after the update has been latched? > > > > > > > > Some thoughts as to how to figure this out: > > > > 1. make sure we're in PSR > > > > 2. sample TIMESTAMP_CTR > > > > 3. start DSB in which > > > > write PLANE_SURF with a new value > > > > send push > > > > wait for vblank > > > > poll PLANE_SURFLIVE == new value > > > > fire interrupt > > > > > > > > in the interrupt handler: > > > > sample TIMESTAMP_CTR again > > > > > > > > And then compare flip timestmap vs. frame timestamp vs. the > > > > manually sampled timestamps. And then repeat without the > > > > SURFLIVE > > > > poll to make sure nothing has changed. You'll need to be > > > > careful > > > > to make sure it will actually poll for long enough to make a > > > > real > > > > difference (if the poll actuall is needed), but tweaking the > > > > poll > > > > interval+count suitably. I don't remeber what the max poll > > > > count was, but IIRC it wasn't too high so the duration will > > > > have > > > > to get bumped for long polls. > > > > > > > > I guess one could also try to poll for the actual PSR status, > > > > but dunno how well that'll work. > > > > > > > > And we could also try to come up with different ideas on where > > > > to sample timestamps. Unfortunately we only have the single > > > > pipe flip timestamp register so we can only sample one > > > > timestamp > > > > from the DSB itself per frame. If we had more we could much > > > > more > > > > easily figure things out :/ > > > > > > > > I pushed my latest DSB selftest stuff to > > > > https://github.com/vsyrjala/linux.git dsb_selftests_7 > > > > which has a bunch of stuff for this kind of experimentation. > > > > It's in a somewhat sorry state at the moment since I last used > > > > to hunt for various DSB bugs, but at least it still builds :) > > > > > > > > The way I use that is that I run igt 'testdisplay -o ...' > > > > to make sure nothing else is actively poking the hardware > > > > and then I trigger the DSB tests via debugfs. > > > > > > I poked around a bit, though only on a TGL+PSR1 system (what I > > > had > > > at hand), so some of this might not apply to PSR2 and/or more > > > modern platforms. > > > > > > General notes: > > > - PSR1 exit is triggered by any pipe/plane register write (even > > > the > > > non-arming ones) > > > > This is same for PSR2 as well. > > > > > > > > We basically have three cases to consider here: > > > 1. PSR1 is currently inactive > > > Obviously everything should be with the current code, > > > vblank evasion works, wait for vblank+interrupt scheme > > > for flip completion works > > > > > > 2. PSR1 is active, but DC states are not > > > The wakeup latency here is super quick (it's < 1 scanline, how > > > much below? I've not yet measured), and arming registers do > > > latch > > > nearly immediately. > > Actually it's only ~1usec (based on the timestamps). I also used the > following DSB batch to test how many registers we can write there: > > for (i = 0; ...; i++) > dsb_write(PLANE_SURF, i << 12) > dsb_interrupt() > > and then in the interrupt handler I read PLANE_SURFLIVE, and it > always > shows 0xa000, meaning we only have time to write ten registers. So > definitely not enough to guarantee that all arming registers get > written. > > So, for PSR1 at least I think we'd have two options: > 1) do a manual PSR wake around the whole commit, which doesn't > sound very nice > 2) evade hw_scaline==0 so that we wait until we've woken up > from the DC state, and then proceed with the normal vblank > evasion and arming register writes. And obviously all that > only works in DSB_SKIP_WAITS_EN is disabled. Also if all > the pipes are doing the full update using the DSB then we > could perhaps also remove the explicit DC_OFF dance around > the whole commit. > > > The scanline counter starts counting > > > accordingly > > > from vblank_start-1. However the hardware still considers PSR > > > to be > > > active for that short duration so DSB_SKIP_WAITS_EN _will_ skip > > > the > > > waits. > > > > > > Unfortunately being this quick I'm not convinced we have enough > > > time > > > to write all the registers atomically before the hardware > > > latches > > > something. So I'm thinking we may need to remove > > > DSB_SKIP_WAITS_EN, > > > in which case the vblank evasion will push the arming register > > > writes into the next frame. This will mean the wakeup will take > > > one full frame. > > > > To my understanding DSB_SKIP_WAITS_EN have impact only when in SRD > > (PSR)/DEEP_SLEEP(PSR2). I.e. In this scenario we still do have all > > waits as in commit without PSR. > > The PSR state machine is already in link off mode in this case, > meaning the pipe has been halted, and the display has been > signalled to scan out from its RFB (confirmed by checking the > PSR status register in DPCD), and DSB_SKIP_WAITS_EN is already > active. But the link is still actually up (I'm guessing it might > be transmitting the idle pattern, but I've not confirmed that. > Can't remeber if we even have any kind of status register that > could show this...). So looks like the link only gets actually > turned off by the DMC when entering the DC state. > > > > > > > > > 3. PSR1 is active and so are DC states > > > The wakeup latency is ~5ms. During that time scanline counter > > > reads > > > 0, PSR is active for the purposes of DSB_SKIP_WAITS_EN. Again > > > we > > > pretty much need DSB_SKIP_WAITS_EN=0 here to make sure the > > > interrupt > > > gets signalled after the wait for vblank. vblank evasion will > > > get > > > skipped on account the scanline being 0. Somewhat ironically > > > this > > > would give us ~5ms total wakeup latency which is now faster > > > than > > > the > > > previous case. > > > > Again my understanding is that DSB_SKIP_WAITS_EN=0/1 have impact > > only > > when in SRD/DEEP_SLEEP. There is SRD_CTRL/PSR2_CTRL[Idle Frames] to > > control when entering SRD/DEEP_SLEEP. > > The PSR state machine is what matters. That stops the pipe, and > causes > DSP_SKIP_WAITS_EN to become active. And the PSR state machine is also > what is affected by the idle frames stuff/etc. AFAIK the only stuff > that is actually done by the DMC is turning the link/PLL/etc on and > off. And obviously link training I guess gets triggered from there > somehow when waking up from PSR, hence why the wakeup takes that 5ms > longer than when DC States are disabled. Ok, I need do some investigation on this. The failure I was referring was this one: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-142521v2/shard-lnl-8/igt@kms_psr2_sf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx I.e. there is that scanline wait at the begin in DSB buffer. It was hanging in waiting the scanline. I verified this by checking if preceding write to 0x700300 completes and it never did. This was with PSR2 but not with Panel Replay. Also if I disable DEEP_SLEEP by setting PSR2_CTL[idle frames] as 0 it didn't hang. Also waking it up by writing e.g. CURSURFLIVE helped. BR, Jouni Högander > > > So wait for vblank is supposed to > > work normally in this scenario as well because there has to be at > > least > > one idle frame before entering sleep. > > > > > > > > So everything here would be fine if we know that the wakeup has > > > just > > > started since we have all of that 5ms to write all the > > > registers. > > > But I guess we can't really know when the wakeup started, so we > > > might be doing the vblank evasion just before the scanline > > > counter > > > starts to read its proper value, and then we have that < 1 > > > scanline > > > to write all the arming registers. Not sure if its enough. If > > > not, > > > then we could also explicitly evade scanline 0 as well, which > > > would > > > again force all arming register writes into the next frame > > > giving > > > us the same kind of single frame wakeup latency as in case 2. > > > > hmm, what else could trigger the wakeup than the commit that is on > > hand > > ? Frontbuffer flush/legacy cursor update? > > Could be any register access really. So yeah those, interrupts, > sysfs/debugfs accesses. Ie. all kinds of stuff we can't possibly > keep track of IMO. > > > It begins to look like we > > still need to take that mutex over DSB commit... > > I don't think we need that mutex unless we go with the > "let's manually wake from PSR and wait for the exit" approach. > > > > > > > > > IIRC you said that you had stuff get stuck with > > > DSB_SKIP_WAITS_EN=0. > > > Was that with hardware that has TRANS_PUSH for PSR? > > > > It has TRANS_PUSH, but not using it for PSR. I.e. On LunarLake and > > still using register writes as a trigger for Frame Change event > > towards > > PSR. > > > > It happened with DSB commit where scanline wait was the first > > thing. > > I.e. only cursor plane updating. > > > > > I suppose that > > > could happen if the scanline counter already reads something > > > (eg. again vblank_start-1) before we've done the push, that would > > > cause the vblank evasion to wait forever. I could see two ways to > > > perhaps handle that: > > > - if DSB_SKIP_WAITS_EN becomes inactive immediately after > > > TRANS_PUSH then we could just keep DSB_SKIP_WAITS_EN=1 all > > > the time and things should be fine > > > - if DSB_SKIP_WAITS_EN stays active for some time after > > > TRANS_PUSH > > > then we'll perhaps need to poke that bit from the DSB itself > > > dynamically so that we will skip the vblank evasion, but > > > not the wait for blank prior to generating the interrupt. > > > But this needs a bit more reverse engineering for sure. > > > > TRAN_PUSH is still not enabled in this patch set. > > > > BR, > > > > Jouni Högander > > > > > > > > > > > > > > > > > > > Signed-off-by: Jouni Högander <jouni.hogander@xxxxxxxxx> > > > > > --- > > > > > drivers/gpu/drm/i915/display/intel_display.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > > > > > b/drivers/gpu/drm/i915/display/intel_display.c > > > > > index e448ff64660a..58575800fad2 100644 > > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c > > > > > @@ -7631,6 +7631,7 @@ static void > > > > > intel_atomic_dsb_finish(struct > > > > > intel_atomic_state *state, > > > > > intel_atomic_get_old_crtc_state(state, > > > > > crtc); > > > > > struct intel_crtc_state *new_crtc_state = > > > > > intel_atomic_get_new_crtc_state(state, > > > > > crtc); > > > > > + struct intel_display *display = > > > > > to_intel_display(crtc); > > > > > > > > > > if (!new_crtc_state->hw.active) > > > > > return; > > > > > @@ -7643,7 +7644,7 @@ static void > > > > > intel_atomic_dsb_finish(struct > > > > > intel_atomic_state *state, > > > > > new_crtc_state->update_planes && > > > > > !new_crtc_state->vrr.enable && > > > > > !new_crtc_state->do_async_flip && > > > > > - !new_crtc_state->has_psr && > > > > > + (DISPLAY_VER(display) >= 20 || > > > > > !new_crtc_state- > > > > > > has_psr) && > > > > > !new_crtc_state->scaler_state.scaler_users > > > > > && > > > > > !old_crtc_state->scaler_state.scaler_users > > > > > && > > > > > !intel_crtc_needs_modeset(new_crtc_state) && > > > > > -- > > > > > 2.43.0 > > > > > > > > -- > > > > Ville Syrjälä > > > > Intel > > > > > >