Re: [PATCH v3 10/10] drm/i915/psr: Allow DSB usage when PSR is enabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2025-01-20 at 16:39 +0200, Ville Syrjälä wrote:
> On Mon, Jan 20, 2025 at 07:28:52AM +0000, Hogander, Jouni wrote:
> > On Sat, 2025-01-18 at 01:07 +0200, Ville Syrjälä wrote:
> > > On Fri, Jan 17, 2025 at 10:20:17PM +0200, Ville Syrjälä wrote:
> > > > On Thu, Jan 09, 2025 at 09:31:37AM +0200, Jouni Högander wrote:
> > > > > Now as we have correct PSR2_MAN_TRK_CTL handling in place we
> > > > > can
> > > > > allow DSB
> > > > > usage also when PSR is enabled for LunarLake onwards.
> > > > 
> > > > We seem to still lack an answer as to when the PSR wakes, when
> > > > it
> > > > latches the update, and how does all that guarantee that the
> > > > DSB
> > > > interrupt fires after the update has been latched?
> > > > 
> > > > Some thoughts as to how to figure this out:
> > > > 1. make sure we're in PSR
> > > > 2. sample TIMESTAMP_CTR
> > > > 3. start DSB in which
> > > >    write PLANE_SURF with a new value
> > > >    send push
> > > >    wait for vblank
> > > >    poll PLANE_SURFLIVE == new value
> > > >    fire interrupt
> > > > 
> > > > in the interrupt handler:
> > > >  sample TIMESTAMP_CTR again
> > > > 
> > > > And then compare flip timestmap vs. frame timestamp vs. the
> > > > manually sampled timestamps. And then repeat without the
> > > > SURFLIVE
> > > > poll to make sure nothing has changed. You'll need to be
> > > > careful
> > > > to make sure it will actually poll for long enough to make a
> > > > real
> > > > difference (if the poll actuall is needed), but tweaking the
> > > > poll
> > > > interval+count suitably. I don't remeber what the max poll
> > > > count was, but IIRC it wasn't too high so the duration will
> > > > have
> > > > to get bumped for long polls.
> > > > 
> > > > I guess one could also try to poll for the actual PSR status,
> > > > but dunno how well that'll work.
> > > > 
> > > > And we could also try to come up with different ideas on where
> > > > to sample timestamps. Unfortunately we only have the single
> > > > pipe flip timestamp register so we can only sample one
> > > > timestamp
> > > > from the DSB itself per frame. If we had more we could much
> > > > more
> > > > easily figure things out :/
> > > > 
> > > > I pushed my latest DSB selftest stuff to
> > > > https://github.com/vsyrjala/linux.git dsb_selftests_7
> > > > which has a bunch of stuff for this kind of experimentation.
> > > > It's in a somewhat sorry state at the moment since I last used
> > > > to hunt for various DSB bugs, but at least it still builds :)
> > > > 
> > > > The way I use that is that I run igt 'testdisplay -o ...' 
> > > > to make sure nothing else is actively poking the hardware
> > > > and then I trigger the DSB tests via debugfs.
> > > 
> > > I poked around a bit, though only on a TGL+PSR1 system (what I
> > > had
> > > at hand), so some of this might not apply to PSR2 and/or more
> > > modern platforms.
> > > 
> > > General notes:
> > > - PSR1 exit is triggered by any pipe/plane register write (even
> > > the
> > >   non-arming ones)
> > 
> > This is same for PSR2 as well.
> > 
> > > 
> > > We basically have three cases to consider here:
> > > 1. PSR1 is currently inactive
> > >   Obviously everything should be with the current code,
> > >   vblank evasion works, wait for vblank+interrupt scheme
> > >   for flip completion works
> > > 
> > > 2. PSR1 is active, but DC states are not
> > >   The wakeup latency here is super quick (it's < 1 scanline, how
> > >   much below? I've not yet measured), and arming registers do
> > > latch
> > >   nearly immediately.
> 
> Actually it's only ~1usec (based on the timestamps). I also used the
> following DSB batch to test how many registers we can write there:
> 
> for (i = 0; ...; i++)
> 	dsb_write(PLANE_SURF, i << 12)
> dsb_interrupt()
> 
> and then in the interrupt handler I read PLANE_SURFLIVE, and it
> always
> shows 0xa000, meaning we only have time to write ten registers. So
> definitely not enough to guarantee that all arming registers get
> written.
> 
> So, for PSR1 at least I think we'd have two options:
> 1) do a manual PSR wake around the whole commit, which doesn't
>    sound very nice
> 2) evade hw_scaline==0 so that we wait until we've woken up
>    from the DC state, and then proceed with the normal vblank
>    evasion and arming register writes. And obviously all that
>    only works in DSB_SKIP_WAITS_EN is disabled. Also if all 
>    the pipes are doing the full update using the DSB then we
>    could perhaps also remove the explicit DC_OFF dance around
>    the whole commit.
> 
> > The scanline counter starts counting
> > > accordingly
> > >   from vblank_start-1. However the hardware still considers PSR
> > > to be
> > >   active for that short duration so DSB_SKIP_WAITS_EN _will_ skip
> > > the
> > >   waits.
> > > 
> > >   Unfortunately being this quick I'm not convinced we have enough
> > > time
> > >   to write all the registers atomically before the hardware
> > > latches
> > >   something. So I'm thinking we may need to remove
> > > DSB_SKIP_WAITS_EN,
> > >   in which case the vblank evasion will push the arming register
> > >   writes into the next frame. This will mean the wakeup will take
> > >   one full frame.
> > 
> > To my understanding DSB_SKIP_WAITS_EN have impact only when in SRD
> > (PSR)/DEEP_SLEEP(PSR2). I.e. In this scenario we still do have all
> > waits as in commit without PSR.
> 
> The PSR state machine is already in link off mode in this case,
> meaning the pipe has been halted, and the display has been
> signalled to scan out from its RFB (confirmed by checking the
> PSR status register in DPCD), and DSB_SKIP_WAITS_EN is already
> active. But the link is still actually up (I'm guessing it might
> be transmitting the idle pattern, but I've not confirmed that.
> Can't remeber if we even have any kind of status register that
> could show this...). So looks like the link only gets actually
> turned off by the DMC when entering the DC state.
> 
> > 
> > > 
> > > 3. PSR1 is active and so are DC states
> > >   The wakeup latency is ~5ms. During that time scanline counter
> > > reads
> > >   0, PSR is active for the purposes of DSB_SKIP_WAITS_EN. Again
> > > we
> > >   pretty much need DSB_SKIP_WAITS_EN=0 here to make sure the
> > > interrupt
> > >   gets signalled after the wait for vblank. vblank evasion will
> > > get
> > >   skipped on account the scanline being 0. Somewhat ironically
> > > this
> > >   would give us ~5ms total wakeup latency which is now faster
> > > than
> > > the
> > >   previous case.
> > 
> > Again my understanding is that DSB_SKIP_WAITS_EN=0/1 have impact
> > only
> > when in SRD/DEEP_SLEEP. There is SRD_CTRL/PSR2_CTRL[Idle Frames] to
> > control when entering SRD/DEEP_SLEEP.
> 
> The PSR state machine is what matters. That stops the pipe, and
> causes
> DSP_SKIP_WAITS_EN to become active. And the PSR state machine is also
> what is affected by the idle frames stuff/etc. AFAIK the only stuff
> that is actually done by the DMC is turning the link/PLL/etc on and
> off. And obviously link training I guess gets triggered from there
> somehow when waking up from PSR, hence why the wakeup takes that 5ms
> longer than when DC States are disabled.

Ok, I need do some investigation on this. The failure I was referring
was this one:

https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-142521v2/shard-lnl-8/igt@kms_psr2_sf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

I.e. there is that scanline wait at the begin in DSB buffer. It was
hanging in waiting the scanline. I verified this by checking if
preceding write to 0x700300 completes and it never did.

This was with PSR2 but not with Panel Replay. Also if I disable
DEEP_SLEEP by setting PSR2_CTL[idle frames] as 0 it didn't hang. Also
waking it up by writing e.g. CURSURFLIVE helped.

BR,

Jouni Högander
 
> 
> > So wait for vblank is supposed to
> > work normally in this scenario as well because there has to be at
> > least
> > one idle frame before entering sleep.
> > 
> > > 
> > >   So everything here would be fine if we know that the wakeup has
> > > just
> > >   started since we have all of that 5ms to write all the
> > > registers.
> > >   But I guess we can't really know when the wakeup started, so we
> > >   might be doing the vblank evasion just before the scanline
> > > counter
> > >   starts to read its proper value, and then we have that < 1
> > > scanline
> > >   to write all the arming registers. Not sure if its enough. If
> > > not,
> > >   then we could also explicitly evade scanline 0 as well, which
> > > would
> > >   again force all arming register writes into the next frame
> > > giving
> > >   us the same kind of single frame wakeup latency as in case 2.
> > 
> > hmm, what else could trigger the wakeup than the commit that is on
> > hand
> > ? Frontbuffer flush/legacy cursor update?
> 
> Could be any register access really. So yeah those, interrupts,
> sysfs/debugfs accesses. Ie. all kinds of stuff we can't possibly
> keep track of IMO.
> 
> > It begins to look like we
> > still need to take that mutex over DSB commit...
> 
> I don't think we need that mutex unless we go with the 
> "let's manually wake from PSR and wait for the exit" approach.
> 
> > 
> > > 
> > > IIRC you said that you had stuff get stuck with
> > > DSB_SKIP_WAITS_EN=0.
> > > Was that with hardware that has TRANS_PUSH for PSR?
> > 
> > It has TRANS_PUSH, but not using it for PSR. I.e. On LunarLake and
> > still using register writes as a trigger for Frame Change event
> > towards
> > PSR.
> > 
> > It happened with DSB commit where scanline wait was the first
> > thing.
> > I.e. only cursor plane updating.
> > 
> > > I suppose that
> > > could happen if the scanline counter already reads something
> > > (eg. again vblank_start-1) before we've done the push, that would
> > > cause the vblank evasion to wait forever. I could see two ways to
> > > perhaps handle that:
> > > - if DSB_SKIP_WAITS_EN becomes inactive immediately after
> > >   TRANS_PUSH then we could just keep DSB_SKIP_WAITS_EN=1 all
> > >   the time and things should be fine
> > > - if DSB_SKIP_WAITS_EN stays active for some time after
> > > TRANS_PUSH
> > >   then we'll perhaps need to poke that bit from the DSB itself
> > >   dynamically so that we will skip the vblank evasion, but
> > >   not the wait for blank prior to generating the interrupt.
> > > But this needs a bit more reverse engineering for sure.
> > 
> > TRAN_PUSH is still not enabled in this patch set.
> > 
> > BR,
> > 
> > Jouni Högander
> > 
> > > 
> > > > 
> > > > > 
> > > > > Signed-off-by: Jouni Högander <jouni.hogander@xxxxxxxxx>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/display/intel_display.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > index e448ff64660a..58575800fad2 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > @@ -7631,6 +7631,7 @@ static void
> > > > > intel_atomic_dsb_finish(struct
> > > > > intel_atomic_state *state,
> > > > >  		intel_atomic_get_old_crtc_state(state,
> > > > > crtc);
> > > > >  	struct intel_crtc_state *new_crtc_state =
> > > > >  		intel_atomic_get_new_crtc_state(state,
> > > > > crtc);
> > > > > +	struct intel_display *display =
> > > > > to_intel_display(crtc);
> > > > >  
> > > > >  	if (!new_crtc_state->hw.active)
> > > > >  		return;
> > > > > @@ -7643,7 +7644,7 @@ static void
> > > > > intel_atomic_dsb_finish(struct
> > > > > intel_atomic_state *state,
> > > > >  		new_crtc_state->update_planes &&
> > > > >  		!new_crtc_state->vrr.enable &&
> > > > >  		!new_crtc_state->do_async_flip &&
> > > > > -		!new_crtc_state->has_psr &&
> > > > > +		(DISPLAY_VER(display) >= 20 ||
> > > > > !new_crtc_state-
> > > > > > has_psr) &&
> > > > >  		!new_crtc_state->scaler_state.scaler_users
> > > > > &&
> > > > >  		!old_crtc_state->scaler_state.scaler_users
> > > > > &&
> > > > >  		!intel_crtc_needs_modeset(new_crtc_state) &&
> > > > > -- 
> > > > > 2.43.0
> > > > 
> > > > -- 
> > > > Ville Syrjälä
> > > > Intel
> > > 
> > 
> 





[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux