Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

"Kazlauskas, Nicholas" <Nicholas.Kazlauskas@xxxxxxx> · Wed, 13 Feb 2019 18:34:31 +0000

On 2/13/19 1:10 PM, Mario Kleiner wrote:
> On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
>>
>> On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas
>> <Nicholas.Kazlauskas@xxxxxxx> wrote:
>>>
>>> On 2/13/19 10:14 AM, Daniel Vetter wrote:
>>>> On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
>>>> <Nicholas.Kazlauskas@xxxxxxx> wrote:
>>>>>
>>>>> On 2/13/19 4:50 AM, Daniel Vetter wrote:
>>>>>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
>>>>>>> On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
>>>>>>>> <Nicholas.Kazlauskas@xxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> On 2/11/19 3:35 AM, Daniel Vetter wrote:
>>>>>>>>>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
>>>>>>>>>>> The pageflip completion timestamps transmitted to userspace
>>>>>>>>>>> via pageflip completion events are supposed to describe the
>>>>>>>>>>> time at which the first pixel of the new post-pageflip scanout
>>>>>>>>>>> buffer leaves the video output of the gpu. This time is
>>>>>>>>>>> identical to end of vblank, when active scanout starts.
>>>>>>>>>>>
>>>>>>>>>>> For a crtc in standard fixed refresh rate, the end of vblank
>>>>>>>>>>> is identical to the vblank timestamps calculated by
>>>>>>>>>>> drm_update_vblank_count() at each vblank interrupt, or each
>>>>>>>>>>> vblank dis-/enable. Therefore pageflip events just carry
>>>>>>>>>>> that vblank timestamp as their pageflip timestamp.
>>>>>>>>>>>
>>>>>>>>>>> For a crtc switched to variable refresh rate mode (vrr), the
>>>>>>>>>>> pageflip completion timestamps are identical to the vblank
>>>>>>>>>>> timestamps iff the pageflip was executed early in vblank,
>>>>>>>>>>> before the minimum vblank duration elapsed. In this case
>>>>>>>>>>> the time of display onset is identical to when the crtc
>>>>>>>>>>> is running in fixed refresh rate.
>>>>>>>>>>>
>>>>>>>>>>> However, if a pageflip completes later in the vblank, inside
>>>>>>>>>>> the "extended front porch" in vrr mode, then the vblank will
>>>>>>>>>>> terminate at a fixed (back porch) duration after flip, so
>>>>>>>>>>> the display onset time is delayed correspondingly. In this
>>>>>>>>>>> case the vblank timestamp computed at vblank irq time would
>>>>>>>>>>> be too early, and we need a way to calculate an estimated
>>>>>>>>>>> pageflip timestamp that will be later than the vblank timestamp.
>>>>>>>>>>>
>>>>>>>>>>> How a driver determines such a "late flip" timestamp is hw
>>>>>>>>>>> and driver specific, but this patch adds a new helper function
>>>>>>>>>>> that allows the driver to propose such an alternate "late flip"
>>>>>>>>>>> timestamp for use in pageflip events:
>>>>>>>>>>>
>>>>>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
>>>>>>>>>>>
>>>>>>>>>>> When sending out pageflip events, we now compare that proposed
>>>>>>>>>>> flip_timestamp against the vblank timestamp of the current
>>>>>>>>>>> vblank of flip completion and choose to send out the greater/
>>>>>>>>>>> later timestamp as flip completion timestamp.
>>>>>>>>>>>
>>>>>>>>>>> The most simple way for a kms driver to supply a suitable
>>>>>>>>>>> flip_timestamp in vrr mode would be to simply take a timestamp
>>>>>>>>>>> at start of the pageflip completion handler, e.g., pageflip
>>>>>>>>>>> irq handler: flip_timestamp = ktime_get(); and then set that
>>>>>>>>>>> as proposed "late" alternative timestamp via ...
>>>>>>>>>>> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
>>>>>>>>>>>
>>>>>>>>>>> More clever approaches could try to add some corrective offset
>>>>>>>>>>> for fixed back porch duration, or ideally use hardware features
>>>>>>>>>>> like hw timestamps to calculate the exact end time of vblank.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@xxxxxxxxx>
>>>>>>>>>>> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx>
>>>>>>>>>>> Cc: Harry Wentland <harry.wentland@xxxxxxx>
>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@xxxxxxx>
>>>>>>>>>>
>>>>>>>>>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us
>>>>>>>>>> the right timestampe, once? With this I guess if you do a vblank query in
>>>>>>>>>> between the wrong and the right vblank you'll get the bogus value. Not
>>>>>>>>>> really great for userspace.
>>>>>>>>>> -Daniel
>>>>>>>>>
>>>>>>>>> I think we calculate the timestamp and send the vblank event both within
>>>>>>>>> the pageflip IRQ handler so calculating the right pageflip timestamp
>>>>>>>>> once could probably be done. I'm not sure if it's easier than proposing
>>>>>>>>> a later flip time with an API like this though.
>>>>>>>>>
>>>>>>>>> The actual scanout time should be known from the page-flip handler so
>>>>>>>>> the semantics for VRR on/off remain the same. This is because the
>>>>>>>>> page-flip triggers entering the back porch if we're in the extended
>>>>>>>>> front porch.
>>>>>>>>>
>>>>>>>>> But scanout time from vblank events for something like
>>>>>>>>> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
>>>>>>>>> treated as estimates. If we're in the regular front porch then the
>>>>>>>>> timing to scanout is based on the fixed duration front porch for the
>>>>>>>>> current mode. If we're in the extended back porch then it's technically
>>>>>>>>> driver defined but the most reasonable guess is to assume that the front
>>>>>>>>> porch is going to end at any moment, so just return the length of the
>>>>>>>>> back porch for getting the scanout time.
>>>>>>>>>
>>>>>>>>> Proposing the late timestamp shouldn't affect vblank event in the
>>>>>>>>> DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip
>>>>>>>>> event case. I'm not sure if that's what's guaranteed to happen with this
>>>>>>>>> patch though. There doesn't seem to be any locking on either
>>>>>>>>> dev->vblank_time_lock or the vblank->seqlock so while it's likely to get
>>>>>>>>> the same vblank event back as the one just stored I don't think it's
>>>>>>>>> guaranteed.
>>>>>>>>
>>>>>>>> That's the inconsistency I mean to highlight - the timestamp for the
>>>>>>>> same frame as observed through flip complete and through the
>>>>>>>> wait_vblank ioctl can differ. Which they really shouldn't.
>>>>>>>>
>>>>>>>
>>>>>>> Ideally they shouldn't differ. The kernel docs for drm_crtc_state say
>>>>>>> that vblank and pageflip timestamps should always match. But then the
>>>>>>> kernel docs for "Variable refresh properties" in drm_connector.c for
>>>>>>> vblank timestamps were changed for the VRR implementation in Linux
>>>>>>> 5.0-rc to redefine them when in VRR mode. They are defined, but
>>>>>>> probably rather useless for any practical purpose, like this:
>>>>>>>
>>>>>>> "The semantics for the vertical blank timestamp differ when
>>>>>>> variable refresh rate is active. The vertical blank timestamp
>>>>>>> is defined to be an estimate using the current mode's fixed
>>>>>>> refresh rate timings. The semantics for the page-flip event
>>>>>>> timestamp remain the same."
>>>>>>
>>>>>> Uh I missed that. That sounds like nonsense tbh.
>>>>>>
>>>>>>> So our docs contradict each other as of Linux 5.0-rc. Certainly having
>>>>>>> useful vblank timetamps would be useful.
>>>>>>
>>>>>> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of
>>>>>> hilarity will ensue.
>>>>>
>>>>> I would imagine you would see more breakage by changing userspace
>>>>> expectations for when the event is actually received.
>>>>>
>>>>> The IOCTL is "wait for vblank", but if we start sending the event near
>>>>> the end of vblank it becomes more "wait for pageflip" (near scanout) or
>>>>> "wait 15ms while the VRR front porch times out and causes flickering".
>>>>>
>>>>> I don't speak for all userspace but it seems more useful to me to have
>>>>> the event sent at the start of vblank.
>>>>
>>>> Maybe we need to things. The problem is that the timestampe is defined
>>>> to be correct for start-of-frame. And right now the amdgpu
>>>> implementation of VRR doesn't even give an accurate timestampe for the
>>>> flip, so most likely no one yet thought about this properly.
>>>> Definitely no testcases (or it would have been caught), whereas we
>>>> have tons of testcases for fixed refresh.
>>>
>>> I agree that it would be good to have documentation for what waiting for
>>> vblank is supposed to mean here. There's a big difference between
>>> getting the event at the start of vsync vs the start of the back porch
>>> near the end of it.
>>>
>>>>
>>>>> VRR in its current state is useful for applications that aren't
>>>>> completely timing sensitive (like most games). This is because the
>>>>> driver has free reign in restricting the min/max bounds of the range -
>>>>> allowing for enhancements like low framerate compensation / BTR.
>>>>>
>>>>> For a concrete example, imagine you're not doing frontbuffer rendering
>>>>> but you're still trying to target a specific scanout time. So even if we
>>>>> were to send the vblank event at vpos 0 you'd get an accurate scanout
>>>>> timestamp for the next scanout after the vback porch ends. You're not
>>>>> going to be doing any rendering or preparation in this time because it's
>>>>> simply too short. So you'd target the scanout period after this one,
>>>>> which you'd need to calculate somehow in userspace. Taking the deltas
>>>>> between two vblank events is completely useless here because you'll get
>>>>> back the vrr min refresh delta. So you can take your source content rate
>>>>> or some fixed rate you'd like to target within the range but even then
>>>>> you're still not guaranteed to have that flip end up at that scanout time.
>>>>>
>>>>> The current solution is to just return the scanout pos at vrr max, or
>>>>> the current mode's fixed refresh rate. If you're flip before the fixed
>>>>> duration front porch ends then that scanout time will likely be accurate
>>>>> since you can't flip faster than vrr max.
>>>>>
>>>>> Is there actually userspace that cares that these timestamps need to
>>>>> match? Like said below, it seems that wait for vblank is mostly about
>>>>> scheduling and for that I think having the event sent back at the start
>>>>> of vblank is more important here.
>>>>
>>>> There's 3 cases:
>>>> - DRI3: Waits for vblank for frontbuffer blits. If you combine that
>>>> with VRR then imo you deserve to keep all the pieces.
>>>> - generic compositors. Use VRR for scheduling, expect that a page_flip
>>>> queued immediately after a page_flip will hit the next frame, and not
>>>> the current frame. They also expect that this will not give you the
>>>> slowest possible refresh rate, but I guess VRR plus frame scheduling
>>>> is currently out of scope. It's kinda just for as-fast-as-possible
>>>> games.
>>>> - flip timestamp. Currently the entire codebase assumes that vblank ts
>>>> = flip ts. I don't really see a point in breaking that.
>>>> - better scheduling of VRR: what compositors actually do isn't wait
>>>> for the vblank, that only latches the page_flip. They set up a
>>>> hrtimer, with a fixed/auto-tuned head-start, from which they start
>>>> their rendering. The same could be done for VRR, if we expose the
>>>> earliest/latest vblank end times (and how much they can shift). vblank
>>>> ioctl is only used for syncing that up and making sure we're hitting
>>>> the right frame. This is all for pageflipping compositors. This is
>>>> also what they'll probably keep doing when we give them a desired flip
>>>> time (instead of a target frame), they still need an hrtimer to fire a
>>>> little bit ahead to get the frame rendered.
>>>>
>>>> So taking all together, we have 2 use-cases I think:
>>>> - frontbuffer rendering. Kinda wants vblank event to fire at the
>>>> beginning of vblank. Won't ever do VRR
>>>> - pageflipping compositors. Don't care about the vblank event, either
>>>> they have an hrtimer timer (which gets synced with the vblank ioctl),
>>>> or just flip as fast as possible. They don't care one bit when the
>>>> vblank event happens, as long as a pageflip scheduled immediately
>>>> after the event shows up only hits the next frame (which is the delay
>>>> code that currently stretches vblank unecessarily with VRR if you're
>>>> already a bit late).
>>>>
>>>> Making sure that vblank and page flip timestamp match the real
>>>> userspace seems like the cleanest semantics, and the least surprising
>>>> to userspace. E.g. what happens if you implement VRR in a compositor,
>>>> but then tune your hrtimer within the vblank, when the vblank
>>>> timestamp is totally not reflecting your real pageflip? I just don't
>>>> see what's the upside of having a vblank timestamp that's not actually
>>>> reflecting the real flip.
>>>>
>>>> Also, if your app wants the start of vblank, that's very easy to
>>>> compute: Just add the scanout time to the last vblank timestamp.
>>>> Figuring out the start of frame otoh isn't doable if you can't query,
>>>> and constantly pageflipping just to have an accurate start-of-frame
>>>> (for tuning your hrtimer) seems rather wasteful - it outright defeats
>>>> some of the use-cases for VRR.
>>>>
>>>> So what benefit do you see in having vblank ts and pageflip ts not
>>>> agree? What's the use-case this solves - aside from "less typing" :-)
>>>> If the only use-case is VRR for frontbuffer rendering, then that
>>>> doesn't even work right now (aside from that I think it's something we
>>>> shouldn't support and definitely not encourage).
>>>> -Daniel
>>>
>>> I'm actually fine with moving the vblank event back to the back porch
>>> but I'd just be worried about userspace breakages that relied on it
>>> happening the other way.
>>>
>>> It'd be especially confusing if the VLINE was different for fixed vs
>>> variable refresh. I wouldn't want to move it to vpos 0 unless the event
>>> is also getting moved for fixed refresh as well.
>>
>> That would break DRI3. So we can pick between two options:
>> - always send out the vblank early, accept that vblank timestamps change.
>> - for VRR only, send out the vblank only when we know the correct
>> timestamp. For non-VRR we have to send out the vblank early, because
>> that's somewhat become api (but not all drivers do it that way, so
>> it's about as hopeless as frontbuffer rendering in general).
>>
>> I still don't see the benefit of the first option, or the exact use case.
>>
> 
> Still haven't slept so i don't know if i'm adding anything to the
> discussion that Daniel hasn't stated already, but anyway:
> 
> I think i agree with Daniel on this. Having a vblank ts that truly
> reflects the end of last vblank/start of scanout should make
> time-based scheduling easier. Then we know when scanout of the active
> frame starts and calculate from known fixed scanout duration when next
> vblank starts and when it ends the latest. On X11 you can
> glXGetSyncValuesOML() query that start time to sync up a compositors
> update cycle or start of an animation or timed presentation sequence
> with vblank. E.g., Waylands Weston does this when switching from idle
> desktop to redraw loop. A movie player or my own software could use
> this info in the same way, whereas i don't see the current vblank
> timestams being useful. If you get the current vblank event with
> timestamp at start of vblank ie. a bit into the short fixed duration
> front porch, you'd have to render and flip within what small amount of
> time is left unti the end of the fixed front-porch, because once you
> get into variable extended front-porch your vblank timestamp is
> already meaningless.

Yeah, the timestamp is more useful for preparing animation/content/etc 
for the scanout after next. I agree that it should be the same, but 
there are the challenges in not breaking userspace or the driver.

Wait for vblank is really poorly defined as is. Defining it explicitly 
for VRR as the start of back porch is fine as long as it is documented 
somewhere. Implementation wise it's going to be a pain since we'd need 
separate interrupts for vpos=vfront and vpos=0 for proper BTR support, 
but I think all the ASICs are capable of this.

This is simple enough to enforce in IGT testing as well, wait for 
vblank, grab the timestamp, then do the flip targeting something within 
the vrr range and grab the timestamp and see if the values are within 
some sane threshold.

> 
> I also think if we want to implement something like the ability to
> flip for a specific target scanout time -- which i'd really like to
> have -- having the true end of last vblank gives us a baseline
> timestamp that is easier to reason about inside the kernel, similar to
> how a compositor can reason/adjust to when to start composing the next
> frame or submitting a flip to hit some consistent framerate, or in
> turn to better obey a presentation request of a client for a specific
> target time.
Yes, the target presentation timestamp is ideal here. It probably should 
have been part of the original proposal but there wasn't anything in 
userspace that would have been using it. It would have been nice for IGT 
testing though.

> 
> At the same time we need to keep current behaviour for non-VRR mode,
> with early vblanks, so all the target_msc vblank count based
> scheduling keeps working. I don't think clients that are written to
> use VRR should expect exactly the same timing behaviour wrt. when
> vblank events get sent or timestamps updated. I certainly expect to
> have separate code-paths in my app for optimal VRR vs. non-VRR
> presentation scheduling, a compositor or movie player might adapt in
> the same way, and fullscreen apps can opt-out of VRR if they want.

FWIW most compositors/movie players/web browsers are blocked from using 
VRR right not in a blacklist in mesa. So maybe we don't have to worry 
about this yet at least.

> 
> Clients just need to know what the behavior is in VRR mode. I wonder
> what we should do for the kernel docs for Linux 5.0-rc? I assume we
> won't have the time to fix this all before release, would it make
> sense to have some kernel-docs updates via drm-fixes that declare the
> vblank/pageflip timestamps as "undefined"/"not to be relied
> on"/"subject to future change" for this release - essentially declare
> vblank timestamps as unstable api for the moment, so nobody relies on
> the current quite broken timestamp behaviour?
> 
> I do now have hacked up proof-of-concept patches here that after
> torturing them for a few hours with my own tests now seem to be able
> to calculate correct pageflip timestamps regardless which of the two
> options we choose, but they need refinement/cleanup/rebasing onto
> somethign else than Linux 5.0-rc5, and help from Nicholas, because i
> tend to get lost in all the indirections and file hierarchy inside the
> DisplayCore code etc., and probably also a few kinks worked out. And
> see if shifting the vblank irq handling doesn't cause other weird
> interactions? I haven't looked how Nicholas below-the-range/framerate
> compensation works in detail and if it matter that it happens at start
> of front-porch. I can only say that it doesn't seem to interfere with
> correct timestamping if i run with framerates intentionally below
> vrr-min, as far as my limited testing goes. Another thing that may be
> problematic is disabling/reenabling vblank during idle times, given
> that we can only reliably timestamp from within pageflip irq handler
> or vblank irq ~ back-porch/active scanout. Maybe we need to
> drm_crtc_vblank_get() hold a reference while VRR is active to simplify
> things for the moment.

BTR support has the bit in the vblank IRQ handler that adjusts the 
min/max duration for the display and if we're doing that in the back 
porch it's already too late.

Another vline interrupt could be installed here and only enabled when 
VRR goes on. If VRR is on then that should be the source for updating 
the vblank count. This would probably have some sort of messy 
interaction with the amdgpu kms counter bit I mentioned in another email 
though, but I haven't really looked into this in depth yet.

I think just calling the wait for vblank IOCTL will grab the reference 
to the vblank and it should enable the interrupts. I think holding an 
extra reference could be a simple solution if that's a problem though.

> 
> One thing one could merge is that other VRR friendly throttling patch
> i sent, given that that works and is independent of the timestamp
> stuff. I also have a backport for Linux 5.0-rc if that makes sense?
> That VRR throttling also needs to get reworked if/once we wanted to
> emit matching vblank/pflip timestamps, as that doesn't like the
> shifted point inside vblank when vblank counter gets updated.
> 
> Basically i don't know when the deadlines for merging this kind of
> work are/were wrt. Linux 5.0, 5.1, 5.2?
> 
> I'll look more into cleaning/testing those patches tomorrow after lots
> of sleep and then see what Nicholas thinks of them.
> 
> One problem with the VRR stuff is that the hardware testing equipment,
> that i'd usually use to verify if timestamps are really correct and
> accurate, doesn't work with DisplayPort or HDMI signals, so i can only
> test for self-consistency of timestamps atm., and look if plots of
> timestamps look how i would expect them to look if stuff is mostly
> correct.
> 
> -mario

The throttling patch is probably fine to merge. There are a few emails 
on that chain that I need to read through, however.

Thanks for looking into this.

Nicholas Kazlauskas

> 
>>> But like you said, the only thing I can think of that should need the
>>> event right away is low latency / framebuffer rendering.
>>>
>>> I can think at least one potential use case where you would want to do
>>> front buffer rendering for low latency with VRR - emulation of old
>>> console hardware. There's not really any modern display that has fixed
>>> refresh modes that match the render frequency of some consoles. But this
>>> use case isn't out there in the wild right now at least.
>>
>> Even on an emulator I'd expect it'd just do regular flips to the
>> frontbuffer. Especially on mobile there's really no such thing as
>> frontbuffer rendering, you always do a flip. The DIRTYFB ioctl (which
>> you need to call instead of doing a page flip) simply does a pageflip
>> behind your back for you. Which I think is worse than actually
>> flipping.
>>
>> And for emulating exact refresh rates your then again go back to
>> hrtimer + issuing the flip approach, but using your desired refresh
>> rate to tune the hrtimer instead of the fixed one. But that needs a
>> correct vblank timestamp for the current frame to work correctly (or
>> you force compositor has to always flip, which defeats the power
>> saving aspects of VRR allowing to stretch refresh rates when nothing
>> changes).
>>
>>> Regarding compositors: I don't really recommend VRR for this use case in
>>> general since you'll want a fixed refresh rate frequency for input to
>>> feel consistent and to avoid any flickering / luminance changes. The
>>> fastest refresh rate will be the most responsive to the user and you can
>>> just used the fixed refresh rate mode in this case. Lowering the refresh
>>> rate to save power sounds nice in theory but I'd imagine it'd feel
>>> terrible in practice.
>>
>> Why is a compositor any different from a game? You aim to render at
>> 60fps, or whatever the refresh rate is. If you're one frame late it's
>> better to stretch the vblank slightly than a full on 1-frame freeze.
>> And on mobile it's unfortunately fairly common that the system fails
>> to clock up quickly enough when starting from an idle state, VRR
>> sounds like really useful here.
>>
>> But like in a game if you don't at least aim to render at a consistent
>> fps, and handle your input at a consistent fps, then the end result
>> will indeed be terrible.
>> -Daniel
>>
>>> Nicholas Kazlauskas
>>>
>>>>
>>>>>>>> Now added complication is that amdgpu sends out vblank events really
>>>>>>>> early, which is used by userspace to do frontbuffer rendering in the
>>>>>>>> vblank time. But I don't think anyone wants to do both VRR and
>>>>>>>
>>>>>>> I think all kms drivers try to call drm_crtc_handle_vblank() at start
>>>>>>> of vblank to give Mesa the most time for frontbuffer rendering for
>>>>>>> classic X. But vblank events are also used for scheduling bufferswaps
>>>>>>> or other stuff for redirected windowed rendering, or via api's like
>>>>>>> OML_sync_controls glXWaitForMscOML, so there might be other things
>>>>>>> affected by a more delayed vblank handling.
>>>>>>
>>>>>> The frontbuffer rendering is very much X driver specific, and I think
>>>>>> -amdgpu/radeon is the only one that requires this. No i915 driver ever
>>>>>> used the vblank interrupt to schedule frontbuffer blits, we use some
>>>>>> CS-side stalls.
>>>>>>
>>>>>> Wrt scheduling pageflips: The rule is that right after you've received the
>>>>>> vblank for frame X, then an immediately schedule pageflip should hit X+1,
>>>>>> but not X. amdgpu had this broken, but it's fixed since a while.
>>>>>>
>>>>>>>> frontbuffer rendering, hence I think it should be possible to create
>>>>>>>> correct vblank timestamps for the VRR case, while leaving the current
>>>>>>>> logic in place for everything else. But that means moving the entire
>>>>>>>> vblank processing to where you know the next frame's start time for
>>>>>>>> VRR, both for page flips and for all other vblank events.
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>
>>>>>>> I think for amdgpu it would be doable by calling drm_handle_vblank
>>>>>>> from the pageflip irq handler in VRR mode, and then additionally from
>>>>>>> the vblank interrupt handler like now. Vblank irq's are triggered via
>>>>>>> vline interrupts with programmable vertical trigger line position, so
>>>>>>> we could program the trigger line to be start of back-porch instead of
>>>>>>> start of front-porch. In the pageflip case we do vblank handling +
>>>>>>> timestamping + pflip completion from pflip irq, and have the regular
>>>>>>> back-porch vblank handling for refresh cycles in which no
>>>>>>> pageflip/pflip irq happened.
>>>>>>
>>>>>> Hm, can't we simply program the vblank to occur in the back porch region
>>>>>> for VRR? The vblank timestamping should automatically correct the
>>>>>> timestamp in that case. Would require a lot less code changes.
>>>>>>
>>>>>> Plus update the documentation ofc.
>>>>>>
>>>>>>> Not sure if/how that would mess with below-vrr-refresh-rate-range
>>>>>>> support for stutter reduction at low fps though? Or with performance,
>>>>>>> given vblank event dispatch could be delayed a lot in VRR compared to
>>>>>>> normal mode? Or with all the vblank accounting during modesets or crtc
>>>>>>> en/disable?
>>>>>
>>>>> I think we have extra VLINEs that can be programmed at vpos 0 that could
>>>>> resolve this but I'm still not really sold on the idea.
>>>>>
>>>>>>
>>>>>> Currently VRR and explicit frame scheduling don't mix. We've discussed
>>>>>> this, and consens seems to be that we need a new property for page_flip,
>>>>>> similar to the target frame, but instead of frames a timestamp. Then the
>>>>>> driver can hit a specific time for the next frame.
>>>>>>
>>>>>> And since compositors use the vblank ioctl just to do frame scheduling, I
>>>>>> don't think we have to worry hugely about interactions there. Much better
>>>>>> to aim for clean semantics (i.e. vblank and flip complete timestamps
>>>>>> better match).
>>>>>>
>>>>>>> Also, could other drivers like Intel easily do this delayed vblank
>>>>>>> processing in the non-pageflip case?
>>>>>>
>>>>>> Intel's problem :-) But yeah we have page flip timestamp registers, so
>>>>>> worst case all we need to do is shuffle code around a bit. And we have
>>>>>> interrupts for both start of vblank and and of vblank. It might be a bit
>>>>>> more work because we use the vblank interrupt to avoid races in atomic
>>>>>> updates, but that's not too terrible to fix.
>>>>>>
>>>>>>>    From my user perspective as a developer of a neuroscience research
>>>>>>> application that has a couple of exciting/important use cases for VRR
>>>>>>> mode, i absolutely do need trustworthy and precise pageflip event
>>>>>>> timestamps that represent reality, otherwise VRR mode would be less
>>>>>>> than useless for me. I can do without meaningful vblank timestamps in
>>>>>>> VRR mode though, and expected to do without them, as any scheduling
>>>>>>> has to happen time-based instead of based on vblank counts anyway,
>>>>>>> given that vblank counts are quite meaningless if every frame can have
>>>>>>> a different duration. I'd expect that situation to be similar for
>>>>>>> other apps that would want to use timestamps in VRR mode like video
>>>>>>> games, movie players or VR/AR applications.
>>>>>>
>>>>>> Yeah for that you want to schedule your frames with a target frame number.
>>>>>> Would still be nice if vblank timestamps wouldn't be an outright lie (and
>>>>>> the timestamp is still useful information, even if the frame counter
>>>>>> isn't - many displays have min/max frame rates for VRR, which is
>>>>>> information we could perhaps expose).
>>>>>> -Daniel >
>>>>>>>
>>>>>>> -mario
>>>>>>>
>>>>>>>>>
>>>>>>>>> Nicholas Kazlauskas
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>      drivers/gpu/drm/drm_vblank.c | 49 +++++++++++++++++++++++++++++++++++-
>>>>>>>>>>>      include/drm/drm_vblank.h     |  8 ++++++
>>>>>>>>>>>      2 files changed, 56 insertions(+), 1 deletion(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
>>>>>>>>>>> index 98e091175921..4b3a4c38fabe 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/drm_vblank.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_vblank.c
>>>>>>>>>>> @@ -814,10 +814,21 @@ static void send_vblank_event(struct drm_device *dev,
>>>>>>>>>>>                 u64 seq, ktime_t now)
>>>>>>>>>>>      {
>>>>>>>>>>>         struct timespec64 tv;
>>>>>>>>>>> +    ktime_t alt_flip_time;
>>>>>>>>>>>
>>>>>>>>>>>         switch (e->event.base.type) {
>>>>>>>>>>> -    case DRM_EVENT_VBLANK:
>>>>>>>>>>>         case DRM_EVENT_FLIP_COMPLETE:
>>>>>>>>>>> +            /*
>>>>>>>>>>> +             * For flip completion events, override "now" time
>>>>>>>>>>> +             * with alt_flip_time provided by the driver via
>>>>>>>>>>> +             * drm_crtc_set_vrr_pageflip_timestamp() in VRR mode
>>>>>>>>>>> +             * if that time is later than given "now" vblank time.
>>>>>>>>>>> +             */
>>>>>>>>>>> +            alt_flip_time = dev->vblank[e->pipe].alt_flip_time;
>>>>>>>>>>> +            if (alt_flip_time > now)
>>>>>>>>>>> +                    now = alt_flip_time;
>>>>>>>>>>> +            /* Fallthrough */
>>>>>>>>>>> +    case DRM_EVENT_VBLANK:
>>>>>>>>>>>                 tv = ktime_to_timespec64(now);
>>>>>>>>>>>                 e->event.vbl.sequence = seq;
>>>>>>>>>>>                 /*
>>>>>>>>>>> @@ -916,11 +927,47 @@ void drm_crtc_send_vblank_event(struct drm_crtc *crtc,
>>>>>>>>>>>
>>>>>>>>>>>                 now = ktime_get();
>>>>>>>>>>>         }
>>>>>>>>>>> +
>>>>>>>>>>>         e->pipe = pipe;
>>>>>>>>>>>         send_vblank_event(dev, e, seq, now);
>>>>>>>>>>>      }
>>>>>>>>>>>      EXPORT_SYMBOL(drm_crtc_send_vblank_event);
>>>>>>>>>>>
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_crtc_set_vrr_pageflip_timestamp - helper to set alternate pageflip time
>>>>>>>>>>> + * @crtc: the source CRTC of the pageflip completion event
>>>>>>>>>>> + * @flip_time: The alternate pageflip completion timestamp in VRR mode
>>>>>>>>>>> + *
>>>>>>>>>>> + * In variable refresh rate mode (VRR), a pageflip completion timestamp carried
>>>>>>>>>>> + * by the pageflip event can never be earlier than the vblank timestamp of the
>>>>>>>>>>> + * vblank of flip completion, as that vblank timestamp defines the end of the
>>>>>>>>>>> + * shortest possible vblank duration. In case of a delayed flip completion
>>>>>>>>>>> + * inside the extended VRR front porch however, the end of vblank can be much
>>>>>>>>>>> + * later, so the driver must assign an estimated timestamp of that later end of
>>>>>>>>>>> + * vblank. For a CRTC in VRR mode, the driver should use this helper function to
>>>>>>>>>>> + * set an alternate flip completion timestamp in case of late flip completions
>>>>>>>>>>> + * in extended vblank. In the most simple case, this @flip_time timestamp could
>>>>>>>>>>> + * simply be a ktime_get() timestamp taken at the start of the pageflip
>>>>>>>>>>> + * completion routine, with some constant duration of the back porch interval
>>>>>>>>>>> + * added, although more precise estimates may be possible on some hardware if
>>>>>>>>>>> + * the hardware provides some means of timestamping the true end of vblank.
>>>>>>>>>>> + *
>>>>>>>>>>> + * When sending out pageflip events, e.g., via drm_crtc_send_vblank_event(), it
>>>>>>>>>>> + * will use either the standard vblank timestamp, calculated for a minimum
>>>>>>>>>>> + * duration vblank, or the provided @flip_time if that time is later than the
>>>>>>>>>>> + * vblank timestamp, to get the best possible estimate of start of display of
>>>>>>>>>>> + * the new post-pageflip scanout buffer.
>>>>>>>>>>> + */
>>>>>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc,
>>>>>>>>>>> +                                     ktime_t flip_time)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct drm_device *dev = crtc->dev;
>>>>>>>>>>> +    struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
>>>>>>>>>>> +
>>>>>>>>>>> +    vblank->alt_flip_time = flip_time;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL(drm_crtc_set_vrr_pageflip_timestamp);
>>>>>>>>>>> +
>>>>>>>>>>>      static int __enable_vblank(struct drm_device *dev, unsigned int pipe)
>>>>>>>>>>>      {
>>>>>>>>>>>         if (drm_core_check_feature(dev, DRIVER_MODESET)) {
>>>>>>>>>>> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
>>>>>>>>>>> index 6ad9630d4f48..aacf44694ab6 100644
>>>>>>>>>>> --- a/include/drm/drm_vblank.h
>>>>>>>>>>> +++ b/include/drm/drm_vblank.h
>>>>>>>>>>> @@ -117,6 +117,12 @@ struct drm_vblank_crtc {
>>>>>>>>>>>          * @time: Vblank timestamp corresponding to @count.
>>>>>>>>>>>          */
>>>>>>>>>>>         ktime_t time;
>>>>>>>>>>> +    /**
>>>>>>>>>>> +     * @alt_flip_time: Vblank timestamp for end of extended vblank due to
>>>>>>>>>>> +     * a late pageflip completion in variable refresh rate mode. Pageflip
>>>>>>>>>>> +     * events will carry the later one of @time and @alt_flip_time.
>>>>>>>>>>> +     */
>>>>>>>>>>> +    ktime_t alt_flip_time;
>>>>>>>>>>>
>>>>>>>>>>>         /**
>>>>>>>>>>>          * @refcount: Number of users/waiters of the vblank interrupt. Only when
>>>>>>>>>>> @@ -179,6 +185,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
>>>>>>>>>>>      u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
>>>>>>>>>>>      u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
>>>>>>>>>>>                                    ktime_t *vblanktime);
>>>>>>>>>>> +void drm_crtc_set_vrr_pageflip_timestamp(struct drm_crtc *crtc,
>>>>>>>>>>> +                                     ktime_t flip_time);
>>>>>>>>>>>      void drm_crtc_send_vblank_event(struct drm_crtc *crtc,
>>>>>>>>>>>                                struct drm_pending_vblank_event *e);
>>>>>>>>>>>      void drm_crtc_arm_vblank_event(struct drm_crtc *crtc,
>>>>>>>>>>> --
>>>>>>>>>>> 2.17.1
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> dri-devel mailing listjj
>>>>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Daniel Vetter
>>>>>>>> Software Engineer, Intel Corporation
>>>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>>>>
>>>>>
>>>>> Nicholas Kazlauskas
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel