Re: [PATCH v4 3/4] drm: Document variable refresh properties

Michel Dänzer <michel@xxxxxxxxxxx> · Thu, 1 Nov 2018 11:58:24 +0100

On 2018-10-31 6:54 p.m., Kazlauskas, Nicholas wrote:
> On 10/31/18 12:20 PM, Michel Dänzer wrote:
>> On 2018-10-31 3:41 p.m., Kazlauskas, Nicholas wrote:
>>> On 10/31/18 10:12 AM, Michel Dänzer wrote:
>>>> On 2018-10-31 2:38 p.m., Kazlauskas, Nicholas wrote:
>>>>> On 10/30/18 11:34 AM, Kazlauskas, Nicholas wrote:
>>>>>>
>>>>>> I understand the issue you're describing now. The timestamp is supposed
>>>>>> to signify the end of the current vblank. The call to get scanout
>>>>>> position is supposed to return the number of lines until scanout (a
>>>>>> negative value) or the number of lines since the next scanout began (a
>>>>>> positive value).
>>>>>>
>>>>>> The amdgpu driver calculates the number of lines based on a hardware
>>>>>> register status position which returns an increasing value from 0 that
>>>>>> indicates the current vpos/hpos for the display.
>>>>>>
>>>>>> For any vpos below vbl_start we know the value is correct since the next
>>>>>> vblank hasn't begun yet. But for anythign about vbl_start it's probably
>>>>>> wrong since it applies a corrective offset based on the fixed value of
>>>>>> crtc_vtotal. It's even worse when the value is above crtc_vtotal since
>>>>>> it'll be calculating the number of lines since the last scanout since
>>>>>> it'll be a positive value.
>>>>>>
>>>>>> So the issue becomes determining when the vfront porch will end.
>>>>>>
>>>>>> When the flip address gets written the vfront porch will end at the
>>>>>> start of the next line leaving only the back porch plus part of the
>>>>>> line. But we don't know when the flip will occur, if at all. It hasn't
>>>>>> occurred yet in this case.
>>>>>>
>>>>>> Waiting for the wrap around to 0 might be the best choice here since
>>>>>> there's no guarantee the flip will occur.
>>>>>
>>>>> I put some more thought into this and I don't think this is as bad as I
>>>>> had originally thought.
>>>>>
>>>>> I think the vblank timestamp is supposed to be for the first active
>>>>> pixel of the next scanout. The usage of which is for clients to time
>>>>> their content/animation/etc.
>>>>>
>>>>> The client likely doesn't care when they issue their flip, just that
>>>>> their content is matched for that vblank timestamp. The fixed refresh
>>>>> model works really well for that kind of application.
>>>>>
>>>>> Utilizing variable refresh rate would be a mistake in that case since
>>>>> the client would then have to time based on when they flip which is a
>>>>> lot harder to "predict" precisely.
>>>>
>>>> It's only a "mistake" as long as the timestamps are misleading. :) As
>>>> discussed before, accurate presentation timestamps are one requirement
>>>> for achieving perfectly smooth animation.
>>>
>>> For most 3D games the game world/rendering can advance by an arbitrary
>>> timestep - and faster will be smoother, which is what the native mode
>>> would give you.
>>>
>>> For fixed interval content/animation where you can't do that variable
>>> refresh rate could vastly improve the output. But like discussed before
>>> that would need a way to specify the interval/presentation timestamp to
>>> control that front porch duration. The timestamp will be misleading in
>>> any other case.
>>
>> I don't agree that an accurate timestamp can ever be more "misleading"
>> than an inaccurate one. But yeah, accurate timestamps and time-based
>> presentation are two sides of the same coin which can buy the holy grail
>> of perfectly smooth animation. :)
>>
>>
>>>>> I did some more investigation into when amdgpu gets the scanout position
>>>>> and what values we get back out of it. The timestamp is updated shortly
>>>>> after the crtc irq vblank which is typically within a few lines of
>>>>> vbl_start. This makes sense, since we can provide the prediction
>>>>> timestamp early. Waiting for the vblank to wrap around to 0 doesn't
>>>>> really make sense here since that would mean we already hit timeout or
>>>>> the flip occurred
>>>>
>>>> Sounds like you're mixing up the two cases of "actual" vblank events
>>>> (triggered by the "vblank" interrupt => drm_(crtc_)handle_vblank) and
>>>> flip completion events (triggered by the PFLIP interrupt with our
>>>> hardware => drm_crtc_send_vblank_event).
>>>>
>>>> Actual vblank events need to be delivered to userspace at the start of
>>>> vblank, so we indeed can't wait until the timestamp is accurate for
>>>> them. We just need to document the exact semantics of their timestamp
>>>> with VRR.
>>>>
>>>> For page flip completion events though, the timestamp needs to be
>>>> accurate and correspond to when the flipped frame starts being scanned
>>>> out, otherwise we'll surely break at least some userspace relying on
>>>> this information.
>>>>
>>> Yeah, I was. I guess what's sent is the estimated vblank timestamp
>>> calculated at the start of the interrupt.
>>
>> s/interrupt/vblank/, yeah.
>>
>>
>>> And since that's just a guess rather than what's actually going to
>>> happen it's going to be wrong in a lot of cases.
>>>
>>> I could see the wrap-around method working if the vblank timestamp was
>>> somehow updated in amdgpu or in drm_crtc_send_vblank_event.
>>
>> DC already calls drm_crtc_accurate_vblank_count before
>> drm_crtc_send_vblank_event, we "just" need to make sure that results in
>> an accurate timestamp.
>>
>>
>>> This would be a relatively simple fix but would break anything in
>>> userspace that relied on the timestamp for vblank interrupt and the
>>> flip completion being the same value.
>>
>> Hmm, that's a good point. So while VRR is enabled, maybe it is safer to
>> defer delivery of vblank events until the accurate timestamp is known as
>> well, at least by default. If there is userspace which needs the events
>> earlier even with VRR but can live with the guessed timestamp, a flag or
>> something could be added for that.
> 
> I was under the impression that the vblank timestamp was reused but it's 
> already going to differ since we call drm_crtc_accurate_vblank_count 
> before drm_crtc_send_vblank_event. Thanks for pointing that out.
> 
> Since that works for updating timestamp what's left is making sure that 
> it waits for the wrap around if it's above crtc_vtotal. It might make 
> sense to add a new flag for this that's only used within 
> amdgpu_get_crtc_scanout_position so the other call sites aren't affected.
> 
> There isn't a way to get an accurate timestamp with VRR enabled until 
> after the flip happens. So deferring it kind of defeats the purpose of a 
> client using it to make predictions before the flip for displaying their 
> content.

That's generally not a valid use-case. The KMS API is defined such that
if userspace receives the vblank event for vblank period N and then
submits a page flip, the page flip cannot (normally[0]) take effect
before vblank period N+1.

There are mainly two valid use-cases for waiting for vblank:

1. Wait for vblank N, then submit page flip for vblank N+1.

2. Wait for vblank N, then try to do something before vblank N ends,
e.g. draw directly to the front buffer, as a poor man's way of avoiding
tearing or similar artifacts.

Use-case 1 is used by Xorg (both for Present and DRI2) and at least
Weston, but presumably most if not all Wayland compositors. From Pekka's
description of how Weston uses it (thanks!), it's clear that the
timestamp of vblank events does need to accurately correspond to the end
of the vblank period.

Use-case 2 is also used by Xorg, but only when not flipping, which rules
out VRR. Also, I'd argue that blocking waits for vblank are more
suitable for this use-case than vblank events.

Therefore, in summary I propose this is how it could work:

* Without VRR, timestamps continue to work the same way they do now.

* With VRR, delivery of both vblank and page flip completion events must
be deferred until the timestamp accurately corresponds to the actual end
of vblank. For blocking vblank waits, I'm not sure, but I'm leaning
towards continuing to return to userspace at the beginning of vblank by
default, even if the timestamp is inaccurate.

Deferring delivery of events should avoid busy-loops. Bonus points for
minimizing the number of interrupts needed. :)

Note that both blocking vblank waits and vblank events are currently
triggered from drm_(crtc_)handle_vblank, so some rework would be
required for this.

It's possible that there's other userspace which doesn't care about the
accuracy of the timestamps, but cares about getting events as early as
possible. In that case, flags might need to be added.

[0] Our drivers can complete the flip in the same vblank period under
some circumstances if userspace specifies it as the target using the
DRM_MODE_PAGE_FLIP_TARGET_ABSOLUTE/RELATIVE flag, but there's no
guarantee for it to work. And no other driver supports this yet. Without
those flags, the flip cannot complete in the same vblank period, or it
would break existing userspace using use-case 1 above.

-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx