Re: [RFC 0/4] dma-fence: Deadline awareness

Pekka Paalanen <ppaalanen@xxxxxxxxx> · Thu, 29 Jul 2021 12:15:42 +0300

On Thu, 29 Jul 2021 10:43:16 +0200
Christian König <christian.koenig@xxxxxxx> wrote:

> Am 29.07.21 um 10:23 schrieb Pekka Paalanen:
> > On Wed, 28 Jul 2021 16:30:13 +0200
> > Christian König <christian.koenig@xxxxxxx> wrote:
> >  
> >> Am 28.07.21 um 15:57 schrieb Pekka Paalanen:  
> >>> On Wed, 28 Jul 2021 15:31:41 +0200
> >>> Christian König <christian.koenig@xxxxxxx> wrote:
> >>>     
> >>>> Am 28.07.21 um 15:24 schrieb Michel Dänzer:  
> >>>>> On 2021-07-28 3:13 p.m., Christian König wrote:  
> >>>>>> Am 28.07.21 um 15:08 schrieb Michel Dänzer:  
> >>>>>>> On 2021-07-28 1:36 p.m., Christian König wrote:  
> >>>>>>>> At least AMD hardware is already capable of flipping frames on GPU events like finishing rendering (or uploading etc).
> >>>>>>>>
> >>>>>>>> By waiting in userspace on the CPU before send the frame to the hardware you are completely killing of such features.
> >>>>>>>>
> >>>>>>>> For composing use cases that makes sense, but certainly not for full screen applications as far as I can see.  
> >>>>>>> Even for fullscreen, the current KMS API only allows queuing a single page flip per CRTC, with no way to cancel or otherwise modify it. Therefore, a Wayland compositor has to set a deadline for the next refresh cycle, and when the deadline passes, it has to select the best buffer available for the fullscreen surface. To make sure the flip will not miss the next refresh cycle, the compositor has to pick an idle buffer. If it picks a non-idle buffer, and the pending rendering does not finish in time for vertical blank, the flip will be delayed by at least one refresh cycle, which results in visible stuttering.
> >>>>>>>
> >>>>>>> (Until the deadline passes, the Wayland compositor can't even know if a previously fullscreen surface will still be fullscreen for the next refresh cycle)  
> >>>>>> Well then let's extend the KMS API instead of hacking together workarounds in userspace.  
> >>>>> That's indeed a possible solution for the fullscreen / direct scanout case.
> >>>>>
> >>>>> Not for the general compositing case though, since a compositor does not want to composite multiple output frames per display refresh cycle, so it has to make sure the one frame hits the target.  
> >>>> Yeah, that's true as well.
> >>>>
> >>>> At least as long as nobody invents a mechanism to do this decision on
> >>>> the GPU instead.  
> >>> That would mean putting the whole window manager into the GPU.  
> >> Not really. You only need to decide if you want to use the new backing
> >> store or the old one based on if the new surface is ready or not.  
> > Except that a window content update in Wayland must be synchronised with
> > all the possible and arbitrary other window system state changes, that
> > will affect how and where other windows will get drawn *this frame*,
> > how input events are routed, and more.
> >
> > But, if the window manager made sure that *only* window contents are
> > about to change and *all* other state remains as it was, then it would
> > be possible to let the GPU decide which frame it uses. As long as it
> > also tells back which one it actually did, so that presentation
> > feedback etc. can trigger the right Wayland events.
> >
> > Wayland has "atomic commits" to windows, and arbitrary protocol
> > extensions can add arbitrary state to be tracked with it. A bit like KMS
> > properties. Even atomic commits affecting multiple windows together are
> > a thing, and they must be latched either all or none.
> >
> > So it's quite a lot of work to determine if one can allow the GPU to
> > choose the buffer it will texture from, or not.  
> 
> But how does it then help to wait on the CPU instead?

A compositor does not "wait" literally. It would only check which state
set is ready to be used, and uses the most recent set that is ready. Any
state sets that are not ready are ignored and reconsidered the next
time the compositor updates the screen.

Depending on which state sets are selected for a screen update, the
global window manager state may be updated accordingly, before the
drawing commands for the composition can be created.

> See what I'm proposing is to either render the next state of the window 
> or compose from the old state (including all atomic properties).

Yes, that's exactly how it would work. It's just that state for a
window is not an independent thing, it can affect how unrelated windows
are managed.

A simplified example would be two windows side by side where the
resizing of one causes the other to move. You can't resize the window
or move the other until the buffer with the new size is ready. Until
then the compositor uses the old state.

> E.g. what do you do if you timeout and can't have the new window content 
> on time? What's the fallback here?

As there is no wait, there is no timeout either.

If the app happens to be frozen (e.g. some weird bug in fence handling
to make it never ready, or maybe it's just bugged itself and never
drawing again), then the app is frozen, and all the rest of the desktop
continues running normally without a glitch.

Thanks,
pq
Attachment:
pgppcSKVkoKoL.pgp

Description: OpenPGP digital signature