> -----Original Message----- > From: Daniel Vetter <daniel@xxxxxxxx> > Sent: Friday, July 30, 2021 6:26 PM > To: Kasireddy, Vivek <vivek.kasireddy@xxxxxxxxx> > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx; Daniel Vetter <daniel@xxxxxxxx>; Gerd > Hoffmann <kraxel@xxxxxxxxxx>; Pekka Paalanen <ppaalanen@xxxxxxxxx>; > Simon Ser <contact@xxxxxxxxxxx>; Michel Dänzer <michel@xxxxxxxxxxx>; > Zhang, Tina <tina.zhang@xxxxxxxxx>; Kim, Dongwon > <dongwon.kim@xxxxxxxxx> > Subject: Re: [RFC v1 0/4] drm: Add support for > DRM_CAP_DEFERRED_OUT_FENCE capability > > On Thu, Jul 29, 2021 at 01:16:55AM -0700, Vivek Kasireddy wrote: > > By separating the OUT_FENCE signalling from pageflip completion allows > > a Guest compositor to start a new repaint cycle with a new buffer > > instead of waiting for the old buffer to be free. > > > > This work is based on the idea/suggestion from Simon and Pekka. > > > > This capability can be a solution for this issue: > > https://gitlab.freedesktop.org/wayland/weston/-/issues/514 > > > > Corresponding Weston MR: > > https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668 > > Uh I kinda wanted to discuss this a bit more before we jump into typing code, > but well I guess not that much work yet. > > So maybe I'm not understanding the problem, but I think the fundamental > underlying issue is that with KMS you can have at most 2 buffers in-flight, > due to our queue depth limit of 1 pending flip. > > Unfortunately that means for virtual hw where it takes a few more > steps/vblanks until the framebuffer actually shows up on screen and is > scanned out, we suffer deeply. The usual fix for that is to drop the latency > and increase throughput, and have more buffers in-flight. Which this patch > tries to do. > > Now I think where we go wrong here is that we're trying to hack this up by > defining different semantics for the out-fence and for the drm-event. Imo > that's wrong, they're both meant to show eactly the same thing: > - when is the new frame actually visible to the user (as in, eyeballs in a > human head, preferrably, not the time when we've handed the buffer off > to the virtual hw) > - when is the previous buffer no longer being used by the scanout hw > > We do cheat a bit right now in so far that we assume they're both the same, > as in, panel-side latency is currently the compositor's problem to figure out. > > So for virtual hw I think the timestamp and even completion really need to > happen only when the buffer has been pushed through the entire > virtualization chain, i.e. ideally we get the timestamp from the kms driver > from the host side. Currently that's not done, so this is most likely quite > broken already (virtio relies on the no-vblank auto event sending, which > definitely doesn't wait for anything, or I'm completely missing something). Agree. One lesson we got from previous direct-display related work is that using host hardware event is kind of "must". Otherwise, problems like flickering or tearing or frame drop will come out. Besides, as the wayland-ui is working as a weston client, it needs to have more than 2 buffers to support the full-frame redraw. I tried the Weston-simple-dmabuf-egl with 2 buffers and it was bad. BR, Tina > > I think instead of hacking up some ill-defined 1.5 queue depth support, what > we should do is support queue depth > 1 properly. So: > > - Change atomic to support queue depth > 1, this needs to be a per-driver > thing due to a bunch of issues in driver code. Essentially drivers must > never look at obj->state pointers, and only ever look up state through > the passed-in drm_atomic_state * update container. > > - Aside: virtio should loose all it's empty hooks, there's no point in > that. > > - We fix virtio to send out the completion event at the end of this entire > pipeline, i.e. virtio code needs to take care of sending out the > crtc_state->event correctly. > > - We probably also want some kind of (maybe per-crtc) recommended queue > depth property so compositors know how many buffers to keep in flight. > Not sure about that. > > It's a bit more work, but also a lot less hacking around infrastructure in > dubious ways. > > Thoughts? > > Cheers, Daniel > > > > > Cc: Daniel Vetter <daniel@xxxxxxxx> > > Cc: Gerd Hoffmann <kraxel@xxxxxxxxxx> > > Cc: Pekka Paalanen <ppaalanen@xxxxxxxxx> > > Cc: Simon Ser <contact@xxxxxxxxxxx> > > Cc: Michel Dänzer <michel@xxxxxxxxxxx> > > Cc: Tina Zhang <tina.zhang@xxxxxxxxx> > > Cc: Dongwon Kim <dongwon.kim@xxxxxxxxx> > > > > Vivek Kasireddy (4): > > drm: Add a capability flag to support deferred out_fence signalling > > virtio-gpu uapi: Add VIRTIO_GPU_F_OUT_FENCE feature > > drm/virtio: Add VIRTIO_GPU_CMD_RESOURCE_OUT_FENCE cmd > > drm/virtio: Probe and implement VIRTIO_GPU_F_OUT_FENCE feature > > > > drivers/gpu/drm/drm_file.c | 11 +++--- > > drivers/gpu/drm/drm_ioctl.c | 3 ++ > > drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + > > drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + > > drivers/gpu/drm/virtio/virtgpu_drv.h | 6 ++++ > > drivers/gpu/drm/virtio/virtgpu_fence.c | 9 +++++ > > drivers/gpu/drm/virtio/virtgpu_kms.c | 10 ++++-- > > drivers/gpu/drm/virtio/virtgpu_plane.c | 44 +++++++++++++++++++++++- > > drivers/gpu/drm/virtio/virtgpu_vq.c | 17 +++++++++ > > include/drm/drm_mode_config.h | 9 +++++ > > include/uapi/drm/drm.h | 1 + > > include/uapi/linux/virtio_gpu.h | 12 +++++++ > > 12 files changed, 117 insertions(+), 7 deletions(-) > > > > -- > > 2.30.2 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch