Re: [PATCH 4/5] drm/amdgpu: Wait for end of last waited-for vblank before programming flip

Daniel Vetter <daniel@xxxxxxxx> · Tue, 14 Jun 2016 10:06:04 +0200

On Tue, Jun 14, 2016 at 04:25:28PM +0900, Michel Dänzer wrote:
> On 14.06.2016 14:53, Daniel Vetter wrote:
> > On Tue, Jun 14, 2016 at 11:09:10AM +0900, Michel Dänzer wrote:
> >> On 06/13/16 23:06, Daniel Vetter wrote:
> >>> On Mon, Jun 13, 2016 at 05:58:29PM +0900, Michel Dänzer wrote:
> >>>> On 06/13/16 17:06, Daniel Vetter wrote:
> >>>>> On Mon, Jun 13, 2016 at 10:54:37AM +0900, Michel Dänzer wrote:
> >>>>>> On 10.06.2016 23:43, Daniel Vetter wrote:
> >>>>>>> On Fri, Jun 10, 2016 at 05:57:12PM +0900, Michel Dänzer wrote:
> >>>>>>>> From: Michel Dänzer <michel.daenzer@xxxxxxx>
> >>>>>>>>
> >>>>>>>> If userspace wants a page flip to take effect during vblank sequence n,
> >>>>>>>> it has to wait for vblank seqno n-1 before calling the
> >>>>>>>> DRM_IOCTL_MODE_PAGE_FLIP ioctl.
> >>>>>>>>
> >>>>>>>> This change makes sure that we do not program the flip to the hardware
> >>>>>>>> before the end of vblank seqno n-1 in this case, to prevent the flip
> >>>>>>>> from taking effect too early.
> >>>>>>>>
> >>>>>>>> On the other hand, if the DRM_IOCTL_MODE_PAGE_FLIP ioctl is called
> >>>>>>>> during vblank, but userspace didn't wait for the current vblank seqno
> >>>>>>>> before, this change would still allow the flip to be programmed during
> >>>>>>>> the current vblank seqno.
> >>>>>>>
> >>>>>>> This just sounds like you're sending vblank events out a bit too early.
> >>>>>>> And watching vblank waits that userspace does works, but it's fragile,
> >>>>>>> add-hoc and I don't really jump in joy about adding that to the vblank
> >>>>>>> core. Is there no way you can adjust sending out the vblank events
> >>>>>>> similarly, to make sure userspace can never sneak in a pageflip too early?
> >>>>>>
> >>>>>> What you call "too early" is actually "during the vertical blank period
> >>>>>> waited for". IMHO only notifying userspace of a vertical blank period
> >>>>>> when it's already over would defeat the purpose.
> >>>>>
> >>>>> Afaiui the rules are:
> >>>>> - The timestamp for vblank event needs to agree with whatever oml_sync
> >>>>>   requries.
> >>>>> - The event delivery itself needs to be consistent with what page_flip
> >>>>>   takes, i.e. if userspace sees an event and immediately issues a
> >>>>>   page_flip then it should not be able to hit the same vblank with that
> >>>>>   pageflip.
> >>>>> - The event needs to be after the old buffers are not longer used and can
> >>>>>   be reused for rendering.
> >>>>
> >>>> That's only relevant for DRM_IOCTL_MODE_PAGE_FLIP, not
> >>>> DRM_IOCTL_WAIT_VBLANK.
> >>>
> >>> Yup, mixed that up.
> >>>
> >>>>> - There's no requirement at all that the event gets delivered at a
> >>>>>   specific point in the vblank, hardware is too different for that to work
> >>>>
> >>>> As the name implies, the purpose of DRM_IOCTL_WAIT_VBLANK is to wait for
> >>>> a vertical blank period. If that doesn't work as intended with some
> >>>> hardware, that's tough luck but not really my problem. :)
> >>>>
> >>>>>   - that kind of precision is why we have a separate timestamp.
> >>>>
> >>>> I'm afraid this last item gives away that you're relatively new to this
> >>>> code. ;) The timestamp was originally literally just the current
> >>>> gettimeofday when the wait finished (the original DRM_IOCTL_WAIT_VBLANK
> >>>> ioctl didn't have any asynchronous notification functionality). It was
> >>>> relatively recently that Mario changed the timestamp to correspond to
> >>>> the end of the vertical blank period / start of scanout of the next
> >>>> frame, presumably due to your first rule above.
> >>>
> >>> Most hw just seems to give you a vblank interrupt somewhere in the vblank
> >>> are, or sometimes even slightly before that.
> >>
> >> Our hardware tends to trigger the vblank interrupt early, but it's still
> >> useful in that drawing operations submitted after it cannot affect the
> >> previously scanned out frame.
> >>
> >>> Also there's scheduling jitter.
> >>
> >> Sure, but there's a big difference between "no guarantee that we're
> >> still in vblank" vs "guarantee that we're no longer in vblank".
> >>
> >>
> >>>>> I assume you're goal is to not delay page_flips unecessarily, without
> >>>>> breaking requirement 2 here. Imo a simpler fix would be to delay the
> >>>>> vblank handling to end of vblank. Fixes everything without hacks, [...]
> >>>>
> >>>> Except it breaks the original purpose of the wait for vblank
> >>>> functionality, which is to wait for the beginning of a vertical blank
> >>>> period. [0] You're focusing too much on page flips and suggesting to
> >>>> throw out the vblank baby with the bathwater. I really don't see the big
> >>>> issue which would justify that.
> >>>>
> >>>>
> >>>> [0] As an analogy, how useful would e.g. calendar notifications be if
> >>>> they arrived at the end of the events they're about? "Hey, that meeting
> >>>> you were supposed to attend? It just finished!"
> >>>
> >>> Ok, what exactly is the use-case for waiting for vblanks _without_
> >>> scheduling a flip afterwards? At least in drm the rule is that ABI is what
> >>> userspace observes and actually cares about.
> >>
> >> E.g.: In cases where page flipping cannot be used, Xorg / the DDX driver
> >> waits for the target vertical blank period before emitting the drawing
> >> commands for a buffer swap operation. If the vblank notification only
> >> arrives when the vertical blank period is already over, this is very
> >> likely to result in tearing.
> >>
> >> Some X compositors and AFAIK even applications such as media players can
> >> use DRM_IOCTL_WAIT_VBLANK similarly. Obviously it's not intended to be
> >> used directly like that, but nonetheless it is.
> > 
> > Is there really anything using it like that outside of -ati?
> 
> Yes. With DRI3/Present, it's driver independent code in
> xserver/present/. With DRI2, it's theoretically up to the DDX driver,
> but all drivers seem to have basically the same logic; the modesetting
> driver certainly does.

Hm, didn't know that everyone does that. Seemed to silly an idea to waste
all that gpu bandwidth by waiting for vblank ...

> > I didn't know that we pass vblank waits to X clients. Either way annoying,
> > since it means you need to keep things working like this for amd drivers
> > forever.
> 
> Please don't try to single out our drivers, it seems like rather your
> driver is the odd man out in this case.

Hm, why/where is i915 special? It sounds like it's very much not, it's
just that normal page_flips (i.e. neither async nor variable refresh rate)
get committed atomically by the hw at the exact same time the vblank irq
fires. And if you look at the metric pile of atomic drivers we now have,
they at least seem to all work like that. Afaik amd hw seems to be the
exception with being able to make a flip still happen after the vblank
fired. I didn't check nouveau, maybe that one is similar.

> > Afaik others don't use it like that, at least not on intel.
> 
> UXA has the same DRI2 logic, not sure about SNA.
> 
> 
> > Anyway, I still don't like adding hacks to drm core like this for
> > single-use in just one driver.
> 
> The driver changes will be ported to radeon of course, and why couldn't
> it be used by other drivers for the same purpose?

Because other drivers work already, see above. At least all the atomic
drivers will work like that, or at least seem to be written to work like
that. So afaics it's amd drivers only.

> > drm_irq.c is already really complex and suffering badly from this, and
> > we're pretty close to always accidentally breaking something when touching it.
> 
> I just don't see how my change makes this any worse.

It's death by a thousand cuts. Of course none of the individual ones are
bad alone.

> > How bad would it really be to just always delay the page_flip past vblank?
> 
> In the variable refresh rate case it could be pretty bad, basically
> dropping to the minimum refresh rate.

Variable refresh rate is entirely undefined right now in kms. We probably
need an entirely new set of flags to make that work smoothly. And I think
even for variable rate refresh you want the current vblank stuff to tick
as regularly as possible. And then your page_flip (without special flags
would only need to make sure it's not faster than that). Variable refresh
rate page_flips probably need a special flag, maybe even the same
dont-tear-but-flip-asap flag.

> > If that's not good enough I'd say we should add a
> > faster-than-vblank-but-still-synced page_flip flag. Then userspace could
> > tell you exactly whether you should always wait (no flags), or never wait
> > (with this new flag).
> 
> That would be an inferior solution compared to my series, e.g.: If
> userspace calls DRM_IOCTL_MODE_PAGE_FLIP before the target vblank seqno
> is reached, it cannot use the new flag, otherwise the flip might take
> effect too early. However, if we are then already in the target vblank
> period when the fences have signalled and we are ready to program the
> flip, we have to wait for the end of vblank first, and the flip will be
> delayed by one frame.
> 
> If we're going to change the userspace interface, it would be better to
> re-purpose the reserved field of struct drm_mode_crtc_page_flip for
> explicitly specifying the target vblank seqno (via a new cap and flag).
> Then the kernel and userspace would no longer need to second-guess each
> other.
> 
> But even if we take that route, this series would be desirable for
> getting us most of the way there for existing userspace.

Well that's what I mean. You'r patches here shoehorn what you want into
existing api, trying to second-guess userspaces intentions inferred from
what it does. Ime that tends to end in trouble. It would be much better to
have a clear uabi for this. And my flag was just a suggestion.

I just had a discussion on irc with Dave about another topic were Dave
complained about some of the inferred abi rules SNA uses (entirely
differently, in probe code). And I thought it was a nice idea, since hey
it works and its easier. But really it's not, since in the end kms is
supposed to be somewhat generic. And usually your nice idea for inferring
behaviour then tends to break down somewhere. At least I think the generic
kms rules are:

- vblank events fire at lockstep (not everyone gets this right, but you
  end up with either lagging desktop or 100% cpu usuage on weston if you
  don't). Which also means for variable refresh rate you need to keep this
  vblank running at full refresh rate (which is annoying, but just means
  we need a flag for vblank waits).
- pageflip immediately after a vblank wait needs to hit the next vblank.
- pageflip in a loop needs to result in at most 1 flip per vblank, and if
  you're too fast then the kernel should return -EBUSY. Latest atomic
  heleprs even enforce this, to standardized the uabi more.

Imo everything else (in this case: make the flip complete on the same
frame as the vblank, if you hit the vblank window) needs special flags,
with clear meaning of what they do. The specific flip target sounds like a
good idea, except that current userspace can't be fixed, so we need to
make it work without any flag. And the new flag would be for flip-asap, or
flip-variable-refresh or something like that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel