Re: [RFC 0/9] nuclear pageflip

Rob Clark <rob.clark@xxxxxxxxxx> · Fri, 14 Sep 2012 12:34:59 -0500

On Fri, Sep 14, 2012 at 12:02 PM, Ville Syrjälä
<ville.syrjala@xxxxxxxxxxxxxxx> wrote:
> On Fri, Sep 14, 2012 at 11:29:04AM -0500, Rob Clark wrote:
>> On Fri, Sep 14, 2012 at 10:48 AM, Ville Syrjälä
>> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
>> > On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
>> >> On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
>> >> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
>> >> > On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
>> >> >> On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
>> >> >> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
>> >> >> > On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
>> >> >> >> On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
>> >> >> >> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
>> >> >> >> > On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
>
>> > I wonder if you've though about omap's FIFO merge. It can cause similar
>> > issues, that is some operations may need two vblanks to complete. And it
>> > looks like I'll get to worry about this stuff too since there are some
>> > watermark related wait_for_vblank() workarounds in the IVB sprite code,
>> > sigh.
>>
>> yeah, FIFO merge is a nice big headache.. and not really ideal for
>> latency unless you have some advanced warning to disable FIFO merge
>> before userspace wants to switch on an extra overlay.
>>
>> I think the best way to deal is just start switching off FIFO merge
>> when userspace first does test w/ overlay, but return EBUSY.  It means
>> we'll use the gpu for rendering for one frame, but I think that is
>> better than blocking the compositor for a vblank or two.  Thou shalt
>> not block the compositor.
>
> Yeah, I suppose it could be handled through another property as well.
> Perhaps some kind of "LOW_POWER_OPTIMIZATIONS" property that you'd
> disable one vblank in advance. But then it's starting to be a bit
> hardware specific ie. you more or less have to know the circumstances
> when the property must be disabled. On omap it would be when more than
> one plane is used, on IVB it appears that you need it when you want to
> enable scaling.
>
> I'm not too thrilled about the idea that the test phase would actually
> touch the hardware. What happens if you do multiple test steps between
> commits? But I can't immediately think of a good solution that would
> avoid the need for hardware specific knowledge.

yeah, that is the problem..

But then again, I suppose we could make fifo-merge disabled by default
and explicitly controlled by userspace via property.  A generic
userspace, and the property would never be set.  A userspace a bit
more optimized/customized for the hw, however, is in a better position
to know if there is likely to be changes in the next frame (ie. based
on user input, etc) and could make perhaps some more intelligent
decisions about when to enable it.

>> >> And if we do support multiple crtc's w/ pageflip, I'm not sure if
>> >> there is a good way to enforce two-steps.  Having a standardized way
>> >> to tell userspace to try later seems like a good thing.
>> >
>> > Sure, for that it seems reasonable.
>> >
>> >> >> >> >> Also, if you pageflip on multiple CRTC's, should the be multiple
>> >> >> >> >> vblank events, and multiple userdata's?
>> >> >> >> >
>> >> >> >> > That's a bit of an open question. I was considering several options:
>> >> >> >>
>> >> >> >> the thing I like about one ioctl per crtc is that it avoids this whole
>> >> >> >> question..
>> >> >> >>
>> >> >> >> And, I think as long as you have to update multiple different scanout
>> >> >> >> address registers, there is always going to be a race in multi-crtc
>> >> >> >> flipping.  Having a single ioctl does make the race smaller.  I'm not
>> >> >> >> sure how important that point is.
>> >> >> >
>> >> >> > Which race?
>> >> >>
>> >> >> ie. if you set REG_CRTC1_ADDR just immediately before vblank and
>> >> >> REG_CRTC2_ADDR just after
>> >> >
>> >> > Well, with unsynced crtcs I wouldn't call that any kind of meaningful race.
>> >> > The same problem after all exists even with a single crtc. You either make
>> >> > the deadline and write the register before vblank, or you don't make it
>> >> > and end up with a repeated frame.
>> >>
>> >> I meant w/ sync'd crtc's, there is still no 100% guarantee that the
>> >> two flip at the same time.
>> >
>> > Sure there is. That's what the vblank evade stuff gives you. I just
>> > happen to need it even when using just one crtc because the hardware
>> > doesn't have the necessary mechanism to flip several planes atomically.
>>
>> hmm, I guess I don't quite follow then.  But I guess I don't know the
>> intel hw well enough.  It seemed like you weren't atomically updating
>> scanout registers.
>
> I guarantee the atomicity by making sure I'm not too close to the start
> of vblank when I write the registers. It's a very generic solution that
> will work on any hardware with double buffered registers that get
> flipped on vblank. Even if some of the registers would get flipped at
> slightly different times (eg. plane A flips at vbl_start+1, plane B at
> vbl_start+10) you could still use this method by extending the range of
> scanlines to be avoided.

ahh, ok, double-buffered..  well, if they are double buffered you
should be able to tolerate two ioctl() calls, because you have a
relatively large window to update all the registers ;-)

> I suppose the most difficult bit is determining how long you need to
> write all the necessary registers. You want to make it long enough to
> guarantee atomic operation, but short enough to avoid blocking
> needlessly. Currently I just have a hardcoded number there. If the
> driver is going to be deployed on a specific device, it's easy enough
> to measure it and tweak the number, but it would be nice to have the
> driver calibrate itself. Or you just have a sysfs knob so that it
> could tweaked more easily for specific systems by non-developers.
>
>> But anyways, I think it is probably ok to not need the crtc up-front.
>> We can catch issues w/ pending vblank at the atomic_test() stage.
>> Still not sure what to do about userdata.
>
> I think that if the event carries the crtc id, then the user_data could
> be shared across crtcs. User space can ref count stuff if necessary.
> When done this way you can think if the user_data as a cookie that
> identifies the flip event(s) originating from a specific flip request.
>
>> Although I suppose we could
>> make userdata a property attached to crtc and/or plane and that gives
>> userspace plenty of flexibility about how many events it wants back.
>> (Ie. no event if userdata==0.. or maybe separate send-event property.)
>
> Hmm. That's an interesting idea. I'll have to give it some thought.

It does give userspace plenty of control about how it want's it's
events, which is nice.. although I'm not quite sure yet about the
implementation.  The point at which you create the vblank evt, you
want to know the 'struct drm_file *'..  and it seems a bit strange to
have to pass that in to set_property() ;-)

BR,
-R

> --
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel