On Wed, Nov 25, 2015 at 08:04:26PM +0100, Mario Kleiner wrote: > On 11/25/2015 06:46 PM, Ville Syrjälä wrote: > > On Wed, Nov 25, 2015 at 06:24:13PM +0100, Mario Kleiner wrote: > >> On 11/23/2015 09:24 PM, Ville Syrjälä wrote: > >>> On Mon, Nov 23, 2015 at 06:58:34PM +0100, Mario Kleiner wrote: > >>>> > >>>> > >>>> On 11/23/2015 04:51 PM, Ville Syrjälä wrote: > >>>>> On Mon, Nov 23, 2015 at 04:23:21PM +0100, Mario Kleiner wrote: > >>>>>> On 11/20/2015 04:34 PM, Ville Syrjälä wrote: > >>>>>>> On Fri, Nov 20, 2015 at 04:24:50PM +0100, Mario Kleiner wrote: > >>>>>> > >>>>>> ... > >>>>>> Ok, but why would that be a bad thing? I think we want it to think it is > >>>>>> in the previous frame if it is called outside the vblank irq context. > >>>>>> The only reason we fudge it to the next frames vblank if i vblank irq is > >>>>>> because we know the vblank irq handler we are executing atm. was meant > >>>>>> to execute within the upcoming vblank for the next frame, so we fudge > >>>>>> the scanout positions and thereby timestamp to correspond to that new > >>>>>> frame. But if something called outside irq context it should get a > >>>>>> scanout position/timestamp that corresponds to "reality". > >>>>> > >>>>> It would be a bad thing since it would cause the timestamp to jump > >>>>> backwards, and that would also cause the frame count guesstimate to go > >>>>> backwards. > >>>>> > >>>> > >>>> But only if we don't use the dev->driver->get_vblank_counter() method, > >>>> which we try to use on AMD. > >>> > >>> Well, if you do it that way then you have the problem of the hw counter > >>> seeming to jump forward by one after crossing the start of vblank (when > >>> compared to the value you sampled when you processed the early vblank > >>> interrupt). > >>> > >> > >> Ok, finally i see the bad scenario that wouldn't get prevented by our > >> current locking with the new vblank counting in the core. The vblank > >> enable path is safe due to locking and discounting of redundant > >> timestamps etc. But the disable path could go wrong: > >> > >> 1. Vblank irq fires, drm_handle_vblank() -> drm_update_vblank_count(), > >> updates timestamps and counts "as if" in vblank -> incremented vblank > >> count and timestamp now set in the future. > >> > >> 2. After vblank irq finishes, but just before leading edge of vblank, > >> vblank_disable_and_save() executes, doesn't get bumped timestamp or > >> count because before vblank and not in vblank irq. Now > >> drm_update_vblank_count() would process a > >> "new" timestamp and count from the past and we'd have time and counts > >> going backwards, and bad things would happen. > >> > >> I haven't observed such a thing happening during testing so far, > >> probably because the time window in which it could happen is tiny, but > >> given how awfully bad it would be, it needs to be prevented. > >> > >> I had a look at the description of the Vblank irq in the "M76 Register > >> Reference Guide" for older asics and the description suggests that the > >> vblank irq fires when the crtc's line buffer is finished reading pixel > >> data from the scanout buffer in memory for a frame, ie., when the line > >> buffer read "enters" vblank. That would explain why the irq happens a > >> few scanlines before actual vblank, because line buffer refills must > >> obviously happen before the crtc can send pixel data from the line > >> buffer to the encoders, so it would lead a bit in time. That also means > >> we can't delay the vblank irq to actually happen at start of vblank and > >> have to deal with the early vblank irq. > >> > >>> I guess one silly idea would be to defer the vblank interrupt processing > >>> to a timer, and just schedule it a bit into the future from the actual > >>> interrupt handler. > >>> > >> > >> Timer would be bad because then we get problems with the pageflip > >> completion irq sometimes being processed before the vblank irq, > > > > You you'd need to move page flip completion to happen from the vblank > > timer too I suppose. > > > >> and > >> because we want to be fast in vblank irq handling, delivering vblank > >> events etc. I wouldn't trust a timer to be reliable enough for such > >> short waits. > > > > hrtimers should be accurate. But maybe more expensive than the timer > > wheel. > > > > Sounds all a bit complex and fraught with new possible complications. I > can't spend much more time on this than i did so far, and we need to get > this into 4.4-rc, so i am aiming for the most simple solution that would > work. > > >> Busy waiting wouldn't be great either in irq. > >> > >> So what about this relatively simple one? > >> > >> 1. In radeon_get_crtc_scanoutpos() we artifically define the > >> vblank_start line to be, e.g, 5 scanlines before the true start of > >> vblank, so for the purpose of vblank counter queries and timestamping > >> our "vblank" would start a bit earlier and the vblank irq would always > >> execute in "vblank". Non-Irq invocations like vblank_disable_and_save() > >> would also be treated to this early vblank start and so what the DRM > >> core observes would always be consistent. > >> > >> 2. At least in radeon-kms we also use radeon_get_crtc_scanoutpos() > >> internally for "dynpm" dynamic power management/reclocking, and to > >> implement pageflip completion detection on asics older than DCE3 which > >> don't have pageflip interrupts. For those cases we need to use the true > >> start of vblank, so for this internal use we pass in some special flag > >> into radeon_get_crtc_scanoutpos() to tell it to not shift the vblank > >> start around. > >> > >> 3. I've added another check to my patch for drm_update_vblank_count() to > >> catch timestamps going backwards and discounting such cases, for a bit > >> of extra robustness against driver trouble. > >> > >> Any ideas why this would be a stupid idea? I'll try this now and just > >> hope meanwhile nobody finds a reason why this would be bad. > > > > What I don't like is leaking any of this into the core code. All the > > hacks should live in the hw specific code. We've managed to keep all > > the i915 hacks hidden that way. > > > > So if you would: > > - fudge your get_scanout_position as you suggested > > - _not_ expose the hw counter to the vblank core since it > > disagrees with the scanout position > > - keep the internal get_scanout_position variant/flag > > purely internal > > > > then I think I might like it. That way working hardware doesn't have to > > be exposed to these hacks, and there's possible a bit less danger that it > > all gets broken again next time the core needs some work. > > > > Ok. Exposing the fudged hw counter shouldn't be a problem though, given > that it would be fudged in a consistent way with the fudged scanout > positions and to have incremented at time of drm_handle_vblank(). We'd > bump the hw counter to the count of the new vblank at the same time when > the scanout positions would start counting backwards from minus > something to 0, showing how many vblank lines to go until start of > scanout, etc. The only difference to reality would be that this > simulated vblank would start 5 scanlines earlier than the true hw > vblank, but i can't see how the core would care about that. > > > > One problem with that approach could be that the vblank event and page > > flip event would be delievered at different times if you keep using the > > page flip interrupt, but I'm not sure that would be a problem. At least > > they should both have the same timestamp and counter value. > > > > That's the same now, and the timestamps and counts be the same. I'll > check with my measurement equipment that the timestamps will be still > accurate wrt. to reality. > > Attached is my current patch i wanted to submit for the drm core's > drm_update_vblank_count(). I think it's good to make the core somewhat > robust against potential kms driver bugs or glitches. But if you > wouldn't like that patch, there wouldn't be much of a point sending it > out at all. > > thanks, > -mario > > >From 2d5d58a1c575ad002ce2cb643f395d0e4757d959 Mon Sep 17 00:00:00 2001 > From: Mario Kleiner <mario.kleiner.de@xxxxxxxxx> > Date: Wed, 25 Nov 2015 18:48:31 +0100 > Subject: [PATCH] drm/irq: Make drm_update_vblank_count() more robust. > > The changes to drm_update_vblank_count() for Linux 4.4-rc > made the function more fragile wrt. some hw quirks. E.g., > at dev->driver->enable_vblank(), AMD gpu's fire a spurious > redundant vblank irq shortly after enabling vblank irqs, not > locked to vblank. This causes a redundant call which needs > to be suppressed to avoid miscounting. > > To increase robustness, shuffle things around a bit: > > On drivers with high precision vblank timestamping always > evaluate the timestamp difference between current timestamp > and previously recorded timestamp to detect such redundant > invocations and no-op in that case. > > Also detect and warn about timestamps going backwards to > catch potential kms driver bugs. > > This patch is meant for Linux 4.4-rc and later. > > Signed-off-by: Mario Kleiner <mario.kleiner.de@xxxxxxxxx> > --- > drivers/gpu/drm/drm_irq.c | 53 ++++++++++++++++++++++++++++++++++++----------- > 1 file changed, 41 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > index 819b8c1..8728c3c 100644 > --- a/drivers/gpu/drm/drm_irq.c > +++ b/drivers/gpu/drm/drm_irq.c > @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > unsigned long flags) > { > struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > - u32 cur_vblank, diff; > + u32 cur_vblank, diff = 0; > bool rc; > struct timeval t_vblank; > + const struct timeval *t_old; > + u64 diff_ns; > int count = DRM_TIMESTAMP_MAXRETRIES; > int framedur_ns = vblank->framedur_ns; > > @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags); > } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && --count > 0); > > - if (dev->max_vblank_count != 0) { > - /* trust the hw counter when it's around */ > - diff = (cur_vblank - vblank->last) & dev->max_vblank_count; > - } else if (rc && framedur_ns) { > - const struct timeval *t_old; > - u64 diff_ns; > - > + /* > + * Always use vblank timestamping based method if supported to reject > + * redundant vblank irqs. E.g., AMD hardware needs this to not screw up > + * due to some irq handling quirk. > + * > + * This also sets the diff value for use as fallback below in case the > + * hw does not support a suitable hw vblank counter. > + */ > + if (rc && framedur_ns) { If you fudged everything properly why do you still need this? With working hw counter there should be no need to do this stuff. > t_old = &vblanktimestamp(dev, pipe, vblank->count); > diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old); > > @@ -212,11 +216,36 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > */ > diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns); > > - if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) > + /* Catch driver timestamping bugs and prevent worse. */ > + if ((s32) diff < 0) { > + DRM_DEBUG_VBL("crtc %u: Timestamp going backward!" > + " diff_ns = %lld, framedur_ns = %d)\n", > + pipe, (long long) diff_ns, framedur_ns); > + return; > + } > + > + if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) { > DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored." > - " diff_ns = %lld, framedur_ns = %d)\n", > - pipe, (long long) diff_ns, framedur_ns); > - } else { > + " diff_ns = %lld, framedur_ns = %d)\n", > + pipe, (long long) diff_ns, framedur_ns); > + > + /* Treat this redundant invocation as no-op. */ > + WARN_ON_ONCE(cur_vblank != vblank->last); > + return; > + } > + } > + > + /* > + * hw counters, if marked as supported via max_vblank_count != 0, > + * *must* increment at leading edge of vblank or at least before > + * the firing of the hw vblank irq, otherwise we have a race here > + * between gpu and us, where small variations in our execution > + * timing can cause off-by-one miscounting of vblanks. > + */ > + if (dev->max_vblank_count != 0) { > + /* trust the hw counter when it's around */ > + diff = (cur_vblank - vblank->last) & dev->max_vblank_count; > + } else if (!(rc && framedur_ns)) { > /* some kind of default for drivers w/o accurate vbl timestamping */ > diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > } > -- > 1.9.1 > -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel