On Thu, Nov 19, 2015 at 06:46:28PM +0100, Mario Kleiner wrote: > Hi Alex and Michel and Ville, > > it's "fix vblank stuff" time again ;-) > > Ville's changes to the DRM's drm_handle_vblank() / > drm_update_vblank_count() code in Linux 4.4 not only made that code more > elegant, but also removed the robustness against the vblank irq quirks > in AMD hw and similar hardware. So now i get tons of off-by-one errors and > > "[ 432.345] (WW) RADEON(1): radeon_dri2_flip_event_handler: Pageflip > completion event has impossible msc 24803 < target_msc 24804" XOrg > messages from that kernel. Argh. Sorry about that. > > One of the reasons for trouble is that AMD hw quirk where the hw fires > an extra vblank irq shortly after vblank irq's get enabled, not > synchronized to vblank, but typically in the middle of active scanout, > so we get a redundant call to drm_handle_vblank in the middle of scanout. I think that should be fine as such. The code should ignore redudntant vbl irqs. Well, assuming you have a reliable hw counter or you use the timestamp guesstimate mechanism and your scanout position is reported accurately. But I guess you have a bit of problem with both. > > To fix that i have a minor patch to make drm_update_vblank_count() again > robust against such redundant calls, which i will send out later to the > mailing list. Diff attached for reference. > > The second quirk of AMD hw is that the vblank interrupt fires a few > scanlines before start of vblank, so drm_handle_vblank -> > drm_update_vblank_count() -> dev->driver->get_vblank_counter() gets > called before the start of the vblank for which the new vblank count > should be queried. Does it fire too soon, or is the scanout position register value(s) just offset by a few lines perhaps? We have that with i915 and I simply fix up the value when reading it out. Fortunately for us the offset is constant (or at least seems to be) for a given platform/connector combo. > > The third problem is that the DRM vblank handling always had the > assumption that hardware vblank counters would essentially increment at > leading edge of vblank - basically in sync with the firing of the vblank > irq, so that a hw counter readout from within the vblank irq handler > would always deliver the new incremented value. If this assumption is > violated then the counting by use of the hw counter gets unreliable, > because depending on random small delays in irq handling the code may > end up sampling the hw counter pre- or post-increment, leading to > inconsistent updating and funky bugs. It just so happens that AMD > hardware doesn't increment the hw counter at leading edge of vblank, so > stuff falls apart. > > So to fix those two problems i'm tinkering with cooking the hw vblank > counter value returned by radeon_get_vblank_counter_kms() to make it > appear as if the counter incremented at leading edge of vblank in sync > with vblank irq. Yeah, I do that in i915 too. Well, on some platforms where the counter increments at the leading edge of active. > > It almost sort of works on the rs600 code path, but i need a bit of info > from you: > > 1. There's this register from the old specs for m76.pdf, which is not > part of the current register defines for radeon-kms: > > "D1CRTC_STATUS_VF_COUNT - RW - 32 bits - [GpuF0MMReg:0x60A8]" > > It contains the lower 16 bits of framecounter and the 13 bits of > vertical scanout position. It seems to give the same readings as the 24 > bit R_0060A4_D1CRTC_STATUS_FRAME_COUNT we use for the hw counter. This > would come handy. > > Does Evergreen and later have a same/similar register and where is it? > > 2. The hw framecounter seems to increment when the vertical scanout > position wraps back from (VTOTAL - 1) to 0, at least on the one DCE-3 > gpu i tested so far. Is this so on all asics? And is the hw counter > increment happening exactly at the moment that vertical scanout position > jumps back to zero, ie. both events are driven by the same signal? Or is > the framecounter increment just happening somewhere inside either > scanline VTOTAL-1 or scanline 0? > > > If we can fix this and get it into rc2 or rc3 then we could avoid a bad > regression and with a bit of luck at the same time improve by being able > to set dev->vblank_disable_immediate = true then and allow vblank irqs > to get turned off more aggressively for a bit of extra power saving. > > thanks, > -mario > -- a/drivers/gpu/drm/drm_irq.c > +++ b/drivers/gpu/drm/drm_irq.c > @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > unsigned long flags) > { > struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > - u32 cur_vblank, diff; > + u32 cur_vblank, diff = 0; > bool rc; > struct timeval t_vblank; > + const struct timeval *t_old; > + u64 diff_ns; > int count = DRM_TIMESTAMP_MAXRETRIES; > int framedur_ns = vblank->framedur_ns; > > @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags); > } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && --count > 0); > > - if (dev->max_vblank_count != 0) { > - /* trust the hw counter when it's around */ > - diff = (cur_vblank - vblank->last) & dev->max_vblank_count; > - } else if (rc && framedur_ns) { > - const struct timeval *t_old; > - u64 diff_ns; > - > + /* > + * Always use vblank timestamping based method if supported to reject > + * redundant vblank irqs. E.g., AMD hardware needs this to not screw up > + * due to some irq handling quirk. Hmm. I'm thinking it would be better to simply not claim that there's a hw counter if it isn't reliable. > + * > + * This also sets the diff value for use as fallback below in case the > + * hw does not support a suitable hw vblank counter. > + */ > + if (rc && framedur_ns) { > t_old = &vblanktimestamp(dev, pipe, vblank->count); > diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old); > > @@ -212,11 +216,28 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > */ > diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns); > > - if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) > - DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored." > - " diff_ns = %lld, framedur_ns = %d)\n", > - pipe, (long long) diff_ns, framedur_ns); > - } else { > + if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) { > + DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored." > + " diff_ns = %lld, framedur_ns = %d)\n", > + pipe, (long long) diff_ns, framedur_ns); > + > + /* Treat this redundant invocation as no-op. */ > + WARN_ON_ONCE(cur_vblank != vblank->last); > + return; > + } > + } > + > + /* > + * hw counters, if marked as supported via max_vblank_count != 0, > + * *must* increment at leading edge of vblank and in sync with > + * the firing of the hw vblank irq, otherwise we have a race here > + * between gpu and us, where small variations in our execution > + * timing can cause off-by-one miscounting of vblanks. > + */ > + if (dev->max_vblank_count != 0) { > + /* trust the hw counter when it's around */ > + diff = (cur_vblank - vblank->last) & dev->max_vblank_count; > + } else if (!(rc && framedur_ns)) { > /* some kind of default for drivers w/o accurate vbl timestamping */ > diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > } -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel