On Tue, Mar 11, 2014 at 07:37:34PM +0200, ville.syrjala@xxxxxxxxxxxxxxx wrote: > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > Starting from ILK, mmio flips also cause a flip done interrupt to be > signalled. This means if we first do a set_base and follow it > immediately with the CS flip, we might mistake the flip done interrupt > caused by the set_base as the flip done interrupt caused by the CS > flip. > > The hardware has a flip counter which increments every time a mmio or > CS flip is issued. It basically counts the number of DSPSURF register > writes. This means we can sample the counter before we put the CS > flip into the ring, and then when we get a flip done interrupt we can > check whether the CS flip has actually performed the surface address > update, or if the interrupt was caused by a previous but yet > unfinished mmio flip. > > Even with the flip counter we still have a race condition of the CS flip > base address update happens after the mmio flip done interrupt was > raised but not yet processed by the driver. When the interrupt is > eventually processed, the flip counter will already indicate that the > CS flip has been executed, but it would not actually complete until the > next start of vblank. We can use the DSPSURFLIVE register to check > whether the hardware is actually scanning out of the buffer we expect, > or if we managed hit this race window. > > This covers all the cases where the CS flip actually changes the base > address. If the base address remains unchanged, we might still complete > the CS flip before it has actually completed. But since the address > didn't change anyway, the premature flip completion can't result in > userspace overwriting data that's still being scanned out. > > CTG already has the flip counter and DSPSURFLIVE registers, and > although the flip done interrupt is still limited to CS flips alone, > the code now also checks the flip counter on CTG as well. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73027 > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_reg.h | 1 + > drivers/gpu/drm/i915/intel_display.c | 72 ++++++++++++++++++++++++++++++++---- > drivers/gpu/drm/i915/intel_drv.h | 2 + > 3 files changed, 68 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 146609a..0d82924 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -3533,6 +3533,7 @@ enum punit_power_well { > #define _PIPEA_FRMCOUNT_GM45 (dev_priv->info.display_mmio_offset + 0x70040) > #define _PIPEA_FLIPCOUNT_GM45 (dev_priv->info.display_mmio_offset + 0x70044) > #define PIPE_FRMCOUNT_GM45(pipe) _PIPE(pipe, _PIPEA_FRMCOUNT_GM45, _PIPEB_FRMCOUNT_GM45) > +#define PIPE_FLIPCOUNT_GM45(pipe) _PIPE(pipe, _PIPEA_FLIPCOUNT_GM45, _PIPEB_FLIPCOUNT_GM45) > > /* Cursor A & B regs */ > #define _CURACNTR (dev_priv->info.display_mmio_offset + 0x70080) > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 37bc041..aabc90c 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -8610,6 +8610,47 @@ void intel_finish_page_flip_plane(struct drm_device *dev, int plane) > do_intel_finish_page_flip(dev, crtc); > } > > +/* Is 'a' after or equal to 'b'? */ > +static bool flip_count_after_eq(u32 a, u32 b) > +{ > + return !((a - b) & 0x80000000); > +} Couldn't we just use the idiom of casting the result to an int, please? > +static bool page_flip_finished(struct intel_crtc *crtc) > +{ > + struct drm_device *dev = crtc->base.dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > + > + /* > + * The relevant registers doen't exist on pre-ctg. > + * As the flip done interrupt doesn't trigger for mmio > + * flips on gmch platforms, a flip count check isn't > + * really needed there. But since ctg has the registers, > + * include it in the check anyway. > + */ > + if (INTEL_INFO(dev)->gen < 5 && !IS_G4X(dev)) > + return true; > + > + /* > + * A DSPSURFLIVE check isn't enough in case the mmio and CS flips > + * used the same base address. In that case the mmio flip might > + * have completed, but the CS hasn't even executed the flip yet. > + * > + * A flip count check isn't enough as the CS might have updated > + * the base address just after start of vblank, but before we > + * managed to process the interrupt. This means we'd complete the > + * CS flip too soon. > + * > + * Combining both checks should get us a good enough result. It may > + * still happen that the CS flip has been executed, but has not > + * yet actually completed. But in case the base address is the same > + * anyway, we don't really care. > + */ > + return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) == crtc->unpin_work->dspsurf && > + flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_GM45(crtc->pipe)), > + crtc->unpin_work->flip_count); Instead of dspsurf, I'd prefer if this was just a more generic gtt_offset so that it isn't peculiar to gen4+, and can be used for the gen3 stall checks. Other than those two nitpicks, Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> A follow-on request would be to update the pageflip stall logic, but I can rebase my patches on to this. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx