On Thu, 4 Apr 2013 21:31:03 +0100 Chris Wilson <chris at chris-wilson.co.uk> wrote: > In order to fully serialize access to the fenced region and the update > to the fence register we need to take extreme measures on SNB+, and > manually flush writes to memory prior to writing the fence register in > conjunction with the memory barriers placed around the register write. > > Fixes i-g-t/gem_fence_thrash > > v2: Bring a bigger gun > v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) > v4: Remove changes for working generations. > v5: Reduce to a per-cpu wbinvd() call prior to updating the fences. > v6: Rewrite comments to ellide forgotten history. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > Cc: Jon Bloomfield <jon.bloomfield at intel.com> > Tested-by: Jon Bloomfield <jon.bloomfield at intel.com> (v2) > Cc: stable at vger.kernel.org > --- > drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++++++++++++++++----- > 1 file changed, 23 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index fa4ea1a..8f7739e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2689,17 +2689,35 @@ static inline int fence_number(struct drm_i915_private *dev_priv, > return fence - dev_priv->fence_regs; > } > > +static void i915_gem_write_fence__ipi(void *data) > +{ > + wbinvd(); > +} > + > static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, > struct drm_i915_fence_reg *fence, > bool enable) > { > - struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > - int reg = fence_number(dev_priv, fence); > - > - i915_gem_write_fence(obj->base.dev, reg, enable ? obj : NULL); > + struct drm_device *dev = obj->base.dev; > + struct drm_i915_private *dev_priv = dev->dev_private; > + int fence_reg = fence_number(dev_priv, fence); > + > + /* In order to fully serialize access to the fenced region and > + * the update to the fence register we need to take extreme > + * measures on SNB+. In theory, the write to the fence register > + * flushes all memory transactions before, and coupled with the > + * mb() placed around the register write we serialise all memory > + * operations with respect to the changes in the tiler. Yet, on > + * SNB+ we need to take a step further and emit an explicit wbinvd() > + * on each processor in order to manually flush all memory > + * transactions before updating the fence register. > + */ > + if (HAS_LLC(obj->base.dev)) > + on_each_cpu(i915_gem_write_fence__ipi, NULL, 1); > + i915_gem_write_fence(dev, fence_reg, enable ? obj : NULL); > > if (enable) { > - obj->fence_reg = reg; > + obj->fence_reg = fence_reg; > fence->obj = obj; > list_move_tail(&fence->lru_list, &dev_priv->mm.fence_list); > } else { I almost wish x86 just gave up and went fully weakly ordered. At least then we'd know we need barriers everywhere, rather than just in mystical places. Reviewed-by: Jesse Barnes <jbarnes at virtuousgeek.org> -- Jesse Barnes, Intel Open Source Technology Center