On 04/08/2013 11:54 AM, Daniel Vetter wrote: > On Fri, Apr 05, 2013 at 09:03:27AM -0700, Jesse Barnes wrote: >> On Thu, 4 Apr 2013 21:31:03 +0100 >> Chris Wilson <chris at chris-wilson.co.uk> wrote: >> >>> In order to fully serialize access to the fenced region and the update >>> to the fence register we need to take extreme measures on SNB+, and >>> manually flush writes to memory prior to writing the fence register in >>> conjunction with the memory barriers placed around the register write. >>> >>> Fixes i-g-t/gem_fence_thrash >>> >>> v2: Bring a bigger gun >>> v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) >>> v4: Remove changes for working generations. >>> v5: Reduce to a per-cpu wbinvd() call prior to updating the fences. >>> v6: Rewrite comments to ellide forgotten history. >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> >>> Cc: Jon Bloomfield <jon.bloomfield at intel.com> >>> Tested-by: Jon Bloomfield <jon.bloomfield at intel.com> (v2) >>> Cc: stable at vger.kernel.org >>> --- >>> drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++++++++++++++++----- >>> 1 file changed, 23 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c >>> index fa4ea1a..8f7739e 100644 >>> --- a/drivers/gpu/drm/i915/i915_gem.c >>> +++ b/drivers/gpu/drm/i915/i915_gem.c >>> @@ -2689,17 +2689,35 @@ static inline int fence_number(struct drm_i915_private *dev_priv, >>> return fence - dev_priv->fence_regs; >>> } >>> >>> +static void i915_gem_write_fence__ipi(void *data) >>> +{ >>> + wbinvd(); >>> +} >>> + >>> static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, >>> struct drm_i915_fence_reg *fence, >>> bool enable) >>> { >>> - struct drm_i915_private *dev_priv = obj->base.dev->dev_private; >>> - int reg = fence_number(dev_priv, fence); >>> - >>> - i915_gem_write_fence(obj->base.dev, reg, enable ? obj : NULL); >>> + struct drm_device *dev = obj->base.dev; >>> + struct drm_i915_private *dev_priv = dev->dev_private; >>> + int fence_reg = fence_number(dev_priv, fence); >>> + >>> + /* In order to fully serialize access to the fenced region and >>> + * the update to the fence register we need to take extreme >>> + * measures on SNB+. In theory, the write to the fence register >>> + * flushes all memory transactions before, and coupled with the >>> + * mb() placed around the register write we serialise all memory >>> + * operations with respect to the changes in the tiler. Yet, on >>> + * SNB+ we need to take a step further and emit an explicit wbinvd() >>> + * on each processor in order to manually flush all memory >>> + * transactions before updating the fence register. >>> + */ >>> + if (HAS_LLC(obj->base.dev)) >>> + on_each_cpu(i915_gem_write_fence__ipi, NULL, 1); >>> + i915_gem_write_fence(dev, fence_reg, enable ? obj : NULL); >>> >>> if (enable) { >>> - obj->fence_reg = reg; >>> + obj->fence_reg = fence_reg; >>> fence->obj = obj; >>> list_move_tail(&fence->lru_list, &dev_priv->mm.fence_list); >>> } else { >> >> I almost wish x86 just gave up and went fully weakly ordered. At least >> then we'd know we need barriers everywhere, rather than just in >> mystical places. >> >> Reviewed-by: Jesse Barnes <jbarnes at virtuousgeek.org> > > Queued for -next, thanks for the patch. I am a bit unhappy that the > testcase doesn't work too well and can't reliably reproduce this on all > affected machines. But I've tried a bit to improve things and it's very > fickle. I guess we just have to accept this :( Under real-time conditions when we expect deterministic response to external and internal events at any time within a couple of microseconds, invalidating and flushing the entire cache in a running system is unacceptable. This is introducing latencies of several milliseconds which was clearly shown in RT regression tests on a kernel with this patch applied (https://www.osadl.org/?id=1543#c7602). We therefore have to revert it in the RT patch queue - kind of a workaround of a workaround. Would simply be wonderful, if we could get rid of the hateful wbinvd(). -Carsten.