We need to treat the GPU core as a distinct processor and so apply the same SMP memory barriers. In this case, in addition to flushing the chipset cache, which is a no-op on LLC platforms, apply a write barrier beforehand. And then when we invalidate the CPU cache, make sure the memory is coherent (again this was a no-op on LLC platforms). Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> --- drivers/char/agp/intel-gtt.c | 1 + drivers/gpu/drm/i915/i915_gem.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c index 8b0f6d19..1223128 100644 --- a/drivers/char/agp/intel-gtt.c +++ b/drivers/char/agp/intel-gtt.c @@ -1706,6 +1706,7 @@ EXPORT_SYMBOL(intel_gtt_get); void intel_gtt_chipset_flush(void) { + wmb(); if (intel_private.driver->chipset_flush) intel_private.driver->chipset_flush(); } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ed8d21a..b1ebb88 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3528,6 +3528,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) /* Flush the CPU cache if it's still invalid. */ if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) { i915_gem_clflush_object(obj); + mb(); /* in case the clflush above is optimised away */ obj->base.read_domains |= I915_GEM_DOMAIN_CPU; } -- 1.7.10.4