On Tue, 9 Oct 2012 19:24:40 +0100 Chris Wilson <chris at chris-wilson.co.uk> wrote: > We need to treat the GPU core as a distinct processor and so apply the > same SMP memory barriers. In this case, in addition to flushing the > chipset cache, which is a no-op on LLC platforms, apply a write barrier > beforehand. And then when we invalidate the CPU cache, make sure the > memory is coherent (again this was a no-op on LLC platforms). > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > --- > drivers/char/agp/intel-gtt.c | 1 + > drivers/gpu/drm/i915/i915_gem.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c > index 8b0f6d19..1223128 100644 > --- a/drivers/char/agp/intel-gtt.c > +++ b/drivers/char/agp/intel-gtt.c > @@ -1706,6 +1706,7 @@ EXPORT_SYMBOL(intel_gtt_get); > > void intel_gtt_chipset_flush(void) > { > + wmb(); > if (intel_private.driver->chipset_flush) > intel_private.driver->chipset_flush(); > } > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ed8d21a..b1ebb88 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3528,6 +3528,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > /* Flush the CPU cache if it's still invalid. */ > if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) { > i915_gem_clflush_object(obj); > + mb(); /* in case the clflush above is optimised away */ > > obj->base.read_domains |= I915_GEM_DOMAIN_CPU; > } These need more comments too. I think the first is to make sure any previous loads have completed before we start using the new object? If so, don't we want reads to complete first too? The second one looks unnecessary. If the object isn't in the CPU domain, there should be no loads/stores against it right? -- Jesse Barnes, Intel Open Source Technology Center