On Tue, Aug 06, 2013 at 01:17:03PM +0100, Chris Wilson wrote: > The LLC is a fun device. The cache is a distinct functional block within > the SA that arbitrates access from both the CPU and GPU cores. As such > all writes to memory land first in the LLC before further action is > taken. For example, an uncached write from either the CPU or GPU will > then proceed to memory and evict the cacheline from the LLC. This means that > a read from the LLC always returns the correct information even if the PTE > bit in the GPU differs from the PAT bit in the CPU. For the older > snooping architecture on non-LLC, the fundamental principle still holds > except that some coordination is required between the CPU and GPU to > explicitly perform the snooping (which is handled by our request > tracking). > > The upshot of this is that we know that we can issue a read from either > LLC devices or snoopable memory and trust the contents of the cache - > i.e. we can forgo a clflush before a read in these circumstances. > Writing to memory from the CPU is a little more tricky as we have to > consider that the scanout does not read from the CPU cache at all, but > from main memory. So we have to currently treat all requests to write to > uncached memory as having to be flushed to main memory for coherency > with all consumers. Since this is a behavioural change wrt cache coherency can we please have an igt testcase to exercise pwrite/pread coherency on uncached buffers on LLC platforms? With the set_caching ioctl it should be fairly easy to add another subtest to the relevant existing igts. /me is simply too paranoid about this stuff Cheers, Daniel > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_gem.c | 22 ++++++++++++++-------- > 1 file changed, 14 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index eec6fcb..5671dab 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -60,6 +60,12 @@ static long i915_gem_purge(struct drm_i915_private *dev_priv, long target); > static void i915_gem_shrink_all(struct drm_i915_private *dev_priv); > static void i915_gem_object_truncate(struct drm_i915_gem_object *obj); > > +static bool cpu_cache_is_coherent(struct drm_device *dev, > + enum i915_cache_level level) > +{ > + return HAS_LLC(dev) || level != I915_CACHE_NONE; > +} > + > static inline void i915_gem_object_fence_lost(struct drm_i915_gem_object *obj) > { > if (obj->tiling_mode) > @@ -510,8 +516,7 @@ i915_gem_shmem_pread(struct drm_device *dev, > * read domain and manually flush cachelines (if required). This > * optimizes for the case when the gpu will dirty the data > * anyway again before the next pread happens. */ > - if (obj->cache_level == I915_CACHE_NONE) > - needs_clflush = 1; > + needs_clflush = !cpu_cache_is_coherent(dev, obj->cache_level); > if (i915_gem_obj_ggtt_bound(obj)) { > ret = i915_gem_object_set_to_gtt_domain(obj, false); > if (ret) > @@ -835,11 +840,11 @@ i915_gem_shmem_pwrite(struct drm_device *dev, > return ret; > } > } > - /* Same trick applies for invalidate partially written cachelines before > - * writing. */ > - if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU) > - && obj->cache_level == I915_CACHE_NONE) > - needs_clflush_before = 1; > + /* Same trick applies to invalidate partially written cachelines read > + * before writing. */ > + if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) > + needs_clflush_before = > + !cpu_cache_is_coherent(dev, obj->cache_level); > > ret = i915_gem_object_get_pages(obj); > if (ret) > @@ -3650,7 +3655,8 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) > > /* Flush the CPU cache if it's still invalid. */ > if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) { > - i915_gem_clflush_object(obj); > + if (!cpu_cache_is_coherent(obj->base.dev, obj->cache_level)) > + i915_gem_clflush_object(obj); > > obj->base.read_domains |= I915_GEM_DOMAIN_CPU; > } > -- > 1.8.4.rc1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx