Quoting Francisco Jerez (2019-07-24 21:37:24) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > Since userspace has the ability to bypass the CPU cache from within its > > unprivileged command stream, we have to flush the CPU cache to memory > > in order to overwrite the previous contents on creation. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > Cc: stablevger.kernel.org > > --- > > drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 26 ++++++----------------- > > 1 file changed, 7 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c > > index d2a1158868e7..f752b326d399 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c > > @@ -459,7 +459,6 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size) > > { > > struct drm_i915_gem_object *obj; > > struct address_space *mapping; > > - unsigned int cache_level; > > gfp_t mask; > > int ret; > > > > @@ -498,24 +497,13 @@ i915_gem_object_create_shmem(struct drm_i915_private *i915, u64 size) > > obj->write_domain = I915_GEM_DOMAIN_CPU; > > obj->read_domains = I915_GEM_DOMAIN_CPU; > > > > - if (HAS_LLC(i915)) > > - /* On some devices, we can have the GPU use the LLC (the CPU > > - * cache) for about a 10% performance improvement > > - * compared to uncached. Graphics requests other than > > - * display scanout are coherent with the CPU in > > - * accessing this cache. This means in this mode we > > - * don't need to clflush on the CPU side, and on the > > - * GPU side we only need to flush internal caches to > > - * get data visible to the CPU. > > - * > > - * However, we maintain the display planes as UC, and so > > - * need to rebind when first used as such. > > - */ > > - cache_level = I915_CACHE_LLC; > > - else > > - cache_level = I915_CACHE_NONE; > > - > > - i915_gem_object_set_cache_coherency(obj, cache_level); > > + /* > > + * Note that userspace has control over cache-bypass > > + * via its command stream, so even on LLC architectures > > + * we have to flush out the CPU cache to memory to > > + * clear previous contents. > > + */ > > + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); > > > > I'm not sure if you've seen my comments about this in an internal thread > you were CC'ed to: I don't think this patch will have the intended > effect. The various clflushes this could trigger before the first use > of the buffer are conditional on "obj->cache_dirty & > ~obj->cache_coherent", which will always be false on LLC platforms > AFAICT, because on those platforms i915_gem_object_set_cache_coherency() > will always set bit 0 of obj->cache_coherent. You only need to flush the stale written-to cachelines, so that the page content is correct on reuse of the foreign struct page. After that point, the CPU cache is managed by the client. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx