On Tue, Aug 11, 2015 at 04:59:14PM +0300, ville.syrjala@xxxxxxxxxxxxxxx wrote: > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > Currently we don't clflush on pin_to_display if the bo is already > UC/WT and is not in the CPU write domain. This causes problems with > pwrite since pwrite doesn't change the write domain, and it avoids > clflushing on UC/WT buffers on LLC platforms unless the buffer is > currently being scanned out. > > Fix the problem by marking the cache dirty and adjusting > i915_gem_object_set_cache_level() to clflush when the cache is dirty > even if the cache_level doesn't change. > > My last attempt [1] at fixing this via write domain frobbing was shot > down, but now with the cache_dirty flag we can do things in a nicer way. > > [1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422 > Testcase: igt/kms_pwrite_crc > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_gem.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 73293b4..73eff2e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1005,12 +1005,15 @@ out: > if (!needs_clflush_after && > obj->base.write_domain != I915_GEM_DOMAIN_CPU) { > if (i915_gem_clflush_object(obj, obj->pin_display)) > - i915_gem_chipset_flush(dev); > + needs_clflush_after = true; > } > } > > if (needs_clflush_after) > i915_gem_chipset_flush(dev); > + else if (obj->cache_level == I915_CACHE_NONE || > + obj->cache_level == I915_CACHE_WT) > + obj->cache_dirty = true; > > intel_fb_obj_flush(obj, false, ORIGIN_CPU); > return ret; Ok, this seems reasonable. > @@ -3639,10 +3642,10 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > { > struct drm_device *dev = obj->base.dev; > struct i915_vma *vma, *next; > - int ret; > + int ret = 0; > > if (obj->cache_level == cache_level) > - return 0; > + goto out; > > if (i915_gem_obj_is_pinned(obj)) { > DRM_DEBUG("can not change the cache level of pinned objects\n"); > @@ -3687,6 +3690,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > vma->node.color = cache_level; > obj->cache_level = cache_level; > > +out: > if (obj->cache_dirty && > obj->base.write_domain != I915_GEM_DOMAIN_CPU && > cpu_write_needs_clflush(obj)) { This we can do better (and I know I am guilty of the original sin). It just feels a little too loose that we expect pin-to-display-plane should always call set-cache-level (yes, it has to, but it feels like we are putting pin-to-display-plane specifics into set-cache-level). I think this chunk is much more understandable if we did: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5e7fcf77e57a..fc6bcc19cca1 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3656,13 +3656,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, vma->node.color = cache_level; obj->cache_level = cache_level; - if (obj->cache_dirty && - obj->base.write_domain != I915_GEM_DOMAIN_CPU && - cpu_write_needs_clflush(obj)) { - if (i915_gem_clflush_object(obj, true)) - i915_gem_chipset_flush(obj->base.dev); - } - return 0; } @@ -3795,6 +3788,10 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, WARN_ON(obj->pin_display > vma->pin_count); i915_gem_object_flush_cpu_write_domain(obj); + if (obj->cache_dirty) { + if (i915_gem_clflush_object(obj, true)) + i915_gem_chipset_flush(obj->base.dev); + } /* It should now be out of any other write domains, and we can update * the domain values for our changes. Maybe even /* Whilst the object was away from the scanout we may have been eliding the coherent * writes into the CPU cache. However, the moment it has to be read by the display * engine, those writes into the CPU cache become inaccessible and so we have to * clflush them out to main memory. We track elided flushes with obj->cache_dirty * and hope they are rare. */ The other user of set-cache-level (set_caching_ioctl) should not care about the clflush side-effect and the clflush elision should work just fine instead. Either way, Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> but I'd prefer a v2 with the comments :) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx