On Tue, Aug 11, 2015 at 03:28:27PM +0100, Chris Wilson wrote: > On Tue, Aug 11, 2015 at 04:59:14PM +0300, ville.syrjala@xxxxxxxxxxxxxxx wrote: > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > Currently we don't clflush on pin_to_display if the bo is already > > UC/WT and is not in the CPU write domain. This causes problems with > > pwrite since pwrite doesn't change the write domain, and it avoids > > clflushing on UC/WT buffers on LLC platforms unless the buffer is > > currently being scanned out. > > > > Fix the problem by marking the cache dirty and adjusting > > i915_gem_object_set_cache_level() to clflush when the cache is dirty > > even if the cache_level doesn't change. > > > > My last attempt [1] at fixing this via write domain frobbing was shot > > down, but now with the cache_dirty flag we can do things in a nicer way. > > > > [1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422 > > Testcase: igt/kms_pwrite_crc > > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 10 +++++++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 73293b4..73eff2e 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -1005,12 +1005,15 @@ out: > > if (!needs_clflush_after && > > obj->base.write_domain != I915_GEM_DOMAIN_CPU) { > > if (i915_gem_clflush_object(obj, obj->pin_display)) > > - i915_gem_chipset_flush(dev); > > + needs_clflush_after = true; > > } > > } > > > > if (needs_clflush_after) > > i915_gem_chipset_flush(dev); > > + else if (obj->cache_level == I915_CACHE_NONE || > > + obj->cache_level == I915_CACHE_WT) > > + obj->cache_dirty = true; > > > > intel_fb_obj_flush(obj, false, ORIGIN_CPU); > > return ret; > > Ok, this seems reasonable. > > > @@ -3639,10 +3642,10 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > > { > > struct drm_device *dev = obj->base.dev; > > struct i915_vma *vma, *next; > > - int ret; > > + int ret = 0; > > > > if (obj->cache_level == cache_level) > > - return 0; > > + goto out; > > > > if (i915_gem_obj_is_pinned(obj)) { > > DRM_DEBUG("can not change the cache level of pinned objects\n"); > > @@ -3687,6 +3690,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > > vma->node.color = cache_level; > > obj->cache_level = cache_level; > > > > +out: > > if (obj->cache_dirty && > > obj->base.write_domain != I915_GEM_DOMAIN_CPU && > > cpu_write_needs_clflush(obj)) { > > This we can do better (and I know I am guilty of the original sin). It > just feels a little too loose that we expect pin-to-display-plane should > always call set-cache-level (yes, it has to, but it feels like we are > putting pin-to-display-plane specifics into set-cache-level). > > I think this chunk is much more understandable if we did: > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 5e7fcf77e57a..fc6bcc19cca1 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3656,13 +3656,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > vma->node.color = cache_level; > obj->cache_level = cache_level; > > - if (obj->cache_dirty && > - obj->base.write_domain != I915_GEM_DOMAIN_CPU && > - cpu_write_needs_clflush(obj)) { > - if (i915_gem_clflush_object(obj, true)) > - i915_gem_chipset_flush(obj->base.dev); > - } > - > return 0; > } > > @@ -3795,6 +3788,10 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, > WARN_ON(obj->pin_display > vma->pin_count); > > i915_gem_object_flush_cpu_write_domain(obj); > + if (obj->cache_dirty) { > + if (i915_gem_clflush_object(obj, true)) > + i915_gem_chipset_flush(obj->base.dev); > + } > > /* It should now be out of any other write domains, and we can update > * the domain values for our changes. > > Maybe even > > /* Whilst the object was away from the scanout we may have been eliding the coherent > * writes into the CPU cache. However, the moment it has to be read by the display > * engine, those writes into the CPU cache become inaccessible and so we have to > * clflush them out to main memory. We track elided flushes with obj->cache_dirty > * and hope they are rare. > */ > > > The other user of set-cache-level (set_caching_ioctl) should not care > about the clflush side-effect and the clflush elision should work just > fine instead. Hmm. So what would happen on !LLC if we start with a cached bo, then pwrite it and afterwards make it uncached? > > Either way, > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > but I'd prefer a v2 with the comments :) > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre -- Ville Syrjälä Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx