Re: [RFC 7/8] drm/i915: Lift the user PAT restriction from use_cpu_reloc

Matt Roper <matthew.d.roper@xxxxxxxxx> · Thu, 27 Jul 2023 17:09:48 -0700

On Thu, Jul 27, 2023 at 03:55:03PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> 
> Now that i915 understands the caching modes behind PAT indices, we can
> refine the check in use_cpu_reloc() to not reject the uncached PAT if it
> was set by userspace.
> 
> Instead it can decide based on the presence of full coherency which
> should be functionally equivalent on legacy platforms. We can ignore WT
> since it is only used by the display, and we can ignore Meteorlake since
> it will fail on the existing "has_llc" condition before the object cache
> mode check.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> Cc: Fei Yang <fei.yang@xxxxxxxxx>
> Cc: Matt Roper <matthew.d.roper@xxxxxxxxx>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 9 +--------
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 9d6e49c8a4c6..f74b33670bad 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -640,16 +640,9 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
>  	if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)
>  		return false;
>  
> -	/*
> -	 * For objects created by userspace through GEM_CREATE with pat_index
> -	 * set by set_pat extension, i915_gem_object_has_cache_level() always
> -	 * return true, otherwise the call would fall back to checking whether
> -	 * the object is un-cached.
> -	 */
>  	return (cache->has_llc ||
>  		obj->cache_dirty ||
> -		!(obj->pat_set_by_user ||
> -		  i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC)));
> +		i915_gem_object_has_cache_flag(obj, I915_CACHE_FLAG_COH2W));

My understanding of relocations is minimal, but does 2W actually matter
here (CPU snooping GPU caches)?  I would have expected only 1W coherency
to be necessary (GPU snooping CPU caches)?

Matt

>  }
>  
>  static int eb_reserve_vma(struct i915_execbuffer *eb,
> -- 
> 2.39.2
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation