Re: [RFC 4/8] drm/i915: Refactor PAT/object cache handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 28/07/2023 08:14, Yang, Fei wrote:
[snip]
@@ -41,14 +42,17 @@ static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
               return false;

       /*
-      * For objects created by userspace through GEM_CREATE with pat_index
-      * set by set_pat extension, i915_gem_object_has_cache_level() will
-      * always return true, because the coherency of such object is managed

i915_gem_object_has_cache_level() always return true means this function
always return false.

-      * by userspace. Othereise the call here would fall back to checking
-      * whether the object is un-cached or write-through.
+      * Always flush cache for UMD objects with PAT index set.

(obj->pat_set_by_user == true) indicates UMD knows how to handle the coherency,
forcing clflush in KMD would be redundant.

For Meteorlake I made gpu_write_needs_clflush() always return false anyway.

Could you please submit a patch with kerneldoc for i915_drm.h explaining what the set domain ioctl is expected to do when set pat extension is used? With the focus on the use cases of how userspace is managing coherency using it, or it isn't, or what.

        */
-     return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
-              i915_gem_object_has_cache_level(obj, I915_CACHE_WT));
+     if (obj->pat_set_by_user)
+             return true;

return false;

Oops, thank you! I did warn in the cover letter I was getting confused by boolean logic conversions, cross-referencing three versions, and extracting the pat_set_by_user to call sites. :)

+
+     /*
+      * Fully coherent cached access may end up with data in the CPU cache
+      * which hasn't hit memory yet.
+      */
+     return i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WB) &&
+            i915_gem_object_has_cache_flag(obj, I915_CACHE_FLAG_COH2W);

Why checking COH2W here? The logic was, if UC or WT return false, otherwise
return true. So, as long as cache_mode is WB, it's sufficient to say true
here, right?

I was trying to penetrate the reason behind the check.

Original code was:

       return !(obj->cache_level == I915_CACHE_NONE ||
                obj->cache_level == I915_CACHE_WT);

Which is equivalent to "is it WB", right? (Since it matches on both old LLC flavours.)

Which I thought, in the context of this function, is supposed to answer the question of "can there be data in the shared cache written by the GPU but not committed to RAM yet".

And then I thought that can only ever happen with 2-way coherency. Otherwise GPU writes never end up in the CPU cache.

Did I get that wrong? Maybe I have..

Regards,

Tvrtko



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux