Re: [PATCH] drm/i915: Do not set cache_dirty for DGFX

Niranjana Vishwanathapura <niranjana.vishwanathapura@xxxxxxxxx> · Wed, 2 Nov 2022 09:35:05 -0700

On Wed, Nov 02, 2022 at 05:09:27PM +0100, Das, Nirmoy wrote:

On 11/2/2022 11:36 AM, Matthew Auld wrote:
On 02/11/2022 07:39, Das, Nirmoy wrote:

On 11/2/2022 6:14 AM, Niranjana Vishwanathapura wrote:
Currently on DG1, which do not have LLC, we hit the below
warning while rebinding an userptr invalidated object.

WARNING: CPU: 4 PID: 13008 at 
drivers/gpu/drm/i915/gem/i915_gem_pages.c:34 
__i915_gem_object_set_pages+0x296/0x2d0 [i915]
...
RIP: 0010:__i915_gem_object_set_pages+0x296/0x2d0 [i915]
...
Call Trace:
  <TASK>
  i915_gem_userptr_get_pages+0x175/0x1a0 [i915]
  ____i915_gem_object_get_pages+0x32/0xb0 [i915]
  i915_gem_object_userptr_submit_init+0x286/0x470 [i915]
  eb_lookup_vmas+0x2ff/0xcf0 [i915]
  ? __intel_wakeref_get_first+0x55/0xb0 [i915]
  i915_gem_do_execbuffer+0x785/0x21d0 [i915]
  i915_gem_execbuffer2_ioctl+0xe7/0x3d0 [i915]

We shouldn't be setting the obj->cache_dirty for DGFX,
fix it.

With Fixes: |d70af57944 |("drm/i915/shmem: ensure flush during 
swap-in on non-LLC")


Ok, will add.

Acked-by: Nirmoy Das <nirmoy.das@xxxxxxxxx>

Any idea why this escaped our testing in CI? Perhaps something to 
improve.


I ran some userptr related igt tests none hit 
__i915_gem_object_release_shmem . So I think we are missing

coverage here or I/CI isn't running such test.

Niranjana, what test did you ran to hit this case WARN ?


I hit this issue with modified gem_userptr_blits@vma-merge where
I added additional execbuf call after userptr invalidation as below
to test rebind happens properly after an userptr invalidation.

        igt_spin_end(spin);
+       igt_spin_reset(spin);
+
+       gem_execbuf_wr(i915, &spin->execbuf);
+       igt_spin_end(spin);
+
        gem_close(i915, handle);

        munmap(addr, sz);

Note that vma-merge subtest fails due to some other issue, but still
is good enough to reproduce this issue and test the fix.

Niranjana


Regards,

Nirmoy



Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx>


Suggested-by: Matthew Auld<matthew.auld@xxxxxxxxx>
Reported-by: Niranjana 
Vishwanathapura<niranjana.vishwanathapura@xxxxxxxxx>
Signed-off-by: Niranjana 
Vishwanathapura<niranjana.vishwanathapura@xxxxxxxxx>
---
  drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 11125c32dd35..2f7804492cd5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -369,14 +369,14 @@ __i915_gem_object_release_shmem(struct 
drm_i915_gem_object *obj,
        __start_cpu_write(obj);
      /*
-     * On non-LLC platforms, force the flush-on-acquire if this 
is ever
+     * On non-LLC igfx platforms, force the flush-on-acquire if 
this is ever
       * swapped-in. Our async flush path is not trust worthy 
enough yet(and
       * happens in the wrong order), and with some tricks it's 
conceivable
       * for userspace to change the cache-level to 
I915_CACHE_NONE after the
       * pages are swapped-in, and since execbuf binds the 
object before doing
       * the async flush, we have a race window.
       */
-    if (!HAS_LLC(i915))
+    if (!HAS_LLC(i915) && !IS_DGFX(i915))
          obj->cache_dirty = true;
  }