On Wed, Nov 02, 2022 at 05:09:27PM +0100, Das, Nirmoy wrote:
On 11/2/2022 11:36 AM, Matthew Auld wrote:
On 02/11/2022 07:39, Das, Nirmoy wrote:
On 11/2/2022 6:14 AM, Niranjana Vishwanathapura wrote:
Currently on DG1, which do not have LLC, we hit the below
warning while rebinding an userptr invalidated object.
WARNING: CPU: 4 PID: 13008 at
drivers/gpu/drm/i915/gem/i915_gem_pages.c:34
__i915_gem_object_set_pages+0x296/0x2d0 [i915]
...
RIP: 0010:__i915_gem_object_set_pages+0x296/0x2d0 [i915]
...
Call Trace:
<TASK>
i915_gem_userptr_get_pages+0x175/0x1a0 [i915]
____i915_gem_object_get_pages+0x32/0xb0 [i915]
i915_gem_object_userptr_submit_init+0x286/0x470 [i915]
eb_lookup_vmas+0x2ff/0xcf0 [i915]
? __intel_wakeref_get_first+0x55/0xb0 [i915]
i915_gem_do_execbuffer+0x785/0x21d0 [i915]
i915_gem_execbuffer2_ioctl+0xe7/0x3d0 [i915]
We shouldn't be setting the obj->cache_dirty for DGFX,
fix it.
With Fixes: |d70af57944 |("drm/i915/shmem: ensure flush during
swap-in on non-LLC")
Ok, will add.
Acked-by: Nirmoy Das <nirmoy.das@xxxxxxxxx>
Any idea why this escaped our testing in CI? Perhaps something to
improve.
I ran some userptr related igt tests none hit
__i915_gem_object_release_shmem . So I think we are missing
coverage here or I/CI isn't running such test.
Niranjana, what test did you ran to hit this case WARN ?
I hit this issue with modified gem_userptr_blits@vma-merge where
I added additional execbuf call after userptr invalidation as below
to test rebind happens properly after an userptr invalidation.
igt_spin_end(spin);
+ igt_spin_reset(spin);
+
+ gem_execbuf_wr(i915, &spin->execbuf);
+ igt_spin_end(spin);
+
gem_close(i915, handle);
munmap(addr, sz);
Note that vma-merge subtest fails due to some other issue, but still
is good enough to reproduce this issue and test the fix.
Niranjana
Regards,
Nirmoy
Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx>
Suggested-by: Matthew Auld<matthew.auld@xxxxxxxxx>
Reported-by: Niranjana
Vishwanathapura<niranjana.vishwanathapura@xxxxxxxxx>
Signed-off-by: Niranjana
Vishwanathapura<niranjana.vishwanathapura@xxxxxxxxx>
---
drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 11125c32dd35..2f7804492cd5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -369,14 +369,14 @@ __i915_gem_object_release_shmem(struct
drm_i915_gem_object *obj,
__start_cpu_write(obj);
/*
- * On non-LLC platforms, force the flush-on-acquire if this
is ever
+ * On non-LLC igfx platforms, force the flush-on-acquire if
this is ever
* swapped-in. Our async flush path is not trust worthy
enough yet(and
* happens in the wrong order), and with some tricks it's
conceivable
* for userspace to change the cache-level to
I915_CACHE_NONE after the
* pages are swapped-in, and since execbuf binds the
object before doing
* the async flush, we have a race window.
*/
- if (!HAS_LLC(i915))
+ if (!HAS_LLC(i915) && !IS_DGFX(i915))
obj->cache_dirty = true;
}