On 08/10/2021 13:19, Christian König wrote:
Am 08.10.21 um 12:49 schrieb Tvrtko Ursulin:
On 08/10/2021 11:21, Christian König wrote:
Am 08.10.21 um 11:50 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cache the count of shared fences in the iterator to avoid dereferencing
the dma_resv_object outside the RCU protection. Otherwise iterator
and its
users can observe an incosistent state which makes it impossible to use
safely.
Ah, of course! I've been staring at the code the whole morning and
couldn't see it.
Going to write a testcase to cover that.
Such as:
<6> [187.517041] [IGT] gem_sync: executing
<7> [187.536343] i915 0000:00:02.0:
[drm:i915_gem_context_create_ioctl [i915]] HW context 1 created
<7> [187.536793] i915 0000:00:02.0:
[drm:i915_gem_context_create_ioctl [i915]] HW context 1 created
<6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
<1> [188.935462] BUG: kernel NULL pointer dereference, address:
0000000000000010
<1> [188.935485] #PF: supervisor write access in kernel mode
<1> [188.935495] #PF: error_code(0x0002) - not-present page
<6> [188.935504] PGD 0 P4D 0
<4> [188.935512] Oops: 0002 [#1] PREEMPT SMP NOPTI
<4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not tainted
5.15.0-rc4-CI-Patchwork_21264+ #1
<4> [188.935535] Hardware name: /NUC6CAYB, BIOS
AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
<4> [188.935546] RIP: 0010:dma_resv_get_fences+0x116/0x2d0
<4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff
eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 01
41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48
8b 44 24 10
<4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 00010202
<4> [188.935593] RAX: 0000000000000000 RBX: 00000000ffffffff RCX:
0000000000000001
<4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI:
ffffc900011dbcc8
<4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09:
00000000fffffffe
<4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffc900011dbd44
<4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15:
0000000000000000
<4> [188.935643] FS: 00007f68d17e9700(0000)
GS:ffff888277900000(0000) knlGS:0000000000000000
<4> [188.935655] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [188.935665] CR2: 0000000000000010 CR3: 000000012d0a4000 CR4:
00000000003506e0
<4> [188.935676] Call Trace:
<4> [188.935685] i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [188.935988] i915_gem_wait_ioctl+0xf2/0x2a0 [i915]
<4> [188.936272] ? i915_gem_object_wait+0x410/0x410 [i915]
<4> [188.936533] drm_ioctl_kernel+0xae/0x140
<4> [188.936546] drm_ioctl+0x201/0x3d0
<4> [188.936555] ? i915_gem_object_wait+0x410/0x410 [i915]
<4> [188.936820] ? __fget_files+0xc2/0x1c0
<4> [188.936830] ? __fget_files+0xda/0x1c0
<4> [188.936839] __x64_sys_ioctl+0x6d/0xa0
<4> [188.936848] do_syscall_64+0x3a/0xb0
<4> [188.936859] entry_SYSCALL_64_after_hwframe+0x44/0xae
If the shared object has changed during the RCU unlocked period
callers will correctly handle the restart on the next iteration.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Fixes: 96601e8a4755 ("dma-buf: use new iterator in
dma_resv_copy_fences")
Fixes: d3c80698c9f5 ("dma-buf: use new iterator in
dma_resv_get_fences v3")
Closes:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Fintel%2F-%2Fissues%2F4274&data=04%7C01%7Cchristian.koenig%40amd.com%7Cc22feea06a3f4285cdac08d98a495984%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637692870805160909%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6oPR30sWnJe04I4GlhhvJWX3QvwKFIOMW1uOIyWZFOE%3D&reserved=0
Cc: Christian König <christian.koenig@xxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
Cc: linux-media@xxxxxxxxxxxxxxx
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Cc: linaro-mm-sig@xxxxxxxxxxxxxxxx
Maybe we should remove cursor->fences altogether, but either way the
patch is Reviewed-by: Christian König <christian.koenig@xxxxxxx>
Please push to drm-misc-next ASAP.
Not sure I can or if my push permissions are limited to Intel
branches. I can try once CI gives a green light.
If it doesn't work just ping me and I will push it.
It finally passed CI but it looks like you'll need to push it:
tursulin@tursulin-mobl2:~/wc/dim/src$ dim push-branch drm-misc-next
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 8 threads
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 2.32 KiB | 593.00 KiB/s, done.
Total 8 (delta 7), reused 0 (delta 0), pack-reused 0
error: remote unpack failed: unable to create temporary object directory
To ssh://git.freedesktop.org/git/drm/drm-misc
! [remote rejected] drm-misc-next -> drm-misc-next (unpacker error)
error: failed to push some refs to 'ssh://git.freedesktop.org/git/drm/drm-misc'
Regards,
Tvrtko