yeah that is a well known issue but actually completely harmless.
What happens is that a trace function accesses a stale pointer to print some additional value into the trace log.
That memory might have been reused and the information is now outdated, but the worst thing that can happen is that the value in the logs is nonsense.
I have a patch in the queue to fix this, should be upstream and backported in the next few weeks.
Regards,
Christian.
Am 29.04.24 um 04:15 schrieb Joonkyo
Jung:
Hi,
Thank you for patching two of the bugs we have reported!
I was just wondering if there's any news on the one other bug we have reported:
BUG: KASAN: slab-use-after-free in amdgpu_bo_move+0x1479/0x1550.
I see that there is a gitlab issue(https://gitlab.freedesktop.org/drm/amd/-/issues/3171) created for this bug,
and there also is a patch(https://lists.freedesktop.org/archives/amd-gfx/2024-March/105680.html) that Christian made for this.
Though, it seems that the issue is not resolved yet, and the patch is not yet pushed to mainstream branches.
So I was wondering, do you have any plans for pushing this patch? If so, would it be possible for us to get a Reported-by tag on the patch?
Best,
Joonkyo
On Fri, Mar 8, 2024 at 4:32 PM Joonkyo Jung <joonkyoj@xxxxxxxxxxxx> wrote:
Hi Vitaly,
No worries, thank you for working on the patches!
I have also confirmed that with the inflight patch, issue No.1 (use-after-free) seems to be resolved.
However, I have reproduced issue No.3 (slab-use-after-free) even with the patch for issue No.1 applied - if it's the first program tested after reboot.
(i.e., if any other bugs are tested before the slab-use-after-free, it does not reproduce).
Could you check if the bug reproduces in this condition for you too?
I will check and see why this is happening and update you if I have something new.
Thank you!
Best,
Joonkyo
On Fri, Mar 8, 2024 at 12:45 PM vitaly prosyak <vprosyak@xxxxxxx> wrote:
Hi Joonkyo,
Sorry for the delay.
Yes, sure, I reproduced issue 2 (null-ptr-deref in amdgpu) and I will provide the fix soon.
However, issue No. 3 is no longer reproducible if the recent patch inflight is applied which fixes issue No 1.Do you see the same behavior?
Thanks in advance, Vitaly
On 2024-03-07 20:18, Joonkyo Jung wrote:
Hello,thank you for patching the first bug we have sent!
Just a quick touch base with you, to ask if there has been any update on our other two bugs.They were each sent with emails titled"Reporting a slab-use-after-free in amdgpu" (this one)"Reporting a null-ptr-deref in amdgpu".
Thank you!
Best,Joonkyo
2024년 2월 16일 (금) 오후 6:22, Joonkyo Jung <joonkyoj@xxxxxxxxxxxx>님이 작성:
Hello,
We would like to report a slab-use-after-free bug in the AMDGPU DRM driver in the linux kernel v6.8-rc4 that we found with our customized Syzkaller.
The bug can be triggered by sending two ioctls to the AMDGPU DRM driver in succession.
In amdgpu_bo_move, struct ttm_resource *old_mem = bo->resource is assigned.
As you can see on the alloc & free stack calls, on the same function amdgpu_bo_move,
amdgpu_move_blit in the end frees bo->resource at ttm_bo_move_accel_cleanup with ttm_bo_wait_free_node(bo, man->use_tt).
But amdgpu_bo_move continues after that, reaching trace_amdgpu_bo_move(abo, new_mem->mem_type, old_mem->mem_type) at the end, causing the use-after-free bug.
Steps to reproduce are as below.
union drm_amdgpu_gem_create *arg1;
arg1 = malloc(sizeof(union drm_amdgpu_gem_create));
arg1->in.bo_size = 0x8;
arg1->in.alignment = 0x0;
arg1->in.domains = 0x4;
arg1->in.domain_flags = 0x9;
ioctl(fd, 0xc0206440, arg1);
arg1->in.bo_size = 0x7fffffff;
arg1->in.alignment = 0x0;
arg1->in.domains = 0x4;
arg1->in.domain_flags = 0x9;
ioctl(fd, 0xc0206440, arg1);
The KASAN report is as follows:
==================================================================
BUG: KASAN: slab-use-after-free in amdgpu_bo_move+0x1479/0x1550
Read of size 4 at addr ffff88800f5bee80 by task syz-executor/219
Call Trace:
<TASK>
amdgpu_bo_move+0x1479/0x1550
ttm_bo_handle_move_mem+0x4d0/0x700
ttm_mem_evict_first+0x945/0x1230
ttm_bo_mem_space+0x6c7/0x940
ttm_bo_validate+0x286/0x650
ttm_bo_init_reserved+0x34c/0x490
amdgpu_bo_create+0x94b/0x1610
amdgpu_bo_create_user+0xa3/0x130
amdgpu_gem_create_ioctl+0x4bc/0xc10
drm_ioctl_kernel+0x300/0x410
drm_ioctl+0x648/0xb30
amdgpu_drm_ioctl+0xc8/0x160
</TASK>
Allocated by task 219:
kmalloc_trace+0x211/0x390
amdgpu_vram_mgr_new+0x1d6/0xbe0
ttm_resource_alloc+0xfd/0x1e0
ttm_bo_mem_space+0x255/0x940
ttm_bo_validate+0x286/0x650
ttm_bo_init_reserved+0x34c/0x490
amdgpu_bo_create+0x94b/0x1610
amdgpu_bo_create_user+0xa3/0x130
amdgpu_gem_create_ioctl+0x4bc/0xc10
drm_ioctl_kernel+0x300/0x410
drm_ioctl+0x648/0xb30
amdgpu_drm_ioctl+0xc8/0x160
Freed by task 219:
kfree+0x111/0x2d0
ttm_resource_free+0x17e/0x1e0
ttm_bo_move_accel_cleanup+0x77e/0x9b0
amdgpu_move_blit+0x3db/0x670
amdgpu_bo_move+0xfa2/0x1550
ttm_bo_handle_move_mem+0x4d0/0x700
ttm_mem_evict_first+0x945/0x1230
ttm_bo_mem_space+0x6c7/0x940
ttm_bo_validate+0x286/0x650
ttm_bo_init_reserved+0x34c/0x490
amdgpu_bo_create+0x94b/0x1610
amdgpu_bo_create_user+0xa3/0x130
amdgpu_gem_create_ioctl+0x4bc/0xc10
drm_ioctl_kernel+0x300/0x410
drm_ioctl+0x648/0xb30
amdgpu_drm_ioctl+0xc8/0x160
The buggy address belongs to the object at ffff88800f5bee70
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 16 bytes inside of
freed 96-byte region [ffff88800f5bee70, ffff88800f5beed0)
Should you need any more information, please do not hesitate to contact us.
Best regards,
Joonkyo Jung