Re: Reporting a slab-use-after-free in amdgpu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi guys,

yeah that is a well known issue but actually completely harmless.

What happens is that a trace function accesses a stale pointer to print some additional value into the trace log.

That memory might have been reused and the information is now outdated, but the worst thing that can happen is that the value in the logs is nonsense.

I have a patch in the queue to fix this, should be upstream and backported in the next few weeks.

Regards,
Christian.

Am 29.04.24 um 04:15 schrieb Joonkyo Jung:
Hi,

Thank you for patching two of the bugs we have reported!
I was just wondering if there's any news on the one other bug we have reported:
BUG: KASAN: slab-use-after-free in amdgpu_bo_move+0x1479/0x1550.

I see that there is a gitlab issue(https://gitlab.freedesktop.org/drm/amd/-/issues/3171) created for this bug,
and there also is a patch(https://lists.freedesktop.org/archives/amd-gfx/2024-March/105680.html) that Christian made for this.
Though, it seems that the issue is not resolved yet, and the patch is not yet pushed to mainstream branches.
So I was wondering, do you have any plans for pushing this patch? If so, would it be possible for us to get a Reported-by tag on the patch?

Best,
Joonkyo

On Fri, Mar 8, 2024 at 4:32 PM Joonkyo Jung <joonkyoj@xxxxxxxxxxxx> wrote:
Hi Vitaly,

No worries, thank you for working on the patches!

I have also confirmed that with the inflight patch, issue No.1 (use-after-free) seems to be resolved.
However, I have reproduced issue No.3 (slab-use-after-free) even with the patch for issue No.1 applied - if it's the first program tested after reboot.
(i.e., if any other bugs are tested before the slab-use-after-free, it does not reproduce).

Could you check if the bug reproduces in this condition for you too?
I will check and see why this is happening and update you if I have something new.

Thank you!

Best,
Joonkyo



On Fri, Mar 8, 2024 at 12:45 PM vitaly prosyak <vprosyak@xxxxxxx> wrote:

Hi Joonkyo,
Sorry for the delay.
Yes, sure, I reproduced issue 2 (null-ptr-deref in amdgpu) and I will provide the fix soon.
However, issue No. 3 is no longer reproducible if the recent patch inflight is applied which fixes issue No 1.

Do you see the same behavior?

Thanks in advance, Vitaly

On 2024-03-07 20:18, Joonkyo Jung wrote:
Hello, 
thank you for patching the first bug we have sent!

Just a quick touch base with you, to ask if there has been any update on our other two bugs.
They were each sent with emails titled 
"Reporting a slab-use-after-free in amdgpu" (this one)
"Reporting a null-ptr-deref in amdgpu". 

Thank you! 

Best, 
Joonkyo


2024년 2월 16일 (금) 오후 6:22, Joonkyo Jung <joonkyoj@xxxxxxxxxxxx>님이 작성:
Hello,

We would like to report a slab-use-after-free bug in the AMDGPU DRM driver in the linux kernel v6.8-rc4 that we found with our customized Syzkaller.
The bug can be triggered by sending two ioctls to the AMDGPU DRM driver in succession.

In amdgpu_bo_move, struct ttm_resource *old_mem = bo->resource is assigned.
As you can see on the alloc & free stack calls, on the same function amdgpu_bo_move,
amdgpu_move_blit in the end frees bo->resource at ttm_bo_move_accel_cleanup with ttm_bo_wait_free_node(bo, man->use_tt).
But amdgpu_bo_move continues after that, reaching trace_amdgpu_bo_move(abo, new_mem->mem_type, old_mem->mem_type) at the end, causing the use-after-free bug.

Steps to reproduce are as below.
union drm_amdgpu_gem_create *arg1;

arg1 = malloc(sizeof(union drm_amdgpu_gem_create));
arg1->in.bo_size = 0x8;
arg1->in.alignment = 0x0;
arg1->in.domains = 0x4;
arg1->in.domain_flags = 0x9;
ioctl(fd, 0xc0206440, arg1);

arg1->in.bo_size = 0x7fffffff;
arg1->in.alignment = 0x0;
arg1->in.domains = 0x4;
arg1->in.domain_flags = 0x9;
ioctl(fd, 0xc0206440, arg1);

The KASAN report is as follows:
==================================================================
BUG: KASAN: slab-use-after-free in amdgpu_bo_move+0x1479/0x1550
Read of size 4 at addr ffff88800f5bee80 by task syz-executor/219
Call Trace:
 <TASK>
 amdgpu_bo_move+0x1479/0x1550
 ttm_bo_handle_move_mem+0x4d0/0x700
 ttm_mem_evict_first+0x945/0x1230
 ttm_bo_mem_space+0x6c7/0x940
 ttm_bo_validate+0x286/0x650
 ttm_bo_init_reserved+0x34c/0x490
 amdgpu_bo_create+0x94b/0x1610
 amdgpu_bo_create_user+0xa3/0x130
 amdgpu_gem_create_ioctl+0x4bc/0xc10
 drm_ioctl_kernel+0x300/0x410
 drm_ioctl+0x648/0xb30
 amdgpu_drm_ioctl+0xc8/0x160
 </TASK>

Allocated by task 219:
 kmalloc_trace+0x211/0x390
 amdgpu_vram_mgr_new+0x1d6/0xbe0
 ttm_resource_alloc+0xfd/0x1e0
 ttm_bo_mem_space+0x255/0x940
 ttm_bo_validate+0x286/0x650
 ttm_bo_init_reserved+0x34c/0x490
 amdgpu_bo_create+0x94b/0x1610
 amdgpu_bo_create_user+0xa3/0x130
 amdgpu_gem_create_ioctl+0x4bc/0xc10
 drm_ioctl_kernel+0x300/0x410
 drm_ioctl+0x648/0xb30
 amdgpu_drm_ioctl+0xc8/0x160

Freed by task 219:
 kfree+0x111/0x2d0
 ttm_resource_free+0x17e/0x1e0
 ttm_bo_move_accel_cleanup+0x77e/0x9b0
 amdgpu_move_blit+0x3db/0x670
 amdgpu_bo_move+0xfa2/0x1550
 ttm_bo_handle_move_mem+0x4d0/0x700
 ttm_mem_evict_first+0x945/0x1230
 ttm_bo_mem_space+0x6c7/0x940
 ttm_bo_validate+0x286/0x650
 ttm_bo_init_reserved+0x34c/0x490
 amdgpu_bo_create+0x94b/0x1610
 amdgpu_bo_create_user+0xa3/0x130
 amdgpu_gem_create_ioctl+0x4bc/0xc10
 drm_ioctl_kernel+0x300/0x410
 drm_ioctl+0x648/0xb30
 amdgpu_drm_ioctl+0xc8/0x160

The buggy address belongs to the object at ffff88800f5bee70
 which belongs to the cache kmalloc-96 of size 96
The buggy address is located 16 bytes inside of
 freed 96-byte region [ffff88800f5bee70, ffff88800f5beed0)

Should you need any more information, please do not hesitate to contact us.

Best regards,
Joonkyo Jung


[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux