RE: [PATCH] drm/amdkfd: Fix shift out-of-bounds issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

-----Original Message-----
From: Chen, Xiaogang <Xiaogang.Chen@xxxxxxx>
Sent: Thursday, October 19, 2023 12:57 AM
To: Yang, Philip <Philip.Yang@xxxxxxx>; Zhang, Yifan <Yifan1.Zhang@xxxxxxx>; Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Yang, Philip <Philip.Yang@xxxxxxx>; Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Subject: Re: [PATCH] drm/amdkfd: Fix shift out-of-bounds issue


On 10/18/2023 9:53 AM, Philip Yang wrote:
>
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> The 255 granularity is from recent Thunk change to increase CWSR area
> granularity.
>
I think we also need to do same restriction at correspondent parts in Thunk when set range granularity. Setting 255 granularity is far big as granularity here is log value of page number.

Regards

Xiaogang

> Thanks for catching this with kernel debug option
> CC_HAS_UBSAN_ARRAY_BOUNDS enabled. Because 1<<prange->granularity is
> used in many places, I think the proper fix should be in function
> svm_range_apply_attrs
>
>         case KFD_IOCTL_SVM_ATTR_GRANULARITY:
> -            prange->granlarity = attrs[i].value;
>
> +            prange->granlarity = attrs[i].value & 0x3F;
>
> BTW, function svm_range_split_by_granularity() is not used anymore,
> forgot the remove it, maybe you are testing on older source code?
>[Zhang, Jesse(Jie)] Thanks Philip, yes, my code is a bit older and  I will remove the unused code.
        Thanks
        Jesse
> Regards,
>
> Philip
>
> On 2023-10-18 09:36, Zhang, Yifan wrote:
>> [AMD Official Use Only - General]
>>
>> Hi Jesse,
>>
>> This patch is only a WA for the error log. How is this issue reproduced ? 255 looks like an invalid value for a prange->granularity, it is better to root cause who set it in the first place.
>>
>> BRs,
>> Yifan
>>
>> -----Original Message-----
>> From: Jesse Zhang<jesse.zhang@xxxxxxx>
>> Sent: Wednesday, October 18, 2023 3:50 PM
>> To:amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>> Cc: Deucher, Alexander<Alexander.Deucher@xxxxxxx>; Kuehling,
>> Felix<Felix.Kuehling@xxxxxxx>; Yang, Philip<Philip.Yang@xxxxxxx>;
>> Zhang, Yifan<Yifan1.Zhang@xxxxxxx>; Zhang,
>> Jesse(Jie)<Jesse.Zhang@xxxxxxx>; Zhang,
>> Jesse(Jie)<Jesse.Zhang@xxxxxxx>
>> Subject: [PATCH] drm/amdkfd: Fix shift out-of-bounds issue
>>
>> [  567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int'
>> [  567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G           OE      6.2.0-34-generic #34~22.04.1-Ubuntu
>> [  567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [  567.614504] Workqueue: events send_exception_work_handler [amdgpu] [  567.614748] Call Trace:
>> [  567.614750]  <TASK>
>> [  567.614753]  dump_stack_lvl+0x48/0x70 [  567.614761]
>> dump_stack+0x10/0x20 [  567.614763]
>> __ubsan_handle_shift_out_of_bounds+0x156/0x310
>> [  567.614769]  ? srso_alias_return_thunk+0x5/0x7f [  567.614773]  ?
>> update_sd_lb_stats.constprop.0+0xf2/0x3c0
>> [  567.614780]  svm_range_split_by_granularity.cold+0x2b/0x34
>> [amdgpu] [  567.615047]  ? srso_alias_return_thunk+0x5/0x7f [
>> 567.615052]  svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [  567.615286]
>> do_swap_page+0x7b6/0xa30 [  567.615291]  ?
>> srso_alias_return_thunk+0x5/0x7f [  567.615294]  ?
>> __free_pages+0x119/0x130 [  567.615299]  handle_pte_fault+0x227/0x280
>> [  567.615303]  __handle_mm_fault+0x3c0/0x720 [  567.615311]
>> handle_mm_fault+0x119/0x330 [  567.615314]  ?
>> lock_mm_and_find_vma+0x44/0x250 [  567.615318]
>> do_user_addr_fault+0x1a9/0x640 [  567.615323]
>> exc_page_fault+0x81/0x1b0 [  567.615328]
>> asm_exc_page_fault+0x27/0x30 [  567.615332] RIP:
>> 0010:__get_user_8+0x1c/0x30
>>
>> Signed-off-by: Jesse Zhang<Jesse.Zhang@xxxxxxx>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 7b81233bc9ae..f5e0bccc6d71 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -1169,7 +1169,7 @@ svm_range_split_by_granularity(struct kfd_process *p, struct mm_struct *mm,
>>           * PTE will be used for whole range, this reduces the number of PTE
>>           * updated and the L1 TLB space used for translation.
>>           */
>> -       size = 1UL << prange->granularity;
>> +       size = 1UL << (prange->granularity & 0x3f);
>>          start = ALIGN_DOWN(addr, size);
>>          last = ALIGN(addr + 1, size) - 1;
>>
>> --
>> 2.25.1
>>




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux