[AMD Public Use]
Thanks Lyude for testing the patch.
Are you referring to this issue [1] ?
Is it reproducible after applying this patch as well ?
From: Lyude Paul <lyude@xxxxxxxxxx>
Sent: Friday, March 5, 2021 6:08 PM
To: Jacob, Anson <Anson.Jacob@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Subject: Re: [PATCH] drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
Tested-by: Lyude Paul <lyude@xxxxxxxxxx>
That just leaves the KASAN error from read_indirect_azalia_reg, thanks for the
fix!
On Thu, 2021-03-04 at 15:08 -0500, Anson Jacob wrote:
> If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up
> doing a shift operation where the number of bits shifted equals
> number of bits in the operand. This behaviour is undefined.
>
> Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the
> count is >= number of bits in the operand.
>
> Bug:
https://nam11.safelinks.protection.outlook.com/?url="">
> Reported-by: Lyude Paul <lyude@xxxxxxxxxx>
> Signed-off-by: Anson Jacob <Anson.Jacob@xxxxxxx>
> Reviewed-by: Alex Deucher <alexander.deucher@xxxxxxx>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
> ---
> .../drm/amd/amdkfd/kfd_device_queue_manager.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index c37e9c4b1fb4..e7a3c496237f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1128,6 +1128,9 @@ static int set_sched_resources(struct
> device_queue_manager *dqm)
>
> static int initialize_cpsch(struct device_queue_manager *dqm)
> {
> + uint64_t num_sdma_queues;
> + uint64_t num_xgmi_sdma_queues;
> +
> pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
>
> mutex_init(&dqm->lock_hidden);
> @@ -1136,8 +1139,18 @@ static int initialize_cpsch(struct device_queue_manager
> *dqm)
> dqm->active_cp_queue_count = 0;
> dqm->gws_queue_count = 0;
> dqm->active_runlist = false;
> - dqm->sdma_bitmap = ~0ULL >> (64 - get_num_sdma_queues(dqm));
> - dqm->xgmi_sdma_bitmap = ~0ULL >> (64 - get_num_xgmi_sdma_queues(dqm));
> +
> + num_sdma_queues = get_num_sdma_queues(dqm);
> + if (num_sdma_queues >= BITS_PER_TYPE(dqm->sdma_bitmap))
> + dqm->sdma_bitmap = ULLONG_MAX;
> + else
> + dqm->sdma_bitmap = (BIT_ULL(num_sdma_queues) - 1);
> +
> + num_xgmi_sdma_queues = get_num_xgmi_sdma_queues(dqm);
> + if (num_xgmi_sdma_queues >= BITS_PER_TYPE(dqm->xgmi_sdma_bitmap))
> + dqm->xgmi_sdma_bitmap = ULLONG_MAX;
> + else
> + dqm->xgmi_sdma_bitmap = (BIT_ULL(num_xgmi_sdma_queues) - 1);
>
> INIT_WORK(&dqm->hw_exception_work, kfd_process_hw_exception);
>
--
Sincerely,
Lyude Paul (she/her)
Software Engineer at Red Hat
Note: I deal with a lot of emails and have a lot of bugs on my plate. If you've
asked me a question, are waiting for a review/merge on a patch, etc. and I
haven't responded in a while, please feel free to send me another email to check
on my status. I don't bite!
|
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx