On 11/28/2024 8:12 PM, Felix Kuehling wrote: > > > On 2024-11-27 22:45, Lazar, Lijo wrote: >> >> >> On 11/28/2024 5:43 AM, Felix Kuehling wrote: >>> >>> On 2024-11-18 00:34, Lijo Lazar wrote: >>>> Write pointer could be 32-bit or 64-bit. Use the correct size during >>>> initialization. >>>> >>>> Signed-off-by: Lijo Lazar <lijo.lazar@xxxxxxx> >>>> --- >>>> drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/ >>>> gpu/drm/amd/amdkfd/kfd_kernel_queue.c >>>> index 4843dcb9a5f7..d6037577c532 100644 >>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c >>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c >>>> @@ -125,7 +125,7 @@ static bool kq_initialize(struct kernel_queue *kq, >>>> struct kfd_node *dev, >>>> memset(kq->pq_kernel_addr, 0, queue_size); >>>> memset(kq->rptr_kernel, 0, sizeof(*kq->rptr_kernel)); >>>> - memset(kq->wptr_kernel, 0, sizeof(*kq->wptr_kernel)); >>>> + memset(kq->wptr_kernel, 0, dev->kfd->device_info.doorbell_size); >>> >>> It would be safer and cleaner to just initialize kq->wptr64_kernel, >>> which is always 64 bit and aliases kq->wptr_kernel. >>> >> >> It's done this way because of aliasing. The size requested is >> doorbell_size which could be 4 or 8 bytes. >> >> By safer, do you mean to have an if..else check in case the aliasing is >> taken out? > > Cleaner because the size of the memset would match the size of the variable. And safer because it clears the entire wptr, regardless of the doorbell size. > > That said, it doesn't really matter because the whole kq structure is allocated with kzalloc just before calling kq_initialize. So maybe we should just remove all those redundant memset(kq->field, 0, size) calls. > This is not related to kzalloc of kq structure. This is actually initializing write pointer value to 0. The sequence is - Allocate memory for write pointer (kfd_gtt_sa_allocate) Assign kq wptr to the allocated pointer Set wptr to 0 (Initial write pointer value). The last step should actually be based on 4 byte or 8byte, this patch is for that. If gtt_sa_allocate is already getting a cleared memory, then this can be skipped as well. Thanks, Lijo > Regards, > Felix > >> >> Thanks, >> Lijo >> >>> Regards, >>> Felix >>> >>> >>> >>>> prop.queue_size = queue_size; >>>> prop.is_interop = false; >> >