[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Mukul Joshi <mukul.joshi@xxxxxxx> > -----Original Message----- > From: SHANMUGAM, SRINIVASAN <SRINIVASAN.SHANMUGAM@xxxxxxx> > Sent: Monday, October 28, 2024 11:12 AM > To: Koenig, Christian <Christian.Koenig@xxxxxxx>; Deucher, Alexander > <Alexander.Deucher@xxxxxxx>; Joshi, Mukul <Mukul.Joshi@xxxxxxx> > Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Kasiviswanathan, Harish > <Harish.Kasiviswanathan@xxxxxxx>; Kuehling, Felix <Felix.Kuehling@xxxxxxx> > Subject: Re: [PATCH v4] drm/amdkfd: Use dynamic allocation for CU occupancy > array in 'kfd_get_cu_occupancy()' > > Ping? > > On 10/25/2024 8:13 AM, Srinivasan Shanmugam wrote: > > The `kfd_get_cu_occupancy` function previously declared a large > > `cu_occupancy` array as a local variable, which could lead to stack > > overflows due to excessive stack usage. This commit replaces the > > static array allocation with dynamic memory allocation using > > `kcalloc`, thereby reducing the stack size. > > > > This change avoids the risk of stack overflows in kernel space, in > > scenarios where `AMDGPU_MAX_QUEUES` is large. The allocated memory is > > freed using `kfree` before the function returns to prevent memory > > leaks. > > > > Fixes the below with gcc W=1: > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c: In function > ‘kfd_get_cu_occupancy’: > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:322:1: warning: the frame > size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=] > > 322 | } > > | ^ > > > > Fixes: 6fc91266b798 ("drm/amdkfd: Update logic for CU occupancy > > calculations") > > Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@xxxxxxx> > > Cc: Felix Kuehling <felix.kuehling@xxxxxxx> > > Cc: Christian König <christian.koenig@xxxxxxx> > > Cc: Alex Deucher <alexander.deucher@xxxxxxx> > > Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@xxxxxxx> > > Suggested-by: Mukul Joshi <mukul.joshi@xxxxxxx> > > --- > > v4: > > - Allocation is moved just before it's needed (Mukul) > > > > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 9 ++++++--- > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > index d4aa843aacfd..6bab6fc6a35d 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > > @@ -271,11 +271,9 @@ static int kfd_get_cu_occupancy(struct attribute *attr, > char *buffer) > > struct kfd_process *proc = NULL; > > struct kfd_process_device *pdd = NULL; > > int i; > > - struct kfd_cu_occupancy cu_occupancy[AMDGPU_MAX_QUEUES]; > > + struct kfd_cu_occupancy *cu_occupancy; > > u32 queue_format; > > > > - memset(cu_occupancy, 0x0, sizeof(cu_occupancy)); > > - > > pdd = container_of(attr, struct kfd_process_device, attr_cu_occupancy); > > dev = pdd->dev; > > if (dev->kfd2kgd->get_cu_occupancy == NULL) @@ -293,6 +291,10 @@ > > static int kfd_get_cu_occupancy(struct attribute *attr, char *buffer) > > wave_cnt = 0; > > max_waves_per_cu = 0; > > > > + cu_occupancy = kcalloc(AMDGPU_MAX_QUEUES, > sizeof(*cu_occupancy), GFP_KERNEL); > > + if (!cu_occupancy) > > + return -ENOMEM; > > + > > /* > > * For GFX 9.4.3, fetch the CU occupancy from the first XCC in the partition. > > * For AQL queues, because of cooperative dispatch we multiply the > > wave count @@ -318,6 +320,7 @@ static int kfd_get_cu_occupancy(struct > > attribute *attr, char *buffer) > > > > /* Translate wave count to number of compute units */ > > cu_cnt = (wave_cnt + (max_waves_per_cu - 1)) / max_waves_per_cu; > > + kfree(cu_occupancy); > > return snprintf(buffer, PAGE_SIZE, "%d\n", cu_cnt); > > } > >