There is a function get_mec_num use the field , but seems no one call it , maybe remove it as well. Regards Shaoyun.liu -----Original Message----- From: amd-gfx [mailto:amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Andres Rodriguez Sent: Thursday, July 13, 2017 3:54 PM To: Kuehling, Felix; Jay Cornwall; amd-gfx at lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix KFD oversubscription by tracking queues correctly On 2017-07-13 03:35 PM, Felix Kuehling wrote: > On 17-07-13 03:15 PM, Jay Cornwall wrote: >> On Thu, Jul 13, 2017, at 13:36, Andres Rodriguez wrote: >>> On 2017-07-12 02:26 PM, Jay Cornwall wrote: >>>> The number of compute queues available to the KFD was erroneously >>>> calculated as 64. Only the first MEC can execute compute queues and >>>> it has 32 queue slots. >>>> >>>> This caused the oversubscription limit to be calculated >>>> incorrectly, leading to a missing chained runlist command at the >>>> end of an oversubscribed runlist. >>>> >>>> Change-Id: Ic4a139c04b8a6d025fbb831a0a67e98728bfe461 >>>> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> index 7060daf..aa4006a 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> @@ -140,7 +140,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) >>>> /* According to linux/bitmap.h we shouldn't use bitmap_clear if >>>> * nbits is not compile time constant >>>> */ >>>> - last_valid_bit = adev->gfx.mec.num_mec >>>> + last_valid_bit = 1 /* only first MEC can have compute queues */ >>> Hey Jay, >>> >>> Minor nitpick. We already have some similar resource patching in >>> kgd2kfd_device_init(), and I think it would be good to keep all of >>> these together. >> OK. I see shared_resources.num_mec is set to 1 in kgd2kfd_device_init. >> That's not very clear (the number of MECs doesn't change) and num_mec >> doesn't appear to be used anywhere except in dead code in kfd_device.c. >> That code also runs after the queue bitmap setup. >> >> How about I remove that field entirely? > Yeah, that's fine with me. > Good with me as well. _______________________________________________ amd-gfx mailing list amd-gfx at lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx