Re: [PATCH 3/3] drm/amdgpu: enable only one compute queue for raven

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/9/20 7:57 PM, Alex Deucher wrote:
On Mon, Nov 9, 2020 at 1:12 PM Nirmoy Das <nirmoy.das@xxxxxxx> wrote:
Because of firmware bug, Raven asics can't handle jobs
scheduled to multiple compute queues. So enable only one
compute queue till we have a firmware fix.

Signed-off-by: Nirmoy Das <nirmoy.das@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 97a8f786cf85..9352fcb77fe9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -812,6 +812,13 @@ void amdgpu_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
  int amdgpu_gfx_get_num_kcq(struct amdgpu_device *adev)
  {
         if (amdgpu_num_kcq == -1) {
+               /* raven firmware currently can not load balance jobs
+                * among multiple compute queues. Enable only one
+                * compute queue till we have a firmware fix.
+                */
+               if (adev->asic_type == CHIP_RAVEN)
+                       return 1;
+


Hi Alex,


I think this is fine as a workaround for now, but it would be worth
checking is the issues are only between queues on the same pipe or
pipes on an MEC.  E.g., can we safely enable one queue per MEC?  What
about one queue per pipe?


Guchun/Aaron's test machine with a recent VBIOS(113-PICASSO-117) seems to

pass amdgpu_test with one compute queue.


I can reproduce the compute queue hang even with one queue.

With all queue enabled, the issue seems to appear much faster.

So I think those above cases won't change anything with my test

machine which is running older VBIOS(113-PICASSO-115).


I will try to find a test machine with latest VBIOS to test your suggestions.


Regards,

Nirmoy


Alex


                 return 8;
         } else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
                 dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid parameter provided by user\n");
--
2.29.0

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cnirmoy.das%40amd.com%7C5fee9c8359df4f41653508d884e162b3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637405450853281240%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EKGmSryJhXMhWpo2XeT%2FTThcuv99%2BPAZ8MV%2Ff6sgmfo%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux