Re: [PATCH] drm/amdgpu: Update irq disable flow during unload

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 08.01.24 um 09:13 schrieb Kamal, Asad:
[AMD Official Use Only - General]

Hi Christian,

Thank you for the comment.

This is not normal reset, it is reset done during unload for smu v_13_0_2.

Yeah, but this doesn't explain the rational for this.

IRQ enable/disable should be balanced in hw_init()/hw_fini(), independent of what else you do.

I'm not sure what you are trying to solve but this here is a complete no-go.

Regards,
Christian.


Thanks & Regards
Asad

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@xxxxxxx>
Sent: Monday, January 8, 2024 1:33 PM
To: Kamal, Asad <Asad.Kamal@xxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: Update irq disable flow during unload

Am 05.01.24 um 16:21 schrieb Asad Kamal:
In certain special cases, e.g device reset before module unload, irq
gets disabled as part of reset sequence and won't get enabled back.
Add special check to cover such scenarios
Well complete NAK to that. Resets shouldn't affect the IRQ state at all!

If this is an issue then something else is broken.

Regards,
Christian.

Signed-off-by: Asad Kamal <asad.kamal@xxxxxxx>
Suggested-by: Lijo Lazar <lijo.lazar@xxxxxxx>
---
   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 12 ++++++++++--
   drivers/gpu/drm/amd/amdgpu/soc15.c    | 13 +++++++++++--
   2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 372de9f1ce59..a4e1b9a58679 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -2361,6 +2361,7 @@ static void gmc_v9_0_gart_disable(struct amdgpu_device *adev)
   static int gmc_v9_0_hw_fini(void *handle)
   {
       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+     bool irq_release = true;

       gmc_v9_0_gart_disable(adev);

@@ -2378,9 +2379,16 @@ static int gmc_v9_0_hw_fini(void *handle)
       if (adev->mmhub.funcs->update_power_gating)
               adev->mmhub.funcs->update_power_gating(adev, false);

-     amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0);
+     if (adev->shutdown)
+             irq_release = amdgpu_irq_enabled(adev, &adev->gmc.vm_fault, 0);

-     if (adev->gmc.ecc_irq.funcs &&
+     if (irq_release)
+             amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0);
+
+     if (adev->shutdown)
+             irq_release = amdgpu_irq_enabled(adev, &adev->gmc.ecc_irq, 0);
+
+     if (adev->gmc.ecc_irq.funcs && irq_release &&
               amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC))
               amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0);

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 15033efec2ba..7ee835049d57 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1266,6 +1266,7 @@ static int soc15_common_hw_init(void *handle)
   static int soc15_common_hw_fini(void *handle)
   {
       struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+     bool irq_release = true;

       /* Disable the doorbell aperture and selfring doorbell aperture
        * separately in hw_fini because soc15_enable_doorbell_aperture @@
-1280,10 +1281,18 @@ static int soc15_common_hw_fini(void *handle)

       if (adev->nbio.ras_if &&
           amdgpu_ras_is_supported(adev, adev->nbio.ras_if->block)) {
-             if (adev->nbio.ras &&
+             if (adev->shutdown)
+                     irq_release = amdgpu_irq_enabled(adev,
+&adev->nbio.ras_controller_irq, 0);
+
+             if (adev->nbio.ras && irq_release &&
                   adev->nbio.ras->init_ras_controller_interrupt)
                       amdgpu_irq_put(adev, &adev->nbio.ras_controller_irq, 0);
-             if (adev->nbio.ras &&
+
+             if (adev->shutdown)
+                     irq_release = amdgpu_irq_enabled(adev,
+                                     &adev->nbio.ras_err_event_athub_irq, 0);
+
+             if (adev->nbio.ras && irq_release &&
                   adev->nbio.ras->init_ras_err_event_athub_interrupt)
                       amdgpu_irq_put(adev, &adev->nbio.ras_err_event_athub_irq, 0);
       }




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux