[AMD Official Use Only - General] Hi Christian, Thank you for the comment. This is not normal reset, it is reset done during unload for smu v_13_0_2. Thanks & Regards Asad -----Original Message----- From: Koenig, Christian <Christian.Koenig@xxxxxxx> Sent: Monday, January 8, 2024 1:33 PM To: Kamal, Asad <Asad.Kamal@xxxxxxx> Subject: Re: [PATCH] drm/amdgpu: Update irq disable flow during unload Am 05.01.24 um 16:21 schrieb Asad Kamal: > In certain special cases, e.g device reset before module unload, irq > gets disabled as part of reset sequence and won't get enabled back. > Add special check to cover such scenarios Well complete NAK to that. Resets shouldn't affect the IRQ state at all! If this is an issue then something else is broken. Regards, Christian. > > Signed-off-by: Asad Kamal <asad.kamal@xxxxxxx> > Suggested-by: Lijo Lazar <lijo.lazar@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 12 ++++++++++-- > drivers/gpu/drm/amd/amdgpu/soc15.c | 13 +++++++++++-- > 2 files changed, 21 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > index 372de9f1ce59..a4e1b9a58679 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c > @@ -2361,6 +2361,7 @@ static void gmc_v9_0_gart_disable(struct amdgpu_device *adev) > static int gmc_v9_0_hw_fini(void *handle) > { > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > + bool irq_release = true; > > gmc_v9_0_gart_disable(adev); > > @@ -2378,9 +2379,16 @@ static int gmc_v9_0_hw_fini(void *handle) > if (adev->mmhub.funcs->update_power_gating) > adev->mmhub.funcs->update_power_gating(adev, false); > > - amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0); > + if (adev->shutdown) > + irq_release = amdgpu_irq_enabled(adev, &adev->gmc.vm_fault, 0); > > - if (adev->gmc.ecc_irq.funcs && > + if (irq_release) > + amdgpu_irq_put(adev, &adev->gmc.vm_fault, 0); > + > + if (adev->shutdown) > + irq_release = amdgpu_irq_enabled(adev, &adev->gmc.ecc_irq, 0); > + > + if (adev->gmc.ecc_irq.funcs && irq_release && > amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC)) > amdgpu_irq_put(adev, &adev->gmc.ecc_irq, 0); > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 15033efec2ba..7ee835049d57 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -1266,6 +1266,7 @@ static int soc15_common_hw_init(void *handle) > static int soc15_common_hw_fini(void *handle) > { > struct amdgpu_device *adev = (struct amdgpu_device *)handle; > + bool irq_release = true; > > /* Disable the doorbell aperture and selfring doorbell aperture > * separately in hw_fini because soc15_enable_doorbell_aperture @@ > -1280,10 +1281,18 @@ static int soc15_common_hw_fini(void *handle) > > if (adev->nbio.ras_if && > amdgpu_ras_is_supported(adev, adev->nbio.ras_if->block)) { > - if (adev->nbio.ras && > + if (adev->shutdown) > + irq_release = amdgpu_irq_enabled(adev, > +&adev->nbio.ras_controller_irq, 0); > + > + if (adev->nbio.ras && irq_release && > adev->nbio.ras->init_ras_controller_interrupt) > amdgpu_irq_put(adev, &adev->nbio.ras_controller_irq, 0); > - if (adev->nbio.ras && > + > + if (adev->shutdown) > + irq_release = amdgpu_irq_enabled(adev, > + &adev->nbio.ras_err_event_athub_irq, 0); > + > + if (adev->nbio.ras && irq_release && > adev->nbio.ras->init_ras_err_event_athub_interrupt) > amdgpu_irq_put(adev, &adev->nbio.ras_err_event_athub_irq, 0); > }