[AMD Official Use Only - Internal Distribution Only] >-----Original Message----- >From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Liu, >Monk >Sent: Tuesday, August 4, 2020 2:31 PM >To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx >Subject: RE: [PATCH 1/2] drm/amdgpu: fix reload KMD hang on KIQ > >[AMD Official Use Only - Internal Distribution Only] > >[AMD Official Use Only - Internal Distribution Only] > >Ping ... this is a severe bug fix > >_____________________________________ >Monk Liu|GPU Virtualization Team |AMD > > >-----Original Message----- >From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Liu, >Monk >Sent: Monday, August 3, 2020 9:55 AM >To: Kuehling, Felix <Felix.Kuehling@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx >Subject: RE: [PATCH 1/2] drm/amdgpu: fix reload KMD hang on KIQ > >[AMD Official Use Only - Internal Distribution Only] > >[AMD Official Use Only - Internal Distribution Only] > >>>In gfx_v10_0_sw_fini the KIQ ring gets freed. Wouldn't that be the >>>right place to stop the KIQ > >KIQ (CPC) will never being stopped (the DISABLE on CPC is skipped for SRIOV ) >for SRIOV in SW_FINI because SRIOV relies on KIQ to do world switch > >But this is really a weird bug because even with the same approach it doesn't >make KIQ (CP) hang on GFX9, only GFX10 need this patch .... > >By now I do not have other good explanation or better fix than this one > >_____________________________________ >Monk Liu|GPU Virtualization Team |AMD > > >-----Original Message----- >From: Kuehling, Felix <Felix.Kuehling@xxxxxxx> >Sent: Friday, July 31, 2020 9:57 PM >To: Liu, Monk <Monk.Liu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx >Subject: Re: [PATCH 1/2] drm/amdgpu: fix reload KMD hang on KIQ > >In gfx_v10_0_sw_fini the KIQ ring gets freed. Wouldn't that be the right place >to stop the KIQ? Otherwise KIQ will hang as soon as someone allocates the >memory that was previously used for the KIQ ring buffer and overwrites it with >something that's not a valid PM4 packet. > >Regards, > Felix > >Am 2020-07-31 um 3:51 a.m. schrieb Monk Liu: >> KIQ will hang if we try below steps: >> modprobe amdgpu >> rmmod amdgpu >> modprobe amdgpu sched_hw_submission=4 >> >> the cause is that due to KIQ is always living there even after we >> unload KMD thus when doing the realod of KMD KIQ will crash upon its >> register programed with different values with the previous >> configuration (the config like HQD addr, ring size, is easily changed >> if we alter the sched_hw_submission) >> >> the fix is we must inactive KIQ first before touching any of its >> registgers >> >> Signed-off-by: Monk Liu <Monk.Liu@xxxxxxx> >> --- >> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c >> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c >> index db9f1e8..f571e25 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c >> @@ -6433,6 +6433,9 @@ static int gfx_v10_0_kiq_init_register(struct >> amdgpu_ring *ring) struct v10_compute_mqd *mqd = ring->mqd_ptr; int >> j; >> >> +/* activate the queue */ >> +WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, 0); >> + Could we move follow to here? if (RREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE) & 1) { WREG32_SOC15(GC, 0, mmCP_HQD_DEQUEUE_REQUEST, 1); for (j = 0; j < adev->usec_timeout; j++) { if (!(RREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE) & 1)) break; udelay(1); } >> /* disable wptr polling */ >> WREG32_FIELD15(GC, 0, CP_PQ_WPTR_POLL_CNTL, EN, 0); >> >_______________________________________________ >amd-gfx mailing list >amd-gfx@xxxxxxxxxxxxxxxxxxxxx >https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.fre >edesktop.org%2Fmailman%2Flistinfo%2Famd- >gfx&data=02%7C01%7CEmily.Deng%40amd.com%7C1236f42617d246b20 >bc108d8384007e4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7 >C637321194957236933&sdata=0%2BzHvJ1n4TZOYss4v1pR6i8bxq46JE73 >UIi%2B49x9joU%3D&reserved=0 >_______________________________________________ >amd-gfx mailing list >amd-gfx@xxxxxxxxxxxxxxxxxxxxx >https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.fre >edesktop.org%2Fmailman%2Flistinfo%2Famd- >gfx&data=02%7C01%7CEmily.Deng%40amd.com%7C1236f42617d246b20 >bc108d8384007e4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7 >C637321194957236933&sdata=0%2BzHvJ1n4TZOYss4v1pR6i8bxq46JE73 >UIi%2B49x9joU%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx