This is a note to let you know that I've just added the patch titled drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled to the 6.3-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-amdgpu-reset-cp_vmid_preempt-after-trailing-fence-signaled.patch and it can be found in the queue-6.3 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 1dbcf770cc2d15baf8a1e8174d6fd014a68b45ca Mon Sep 17 00:00:00 2001 From: Jiadong Zhu <Jiadong.Zhu@xxxxxxx> Date: Wed, 24 May 2023 11:42:19 +0800 Subject: drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled From: Jiadong Zhu <Jiadong.Zhu@xxxxxxx> commit 1dbcf770cc2d15baf8a1e8174d6fd014a68b45ca upstream. When MEC executes unmap_queue for mid command buffer preemption, it will kick the write pointer of the gfx ring, set CP_VMID_PREEMPT to trigger the preemption and wait for CP_VMID_PREEMPT becomes zero after the preemption done. There is a race condition that PFP may excute the resetting command before MEC set CP_VMID_PREEMPT. As a result, hang happens as CP_VMID_PREEMPT is always 0xffff. To avoid this, we send resetting CP_VMID_PREEMPT command after the trailing fence is siganled and update gfx write pointer explicitly. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@xxxxxxx> Acked-by: Alex Deucher <alexander.deucher@xxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> Cc: stable@xxxxxxxxxxxxxxx # 6.3.x Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535 Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -5366,10 +5366,6 @@ static int gfx_v9_0_ring_preempt_ib(stru amdgpu_ring_alloc(ring, 13); gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr, ring->trail_seq, AMDGPU_FENCE_FLAG_EXEC | AMDGPU_FENCE_FLAG_INT); - /*reset the CP_VMID_PREEMPT after trailing fence*/ - amdgpu_ring_emit_wreg(ring, - SOC15_REG_OFFSET(GC, 0, mmCP_VMID_PREEMPT), - 0x0); /* assert IB preemption, emit the trailing fence */ kiq->pmf->kiq_unmap_queues(kiq_ring, ring, PREEMPT_QUEUES_NO_UNMAP, @@ -5392,6 +5388,10 @@ static int gfx_v9_0_ring_preempt_ib(stru DRM_WARN("ring %d timeout to preempt ib\n", ring->idx); } + /*reset the CP_VMID_PREEMPT after trailing fence*/ + amdgpu_ring_emit_wreg(ring, + SOC15_REG_OFFSET(GC, 0, mmCP_VMID_PREEMPT), + 0x0); amdgpu_ring_commit(ring); /* deassert preemption condition */ Patches currently in stable-queue which might be from Jiadong.Zhu@xxxxxxx are queue-6.3/drm-amdgpu-program-gds-backup-address-as-zero-if-no-gds-allocated.patch queue-6.3/drm-amdgpu-modify-indirect-buffer-packages-for-resubmission.patch queue-6.3/drm-amdgpu-implement-gfx9-patch-functions-for-resubmission.patch queue-6.3/drm-amdgpu-reset-cp_vmid_preempt-after-trailing-fence-signaled.patch