Patch "drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs" has been added to the 6.1-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs

to the 6.1-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-amdgpu-force-signal-hw_fences-that-are-embedded-.patch
and it can be found in the queue-6.1 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 9b2bd985d8d074d8061b433eb03448dd428526a4
Author: YuBiao Wang <YuBiao.Wang@xxxxxxx>
Date:   Thu Mar 16 11:30:32 2023 +0800

    drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
    
    [ Upstream commit 033c56474acf567a450f8bafca50e0b610f2b716 ]
    
    [Why]
    For engines not supporting soft reset, i.e. VCN, there will be a failed
    ib test before mode 1 reset during asic reset. The fences in this case
    are never signaled and next time when we try to free the sa_bo, kernel
    will hang.
    
    [How]
    During pre_asic_reset, driver will clear job fences and afterwards the
    fences' refcount will be reduced to 1. For drm_sched_jobs it will be
    released in job_free_cb, and for non-sched jobs like ib_test, it's meant
    to be released in sa_bo_free but only when the fences are signaled. So
    we have to force signal the non_sched bad job's fence during
    pre_asic_reset or the clear is not complete.
    
    Signed-off-by: YuBiao Wang <YuBiao.Wang@xxxxxxx>
    Acked-by: Luben Tuikov <luben.tuikov@xxxxxxx>
    Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 6fdb679321d0d..3cc1929285fc0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -624,6 +624,15 @@ void amdgpu_fence_driver_clear_job_fences(struct amdgpu_ring *ring)
 		ptr = &ring->fence_drv.fences[i];
 		old = rcu_dereference_protected(*ptr, 1);
 		if (old && old->ops == &amdgpu_job_fence_ops) {
+			struct amdgpu_job *job;
+
+			/* For non-scheduler bad job, i.e. failed ib test, we need to signal
+			 * it right here or we won't be able to track them in fence_drv
+			 * and they will remain unsignaled during sa_bo free.
+			 */
+			job = container_of(old, struct amdgpu_job, hw_fence);
+			if (!job->base.s_fence && !dma_fence_is_signaled(old))
+				dma_fence_signal(old);
 			RCU_INIT_POINTER(*ptr, NULL);
 			dma_fence_put(old);
 		}



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux