Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence

Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> · Thu, 22 Jul 2021 13:50:09 -0400

On 2021-07-22 1:27 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 01:17:13PM -0400, Andrey Grodzovsky wrote:
On 2021-07-22 12:47 p.m., Jingwen Chen wrote:
On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote:
Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky:
On 2021-07-22 6:45 a.m., Jingwen Chen wrote:
On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote:
On 2021-07-20 11:13 p.m., Jingwen Chen wrote:
[Why]
After embeded hw_fence to amdgpu_job, we need to add tdr support
for this feature.

[How]
1. Add a resubmit_flag for resubmit jobs.
2. Clear job fence from RCU and force complete vm flush fences in
       pre_asic_reset
3. skip dma_fence_get for resubmit jobs and add a dma_fence_put
       for guilty jobs.

Signed-off-by: Jack Zhang <Jack.Zhang1@xxxxxxx>
Signed-off-by: Jingwen Chen <Jingwen.Chen2@xxxxxxx>
---
     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++++++++++-
     drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 16 +++++++++++-----
     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  4 +++-
     drivers/gpu/drm/scheduler/sched_main.c     |  1 +
     include/drm/gpu_scheduler.h                |  1 +
     5 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 40461547701a..fe0237f72a09 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct
amdgpu_device *adev)
     int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
                      struct amdgpu_reset_context *reset_context)
     {
-    int i, r = 0;
+    int i, j, r = 0;
         struct amdgpu_job *job = NULL;
         bool need_full_reset =
             test_bit(AMDGPU_NEED_FULL_RESET, &reset_context->flags);
@@ -4406,6 +4406,16 @@ int
amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
             if (!ring || !ring->sched.thread)
                 continue;
+        /*clear job fence from fence drv to avoid force_completion
+         *leave NULL and vm flush fence in fence drv */
+        for (j = 0; j <= ring->fence_drv.num_fences_mask; j ++) {
+            struct dma_fence *old,**ptr;
+            ptr = &ring->fence_drv.fences[j];
+            old = rcu_dereference_protected(*ptr, 1);
+            if (old && test_bit(DMA_FENCE_FLAG_USER_BITS,
&old->flags))) {
+                RCU_INIT_POINTER(*ptr, NULL);
+            }
Is this to avoid premature job free because of dma_fence_put inside
amdgpu_fence_process ?
I can't currently remember why but we probably want all the HW fences
currently in the ring to
be forced signaled - maybe better to test for DMA_FENCE_FLAG_USER_BITS
inside amdgpu_fence_process
and still do the signaling but not the dma_fence_put part

Andrey
Hi Andrey,

This is to avoid signaling the same fence twice. If we still do the
signaling, then the job in the pending list will be signaled first in
force_completion, and later be signaled in resubmit. This will go to
BUG() in amdgpu_fence_process.
Oh, i see, how about just adding 'skip' flag to amdgpu_ring and setting
it before calling
amdgpu_fence_driver_force_completion and resetting it after, then inside
amdgpu_fence_driver_force_completion
you can just skip the signaling part with this flag for fences with
DMA_FENCE_FLAG_USER_BITS set
Less lines of code at least.
Still sounds quite a bit hacky.

I would rather suggest to completely drop the approach with
amdgpu_fence_driver_force_completion(). I could never see why we would want
that in the first place.

Regards,
Christian.

Hi Christian,

I keep the amdgpu_fence_driver_force_completion here to make sure the vm
flush fence is signaled and put.

Right, so we need to do force completion for the sake of all the fences
without parent jobs since there is code which wait directly on them.


So the key question is whether the fence of ib test should be the first
fence to be signaled after reset.

What do you mean by 'after reset' - at this point in the code there was
no ASIC reset yet, we just stopped the schedulers and detached the
HW fences from their parent jobs (sched_fence)
I mean the ASIC reset. There will be a ib_test for each ring after ASIC
reset.


Then why wouldn't they be the first ? They will emit new fences into the 
ring
which will be signaled right away because the ASIC went through reset 
and is not
hung anymore.  All the previous fences, including VM flush once from 
before the
reset will be already signaled by this time from 
amdgpu_fence_driver_force_completion.



If it should be, it means not only fences with DMA_FENCE_FLAG_USER_BITS
but also vm flush fences should be removed from RCU fence array before
ib_test.


As I mentioned above, not clear to me why VM fences should get special 
treatment here.

Andrey



Which ib_test is it, the one after ASIC reset ?

Yes


Best Regards,
JingWen Chen
Andrey

   Then we must do the force_completion here for vm flush
fence, otherwise there will be a memory leak here as no one will signal
and put it after we remove it from RCU.
If not, then the first fence to signal could be vm flush fence. And I'm
OK with drop amdgpu_fence_driver_force_completion here.


Best Regards,
JingWen Chen
Andrey


+        }
             /* after all hw jobs are reset, hw fence is
meaningless, so force_completion */
             amdgpu_fence_driver_force_completion(ring);
         }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index eecf21d8ec33..815776c9a013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -156,11 +156,17 @@ int amdgpu_fence_emit(struct
amdgpu_ring *ring, struct dma_fence **f, struct amd
             job->ring = ring;
         }
-    seq = ++ring->fence_drv.sync_seq;
-    dma_fence_init(fence, &amdgpu_fence_ops,
-               &ring->fence_drv.lock,
-               adev->fence_context + ring->idx,
-               seq);
+    if (job != NULL && job->base.resubmit_flag == 1) {
+        /* reinit seq for resubmitted jobs */
+        seq = ++ring->fence_drv.sync_seq;
+        fence->seqno = seq;
+    } else {
+        seq = ++ring->fence_drv.sync_seq;
Seems like you could do the above line only once above if-else
as it was
before
Sure, I will modify this.


Best Regards,
JingWen Chen
+        dma_fence_init(fence, &amdgpu_fence_ops,
+                &ring->fence_drv.lock,
+                adev->fence_context + ring->idx,
+                seq);
+    }
         if (job != NULL) {
             /* mark this fence has a parent job */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 7c426e225b24..d6f848adc3f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -241,6 +241,7 @@ static struct dma_fence
*amdgpu_job_run(struct drm_sched_job *sched_job)
             dma_fence_set_error(finished, -ECANCELED);/* skip
IB as well if VRAM lost */
         if (finished->error < 0) {
+        dma_fence_put(&job->hw_fence);
             DRM_INFO("Skip scheduling IBs!\n");
         } else {
             r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job,
@@ -249,7 +250,8 @@ static struct dma_fence
*amdgpu_job_run(struct drm_sched_job *sched_job)
                 DRM_ERROR("Error scheduling IBs (%d)\n", r);
         }
-    dma_fence_get(fence);
+    if (!job->base.resubmit_flag)
+        dma_fence_get(fence);
         amdgpu_job_free_resources(job);
         fence = r ? ERR_PTR(r) : fence;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index f4f474944169..5a36ab5aea2d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -544,6 +544,7 @@ void drm_sched_resubmit_jobs_ext(struct
drm_gpu_scheduler *sched, int max)
dma_fence_set_error(&s_fence->finished, -ECANCELED);
             dma_fence_put(s_job->s_fence->parent);
+        s_job->resubmit_flag = 1;
             fence = sched->ops->run_job(s_job);
             i++;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 4ea8606d91fe..06c101af1f71 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -198,6 +198,7 @@ struct drm_sched_job {
         enum drm_sched_priority        s_priority;
         struct drm_sched_entity         *entity;
         struct dma_fence_cb        cb;
+    int resubmit_flag;
     };
     static inline bool drm_sched_invalidate_job(struct
drm_sched_job *s_job,
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7C9749ced7ce6645bd832408d94d305aaa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637625692355112825%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=OhLndCUDlWcrhg%2FA0QQ%2FoONxdmQ46CT7P%2F8uvSTGnQ8%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx