[AMD Official Use Only] Hi Christian How about this way: #define AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB DMA_FENCE_FLAG_USER_BITS And we can use AMDGPU_DMA_FENCE_FLAG_INSIDE_JOB instead of DMA_FENCE_FLAG_USER_BITS everywhere Thanks ------------------------------------------ Monk Liu | Cloud-GPU Core team ------------------------------------------ -----Original Message----- From: Jingwen Chen <Jingwen.Chen2@xxxxxxx> Sent: Friday, July 23, 2021 3:07 PM To: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; Grodzovsky, Andrey <Andrey.Grodzovsky@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Chen, Horace <Horace.Chen@xxxxxxx>; Liu, Monk <Monk.Liu@xxxxxxx>; Zhang, Jack (Jian) <Jack.Zhang1@xxxxxxx> Subject: Re: [PATCH 2/2] drm: add tdr support for embeded hw_fence On Fri Jul 23, 2021 at 08:33:02AM +0200, Christian König wrote: > Am 22.07.21 um 18:47 schrieb Jingwen Chen: > > On Thu Jul 22, 2021 at 06:24:28PM +0200, Christian König wrote: > > > Am 22.07.21 um 16:45 schrieb Andrey Grodzovsky: > > > > On 2021-07-22 6:45 a.m., Jingwen Chen wrote: > > > > > On Wed Jul 21, 2021 at 12:53:51PM -0400, Andrey Grodzovsky wrote: > > > > > > On 2021-07-20 11:13 p.m., Jingwen Chen wrote: > > > > > > > [Why] > > > > > > > After embeded hw_fence to amdgpu_job, we need to add tdr > > > > > > > support for this feature. > > > > > > > > > > > > > > [How] > > > > > > > 1. Add a resubmit_flag for resubmit jobs. > > > > > > > 2. Clear job fence from RCU and force complete vm flush > > > > > > > fences in > > > > > > > pre_asic_reset > > > > > > > 3. skip dma_fence_get for resubmit jobs and add a > > > > > > > dma_fence_put > > > > > > > for guilty jobs. > > > > > > > > > > > > > > Signed-off-by: Jack Zhang <Jack.Zhang1@xxxxxxx> > > > > > > > Signed-off-by: Jingwen Chen <Jingwen.Chen2@xxxxxxx> > > > > > > > --- > > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 > > > > > > > +++++++++++- > > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 16 > > > > > > > +++++++++++----- > > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 +++- > > > > > > > drivers/gpu/drm/scheduler/sched_main.c | 1 + > > > > > > > include/drm/gpu_scheduler.h | 1 + > > > > > > > 5 files changed, 27 insertions(+), 7 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > > > > index 40461547701a..fe0237f72a09 100644 > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > > > > @@ -4382,7 +4382,7 @@ int amdgpu_device_mode1_reset(struct > > > > > > > amdgpu_device *adev) > > > > > > > int amdgpu_device_pre_asic_reset(struct amdgpu_device > > > > > > > *adev, > > > > > > > struct amdgpu_reset_context > > > > > > > *reset_context) > > > > > > > { > > > > > > > - int i, r = 0; > > > > > > > + int i, j, r = 0; > > > > > > > struct amdgpu_job *job = NULL; > > > > > > > bool need_full_reset = > > > > > > > test_bit(AMDGPU_NEED_FULL_RESET, > > > > > > > &reset_context->flags); @@ -4406,6 +4406,16 @@ int > > > > > > > amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, > > > > > > > if (!ring || !ring->sched.thread) > > > > > > > continue; > > > > > > > + /*clear job fence from fence drv to avoid > > > > > > > +force_completion > > > > > > > + *leave NULL and vm flush fence in fence drv */ > > > > > > > + for (j = 0; j <= ring->fence_drv.num_fences_mask; > > > > > > > +j ++) { > > > > > > > + struct dma_fence *old,**ptr; > > > > > > > + ptr = &ring->fence_drv.fences[j]; > > > > > > > + old = rcu_dereference_protected(*ptr, 1); > > > > > > > + if (old && test_bit(DMA_FENCE_FLAG_USER_BITS, > > > > > > > &old->flags))) { > > > > > > > + RCU_INIT_POINTER(*ptr, NULL); > > > > > > > + } > > > > > > Is this to avoid premature job free because of dma_fence_put > > > > > > inside amdgpu_fence_process ? > > > > > > I can't currently remember why but we probably want all the > > > > > > HW fences currently in the ring to be forced signaled - > > > > > > maybe better to test for DMA_FENCE_FLAG_USER_BITS inside > > > > > > amdgpu_fence_process and still do the signaling but not the > > > > > > dma_fence_put part > > > > > > > > > > > > Andrey > > > > > Hi Andrey, > > > > > > > > > > This is to avoid signaling the same fence twice. If we still > > > > > do the signaling, then the job in the pending list will be > > > > > signaled first in force_completion, and later be signaled in > > > > > resubmit. This will go to > > > > > BUG() in amdgpu_fence_process. > > > > > > > > Oh, i see, how about just adding 'skip' flag to amdgpu_ring and > > > > setting it before calling amdgpu_fence_driver_force_completion > > > > and resetting it after, then inside > > > > amdgpu_fence_driver_force_completion > > > > you can just skip the signaling part with this flag for fences > > > > with DMA_FENCE_FLAG_USER_BITS set Less lines of code at least. > > > Still sounds quite a bit hacky. > > > > > > I would rather suggest to completely drop the approach with > > > amdgpu_fence_driver_force_completion(). I could never see why we > > > would want that in the first place. > > > > > > Regards, > > > Christian. > > > > > Hi Christian, > > > > I keep the amdgpu_fence_driver_force_completion here to make sure > > the vm flush fence is signaled and put. > > So the key question is whether the fence of ib test should be the > > first fence to be signaled after reset. > > If it should be, it means not only fences with > > DMA_FENCE_FLAG_USER_BITS but also vm flush fences should be removed > > from RCU fence array before ib_test. Then we must do the > > force_completion here for vm flush fence, otherwise there will be a > > memory leak here as no one will signal and put it after we remove it from RCU. > > If not, then the first fence to signal could be vm flush fence. And > > I'm OK with drop amdgpu_fence_driver_force_completion here. > > The key problem is that amdgpu_fence_driver_force_completion() goes > over the RCU array and signals everything there. > > What we should do instead is to go over the jobs and cleanup the ones > we don't want to re-submit and keep the rest. > Hi Christian, The thing is vm flush fence has no job passed to amdgpu_fence_emit, so go through the jobs cannot help find the vm flush fence. And keep the rest fences in the RCU array will lead to signaling them in the ib_test right after ASIC reset. While they will be signaled again during resubmission. What I'm doning here is just trying to cleanup the fences without a parent job and make sure the rest fences won't be signaled twice. > And please drop any usage of DMA_FENCE_FLAG_USER_BITS. That is not > something which should be abused for reset handling. > The DMA_FENCE_FLAG_USER_BITS here is to mark this fence has a parent job. If this is not a proper usage here, do you have any suggestions about how to identify whether the fence has a parent job? Best Regards, JingWen Chen > Regards, > Christian. > > > > > > > Best Regards, > > JingWen Chen > > > > Andrey > > > > > > > > > > > > > > > + } > > > > > > > /* after all hw jobs are reset, hw fence is > > > > > > > meaningless, so force_completion */ > > > > > > > amdgpu_fence_driver_force_completion(ring); > > > > > > > } > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > > > > > index eecf21d8ec33..815776c9a013 100644 > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > > > > > > @@ -156,11 +156,17 @@ int amdgpu_fence_emit(struct > > > > > > > amdgpu_ring *ring, struct dma_fence **f, struct amd > > > > > > > job->ring = ring; > > > > > > > } > > > > > > > - seq = ++ring->fence_drv.sync_seq; > > > > > > > - dma_fence_init(fence, &amdgpu_fence_ops, > > > > > > > - &ring->fence_drv.lock, > > > > > > > - adev->fence_context + ring->idx, > > > > > > > - seq); > > > > > > > + if (job != NULL && job->base.resubmit_flag == 1) { > > > > > > > + /* reinit seq for resubmitted jobs */ > > > > > > > + seq = ++ring->fence_drv.sync_seq; > > > > > > > + fence->seqno = seq; > > > > > > > + } else { > > > > > > > + seq = ++ring->fence_drv.sync_seq; > > > > > > Seems like you could do the above line only once above > > > > > > if-else as it was before > > > > > Sure, I will modify this. > > > > > > > > > > > > > > > Best Regards, > > > > > JingWen Chen > > > > > > > + dma_fence_init(fence, &amdgpu_fence_ops, > > > > > > > + &ring->fence_drv.lock, > > > > > > > + adev->fence_context + ring->idx, > > > > > > > + seq); > > > > > > > + } > > > > > > > if (job != NULL) { > > > > > > > /* mark this fence has a parent job */ > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > > > > > index 7c426e225b24..d6f848adc3f4 100644 > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > > > > > @@ -241,6 +241,7 @@ static struct dma_fence > > > > > > > *amdgpu_job_run(struct drm_sched_job *sched_job) > > > > > > > dma_fence_set_error(finished, -ECANCELED);/* skip > > > > > > > IB as well if VRAM lost */ > > > > > > > if (finished->error < 0) { > > > > > > > + dma_fence_put(&job->hw_fence); > > > > > > > DRM_INFO("Skip scheduling IBs!\n"); > > > > > > > } else { > > > > > > > r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job, > > > > > > > @@ -249,7 +250,8 @@ static struct dma_fence > > > > > > > *amdgpu_job_run(struct drm_sched_job *sched_job) > > > > > > > DRM_ERROR("Error scheduling IBs (%d)\n", r); > > > > > > > } > > > > > > > - dma_fence_get(fence); > > > > > > > + if (!job->base.resubmit_flag) > > > > > > > + dma_fence_get(fence); > > > > > > > amdgpu_job_free_resources(job); > > > > > > > fence = r ? ERR_PTR(r) : fence; > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > index f4f474944169..5a36ab5aea2d 100644 > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > @@ -544,6 +544,7 @@ void drm_sched_resubmit_jobs_ext(struct > > > > > > > drm_gpu_scheduler *sched, int max) > > > > > > > dma_fence_set_error(&s_fence->finished, -ECANCELED); > > > > > > > dma_fence_put(s_job->s_fence->parent); > > > > > > > + s_job->resubmit_flag = 1; > > > > > > > fence = sched->ops->run_job(s_job); > > > > > > > i++; > > > > > > > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > > > > > > > index 4ea8606d91fe..06c101af1f71 100644 > > > > > > > --- a/include/drm/gpu_scheduler.h > > > > > > > +++ b/include/drm/gpu_scheduler.h > > > > > > > @@ -198,6 +198,7 @@ struct drm_sched_job { > > > > > > > enum drm_sched_priority s_priority; > > > > > > > struct drm_sched_entity *entity; > > > > > > > struct dma_fence_cb cb; > > > > > > > + int resubmit_flag; > > > > > > > }; > > > > > > > static inline bool drm_sched_invalidate_job(struct > > > > > > > drm_sched_job *s_job, > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx