On Thu, Jul 13, 2023 at 1:03 PM Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> wrote: > > On 13/07/2023 01:25, Rob Clark wrote: > > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > > > In an error path where the submit is free'd without the job being run, > > the hw_fence pointer is simply a kzalloc'd block of memory. In this > > case we should just kfree() it, rather than trying to decrement it's > > reference count. Fortunately we can tell that this is the case by > > checking for a zero refcount, since if the job was run, the submit would > > be holding a reference to the hw_fence. > > > > Fixes: f94e6a51e17c ("drm/msm: Pre-allocate hw_fence") > > Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> > > --- > > drivers/gpu/drm/msm/msm_fence.c | 6 ++++++ > > drivers/gpu/drm/msm/msm_gem_submit.c | 14 +++++++++++++- > > 2 files changed, 19 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c > > index 96599ec3eb78..1a5d4f1c8b42 100644 > > --- a/drivers/gpu/drm/msm/msm_fence.c > > +++ b/drivers/gpu/drm/msm/msm_fence.c > > @@ -191,6 +191,12 @@ msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx) > > > > f->fctx = fctx; > > > > + /* > > + * Until this point, the fence was just some pre-allocated memory, > > + * no-one should have taken a reference to it yet. > > + */ > > + WARN_ON(kref_read(&fence->refcount)); > > It this really correct to return a refcounted object with 0 refcount > (I'm looking at submit_create() / msm_fence_alloc() )? Maybe it would be > better to move dma_fence_get() to msm_fence_alloc() ? But don't > immediately see, which one should be moved. The issue is that we can't really initialize the fence until msm_job_run(), when it is known what order the fence would be signaled. But we don't want to do any allocations in msm_job_run() because that could trigger the shrinker, which could need to wait until jobs complete to release memory, forming a deadlock. BR, -R > > + > > dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock, > > fctx->context, ++fctx->last_fence); > > } > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c > > index 3f1aa4de3b87..9d66498cdc04 100644 > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c > > @@ -86,7 +86,19 @@ void __msm_gem_submit_destroy(struct kref *kref) > > } > > > > dma_fence_put(submit->user_fence); > > - dma_fence_put(submit->hw_fence); > > + > > + /* > > + * If the submit is freed before msm_job_run(), then hw_fence is > > + * just some pre-allocated memory, not a reference counted fence. > > + * Once the job runs and the hw_fence is initialized, it will > > + * have a refcount of at least one, since the submit holds a ref > > + * to the hw_fence. > > + */ > > + if (kref_read(&submit->hw_fence->refcount) == 0) { > > + kfree(submit->hw_fence); > > + } else { > > + dma_fence_put(submit->hw_fence); > > + } > > > > put_pid(submit->pid); > > msm_submitqueue_put(submit->queue); > > -- > With best wishes > Dmitry >