Re: [PATCH 12/23] dma-buf/drivers: make reserving a shared slot mandatory v3

Daniel Vetter <daniel@xxxxxxxx> · Thu, 31 Mar 2022 18:42:59 +0200

On Thu, 31 Mar 2022 at 14:07, Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Am 28.03.22 um 19:14 schrieb Daniel Vetter:
> > On Mon, Mar 21, 2022 at 02:58:45PM +0100, Christian König wrote:
> >> [SNIP]
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> index ea0cde4904f0..2f808decd8d9 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >> @@ -1384,6 +1384,14 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
> >>                   bool shared)
> >>   {
> >>      struct dma_resv *resv = bo->tbo.base.resv;
> >> +    int r;
> >> +
> >> +    r = dma_resv_reserve_fences(resv, 1);
> > This is quite a hack, but I did scroll through all the callers of
> > amdgpu_bo_fence and I think it's fine - i.e. no recursion into the
> > shrinker from a calling context where recursion into shrinker/memalloc
> > isn't allowed.
> >
> > But it aint pretty :-/
>
> Yeah, but one long term goal of this is to remove all the hacky handling
> of manually adding fences to the resv object using this function. I
> could add a TODO if that helps.
>
> > [SNIP]
> >> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> >> index ee9612a3ee5e..4de6500f3c55 100644
> >> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> >> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> >> @@ -596,7 +596,11 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
> >>      assert_object_held(src);
> >>      i915_deps_init(&deps, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
> >>
> >> -    ret = dma_resv_reserve_shared(src_bo->base.resv, 1);
> >> +    ret = dma_resv_reserve_fences(src_bo->base.resv, 1);
> >> +    if (ret)
> >> +            return ret;
> >> +
> >> +    ret = dma_resv_reserve_fences(dst_bo->base.resv, 1);
> > Can't we just reserve 2 slots instead of doing this 2x?
>
> *handing you some coffee* We reserve one one slot on the source and one
> on the destination buffer :)

Ah, coffee, great :-)

> > [SNIP]
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index a6925dbb6224..c34114560e49 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -247,6 +247,10 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> >       int i, ret;
> >
> >       for (i = 0; i < bo_count; i++) {
> > +             ret = dma_resv_reserve_fences(bos[i]->resv, 1);
> > +             if (ret)
> > +                     return ret;
> > +
> >               /* panfrost always uses write mode in its current uapi */
> >               ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
> > I wonder whether we shouldn't move the dma-resv reserving into some shared
> > helpers eventually ...
>
> I was going back and forth adding this to
> drm_sched_job_add_implicit_dependencies(), but then decided against that
> because it is really two independent functionalities.

Yeah it doesn't really fit. Maybe together as a combo packet of ttm eu
helpers (lifted to gem_bo level) combined with drm_sched. Defo
something for another patch set.

> >> [SNIP]
> >> @@ -120,9 +119,9 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
> >>                      ret = ttm_bo_reserve_slowpath(bo, intr, ticket);
> >>              }
> >>
> >> -            if (!ret && entry->num_shared)
> >> -                    ret = dma_resv_reserve_shared(bo->base.resv,
> >> -                                                            entry->num_shared);
> >> +            if (!ret)
> >> +                    ret = dma_resv_reserve_fences(bo->base.resv,
> >> +                                                  num_fences);
> >>
> >>              if (unlikely(ret != 0)) {
> >>                      if (ticket) {
> > I didn't fine the corresponding reserve for the dma_resv_add_excl_fence()
> > in ttm_bo_move_accel_cleanup(). Was that an oversight?
>
> Mhm, need to double check as well. Could be that I missed that one.
>
> >> [SNIP]
> >> diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> >> index 4abf10b66fe8..594bd6bb00d2 100644
> >> --- a/drivers/gpu/drm/vc4/vc4_gem.c
> >> +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> >> @@ -644,7 +644,7 @@ vc4_lock_bo_reservations(struct drm_device *dev,
> >>      for (i = 0; i < exec->bo_count; i++) {
> >>              bo = &exec->bo[i]->base;
> >>
> >> -            ret = dma_resv_reserve_shared(bo->resv, 1);
> >> +            ret = dma_resv_reserve_fences(bo->resv, 1);
> >>              if (ret) {
> >>                      vc4_unlock_bo_reservations(dev, exec, acquire_ctx);
> >>                      return ret;
> > v3d and vc4 are missing in the conversion. I think for both you need to
> > add it before the call to  like
> > with etnaviv.
>
> Both drivers already have the necessary calls. See
> vc4_lock_bo_reservations() and v3d_lock_bo_reservations().

Indeed I missed that they unconditionally reserve slots and aren't
trying to be clever.

> >> [SNIP]
> >> diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
> >> index 48d3c9955f0d..1820ca6cf673 100644
> >> --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
> >> +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
> >> @@ -214,6 +214,7 @@ void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
> >>
> >>   int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
> >>   {
> >> +    unsigned int i;
> >>      int ret;
> >>
> >>      if (objs->nents == 1) {
> >> @@ -222,6 +223,14 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
> >>              ret = drm_gem_lock_reservations(objs->objs, objs->nents,
> >>                                              &objs->ticket);
> >>      }
> >> +    if (ret)
> >> +            return ret;
> >> +
> >> +    for (i = 0; i < objs->nents; ++i) {
> >> +            ret = dma_resv_reserve_fences(objs->objs[i]->resv, 1);
> > I think you could stuff this into the same loop, but also probably doesn't
> > matter.
>
> Na, that loop is inside drm_gem_lock_reservations().

Hm maybe another case for unified execbuf helpers that do this for drivers :-)

> > [SNIP]
> >
> > I found a few things, but with those (vc4 and v3d plus the ttm question,
> > the other stuff is just comments) corrected this gets my
>
> Going to double check the TTM case once more, but apart from that I
> think its solid.

Yeah with ttm I'm just a bit too much out of my own depth, so if you
can reply with an explainer for dummies so I can check myself where
all the pieces are I think we have it all now!
-Daniel

>
> > Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
>
> Thanks,
> Christian.
>
> >
> >> --
> >> 2.25.1
> >>
>

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch