Re: [Intel-gfx] [PATCH 1/5] drm/i915: Move intel_engine_free_request_pool to i915_request.c

Jason Ekstrand <jason@xxxxxxxxxxxxxx> · Thu, 10 Jun 2021 11:32:17 -0500

On Thu, Jun 10, 2021 at 10:07 AM Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
>
> On 10/06/2021 14:57, Jason Ekstrand wrote:
> > On Thu, Jun 10, 2021 at 5:04 AM Tvrtko Ursulin
> > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
> >>
> >> On 09/06/2021 22:29, Jason Ekstrand wrote:
> >>> This appears to break encapsulation by moving an intel_engine_cs
> >>> function to a i915_request file.  However, this function is
> >>> intrinsically tied to the lifetime rules and allocation scheme of
> >>> i915_request and having it in intel_engine_cs.c leaks details of
> >>> i915_request.  We have an abstraction leak either way.  Since
> >>> i915_request's allocation scheme is far more subtle than the simple
> >>> pointer that is intel_engine_cs.request_pool, it's probably better to
> >>> keep i915_request's details to itself.
> >>>
> >>> Signed-off-by: Jason Ekstrand <jason@xxxxxxxxxxxxxx>
> >>> Cc: Jon Bloomfield <jon.bloomfield@xxxxxxxxx>
> >>> Cc: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> >>> Cc: Matthew Auld <matthew.auld@xxxxxxxxx>
> >>> Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c | 8 --------
> >>>    drivers/gpu/drm/i915/i915_request.c       | 7 +++++--
> >>>    drivers/gpu/drm/i915/i915_request.h       | 2 --
> >>>    3 files changed, 5 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> index 9ceddfbb1687d..df6b80ec84199 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> >>> @@ -422,14 +422,6 @@ void intel_engines_release(struct intel_gt *gt)
> >>>        }
> >>>    }
> >>>
> >>> -void intel_engine_free_request_pool(struct intel_engine_cs *engine)
> >>> -{
> >>> -     if (!engine->request_pool)
> >>> -             return;
> >>> -
> >>> -     kmem_cache_free(i915_request_slab_cache(), engine->request_pool);
> >>
> >> Argument that the slab cache shouldn't be exported from i915_request.c
> >> sounds good to me.
> >>
> >> But I think step better than simply reversing the break of encapsulation
> >> (And it's even worse because it leaks much higher level object!) could
> >> be to export a freeing helper from i915_request.c, engine pool would
> >> then use:
> >>
> >> void __i915_request_free(...)
> >> {
> >>          kmem_cache_free(...);
> >> }
> >
> > That was what I did at first.  However, the semantics of how the
> > pointer is touched/modified are really also part of i915_request.  In
> > particular, the use of xchg and cmpxchg.  So I pulled the one other
> > access (besides NULL initializing) into i915_request.c which meant
> > pulling in intel_engine_free_request_pool.
>
> Hmmm in my view the only break of encapsulation at the moment is that
> intel_engine_cs.c knows requests have been allocated from a dedicated slab.
>
> Semantics of how the request pool pointer is managed, so xchg and
> cmpxchg, already are in i915_request.c so I don't exactly follow what is
> the problem with wrapping the knowledge on how requests should be freed
> inside i915_request.c as well?
>
> Unless you view the fact intel_engine_cs contains a pointer to
> i915_request a break as well? But even then... <continued below>
>
> > Really, if we wanted proper encapsulation here, we'd have
> >
> > struct i915_request_cache {
> >      struct i915_request *rq;
> > };
> >
> > void i915_request_cache_init(struct i915_request_cache *cache);
> > void i915_request_cache_finish(struct i915_request_cache *cache);
> >
> > all in i915_request.h and have all the gory details inside
> > i915_request.c.  Then all intel_engine_cs knows is that it has a > request cache.
>
> ... with this scheme you'd have intel_engine_cs contain a pointer to
> i915_request_cache,

No, it would contain an i915_request_cache, not a pointer to one.  It
wouldn't fundamentally change any data structures; just add wrapping.

> which does not seem particularly exciting
> improvement for me since wrapping would be extremely thin with no
> fundamental changes.

Yeah, it's not particularly exciting.

> So for me exporting new __i915_request_free() from i915_request.c makes
> things a bit better and I don't think we need to go further than that.

I'm not sure it's necessary either.  The thing that bothers me is that
we have this pointer that's clearly managed by i915_request.c but is
initialized and finished by intel_context_cs.c.  Certainly adding an
i915_request_free() is better than what we have today.  I'm not sure
it's enough better to really make me happy but, TBH, the whole request
cache thing is a bit of a mess....

> I mean there is the issue of i915_request.c knowing about engines having
> request pools, but I am not sure if with i915_request_cache you proposed
> to remove that knowledge and how?

It doesn't, really.  As long as we're stashing a request in the
engine, there's still an encapsulation problem no matter what we do.

>  From the design point of view, given request pool is used only for
> engine pm, clean design could be to manage this from engine pm. Like if
> parking cannot use GFP_KERNEL then check if unparking can and explicitly
> allocate a request from there to be consumed at parking time. It may
> require some splitting of the request creation path though. To allocate
> but not put it on the kernel timeline until park time.

And now we're getting to the heart of things. :-)  Daniel mentioned
this too.  Maybe if the real problem here is that engine parking can't
allocate memory, we need to just fix engine parking to either not
require an i915_request somehow or to do its own caching somehow.  I'm
going to look into this.

--Jason

> Regards,
>
> Tvrtko
>
> >
> > If we really want to go that far, we can, I suppose.
> >
> > --Jason
> >
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> -}
> >>> -
> >>>    void intel_engines_free(struct intel_gt *gt)
> >>>    {
> >>>        struct intel_engine_cs *engine;
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>> index 1014c71cf7f52..48c5f8527854b 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>> @@ -106,9 +106,12 @@ static signed long i915_fence_wait(struct dma_fence *fence,
> >>>                                 timeout);
> >>>    }
> >>>
> >>> -struct kmem_cache *i915_request_slab_cache(void)
> >>> +void intel_engine_free_request_pool(struct intel_engine_cs *engine)
> >>>    {
> >>> -     return global.slab_requests;
> >>> +     if (!engine->request_pool)
> >>> +             return;
> >>> +
> >>> +     kmem_cache_free(global.slab_requests, engine->request_pool);
> >>>    }
> >>>
> >>>    static void i915_fence_release(struct dma_fence *fence)
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> >>> index 270f6cd37650c..f84c38d29f988 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.h
> >>> +++ b/drivers/gpu/drm/i915/i915_request.h
> >>> @@ -300,8 +300,6 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence)
> >>>        return fence->ops == &i915_fence_ops;
> >>>    }
> >>>
> >>> -struct kmem_cache *i915_request_slab_cache(void);
> >>> -
> >>>    struct i915_request * __must_check
> >>>    __i915_request_create(struct intel_context *ce, gfp_t gfp);
> >>>    struct i915_request * __must_check
> >>>