Re: [PATCH 09/20] drm/i915/gem: Assign context id for async work

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Wed, 08 Jul 2020 13:42:58 +0100

Quoting Tvrtko Ursulin (2020-07-08 13:26:24)
> 
> On 06/07/2020 07:19, Chris Wilson wrote:
> > Allocate a few dma fence context id that we can use to associate async work
> > [for the CPU] launched on behalf of this context. For extra fun, we allow
> > a configurable concurrency width.
> > 
> > A current example would be that we spawn an unbound worker for every
> > userptr get_pages. In the future, we wish to charge this work to the
> > context that initiated the async work and to impose concurrency limits
> > based on the context.
> > 
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c       | 4 ++++
> >   drivers/gpu/drm/i915/gem/i915_gem_context.h       | 6 ++++++
> >   drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++++++
> >   3 files changed, 16 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 41784df51e58..bd68746327b3 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -714,6 +714,10 @@ __create_context(struct drm_i915_private *i915)
> >       ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL);
> >       mutex_init(&ctx->mutex);
> >   
> > +     ctx->async.width = rounddown_pow_of_two(num_online_cpus());
> > +     ctx->async.context = dma_fence_context_alloc(ctx->async.width);
> > +     ctx->async.width--;
> 
> Hey I had a tri-core CPU back in the day.. :) Really, I can only assume 
> you are oding some tricks with masks which maybe only work with power of 
> 2 num cpus? Hard to say.. please explain in a comment.

Just a pot mask, that fits in the currently available set of CPUs.

> I don't even understand what the context will be for yet and why it 
> needs a separate context id.

The longer term view is that I want to pull the various async tasks we
use into a CPU scheduling kthread[s], that shares the same priority
inheritance of tasks. The issue at the moment is that as we use the
system_wq, that imposes an implicit FIFO ordering on our tasks upsetting
our context priorities. This is a step towards that to start looking at
how we might limit concurrency in various stages by using a bunch of
timelines for each stage, and queuing our work along each timeline before
submitting to an unbound system_wq. [The immediate goal is to limit how
much of the CPU one client can hog by submitting deferred work that would
run in parallel, with a view to making that configurable per-context.]

> >       spin_lock_init(&ctx->stale.lock);
> >       INIT_LIST_HEAD(&ctx->stale.engines);
> >   
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > index 3702b2fb27ab..e104ff0ae740 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > @@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> >   int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file);
> >   
> > +static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx)
> > +{
> > +     return (ctx->async.context +
> > +             (atomic_fetch_inc(&ctx->async.cur) & ctx->async.width));
> > +}
> > +
> >   static inline struct i915_gem_context *
> >   i915_gem_context_get(struct i915_gem_context *ctx)
> >   {
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > index ae14ca24a11f..52561f98000f 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
> > @@ -85,6 +85,12 @@ struct i915_gem_context {
> >   
> >       struct intel_timeline *timeline;
> >   
> > +     struct {
> > +             u64 context;
> > +             atomic_t cur;
> 
> What is cur? In which patch it gets used? (Can't see it.)

See i915_gem_context_async_id() above.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx