Quoting Tvrtko Ursulin (2020-07-08 13:26:24) > > On 06/07/2020 07:19, Chris Wilson wrote: > > Allocate a few dma fence context id that we can use to associate async work > > [for the CPU] launched on behalf of this context. For extra fun, we allow > > a configurable concurrency width. > > > > A current example would be that we spawn an unbound worker for every > > userptr get_pages. In the future, we wish to charge this work to the > > context that initiated the async work and to impose concurrency limits > > based on the context. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++++ > > drivers/gpu/drm/i915/gem/i915_gem_context.h | 6 ++++++ > > drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++++++ > > 3 files changed, 16 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index 41784df51e58..bd68746327b3 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -714,6 +714,10 @@ __create_context(struct drm_i915_private *i915) > > ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); > > mutex_init(&ctx->mutex); > > > > + ctx->async.width = rounddown_pow_of_two(num_online_cpus()); > > + ctx->async.context = dma_fence_context_alloc(ctx->async.width); > > + ctx->async.width--; > > Hey I had a tri-core CPU back in the day.. :) Really, I can only assume > you are oding some tricks with masks which maybe only work with power of > 2 num cpus? Hard to say.. please explain in a comment. Just a pot mask, that fits in the currently available set of CPUs. > I don't even understand what the context will be for yet and why it > needs a separate context id. The longer term view is that I want to pull the various async tasks we use into a CPU scheduling kthread[s], that shares the same priority inheritance of tasks. The issue at the moment is that as we use the system_wq, that imposes an implicit FIFO ordering on our tasks upsetting our context priorities. This is a step towards that to start looking at how we might limit concurrency in various stages by using a bunch of timelines for each stage, and queuing our work along each timeline before submitting to an unbound system_wq. [The immediate goal is to limit how much of the CPU one client can hog by submitting deferred work that would run in parallel, with a view to making that configurable per-context.] > > spin_lock_init(&ctx->stale.lock); > > INIT_LIST_HEAD(&ctx->stale.engines); > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h > > index 3702b2fb27ab..e104ff0ae740 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h > > @@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, > > int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, > > struct drm_file *file); > > > > +static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx) > > +{ > > + return (ctx->async.context + > > + (atomic_fetch_inc(&ctx->async.cur) & ctx->async.width)); > > +} > > + > > static inline struct i915_gem_context * > > i915_gem_context_get(struct i915_gem_context *ctx) > > { > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > > index ae14ca24a11f..52561f98000f 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > > @@ -85,6 +85,12 @@ struct i915_gem_context { > > > > struct intel_timeline *timeline; > > > > + struct { > > + u64 context; > > + atomic_t cur; > > What is cur? In which patch it gets used? (Can't see it.) See i915_gem_context_async_id() above. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx