Quoting Tvrtko Ursulin (2019-09-25 13:41:10) > [+ Daniele, I think he might want to have a look at this.] > > On 25/09/2019 11:01, Chris Wilson wrote: > > With the introduction of ctx->engines[] we allow multiple logical > > contexts to be used on the same engine (e.g. with virtual engines). Each > > logical context requires a unique tag in order for context-switching to > > occur correctly between them. > > > > We only need to keep a unique tag for the active lifetime of the > > context, and for as long as we need to identify that context. The HW > > uses the tag to determine if it should use a lite-restore (why not the > > LRCA?) and passes the tag back for various status identifies. The only > > status we need to track is for OA, so when using perf, we assign the > > specific context a unique tag. > > > > Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines") > > Fixes? Why? The above paragraph explains that as we give two distinct contexts the same tag then it will only perform (according to bspec) a lite-restore. -Chris > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h > > index da4b45aa84b1..7bce176e696c 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h > > @@ -296,10 +296,12 @@ struct intel_engine_cs { > > u8 uabi_class; > > u8 uabi_instance; > > > > + u32 uabi_capabilities; > > u32 context_size; > > u32 mmio_base; > > > > - u32 uabi_capabilities; > > + unsigned int context_tag; > > +#define NUM_CONTEXT_TAG (256) > > AFAICT this define now defines how deep ELSP we can have before we start > getting hard to debug problems. GuC angle needs considering here. The guc is immaterial here, since they use their own mechanism. It just needs to be a value larger than 2 * EXECLIST_MAX_PORTS and less then BIT(10). > > @@ -978,6 +965,15 @@ __execlists_schedule_in(struct i915_request *rq) > > > > intel_context_get(ce); > > > > + if (ce->tag) { > > + ce->lrc_desc |= (u64)ce->tag << 32; > > + } else { > > + ce->lrc_desc &= ~GENMASK_ULL(47, 37); > > + ce->lrc_desc |= > > + (u64)(engine->context_tag++ % NUM_CONTEXT_TAG) << > > + GEN11_SW_CTX_ID_SHIFT; > > + } > > So hw id is valid only while context is in ELSP and it changes with > every submission except in the OA case? Aye. OA being a pita. > Shifts and masks look dodgy. For >= gen11 the current code has the hw_id > in [37, 42], otherwise [32, 52]. This patch does not seem to handle the > differences between gens. Because the values we supply do not matter (they are cookies), just the lifetime. > > + > > intel_gt_pm_get(engine->gt); > > execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); > > intel_engine_context_in(engine); > > @@ -2224,7 +2220,6 @@ static void execlists_context_unpin(struct intel_context *ce) > > check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE, > > ce->engine); > > > > - i915_gem_context_unpin_hw_id(ce->gem_context); > > i915_gem_object_unpin_map(ce->state->obj); > > intel_ring_reset(ce->ring, ce->ring->tail); > > } > > @@ -2274,18 +2269,12 @@ __execlists_context_pin(struct intel_context *ce, > > goto unpin_active; > > } > > > > - ret = i915_gem_context_pin_hw_id(ce->gem_context); > > - if (ret) > > - goto unpin_map; > > - > > ce->lrc_desc = lrc_descriptor(ce, engine); > > ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE; > > __execlists_update_reg_state(ce, engine); > > > > return 0; > > > > -unpin_map: > > - i915_gem_object_unpin_map(ce->state->obj); > > unpin_active: > > intel_context_active_release(ce); > > err: > > diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c > > index 343d79c1cb7e..04a5a0d90823 100644 > > --- a/drivers/gpu/drm/i915/gvt/kvmgt.c > > +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c > > @@ -1564,27 +1564,10 @@ vgpu_id_show(struct device *dev, struct device_attribute *attr, > > return sprintf(buf, "\n"); > > } > > > > -static ssize_t > > -hw_id_show(struct device *dev, struct device_attribute *attr, > > - char *buf) > > -{ > > - struct mdev_device *mdev = mdev_from_dev(dev); > > - > > - if (mdev) { > > - struct intel_vgpu *vgpu = (struct intel_vgpu *) > > - mdev_get_drvdata(mdev); > > - return sprintf(buf, "%u\n", > > - vgpu->submission.shadow[0]->gem_context->hw_id); > > - } > > - return sprintf(buf, "\n"); > > -} > > - > > static DEVICE_ATTR_RO(vgpu_id); > > -static DEVICE_ATTR_RO(hw_id); > > > > static struct attribute *intel_vgpu_attrs[] = { > > &dev_attr_vgpu_id.attr, > > - &dev_attr_hw_id.attr, > > NULL > > }; > > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > > index fd41505d52ec..8caaa446490f 100644 > > --- a/drivers/gpu/drm/i915/i915_debugfs.c > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > > @@ -1504,9 +1504,6 @@ static int i915_context_status(struct seq_file *m, void *unused) > > struct intel_context *ce; > > > > seq_puts(m, "HW context "); > > - if (!list_empty(&ctx->hw_id_link)) > > - seq_printf(m, "%x [pin %u]", ctx->hw_id, > > - atomic_read(&ctx->hw_id_pin_count)); > > if (ctx->pid) { > > struct task_struct *task; > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > > index a86d28bda6dd..47239df653f2 100644 > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > @@ -471,9 +471,9 @@ static void error_print_context(struct drm_i915_error_state_buf *m, > > const char *header, > > const struct drm_i915_error_context *ctx) > > { > > - err_printf(m, "%s%s[%d] hw_id %d, prio %d, guilty %d active %d\n", > > - header, ctx->comm, ctx->pid, ctx->hw_id, > > - ctx->sched_attr.priority, ctx->guilty, ctx->active); > > + err_printf(m, "%s%s[%d] prio %d, guilty %d active %d\n", > > + header, ctx->comm, ctx->pid, ctx->sched_attr.priority, > > + ctx->guilty, ctx->active); > > } > > > > static void error_print_engine(struct drm_i915_error_state_buf *m, > > @@ -1262,7 +1262,6 @@ static bool record_context(struct drm_i915_error_context *e, > > rcu_read_unlock(); > > } > > > > - e->hw_id = ctx->hw_id; > > e->sched_attr = ctx->sched; > > e->guilty = atomic_read(&ctx->guilty_count); > > e->active = atomic_read(&ctx->active_count); > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h > > index caaed5093d95..4dc36d6ee3a2 100644 > > --- a/drivers/gpu/drm/i915/i915_gpu_error.h > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.h > > @@ -117,7 +117,6 @@ struct i915_gpu_state { > > struct drm_i915_error_context { > > char comm[TASK_COMM_LEN]; > > pid_t pid; > > - u32 hw_id; > > int active; > > int guilty; > > struct i915_sched_attr sched_attr; > > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > > index 80055501eccb..d36ba248d438 100644 > > --- a/drivers/gpu/drm/i915/i915_perf.c > > +++ b/drivers/gpu/drm/i915/i915_perf.c > > @@ -1283,22 +1283,15 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) > > } else { > > stream->specific_ctx_id_mask = > > (1U << GEN8_CTX_ID_WIDTH) - 1; > > - stream->specific_ctx_id = > > - upper_32_bits(ce->lrc_desc); > > - stream->specific_ctx_id &= > > - stream->specific_ctx_id_mask; > > + stream->specific_ctx_id = stream->specific_ctx_id_mask; > > } > > break; > > > > case 11: > > case 12: { > > stream->specific_ctx_id_mask = > > - ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32) | > > - ((1U << GEN11_ENGINE_INSTANCE_WIDTH) - 1) << (GEN11_ENGINE_INSTANCE_SHIFT - 32) | > > - ((1 << GEN11_ENGINE_CLASS_WIDTH) - 1) << (GEN11_ENGINE_CLASS_SHIFT - 32); > > - stream->specific_ctx_id = upper_32_bits(ce->lrc_desc); > > - stream->specific_ctx_id &= > > - stream->specific_ctx_id_mask; > > + ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); > > + stream->specific_ctx_id = stream->specific_ctx_id_mask; > > break; > > } > > > > @@ -1306,6 +1299,8 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) > > MISSING_CASE(INTEL_GEN(i915)); > > } > > > > + ce->tag = stream->specific_ctx_id_mask; > > + > > DRM_DEBUG_DRIVER("filtering on ctx_id=0x%x ctx_id_mask=0x%x\n", > > stream->specific_ctx_id, > > stream->specific_ctx_id_mask); > > @@ -1324,12 +1319,14 @@ static void oa_put_render_ctx_id(struct i915_perf_stream *stream) > > { > > struct intel_context *ce; > > > > - stream->specific_ctx_id = INVALID_CTX_ID; > > - stream->specific_ctx_id_mask = 0; > > - > > ce = fetch_and_zero(&stream->pinned_ctx); > > - if (ce) > > + if (ce) { > > + ce->tag = 0; > > intel_context_unpin(ce); > > + } > > + > > + stream->specific_ctx_id = INVALID_CTX_ID; > > + stream->specific_ctx_id_mask = 0; > > OA hunks looks dodgy as well. ce->tag is set to > stream->specific_ctx_id_mask while I think it should be id. Furthermore > id is assigned from mask which is fixed and not unique for different > contexts. It is singular in the entire system. The mask is a guaranteed unique value. > > diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c > > index 1c9db08f7c28..ac1ff558eb90 100644 > > --- a/drivers/gpu/drm/i915/selftests/i915_vma.c > > +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c > > @@ -170,7 +170,7 @@ static int igt_vma_create(void *arg) > > } > > > > nc = 0; > > - for_each_prime_number(num_ctx, MAX_CONTEXT_HW_ID) { > > + for_each_prime_number(num_ctx, 2 * NUM_CONTEXT_TAG) { > > for (; nc < num_ctx; nc++) { > > ctx = mock_context(i915, "mock"); > > if (!ctx) > > > > Unless I am missing something I see this patch as simplification and not > a fix. If we can get away with it in the GuC world then it's quite a big > simplification so it's fine by me. Bspec says that for lrc_desc with matching tags (at least for gen8-gen11 I believe), it performs a lite restore (regardless of lrca). Ergo it is quite a major fix. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx