Hi Chris, for some reason I didn’t receive the review email, so I copied your comments from patchwork and faked this email. >> static void execlists_dequeue(struct intel_engine_cs *engine) >> { >> struct intel_engine_execlists * const execlists = &engine->execlists; >> @@ -1538,6 +1566,16 @@ static void execlists_dequeue(struct intel_engine_cs *engine) >> } >> >> if (__i915_request_submit(rq)) { >> + /* hsdes: 1809175790 */ >> + if ((GRAPHICS_VER(engine->i915) == 12) && >> + rq->vd_ve_aux_inv && >> + (engine->class == VIDEO_DECODE_CLASS || >> + engine->class == VIDEO_ENHANCEMENT_CLASS)) { > We do not need the extra checks here; we just do as we are told. We only > tell ourselves to apply the fixup when required. Without checking GRAPHICS_VER, I’m seeing a lot of regressions on older platforms in the CI result. This workaround was only implemented for gen12 (gen12_emit_flush_xcs). Without checking engine->class, I’m seeing boot issues due to GEM_BUG_ON() in aux_inv_reg(). Any rq will go through this code regardless of engine class and gen version, so the checking seems to be necessary. >> + *rq->vd_ve_aux_inv = i915_mmio_reg_offset > Likewise, vd_ve is overspecific, aux_inv_fixup or aux_inv_wa (or > wa_aux_iv, fixup_aux_inv). I wanted to be specific because the workaround was only implemented for vd/ve engines. But I’m ok with your proposal. >> + (aux_inv_reg(engine)); >> + rq->vd_ve_aux_inv = NULL; > Move this to i915_request initialisation so that we only set aux_inv > when required, which probably explains the extra defence. The pointer is currently initialized with 0x5a5a. I set it to NULL in gen12_emit_flush_xcs, otherwise the rq will enter that if-statement with an invalid pointer. I’m not familiar with the code, there seems to be multiple functions allocating the structure. I agree it’s better to set it to NULL at initialization, but need some guidance on where is the best place to do so. >> + rq->execution_mask = engine->mask; >> + } >> if (!merge) { >> *port++ = i915_request_get(last); >> last = NULL; >> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h >> index 28b1f9db5487..69de32e5e15d 100644 >> --- a/drivers/gpu/drm/i915/i915_request.h >> +++ b/drivers/gpu/drm/i915/i915_request.h >> @@ -350,6 +350,8 @@ struct i915_request { >> struct list_head link; >> unsigned long delay; >> } mock;) >> + >> + u32 *vd_ve_aux_inv; > Not at the end of the struct; that's where we put things in the dungeon. > The selftest struct should be last; I do hope no one has been putting > things at random places in the struct without considering the layout and > semantics. From the flow, this is akin to batch, capture_list; before > emitted_jiffies would be a good spot. Got it, will change. I thought adding at the end would be safer, thanks for the explanation. > -Chris |