On Mon, Jul 17, 2023 at 07:30:58PM +0200, Andi Shyti wrote: > From: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > > For platforms that use Aux CCS, wait for aux invalidation to > complete by checking the aux invalidation register bit is > cleared. > > Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines") > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@xxxxxxxxx> > Signed-off-by: Andi Shyti <andi.shyti@xxxxxxxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v5.8+ > Reviewed-by: Nirmoy Das <nirmoy.das@xxxxxxxxx> > --- > drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 17 +++++++++++++---- > drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 + > 2 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c > index aa2fb9d72745a..fbc70f3b7f2fd 100644 > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c > @@ -174,6 +174,16 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv > *cs++ = AUX_INV; > *cs++ = MI_NOOP; We only need qword alignment for sequences of commands, not each individual command, right? So technically we could drop this noop... > > + *cs++ = MI_SEMAPHORE_WAIT_TOKEN | > + MI_SEMAPHORE_REGISTER_POLL | > + MI_SEMAPHORE_POLL | > + MI_SEMAPHORE_SAD_EQ_SDD; > + *cs++ = 0; > + *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset; > + *cs++ = 0; > + *cs++ = 0; > + *cs++ = MI_NOOP; ...and then we wouldn't need an extra one here. If we drop the pair of noops, that would also change the # of dwords farther down too. > + > return cs; > } > > @@ -284,10 +294,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode) > else if (engine->class == COMPUTE_CLASS) > flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS; > > + count = 8; > if (!HAS_FLAT_CCS(rq->engine->i915)) As noted on the earlier patch, we should probably make this check that the platform actually has AuxCCS. Anyway, up to you whether you want to make that change or not. The extra noops don't actually hurt anything. Reviewed-by: Matt Roper <matthew.d.roper@xxxxxxxxx> > - count = 8 + 4; > - else > - count = 8; > + count += 10; > > cs = intel_ring_begin(rq, count); > if (IS_ERR(cs)) > @@ -330,7 +339,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) > aux_inv = rq->engine->mask & > ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0); > if (aux_inv) > - cmd += 4; > + cmd += 10; > } > } > > diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > index 5df7cce23197c..2bd8d98d21102 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > @@ -121,6 +121,7 @@ > #define MI_SEMAPHORE_TARGET(engine) ((engine)<<15) > #define MI_SEMAPHORE_WAIT MI_INSTR(0x1c, 2) /* GEN8+ */ > #define MI_SEMAPHORE_WAIT_TOKEN MI_INSTR(0x1c, 3) /* GEN12+ */ > +#define MI_SEMAPHORE_REGISTER_POLL (1 << 16) > #define MI_SEMAPHORE_POLL (1 << 15) > #define MI_SEMAPHORE_SAD_GT_SDD (0 << 12) > #define MI_SEMAPHORE_SAD_GTE_SDD (1 << 12) > -- > 2.40.1 > -- Matt Roper Graphics Software Engineer Linux GPU Platform Enablement Intel Corporation