Quoting Michał Winiarski (2019-09-26 11:06:34) > We're currently doing one workaround where we're using scratch as a > temporary storage place, while we're overwriting the value of one > register with some known constant value in order to perform a > workaround. > While we could just do similar thing with CS_GPR register > and MI_LOAD_REGISTER_REG instead of scratch, since we would use CS_GPR > anyways, let's just drop the constant values and do the bitops using > MI_MATH. > > Signed-off-by: Michał Winiarski <michal.winiarski@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > --- > drivers/gpu/drm/i915/gt/intel_engine.h | 53 ++++++++++++++++++++++++ > drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 -- > drivers/gpu/drm/i915/gt/intel_lrc.c | 27 +++--------- > 3 files changed, 58 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h > index d3c6993f4f46..2dfe0b23af1d 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine.h > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h > @@ -400,6 +400,59 @@ gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset, u32 flags) > return cs; > } > > +/* > + * We're using CS_GPR registers for the MI_MATH ops. > + * Note that user contexts are free to use those registers, which means that we > + * should only use this this function during context initialization, before > + * context restore (WA batch) or inside i915-owned contexts. > + */ > +static inline u32 * > +gen8_emit_bitop_reg_mask(struct intel_engine_cs *engine, > + u32 *cs, u32 op, i915_reg_t reg, u32 mask) Useful, we had discussed using MI_MATH to perform rmw for our workarounds. > +{ > + const u32 base = engine->mmio_base; > + > + GEM_BUG_ON(op != MI_MATH_OR && op != MI_MATH_AND); > + > + *cs++ = MI_LOAD_REGISTER_REG; > + *cs++ = i915_mmio_reg_offset(reg); > + *cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 0)); > + *cs++ = MI_NOOP; > + > + *cs++ = MI_LOAD_REGISTER_IMM(1); > + *cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 1)); > + *cs++ = mask; > + *cs++ = MI_NOOP; > + > + *cs++ = MI_MATH(4); > + *cs++ = MI_MATH_LOAD(MI_MATH_REG_SRCA, MI_MATH_REG(0)); > + *cs++ = MI_MATH_LOAD(MI_MATH_REG_SRCB, MI_MATH_REG(1)); > + *cs++ = op; > + *cs++ = MI_MATH_STORE(MI_MATH_REG(0), MI_MATH_REG_ACCU); > + *cs++ = MI_NOOP; > + > + *cs++ = MI_LOAD_REGISTER_REG; > + *cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 0)); > + *cs++ = i915_mmio_reg_offset(reg); > + *cs++ = MI_NOOP; 4 nicely paired redundant MI_NOOP. Can be removed, leaving the packet still oword aligned. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx