Hardware needs cacheline count for indirect context size. Count of zero means that the feature is disabled. If we only divide size with cacheline bytes, we get one cacheline short of execution. Divide by rounding up to a cacheline size so that hardware executes everything intended. Bspec: 11739 Fixes: 17ee950df38b ("drm/i915/gen8: Add infrastructure to initialize WA batch buffers") Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/gt/intel_lrc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 6fbad5e2343f..acbb36ad17ff 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -4739,7 +4739,8 @@ static void init_wa_bb_reg_state(u32 * const regs, regs[pos_bb_per_ctx + 2] = (ggtt_offset + wa_ctx->indirect_ctx.offset) | - (wa_ctx->indirect_ctx.size / CACHELINE_BYTES); + DIV_ROUND_UP(wa_ctx->indirect_ctx.size, + CACHELINE_BYTES); regs[pos_bb_per_ctx + 4] = intel_lr_indirect_ctx_offset(engine) << 6; -- 2.17.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx