On 08/02/2017 13:20, Chris Wilson wrote:
On Wed, Feb 08, 2017 at 01:13:48PM +0000, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
-static int gen9_init_perctx_bb(struct intel_engine_cs *engine,
- struct i915_wa_ctx_bb *wa_ctx,
- uint32_t *batch,
- uint32_t *offset)
+static u32 *gen9_init_perctx_bb(struct intel_engine_cs *engine, u32 *batch)
{
- uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS);
-
- wa_ctx_emit(batch, index, MI_BATCH_BUFFER_END);
+ *batch++ = MI_BATCH_BUFFER_END;
- return wa_ctx_end(wa_ctx, *offset = index, 1);
+ return batch;
}
Transformation looks reasonable, but I'd like to omit this per-ctx bb
when empty.
Don't know if that is possible. It is always programming the offset at
the moment so documentation digging is required.
+ /*
+ * Emit the two workaround batch buffers, recording the offset from the
+ * start of the workaround batch buffer object for each and their
+ * respective sizes.
+ */
+ for (i = 0; i < ARRAY_SIZE(wa_bb_f); i++) {
+ wa_bb[i]->offset = ALIGN(batch_ptr - batch, CACHELINE_DWORDS);
+ batch_ptr = wa_bb_f[i](engine, batch_ptr);
+ wa_bb[i]->size = batch_ptr - &batch[wa_bb[i]->offset];
Which will break this pattern. At the least we could do
if (!ww_bb_fn[i])
continue;
And then skip loading into the context image if size==0.
I'll double check the mechanical aspects in a bit.
This should have been an RFC, just a mistake on my part for not marking
it as such. It saves around 500-600 bytes AFAIR, but also I was thinking
of adding a helper to emit the often used pipe control
(gen8_emit_flush_render) which can bring more gains.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx