I'm pretty happy with the code, I was just confused by the series changing the setup halfway through On Thu, Jun 18, 2015 at 02:07:30PM +0100, Arun Siluvery wrote: > +static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, > + uint32_t **wa_ctx_batch, > + uint32_t offset, > + uint32_t *num_dwords) > +{ > + uint32_t index; > + uint32_t *batch = *wa_ctx_batch; > + > + index = offset; > + > + /* FIXME: fill one cacheline with NOOPs. > + * Replace these instructions with WA > + */ > + while (index < (offset + 16)) > + wa_ctx_emit(batch, MI_NOOP); If this was /* Replace me with WA */ wa_ctx_emit(batch, MI_NOOP) /* Pad to end of cacheline */ while (index % 16) wa_ctx_emit(batch, MI_NOOP); You then don't need to alter the code when yo add the real w/a. Note that using (unsigned long)batch as you do later for cacheline calculation is wrong, as that is a local physical CPU address (not the virtual address used by the cache in the GPU) and was page aligned anyway. Similary, > +static int gen8_init_perctx_bb(struct intel_engine_cs *ring, > + uint32_t **wa_ctx_batch, > + uint32_t offset, > + uint32_t *num_dwords) > +{ > + uint32_t index; > + uint32_t *batch = *wa_ctx_batch; > + > + index = offset; > + If this just did wa_ctx_emit(batch, MI_BATCH_BUFFER_END); rather than insert a cacheline of noops, again you wouldn't need to touch this infrastructure as you added the w/a. As it stands, I was a little worried halfway through when the cache alignment suddenly disappeared - but this patch implied to me that it was necessary. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx