On Fri, 27 Jul 2012 07:17:36 +0200 Daniel Vetter <daniel at ffwll.ch> wrote: > On Thu, Jul 26, 2012 at 04:48:43PM -0700, Ben Widawsky wrote: > > The IVB simulator really doesn't like a TLB invalidate with no > > post-sync operation, in fact it blows up in an assertion failure. > > The documentation states that we must issue the TLB invalidate with > > a CS stall: "Also Requires stall bit ([20] of DW1) set." This patch > > doesn't comply with the docs, but we're able to satisfy the > > simulator with this very small change, and I think simulator has > > historically trumped docs. > > > > Note, I don't think this belongs in stable as our TLB invalidation > > should be correct since we use the global invalidation per batch. > > Using TLB invalidation is itself only a requirement of HW contexts. > > > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net> > > I have another patch from Chris that kills the non-zero post-sync op > workaround for ivb ... So I guess we can't do this this easily. > -Daniel Rethink this. The need to emit the TLB invalidation as the simulator dictates essentially means we cannot remove the workaround as Chris' patch does. > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 10 +++++++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index c58f1b9..339712a > > 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -198,8 +198,12 @@ intel_emit_post_sync_nonzero_flush(struct > > intel_ring_buffer *ring) if (ret) > > return ret; > > > > + /* NOTE: we want the TLB invalidate for render ring flush, > > but it must > > + * be sent with a non-zero post sync op. So it can be > > stuffed in here > > + * for convenience. > > + */ > > intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(5)); > > - intel_ring_emit(ring, PIPE_CONTROL_QW_WRITE); > > + intel_ring_emit(ring, PIPE_CONTROL_QW_WRITE | > > PIPE_CONTROL_TLB_INVALIDATE); intel_ring_emit(ring, scratch_addr | > > PIPE_CONTROL_GLOBAL_GTT); /* address */ intel_ring_emit(ring, 0); > > intel_ring_emit(ring, 0); > > @@ -225,10 +229,10 @@ gen6_render_ring_flush(struct > > intel_ring_buffer *ring, > > /* Just flush everything. Experiments have shown that > > reducing the > > * number of bits based on the write domains has little > > performance > > - * impact. > > + * impact. Note to copy/pasters: the TLB invlidate we want > > is tucked > > + * into intel_emit_post_sync_nonzero_flush(). > > */ > > flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH; > > - flags |= PIPE_CONTROL_TLB_INVALIDATE; > > flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE; > > flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE; > > flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH; > > -- > > 1.7.11.3 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx at lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > -- Ben Widawsky, Intel Open Source Technology Center