Re: [PATCH 2/6] drm/i915/vlv: Added a rendering specific Hw WA 'WaReadAfterWriteHazard'

Daniel Vetter <daniel@xxxxxxxx> · Fri, 21 Mar 2014 15:58:56 +0100



On Fri, Mar 21, 2014 at 11:53:40AM +0000, Gupta, Sourab wrote:
> On Wed, 2014-01-22 at 11:11 +0000, Chris Wilson wrote:
> > On Wed, Jan 22, 2014 at 12:54:51PM +0200, Ville Syrjälä wrote:
> > > On Wed, Jan 22, 2014 at 09:15:06AM +0530, akash.goel@xxxxxxxxx wrote:
> > > > From: Akash Goel <akash.goel@xxxxxxxxx>
> > > >
> > > > Added a new rendering specific Workaround 'WaReadAfterWriteHazard'.
> > > > In this WA, need to add 12 MI Store Dword commands to ensure proper
> > > > flush of h/w pipeline.
> > > >
> > > > Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c | 25 +++++++++++++++++++++++++
> > > >  1 file changed, 25 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > index 133d273..e8ec536 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -2167,6 +2167,31 @@ intel_ring_flush_all_caches(struct intel_ring_buffer *ring)
> > > >
> > > >     trace_i915_gem_ring_flush(ring, 0, I915_GEM_GPU_DOMAINS);
> > > >
> > > > +   if (IS_VALLEYVIEW(ring->dev)) {
> > > > +           /*
> > > > +            * WaReadAfterWriteHazard
> > > > +            * Send a number of Store Data commands here to finish
> > > > +            * flushing hardware pipeline.This is needed in the case
> > > > +            * where the next workload tries reading from the same
> > > > +            * surface that this batch writes to. Without these StoreDWs,
> > > > +            * not all of the data will actually be flushd to the surface
> > > > +            * by the time the next batch starts reading it, possibly
> > > > +            * causing a small amount of corruption.
> > > > +            */
> > > > +           int i;
> > > > +           ret = intel_ring_begin(ring, 4 * 12);
> > >
> > > BSpec says 8 is enough. Is Bspec incorrect.
> > 
> > No, these are just figures they plucked out of the air. Last I heard
> > they were using 32...
> > 
> > > Also this workaround is also listed for everything SNB+.
> > 
> > And we already have a more effective workaround.
> > -Chris
> 
> Can you please let us know whether this workaround patch is required or
> not. If not, then how is this currently handled.

We emit XY_SETUP_BLT before certain blt operations to insert a
sufficiently long stall. The underlying bug this works around is that the
cache controller of the cpu falls over in certain very specific rwm
cycles. The official w/a is a full pipeline flush when switching between
reading and writing to a surface, which has a horribly perf impact on the
blitter.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx