On Thu, 12 Jul 2012 11:00:17 +0200, Daniel Vetter <daniel at ffwll.ch> wrote: > On Thu, Jul 12, 2012 at 09:29:42AM +0100, Chris Wilson wrote: > > During batch buffer emission we flag that the next request should > > generate a flush. The intention is that we queue a request following the > > batch buffer in order to provide the breadcrumb, and so always flush all > > caches after every batch, killing the flushing list. However, X > > carefully schedules a signal to arrive just as we are checking whether > > the ring has enough space to enqueue the flush and so we abort that > > addition with an ERESTARTSYS. The next batch then clears the write_domain, > > and we end up with an object with no write_domain on the ring's > > gpu_write_list. Havoc WARNs and BUGs ensue. > > > > The gpu_write_list is doomed as it is merely a remnant of the flushing > > list, so the easiest fix in the meantime is to clear all residual > > members from the gpu_write_list prior to updating the write_domain on > > the next batch. > > > > Fixes regression from > > > > commit cc889e0f6ce6a63c62db17d702ecfed86d58083f > > Author: Daniel Vetter <daniel.vetter at ffwll.ch> > > Date: Wed Jun 13 20:45:19 2012 +0200 > > > > drm/i915: disable flushing_list/gpu_write_lis > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > > Cc: Daniel Vetter <daniel.vetter at ffwll.ch> > > Thinking about this some more, this could actually be a regression from > > commit de2b998552c1534e87bfbc51ec5734b02bc89020 > Author: Daniel Vetter <daniel.vetter at ffwll.ch> > Date: Wed Jul 4 22:52:50 2012 +0200 > > drm/i915: don't return a spurious -EIO from intel_ring_begin > > Only with that patch does intel_ring_begin return -ERESTARTSYS. Ok, that would better fit in with the discovery timeline, and indeed it was a side-effect of the "not yet audited all the paths...". How true you were. :) > > --- > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > > index 88e2e11..4c521df 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > > @@ -885,8 +885,9 @@ i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring, > > return ret; > > } > > > > - /* Unconditionally invalidate gpu caches. */ > > - ret = i915_gem_flush_ring(ring, I915_GEM_GPU_DOMAINS, 0); > > + /* Unconditionally invalidate gpu caches and flush residual writes. */ > > Hm, can we have a big FIXME here that we should change this back once the > gpu_write_list is properly burried? I fully expect i915_gem_flush_ring() to die in a fire once the flushing list removal is complete. There will only be two callsites, both in i915_gem_execbuffer.c; one to invalidate and one to flush. So we might as well call them thus. If you want to tag an XXX in there, feel free, I was just trying to keep inside 80 cols. ;-) -Chris -- Chris Wilson, Intel Open Source Technology Centre