On Fri, 13 Jul 2012 17:46:20 +0200, Daniel Vetter <daniel at ffwll.ch> wrote: > On Fri, Jul 13, 2012 at 02:14:08PM +0100, Chris Wilson wrote: > > If we drop the breadcrumb request after a batch due to a signal for > > example we aim to fix it up at the next opportunity. In this case we > > emit a second batchbuffer with no waits upon the first and so no > > opportunity to insert the missing request, so we need to emit the > > missing flush for coherency. (Note that that invalidating the render > > cache is the same as flushing it, so there should have been no > > observable corruption.) > > > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> > > Imo still too meager commit message ;-) As I've said in the previous mail, > I'd like some mention of the two commits that made this disaster possible > (put the blame on me where it is due). And I think some more in-detail > walk-thru of how things blow up would be great. And the Bugzilla link for > the QA bugreport. Sure, in the patch I thought I was sending I had an extra paragraph: As a side effect this will also paper over issues such as https://bugs.freedesktop.org/show_bug.cgi?id=52040 whereby we clear the write_domain on objects on the defunct gpu_write_list. References: https://bugs.freedesktop.org/show_bug.cgi?id=52040 > Also, I still don't understand why this patch here isn't enough to fix up > the fallout. So if you can enlighten me where/why stuff blows up even with > this I'd highly appreciate. Not just because not understanding bugs makes > me queasy, but also to have a clear picture of what I'd need to send to > Dave it this -next cycle misses 3.6. The remaining fallout is that we still end up using the flushing-list, as revealed by *adding* a WARN. To end up in that situation we must retire an object with a write-domain still set. But how can this be possible if we always clear the write_list prior to the request/retirment? I thought I had it, being sneaky with the use of INSTRUCTION write domain for pipe-control. However, looks like I'm going to have to reproduce with some more debugging. -Chris -- Chris Wilson, Intel Open Source Technology Centre