On 12/13/2011 09:22 AM, Eric Anholt wrote: > On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky<ben at bwidawsk.net> wrote: >> Since we don't differentiate on the different GPU read domains, it >> should be safe to allow back to back reads to occur without issuing a >> wait (or flush in the non-semaphore case). >> >> This has the unfortunate side effect that we need to keep track of all >> the outstanding buffer reads so that we can synchronize on a write, to >> another ring (since we don't know which read finishes first). In other >> words, the code is quite simple for two rings, but gets more tricky for >>> 2 rings. >> >> Here is a picture of the solution to the above problem >> >> Ring 0 Ring 1 Ring 2 >> batch 0 batch 1 batch 2 >> read buffer A read buffer A wait batch 0 >> wait batch 1 >> write buffer A >> >> This code is really untested. I'm hoping for some feedback if this is >> worth cleaning up, and testing more thoroughly. > > You say it's an optimization -- do you have performance numbers? 33% improvement on a hacked version of gem_ring_sync_loop with. It's not really a valid test as it's not coherent, but this is approximately the best case improvement. Oddly semaphores doesn't make much difference in this test, which was surprising.