On Mon, 16 Jan 2012 14:20:55 -0800, Ben Widawsky <ben at bwidawsk.net> wrote: > On 01/16/2012 01:50 PM, Daniel Vetter wrote: > > On Tue, Dec 13, 2011 at 10:36:15AM -0800, Ben Widawsky wrote: > >> On 12/13/2011 09:22 AM, Eric Anholt wrote: > >>> On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky<ben at bwidawsk.net> wrote: > >>>> Since we don't differentiate on the different GPU read domains, it > >>>> should be safe to allow back to back reads to occur without issuing a > >>>> wait (or flush in the non-semaphore case). > >>>> > >>>> This has the unfortunate side effect that we need to keep track of all > >>>> the outstanding buffer reads so that we can synchronize on a write, to > >>>> another ring (since we don't know which read finishes first). In other > >>>> words, the code is quite simple for two rings, but gets more tricky for > >>>>> 2 rings. > >>>> > >>>> Here is a picture of the solution to the above problem > >>>> > >>>> Ring 0 Ring 1 Ring 2 > >>>> batch 0 batch 1 batch 2 > >>>> read buffer A read buffer A wait batch 0 > >>>> wait batch 1 > >>>> write buffer A > >>>> > >>>> This code is really untested. I'm hoping for some feedback if this is > >>>> worth cleaning up, and testing more thoroughly. > >>> > >>> You say it's an optimization -- do you have performance numbers? > >> > >> 33% improvement on a hacked version of gem_ring_sync_loop with. > >> > >> It's not really a valid test as it's not coherent, but this is > >> approximately the best case improvement. > >> > >> Oddly semaphores doesn't make much difference in this test, which > >> was surprising. > > > > Our domain tracking is already complicated in unfunny ways. And (at least > > without a use-case showing gains with hard numbers in either perf or power > > usage) I think this patch is the kind of "this looks cool" stuff that > > added a lot to the current problem. > > > > So before adding more complexity on top I'd like to remove some of the > > superflous stuff we already have. I.e. all the flushing_list stuff and > > maybe other things ... > > Can you be more clear on what exactly you want done before taking a > patch like this? Maybe I can work on it during some down time. If it claims to be an optimization, at a minimum the patch should include performance numbers. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20120116/27d5fee7/attachment.pgp>