On Tue, 19 Jun 2012 09:13:20 -0700, Ben Widawsky <ben at bwidawsk.net> wrote: > On Tue, 19 Jun 2012 09:22:03 +0100 > Chris Wilson <chris at chris-wilson.co.uk> wrote: > > > On Mon, 18 Jun 2012 20:38:15 -0700, Ben Widawsky <ben at bwidawsk.net> wrote: > > > The history on this patch goes back quite a way. This time around, the > > > patch builds on top of the map_unsynchronized that Eric pushed. Eric's > > > patch attempted only to solve the problem for LLC machines. Unlike > > > my earlier versions of this patch (with the help from Daniel Vetter), we > > > do not attempt to cpu map objects in a unsynchronized manner. > > > > > > The concept is fairly simple - once a buffer is moved into the GTT > > > domain, we can assume it remains there unless we tell it otherwise (via > > > cpu map). It therefore stands to reason that as long as we can keep the > > > object in the GTT domain, and don't ever count on reading back contents, > > > things might just work. I believe as long as we are doing GTT mappings > > > only, we get to avoid worry about clflushing the dirtied cachelines, but > > > that could use some fact checking. > > > > > > The patch makes some assumptions about how the kernel does buffer > > > tracking, this could be conceived as an ABI dependency, but actually the > > > behavior is pretty confined. It exploits the fact the BOs are only moved > > > into the CPU domain under certain circumstances, and daintily dances > > > around those conditions. The main thing here is we assume MADV_WILLNEED > > > prevents the object from getting evicted. > > > > > > I am not aware of a good way to test it's effectiveness > > > performance-wise; but it introduces no regressions with piglit on my > > > ILK, or SNB. > > > > This is broken wrt to cache invalidation if I want to rewrite part of > > the buffer that already has been read by the GPU. > > -Chris > > > > Well if you're talking about what I think you're talking about (ie. not > clflushing, but simply dealing with the GPUs internal caching). It's a > problem that has existed with all of the non-LLC non-blocking map > patches; and sort of the point of non-blocking maps. Play it fast and > loose, submit pipe controls if you get nervous. > > Did I catch your meaning, or were you just talking about clflushing > stuff (we also miss chipset flush on really old platforms; I was > thinking of restricting this to ILK only)? Sorry, I actually meant GPU caches. However, I was under the false impression that you were chaning existing API not bringing GTT maps into compliance with the new async mappings. My warning was merely about the issue that can arrise from missing the invalidate when reusing an async map (or even if the sampler prefetches futher than expected which is what I was stung by most recently.) Furthermore, Daniel has just added unconditional invalidates before each batchbuffer which neatly papers over this issue (in the future at least). Again, sorry for the noise, please continue :) -Chris -- Chris Wilson, Intel Open Source Technology Centre